Company:
    
      takealot.com
    
  
    
Industry: Sales / Retail
Deadline: Not specified
Job Type: Full Time
Experience: 3 – 5 years
Location: Western Cape
Province: Cape Town
Field: ICT / Computer
Your mission, should you choose to accept it:
- Cloud Infrastructure Management: Design, implement, and manage scalable and resilient infrastructure using AWS services like EC2, Lambda, Aurora RDS PostgreSQL, and DynamoDB.
- Container Orchestration: Deploy, manage, and scale applications in Kubernetes.
- Monitoring & Observability: Set up and maintain comprehensive monitoring using Grafana Cloud, Mimir, Loki, Tempo, and OpenTelemetry.
- CI/CD Integration: Automate deployments with robust CI/CD pipelines. Familiarity with tools like GitHub Actions and AWS CodeBuild is essential..
- Log Management & Analysis: Utilize tools like OpenSearch/Elasticsearch and Loki for log analysis and troubleshooting.
- Scripting & Automation: Develop scripts and tools using Python and Golang to automate tasks and processes.
- Database Management: Manage and optimize data workflows across databases like Aurora RDS PostgreSQL and DynamoDB.
- Stream Processing: Work with Kafka for real-time data processing and integration workflows.
- Incident Management: Participate in on-call rotations, providing expertise in incident resolution and system troubleshooting
The skills we need:
- AWS Expertise: An understanding of AWS services and cloud architecture best practices.
- Kubernetes Proficiency: Hands-on experience in deploying and managing Kubernetes clusters.
- Programming Skills: Proficiency in Python and Golang for automation and development.
- Observability Tools: Experience with Grafana, OpenTelemetry, Mimir, Loki, and Tempo.
- Database Technologies: Knowledge of OpenSearch/Elasticsearch and Aurora RDS PostgreSQL.
- Streaming Platforms: Practical experience with Kafka for data stream processing.
- Infrastructure as Code (IaC): Use tools like Terraform or CloudFormation to manage AWS resources as code.
- Strong problem-solving skills and attention to detail.
- Excellent communication and collaboration skills.
- Ability to work independently and within a team in a fast-paced environment.
- Passion for continuous learning and staying updated with industry trends.
Nice to have:
- Experience with NoSQL, PostgreSQL, DynamoDB, Elasticsearch
- Experience with common web stack applications (nginx, tornado, FastAPI)
- Experience with messaging platforms (Kafka, Kinesis, SQS, SNS)
- Experience with Google (GCP, Firebase)
Qualifications & Experience:
- Bachelor’s Degree or Advanced Diploma in Information Systems, Computer Science, Mathematics, Engineering and 3 years of hands-on experience in a DevOps or Site Reliability Engineering role is required.
- In the event that a candidate does not have a Bachelor’s Degree or an Advanced Diploma (in Information Systems, Computer Science, Mathematics, or Engineering), an equivalent experience requirement must be met, which equates to a minimum of 6 years experience in a software/technology environment.
- Certifications in AWS or Kubernetes are advantageous.
- 3-5 years of hands-on experience in a DevOps or Site Reliability Engineering role.
- 2 – 5 years of experience in Python and Golang for automation, scripting and development
- AWS Expertise: 3 years comprehensive experience with AWS services, including EC2, Lambda, DynamoDB, and Aurora RDS PostgreSQL and AWS OpenSearch.
 Ability to design and manage scalable and resilient cloud architectures.
- Kubernetes Proficiency: 3 years hands-on experience with deploying, managing, and scaling applications in Kubernetes environments. Practical experience with Helm and ArgoCD
 Understanding of containerization concepts and tools like Docker/Podman.
- Infrastructure as Code (IaC): 3 years experience with IaC tools like Terraform or CloudFormation to manage and automate cloud resources effectively. CloudFormation is preferential.
 
					 
						