Job Summary The DevOps Engineering Lead will play a crucial role in bridging the gap between Engineering and Technology operations team. The primary responsibility will be to streamline the software development process, automate infrastructure deployment, and enhance system reliability, scalability, and security. You will collaborate closely with developers, system administrators, and other stakeholders while reporting to the Chief Technology Officer.
Responsibilities
Design, implement, and maintain continuous integration and continuous deployment (CI/CD) pipelines to automate software delivery processes.
Configure and manage infrastructure as code (IaC) using tools like Terraform, Helm, EKS, Karpenter or CloudFormation to provision and manage cloud resources.
Monitor, analyse, and optimize system performance, resource utilization, and reliability metrics.
Implement and maintain robust monitoring, logging, and alerting solutions to ensure timely detection and resolution of issues.
Collaborate with development teams to improve application architecture for scalability, performance, and security.
Ensure adherence to security best practices and compliance requirements throughout the software development lifecycle.
Troubleshoot and resolve technical issues on Cloud environments.
Provide guidance and support to development teams on DevOps best practices, tools, and processes.
Participate in on-call rotation and incident response activities to ensure system availability and reliability.
Requirements
7 -10 years’ experience in a similar DevOps role managing enterprise grade application and infrastructure
Hands-on experience with CI/CD tools like Jenkins, GitLab CI/CD, or GitHub Actions.
Familiarity with containerization and orchestration tools like Docker and Kubernetes.
Experience with cloud platforms such as AWS, Azure and serverless computing.
Understanding of networking concepts and protocols (TCP/IP, DNS, HTTP, SSL/TLS).
Proficient in Linux/Unix system administration and troubleshooting.
Familiarity with monitoring and logging tools like Prometheus, ELK stack (Elasticsearch, Logstash, Kibana), Grafana or Splunk.
Strong analytical and problem-solving skills with the ability to troubleshoot complex issues in distributed systems.
Excellent communication and collaboration skills to work effectively in a cross-functional team environment.