Overview of Job Role:
We are looking for a skilled and motivated DevOps Engineer to join our growing team. The ideal candidate will have expertise in AWS, CI/CD pipelines, and Terraform, with a passion for building and optimizing scalable, reliable, and secure infrastructure. This role involves close collaboration with development, QA, and operations teams to streamline deployment processes and enhance system performance.
Roles & Responsibilities:
Leadership & Strategy
- Lead and mentor a team of DevOps engineers, fostering a culture of automation, innovation, and continuous improvement.
- Define and implement DevOps strategies aligned with business objectives to enhance scalability, security, and reliability.
- Collaborate with cross-functional teams, including software engineering, security, MLOps, and infrastructure teams, to drive DevOps best practices.
- Establish KPIs and performance metrics for DevOps operations, ensuring optimal system performance, cost efficiency, and high availability.
- Advocate for
CPU throttling, auto-scaling, and workload optimization
strategies to improve system efficiency and reduce costs. - Drive
MLOps adoption
, integrating machine learning workflows into CI/CD pipelines and cloud infrastructure. Ensure compliance with ISO 27001 standards
, implementing security controls and risk management measures.
Infrastructure & Automation
- Oversee the design, implementation, and management of
scalable, secure, and resilient infrastructure on AWS
. - Lead the adoption of
Infrastructure as Code (IaC)
using Terraform, CloudFormation, and configuration management tools like Ansible or Chef. - Spearhead
automation efforts
for infrastructure provisioning, deployment, and monitoring to reduce manual overhead and improve efficiency. - Ensure
high availability and disaster recovery
strategies, leveraging multi-region architectures and failover mechanisms. - Manage
Kubernetes (or AWS ECS/EKS) clusters
, optimizing container orchestration for large-scale applications. - Drive
cost optimization initiatives
, implementing intelligent cloud resource allocation strategies.
CI/CD & Observability
- Architect and oversee
CI/CD pipelines
, ensuring seamless automation of application builds, testing, and deployments. - Enhance
observability and monitoring
by implementing tools like CloudWatch, Prometheus, Grafana, ELK Stack, or Datadog. - Develop
robust logging, alerting, and anomaly detection
mechanisms to ensure proactive issue resolution.
Security & Compliance (ISO 27001 Implementation)
Lead the implementation and enforcement of ISO 27001 security standards
, ensuring compliance with information security policies
and regulatory requirements. - Develop and maintain an
Information Security Management System (ISMS)
to align with ISO 27001 guidelines. - Implement
secure access controls, encryption, IAM policies, and network security measures
to safeguard infrastructure. - Conduct
risk assessments, vulnerability management, and security audits
to identify and mitigate threats. - Ensure security best practices are embedded into all DevOps workflows, following
DevSecOps principles
. - Work closely with auditors and compliance teams to maintain
SOC2, GDPR, and other regulatory frameworks
.
Required Skills and Qualifications:
5+ years of experience
in DevOps, cloud infrastructure, and automation, with at least 3+ years in a managerial or leadership role
. - Proven experience managing
AWS cloud infrastructure
at scale, including EC2, S3, RDS, Lambda, VPC, IAM, and CloudFormation. - Expertise in
Terraform and Infrastructure as Code (IaC) principles
. - Strong background in
CI/CD pipeline automation
with tools like Jenkins, GitHub Actions, GitLab CI, or CircleCI. - Hands-on experience with
Docker and Kubernetes (or AWS ECS/EKS)
for container orchestration. - Experience in
CPU throttling, auto-scaling, and performance optimization
for cloud-based applications. - Strong knowledge of
Linux/Unix systems, shell scripting, and network configurations
. Proven experience with ISO 27001 implementation
, ISMS development, and security risk management. - Familiarity with
MLOps frameworks
like Kubeflow, MLflow, or SageMaker, and integrating ML pipelines into DevOps workflows. - Deep understanding of
observability tools
such as ELK Stack, Grafana, Prometheus, or Datadog. - Strong stakeholder management, communication, and ability to collaborate across teams.
- Experience in
regulatory compliance, including SOC2, ISO 27001, and GDPR
.
Professional Attributes:
- Strong interpersonal and communication skills, being an effective team player, being able to work with individuals at all levels within the organization and building remote relationships.
- Excellent prioritization skills, the ability to work well under pressure, and the ability to multi-task.