About us: Where elite tech talent meets world-class opportunities! At Xenon7, we work with leading enterprises and innovative startups on exciting, cutting edge projects that leverage the latest technologies across various domains of IT including Data, Web, Infrastructure, AI, and many others. Our expertise in IT solutions development and on-demand resources allows us to partner with clients on transformative initiatives, driving innovation and business growth. Whether it's empowering global organizations or collaborating with trailblazing startups, we are committed to delivering advanced, impactful solutions that meet today's most complex challenges.
Requirements
Location:
Remote (India / UK)
Experience:
5-7 years
Employment Type:
Full-time
About The Role
We are seeking a skilled
Infrastructure & DevOps Engineer
to support the development and deployment of AWS
SageMaker Unified Studio
, building on the existing SageMaker ecosystem. The role involves designing, automating, and maintaining cloud-native infrastructure that enables scalable, secure, and efficient machine learning workflows.You will collaborate with data scientists, ML engineers, and platform teams to ensure seamless integration of new features into Unified Studio, focusing on reliability, automation, and operational excellence.
Key Responsibilities
- Design and implement cloud infrastructure to support SageMaker Unified Studio features
- Automate deployment pipelines using CI/CD tools (CodePipeline, Jenkins, GitHub Actions, etc.)
- Manage infrastructure as code (IaC) with Terraform/CloudFormation
- Ensure scalability, security, and compliance of ML workloads in AWS
- Monitor and optimize SageMaker Studio environments, including notebooks, pipelines, and endpoints
- Collaborate with ML engineers to integrate new Unified Studio capabilities into existing workflows
- Implement observability solutions (CloudWatch, Prometheus, Grafana) for proactive monitoring
- Troubleshoot infrastructure and deployment issues across distributed ML systems
- Drive DevOps best practices for automation, testing, and release management
Qualifications
- Bachelor's degree in Computer Science, Engineering, or related field
- 3-5 years of experience in Infrastructure/DevOps roles
- Strong expertise in AWS services: SageMaker, EC2, S3, IAM, CloudFormation, Lambda, EKS
- Hands-on experience with CI/CD pipelines and automation frameworks
- Proficiency in Terraform, CloudFormation, or Ansible for IaC
- Solid understanding of Docker & Kubernetes for containerized ML workloads
- Familiarity with ML workflows and SageMaker Studio (preferred)
- Strong scripting skills in Python, Bash, or Go
- Experience with monitoring/logging tools (CloudWatch, ELK, Prometheus)
- Excellent problem-solving and communication skills
Preferred Skills
- Exposure to SageMaker Unified Studio or similar ML orchestration platforms
- Knowledge of data engineering pipelines and ML lifecycle management
- Experience working in financial services or regulated industries (bonus)