Site Reliability Engineer

5 - 7 years

15 - 20 Lacs

Posted:None| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Key Responsibilities:

Identity-Related Efforts

• Lead identity efforts including authentication, authorization, and directory services integration. • Work extensively with Okta for identity and access management.

• Design, build, test, and launch Terraform workflows end-to-end.

• Define and manage Infrastructure as Code (IaC) using Terraform to provision and manage infrastructure.

• Integrate identity systems into the larger cloud and application infrastructure securely and efficiently.

• Ensure identity-related components are scalable, secure, and resilient.

Observability-Related Efforts

• Plan, build, test, and launch an end-to-end observability platform.

• Deploy and manage Kubermetheus stack including Kubernetes, Prometheus, Loki, Grafana, and Alert Manager.

• Integrate PagerDuty for effective incident response and alerting.

• Monitor system health and performance; define and tune alerts to ensure proactive resolution of issues.

• Experience with AWS to manage cloud infrastructure and observability in production environments.

DevOps & Engineering Responsibilities

• Develop and maintain scalable CI/CD pipelines for application and infrastructure deployment.

• Write clean, reusable, and efficient Python code for automation, tooling, and scripting tasks.

• Collaborate closely with development and operations teams to ensure system reliability and best practices.

• Participate in on-call rotations and incident response activities.

• Maintain technical documentation and share knowledge through documentation and internal sessions.

Required Skills:

• Strong understanding of authentication, authorization, and directory services

• Hands-on experience with Okta for IAM • Expertise in Terraform must be able to plan, build, test, and deploy complete workflows

• Proficiency in Python scripting, automation, and tooling

• CI/CD pipeline experience (e.g., Jenkins, GitHub Actions, GitLab CI/CD)

• Infrastructure as Code (IaC) using tools like Terraform, Ansible, or CloudFormation

• Strong troubleshooting and problem-solving skills

• Observability stack experience: Kubernetes, Prometheus, Grafana, Loki, Alert Manager

• Cloud experience, especially AWS

• Experience with PagerDuty for incident alerting

• Portfolio of recent projects is required and interviews will include coding challenges

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Keyutech logo
Keyutech

Information Technology

Dallas

RecommendedJobs for You