Site Reliability Engineer

2 - 5 years

25 Lacs

Posted:Just now| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

  • Build and operate production platforms across Azure (e.g., AKS, App Services, Functions), Windows/Linux, and networking layers in partnership with Platform/Server/Network teams.
  • Engineer end-to-end observability: metrics, logs, and traces via Azure Monitor, Application Insights, Log Analytics, Prometheus, Grafana, and centralized logging.
  • Automate provisioning and configuration using Infrastructure as Code (Terraform/Bicep) and configuration management (Ansible/PowerShell DSC).
  • Design and maintain CI/CD pipelines (Azure DevOps/GitHub Actions) with automated testing, canary/blue-green deployments, and change control alignment.
  • Establish runbooks, SOPs, and self-healing automations to reduce MTTR and ticket volume from the NOC and Service Desk.
  • Harden platform security (identity, secrets, certificates, network segmentation) leveraging Azure Key Vault, managed identities, and policy guardrails.
  • Perform capacity planning, performance tuning, and cost optimization (FinOps) for compute, storage, and networking.
  • Partner with Data/ETL teams to ensure reliability of batch and streaming jobs, scheduling, and dependencies.
  • Create and maintain documentation (architecture, runbooks, dashboards) and support audits and compliance requirements.
  • Bachelor s degree in Computer Science, Engineering, or equivalent experience.
  • 2 5+ years in SRE/DevOps/Platform Engineering with hands-on production ownership.
  • Proficiency with Azure services (AKS, App Services, Functions, Azure Monitor, Log Analytics, Application Insights).
  • Strong Kubernetes/Docker skills; Helm, ingress, service mesh (e.g., Istio/Linkerd) experience is a plus.
  • IaC (Terraform or Bicep) and scripting (PowerShell and/or Python); Git-based workflows.
  • CI/CD (Azure DevOps or GitHub Actions), artifact management, and release strategies (canary/blue-green).
  • Observability tooling (Prometheus, Grafana, ELK/OpenSearch, Azure Monitor) and alert design to minimize noise.
  • Experience with ITIL processes (incident, change, problem) and tools (ServiceNow/Jira).
  • Knowledge of networking, DNS, TLS/certificates, load balancers, and security fundamentals.
  • Excellent troubleshooting, communication, and cross-functional collaboration skills.
  • Certifications such as Microsoft Azure Administrator/DevOps, CKA/CKAD, or ITIL Foundation are a plus.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

pune, maharashtra, india

noida, uttar pradesh, india

bengaluru, karnataka, india