Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home

Jobs

Home
>
Jobs in Hyderabad
>
Globallogic
>
Site Reliability Engineer

Site Reliability Engineer

Globallogic

10 - 15 years

20 - 35 Lacs

Hyderabad

Posted:2 hours ago| Platform:

Apply

Skills Required

fundamentals continuous integration kubernetes functional python ci/cd networking cloud platforms monitoring docker scripting unix system environment git collaboration linux leadership cloud infrastructure debugging aws programming devops tools communication skills

Work Mode

Work from Office

Job Type

Full Time

Job Description

Description:

Hiring SRE Lead for the Hyderabad location

Requirements:

Qualifications:
• Bachelor’s or Master’s degree in Computer Science, Information Systems, Engineering, or a related technical field.• 12+ years of total experience in infrastructure, platform engineering, or software development roles, including at least 3–5 years in an SRE or DevOps leadership role.• Deep understanding of Linux/Unix systems, networking fundamentals, and containerized environments (Docker, Kubernetes).• Proven experience managing large-scale production systems, including high-availability, distributed, and event-driven architectures.• Strong hands-on experience with cloud platforms such as AWS, GCP, or Azure and infrastructure-as-code tools (e.g., Terraform, CloudFormation).• Proficiency in at least one scripting or programming language (Python, Go, Shell, Java, etc.).• Demonstrated experience building observability solutions (metrics, logs, traces) and integrating them into proactive monitoring and alerting systems.• Solid understanding of incident response practices, runbook automation, on-call rotation management, and disaster recovery planning.• Familiarity with modern CI/CD tools (Jenkins, GitLab CI, Argo CD, Spinnaker) and release automation best practices.• Strong problem-solving and debugging skills, especially in high-pressure, production-critical environments.• Excellent leadership, communication, and cross-functional collaboration skills.

Job Responsibilities:

Responsibilities:
• Lead the SRE function, owning end-to-end service reliability, observability, incident management, capacity planning, and production readiness.• Establish SLOs, SLIs, and error budgets in collaboration with product and engineering teams to drive service quality goals.• Build and maintain highly available, fault-tolerant, and self-healing infrastructure leveraging IaC, automation, and scalable architectures.• Design and implement monitoring, alerting, and observability platforms using tools like Prometheus, Grafana, Datadog, ELK/EFK stack, or equivalent.• Drive the evolution of CI/CD pipelines, release automation, and safe deployment practices using GitOps or similar methodologies.• Lead and refine the incident management lifecycle, including root cause analysis (RCA), incident postmortems, and production runbooks.• Optimize cost, performance, and scalability of cloud infrastructure across hybrid or multi-cloud environments (AWS, GCP, Azure).• Champion DevSecOps and SRE best practices, advocating for early detection, chaos engineering, and continuous improvement in resilience engineering.• Mentor and develop a team of SREs and platform engineers; conduct performance reviews and technical coaching.• Serve as a key advisor in architectural reviews to ensure systems are built with reliability, scalability, and observability in mind.• Maintain strong partnerships with Security, Product, QA, and Engineering teams to support agile development and delivery.

What We Offer:

Exciting Projects:

Collaborative Environment:

Work-Life Balance:

Professional Development:

Excellent Benefits:

Fun Perks:

More Jobs at Globallogic

Automation QA Engineer

Gurgaon

5 - 10 yrs

INR 15 - 22 Lacs

Java Data Engineer

Noida, Pune, Bengaluru

7 - 12 yrs

INR 20 - 35 Lacs

DevOps Engineer(Night Shift)

Gurugram

5 - 10 yrs

INR 20 - 27 Lacs

Pune | .NET + Angular

Pune

3 - 5 yrs

INR 15 - 25 Lacs

React Js Lead

Noida

8 - 10 yrs

INR 20 - 35 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

Globallogic

Software Development

Santa Clara CA

Login to

Please Verify Your Phone or Email

Confirm Action

Search

Profile

Upskill and Grow with AI

Site Reliability Engineer