Sr. Principal Site Reliability Engineer

7 - 11 years

17 - 22 Lacs

Posted:1 week ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Position Summary
  • F5 Inc. is actively seeking an exceptional Sr Principal Software Engineer (Individual Contributor) to play a pivotal role in our SRE Operations team for the groundbreaking F5XC Product.
  • Are you an SRE Operations specialist with automation in your DNA? Do you thrive in fast-paced SaaS environments where

    Why This Role is Unique:
    Our SaaS is hybrid running across public cloud and a global network of 50+ PoPs, delivering terabits of capacity. Our infrastructure spans cloud-native services and physical networking gear (routers, switches, firewalls), creating a uniquely challenging and exciting observability landscape. The
    Analytics & Observability platform will have deep reach across these layers, ensuring reliability, security, and performance at a massive scale.

    What Youll Do:
    Be the Force Behind Observability & Stability
  • Drive end-to-end Observability (Logs, Metrics, and Alerts) across our hybrid SaaS stack, spanning cloud, edge, and physical network devices.
  • Take ownership of Alerting strategy, cutting through noise while ensuring actionable, high-fidelity alerts.
  • Implement intelligent automation to reduce operational toil and enhance real-time visibility.
  • Own & Automate Operations
  • Design, build, and manage automation for self-healing infrastructure across cloud + global PoPs.
  • Develop automation for Kubernetes, ArgoCD, Helm Charts, Golang-based services, AWS, GCP, Terraform.
  • Improve networking observability, ensuring our routers, switches, and firewalls are monitored at scale.
  • Continuously eliminate manual ops work through automation and platform improvements.
  • Lead Incident Response & Operational Excellence
  • Participate in on-call rotations, ensuring rapid incident response across our cloud + edge stack.
  • Drive incident response automation, reducing MTTR and increasing system resilience.
  • Ensure security, compliance, and best practices in observability & automation.Collaborate & Mentor
  • Work closely with application teams, network engineers, and SREs to improve reliability and performance.
  • Mentor junior engineers, fostering a culture of automation-first thinking and deep observability.What Makes You a Great Fit?Deep expertise in Logs, Metrics, and Alerting, with a strong focus on Alerting automation.
  • Experience in hybrid SaaS environments spanning cloud-native and global infrastructure.
  • Strong background in Kubernetes, Infrastructure-as-Code (Terraform), Golang, AWS/GCP, and networking observability.
  • Proven track record of eliminating toil and improving operational efficiency through automation.
  • Passion for deep observability, networking-scale analytics, and automation at the edge.If you love solving reliability challenges at global scale, automating everything, and working in a hybrid cloud + networking environment, we want to talk to you!The About The Role is intended to be a general representation of the responsibilities and requirements of the job. However, the description may not be all-inclusive, and responsibilities and requirements are subject to change.
  • Must-Have:

    Observability & Alerting Expertise Strong experience with Logs, Metrics, and Alerts, with a focus on high-fidelity alerting and automation. Automation & Infrastructure as Code Deep knowledge of Terraform, ArgoCD, Helm, Kubernetes, and Golang for automation. Cloud & Hybrid SaaS Experience Hands-on experience managing cloud-native (AWS/GCP) and edge infrastructure. Incident Response & Reliability Engineering Strong on-call experience, with a track record of reducing MTTR through automation Kubernetes Mastery Hands-on experience deploying, managing, and troubleshooting Kubernetes in production environments.

    Nice-to-Have:

    Networking & Edge Observability Familiarity with monitoring routers, switches, and firewalls in a global PoP environment. Data & Analytics in Observability Experience with time-series databases (Prometheus, Grafana, OpenTelemetry, etc.). Security & Compliance Awareness Understanding of secure-by-design principles for monitoring & alerting. Mentorship & Collaboration Ability to mentor junior engineers and work cross-functionally with SREs, application teams, and network engineers. High Availability Disaster RecoveryExperience with HA/DR and Migration

    Qualifications

  • Typically, it requires at least 18 years of related experience with a bachelors degree, 15 years and a masters degree, or a PhD with 12 years experience; or equivalent experience.
  • Excellent organizational agility and communication skills throughout the organization.
  • Environment
  • Empowered Work Culture: Experience an environment that values autonomy, fostering a culture where creativity and ownership are encouraged.
  • Continuous Learning: Benefit from the mentorship of experienced professionals with solid backgrounds across diverse domains, supporting your professional growth.
  • Team Cohesion: Join a collaborative and supportive team where youll feel at home from day one, contributing to a positive and inspiring workplace.
  • F5 Networks, Inc. is an equal opportunity employer and strongly supports diversity in the workplace.

    Mock Interview

    Practice Video Interview with JobPe AI

    Start Job-Specific Interview
    cta

    Start Your Job Search Today

    Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

    Job Application AI Bot

    Job Application AI Bot

    Apply to 20+ Portals in one click

    Download Now

    Download the Mobile App

    Instantly access job listings, apply easily, and track applications.

    coding practice

    Enhance Your Golang Skills

    Practice Golang coding challenges to boost your skills

    Start Practicing Golang Now

    RecommendedJobs for You