Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Jobs

Interviews

Home
>
Jobs in Hyderabad
>
Spydra
>
Spydra - Site Reliability Engineer - DevOps

Spydra - Site Reliability Engineer - DevOps

Spydra

3 years

0 Lacs

Hyderabad Telangana India

Posted:3 days ago| Platform:

Apply

Skills Required

reliability devops kubernetes support learning optimization data scalability ml automation design jenkins github code ansible terraform monitoring logging mlflow tensorflow training inference management analysis automate service deployment python collaboration engineering security integrity tooling gitops scripting spark kafka aws gcp

Work Mode

On-site

Job Type

Full Time

Job Description

We are seeking a highly skilled and self-driven Site Reliability Engineer to join our dynamic team.This role is ideal for someone with a strong foundation in Kubernetes, DevOps, and observability who can also support machine learning infrastructure, GPU optimization, and Big Data ecosystems.You will play a pivotal role in ensuring the reliability, scalability, and performance of our production systems, while also enabling innovation across ML and data teams.

Key Responsibilities Automation & Reliability

Design, build, and maintain Kubernetes clusters across hybrid or cloud environments (e.g., EKS, GKE, AKS).
Implement and optimize CI/CD pipelines using tools like Jenkins, ArgoCD, and GitHub Actions.
Develop and maintain Infrastructure as Code (IaC) using Ansible, Terraform, or & Observability :
Deploy and maintain monitoring, logging, and tracing tools (e.g., Thanos, Prometheus, Grafana, Loki, Jaeger).
Establish proactive alerting and observability practices to identify and address issues before they impact users.

ML Ops & GPU Optimization

Support and scale ML workflows using tools like Kubeflow, MLflow, and TensorFlow Serving.
Work with data scientists to ensure efficient use of GPU resources, optimizing training and inference & Incident Management :
Lead root cause analysis for infrastructure and application-level incidents.
Participate in the on-call rotation and improve incident response & Automation :
Automate operational tasks and service deployment using Python, Shell, Groovy, or Ansible.
Write reusable scripts and tools to improve team productivity and reduce manual Learning & Collaboration :
Stay up-to-date with emerging technologies in SRE, ML Ops, and observability.
Collaborate with cross-functional teams including engineering, data science, and security to ensure system integrity and :
3+ years of experience as an SRE, DevOps Engineer, or equivalent role.
Strong experience with Kubernetes ecosystem and container orchestration.
Proficiency in DevOps tooling including Jenkins, ArgoCD, and GitOps workflows.
Deep understanding of observability tools, with hands-on experience using Thanos and Prometheus stack.
Experience with ML platforms (MLflow, Kubeflow) and supporting GPU workloads.
Strong scripting skills in Python, Shell, Ansible, or :
CKS (Certified Kubernetes Security Specialist) certification.
Exposure to Big Data platforms (e.g., Spark, Kafka, Hadoop).
Experience with cloud-native environments (AWS, GCP, or Azure).
Background in infrastructure security and compliance.

(ref:hirist.tech)

More Jobs at Spydra

Spydra - Data Scientist - Automatic Speech Recognition

Hyderabad, Telangana, India

Experience: Not specified

Salary: Not disclosed

Spydra - DevOps Engineer - CI/CD Pipeline

Hyderabad, Telangana, India

Experience: Not specified

Salary: Not disclosed

Spydra - SQL Engineer

Hyderabad, Telangana, India

5.0 - 5.0 yrs

Salary: Not disclosed

Spydra - Full Stack Developer - Node.js/React.js

Bengaluru, Karnataka, India

Experience: Not specified

Salary: Not disclosed

Spydra - AngularJS Developer

Hyderabad, Telangana, India

5.0 - 5.0 yrs

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

Spydra

RecommendedJobs for You

Spydra - Site Reliability Engineer - DevOps

Spydra

Hyderabad, Telangana, India

Spydra - Site Reliability Engineer - DevOps

Spydra

Hyderabad, Telangana, India

Login to

Please Verify Your Phone or Email

Confirm Action

Search

Profile

Upskill and Grow with AI

Spydra - Site Reliability Engineer - DevOps