Jobs

Interviews
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home

Jobs

Home
>
Jobs in Mumbai
>
HARP Technologies and Services
>
Site Reliability Engineer - Configuration Management Tools

Site Reliability Engineer - Configuration Management Tools

HARP Technologies and Services

8 - 13 years

8 - 12 Lacs

Mumbai Chennai

Posted:4 weeks ago| Platform:

Apply

Skills Required

Site Reliability Engineering Configuration Management Ansible Bitbucket Bash Puppet Python

Work Mode

Work from Office

Job Type

Full Time

Job Description

Notice period : Immediate to 30 days max Responsibilities of Senior SRE : - The Site Reliability Engineering (SRE) team is responsible for the reliability, scalability, stability and performance of systems and services. - They work with cross-functional teams to design, build and maintain systems and they troubleshoot issues when they arise. They bridge the gap between development and operations teams. - They work closely with business teams to define Service Level Objectives (SLO) and agreements (SLA) of critical systems. They also monitor and maintain the uptime of these systems in-line with the defined SLO's and SLA's. - They deploy and manage monitoring tools to gain insights on system health and performance. - They analyze performance, identify bottlenecks and implement solutions to improve a system's scalability and latency durations. - They develop scripts, implement tools and automation frameworks to reduce the manual intervention efforts of deployment, monitoring and scaling. - They work with development teams for design and development of observability practices like logging, metrics, tracing, etc. They aim to diagnose and troubleshoot issues proactively. - They create actionable alerts on monitoring systems to ensure rapid response for potential production incidents. - They forecast resource needs and provision adequately for current and future demand. - They design and execute "chaos experiments" to test system's failure resiliency. - They own, define and implement the Disaster Recovery (DR) processes for systems. They also conduct planned and unplanned mock DR drills to test for response preparedness during production incidents. - They ensure that security best practices are followed and implemented during design and operations of systems. - They also own and maintain documentation of processes, playbooks, and systems. - They publish KPI reports and other system health updates on a regular basis to the business. Requirements : - Must-have - Bachelor's degree, preferably in CS or a related field, or equivalent experience - Must-have - 12+ years of overall IT experience - Must-have - 7+ years of proven work experience as a Senior Site Reliability Engineer or a similar position. - Must-have - 5+ years of AWS Cloud experience with AWS Certified DevOps Engineer or SysOps or Security etc. - Must-have - AWS experience - 3+ years' experience with using a broad range of AWS technologies (e.g. EC2, RDS, ELB, S3, VPC, CloudWatch & Monitoring Tools) to develop and maintain an Amazon AWS based cloud solution, with an emphasis on best practice cloud security. - Must-have - 2+ years of experience in CDN and/or Cache systems like Fastly, Akamai, CloudFront, etc. - Proven Understanding & strong experience with Cloud deployments ( AWS / Docker/ Kubernetes) - Knowledge on provisioning IAC Tools like Terraform, Chef, Ansible, Shell, groovy, python, etc. - Experience with monitoring systems such as CloudWatch, NewRelic, Datadog/Splunk, ELK stack. - Experience managing cloud network resources (AWS Preferred) such as CloudWatch, VPC, URL proxies, private link, DNS, ACLs, firewalls, and C2S access points. - Platform or Application Engineering and Operational Knowledge in any of the CI/CD tooling like GitHub Actions, Jenkins, etc. - Experience in other tooling Technologies like JIRA, Bitbucket, Jenkins, Fortify, SonarQube, Nexus, Nexus IQ - Experience with configuration automation tools like Puppet/Ansible/Chef/Salt - Scripting Skills : Strong scripting (e.g. Bash & Python) and automation skills. - Operating Systems : Windows and Linux system administration. - Problem Solving : Ability to analyze and resolve complex infrastructure resource and application deployment issues - Strong attention to detail. Excellent verbal and written communication skills. Strong documentation skills. Good To Have : - Experience with Terraform/Ansible/Chef/Puppet - Experience with GitHub Actions - Experience with CloudFront, Fastly

More Jobs at HARP Technologies and Services

Site Reliability Engineer - Configuration Management Tools

Mumbai, Chennai

8.0 - 13.0 yrs

INR 8 - 12 Lacs

Java Developer - Spring Frameworks

Bengaluru

4.0 - 7.0 yrs

INR 6 - 9 Lacs

.Net Developer - ASP/C#

Pune

5.0 - 10.0 yrs

INR 3 - 6 Lacs

Azure Solution Architect

Noida, Hyderabad, Bengaluru

8.0 - 13.0 yrs

INR 12 - 17 Lacs

Angular Developer - JavaScript

Pune

4.0 - 8.0 yrs

INR 5 - 8 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Site Reliability Engineering Interview Now

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

HARP Technologies and Services

Technology Services

Innovation City

250 Employees

5 Jobs

Key People

Alice Smith

CEO
Bob Johnson

CTO
Catherine Davis

COO

Login to

Please Verify Your Phone or Email

Confirm Action

Search

Profile

Upskill and Grow with AI

Site Reliability Engineer - Configuration Management Tools