Site Reliability Engineer

5 - 9 years

0 Lacs

Posted:1 day ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

As a Site Reliability Engineer, you will play a crucial role in ensuring the reliability and uptime of critical services for our client's team. Your primary responsibilities will revolve around Kubernetes administration, CentOS server management, Java application support, incident handling, and change management. The ideal candidate for this role should have a solid background in ArgoCD for Kubernetes management, Linux proficiency, basic scripting skills, and familiarity with modern monitoring, alerting, and automation tools. We are seeking a self-motivated individual with strong communication skills, both verbal and written, who can work effectively both independently and collaboratively. Your daily tasks will include monitoring, maintaining, and managing applications on CentOS servers to ensure high availability and performance. You will be responsible for conducting routine system and application maintenance tasks following standard operating procedures to prevent and resolve issues promptly. Additionally, you will be in charge of responding to and managing incidents, facilitating post-mortem meetings, conducting root cause analysis, and ensuring timely issue resolution. Furthermore, you will monitor production systems, applications, and overall performance, utilizing tools to detect abnormal behaviors in software and collect relevant information for developers to understand and address the underlying causes. Security checks, policy and procedure documentation, script/code writing for tool and service development, post-mortem learning, and administration work on tools like JIRA and New Relic are also part of your responsibilities. In terms of technical skills, you should have at least 5 years of experience in a SaaS and Cloud environment. Proficiency in Kubernetes cluster administration, Linux scripting, database systems (MySQL, DB2), Linux (CentOS / RHEL) administration, change management procedures, on-call responsibilities, deployment management using Jenkins, monitoring tools (e.g., New Relic, Splunk, Nagios), log aggregation tools (e.g., Splunk, Loki, Grafana), and scripting knowledge in at least one language is essential. Experience with API programming and integrating tools such as Jira, Slack, xMatters/PagerDuty will be advantageous for this role.,

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You

Hyderabad, Telangana, India

Bengaluru, Karnataka, India

Noida, Uttar Pradesh, India

Hyderabad, Telangana, India

Pune, Maharashtra, India