SRE Engineer

5 - 7 years

14 - 16 Lacs

Posted:2 hours ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Critical Skills to Possess:

  • 5+ years

    of Site Reliability Engineering, DevOps, or Infrastructure Engineering experience
  • SRE Principles:

    Deep understanding of SLOs, SLIs, error budgets, and reliability engineering practices
  • Incident Management:

    Proven experience with incident response, on-call rotations, and post-mortem processes
  • Automation:

    Strong scripting abilities in PowerShell, Python, or Bash for automation and tooling
Monitoring and Tools
  • SolarWinds:

    Advanced experience with SolarWinds NPM, SAM, and custom monitoring setup
  • Azure Monitor:

    Proficient in Azure Monitor, Log Analytics, and Application Insights
  • Ivanti:

    Experience with Ivanti ITSM for incident and change management
  • Backup Solutions:

    Enterprise backup strategy implementation and monitoring
Professional Skills
  • Strong analytical and troubleshooting skills with systematic problem-solving approach
  • Excellent communication skills for incident coordination and stakeholder updates
  • Experience working in 24/7 production environments with strict SLA requirements
  • Ability to balance reliability with feature velocity and business requirements



Preferred Qualifications:

  • BS degree in Computer Science or Engineering or equivalent experience

Roles and Responsibilities

Roles and Responsibilities:

Service Reliability and Availability
  • Design and implement service level objectives (SLOs) and service level indicators (SLIs) for critical systems
  • Monitor and maintain 99.9%+ uptime for production environments across hybrid infrastructure
  • Develop and execute incident response procedures and post-incident reviews
  • Implement chaos engineering practices to proactively identify system weaknesses
  • Lead root cause analysis and implement permanent fixes to prevent recurring issues
Monitoring and Observability
  • Design comprehensive monitoring strategies using SolarWinds, Azure Monitor, and custom solutions
  • Implement alerting systems with appropriate escalation procedures and noise reduction
  • Create and maintain dashboards for system health, performance metrics, and business KPIs
  • Establish logging strategies and log aggregation across all platforms
  • Develop automated health checks and synthetic monitoring for critical services
Automation and Infrastructure as Code
  • Develop automation scripts and tools to reduce manual operational overhead
  • Implement infrastructure as code practices for consistent environment provisioning
  • Create self-healing systems and automated remediation procedures
  • Build CI/CD pipelines for infrastructure changes and application deployments
  • Automate backup, recovery, and disaster recovery procedures
Database Reliability Engineering
  • Ensure high availability and performance of Oracle and SQL Server database systems
  • Implement database monitoring, alerting, and automated maintenance procedures
  • Manage database backup strategies and recovery time objectives (RTO/RPO)
  • Optimize database performance through query tuning and resource management
  • Coordinate with Informatica ETL processes for data pipeline reliability
Capacity Planning and Performance
  • Conduct capacity planning for compute, storage, and network resources
  • Performance tuning across Windows, Linux, and Azure environments
  • Implement auto-scaling solutions for cloud workloads
  • Analyze system performance trends and proactively address bottlenecks
  • Optimize cost efficiency while maintaining performance standards
Security and Compliance
  • Implement security best practices across all infrastructure components
  • Manage patch management automation and vulnerability remediation
  • Ensure compliance with security policies and regulatory requirements
  • Implement security monitoring and incident response procedures
  • Coordinate with security teams for threat detection and response


Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

Bengaluru, Karnataka, India

Kolkata, Mumbai, New Delhi, Hyderabad, Pune, Chennai, Bengaluru