Site Reliability Operations Engineer (SROE)

0 years

0 Lacs

Posted:6 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Key Responsibilities

  • Engage with the product and engineering team to design the best operational structure and processes.
  • Identify and drive opportunities in making resilient systems that help maintain business continuity
  • Proactively perform troubleshooting, RCA and implement permanent resolution of issues across the stacks – hardware, software, database, network and so on
  • Proactive performance engineering activities on infrastructure.
  • Proactive documentation of architecture diagrams and processes
  • Implementation of proactive monitoring, alerting, trend analysis and self-healing systems
  • Develop continuous delivery for multiple platforms in production and staging environments
  • Regular Benchmarking and capacity planning of the infrastructure
  • Test and deploy new technologies/tools as per project's need.Infrastructure and platform security
  • Effectively use and maintain Infrastructure and config management tools like
  • puppet, chef, ansible, terraform to deploy and manage infrastructure
  • Demonstrate technical mentoring and coaching to team
  • members
  • Adaptable to work in a fast-paced environment and alter priorities as per
  • business needs

Required Skills

  • Experience with Unix/Linux operating systems internals and administration (e.g. filesystems, inodes, system calls, etc)
  • Good understanding of network stack (e.g. TCP/IP, routing, network topologies and hardware, SDN, etc)
  • Knowledge of performance engineering tools.
  • Hands on experience with any of public clouds like AWS , GCP , Azure
  • Proactive in learning and testing new tools to improve infrastructure
  • Understanding of scripting languages like python and bash and ability to learn new languages when needed
  • Strong understanding of project and infrastructure operational needs and
  • infrastructure architecture , You have expertise in some of the below tools/skills -
  • Container orchestration technologies like Kubernetes and Mesos
  • Understands Infrastructure as a code (we use Puppet, Ansible and Terraform) and containerization tool sets (we use Docker).
  • Data intensive applications and platforms like Kafka, Hadoop, Spark, Zookeeper, Cassandra, PostgreSQL OLAP, Druid
  • Relational databases like MySQL, Oracle, PostgreSQL etc
  • NoSQL databases like Redis, MongoDB, Cassandra, CouchDB etc
  • One or more CI tools like Jenkins, Teamcity
  • Strong knowledge of Centralized logging systems, metrics, and tooling frameworks such as ELK, Prometheus, and Grafana.
  • Web and Application servers like Apache, Nginx, Tomcat
  • Versioning tools such as git.
  • Ability to work independently and own problem statements end-to-end.
  • Great communication, interpersonal and teamwork skills

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You