Jobs
Interviews

869 Prometheus Jobs - Page 23

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

2.0 - 4.0 years

4 - 6 Lacs

Bengaluru

Work from Office

ZS is a place where passion changes lives. As a management consulting and technology firm focused on improving life and how we live it , our most valuable asset is our people. Here you ll work side-by-side with a powerful collective of thinkers and experts shaping life-changing solutions for patients, caregivers and consumers, worldwide. ZSers drive impact by bringing a client first mentality to each and every engagement. We partner collaboratively with our clients to develop custom solutions and technology products that create value and deliver company results across critical areas of their business. Bring your curiosity for learning; bold ideas; courage an d passion to drive life-changing impact to ZS. Our most valuable asset is our people . At ZS we honor the visible and invisible elements of our identities, personal experiences and belief systems the ones that comprise us as individuals, shape who we are and make us unique. We believe your personal interests, identities, and desire to learn are part of your success here. Learn more about our diversity, equity, and inclusion efforts and the networks ZS supports to assist our ZSers in cultivating community spaces, obtaining the resources they need to thrive, and sharing the messages they are passionate about. Platform and Product Team is shaping one of the key growth vector area for ZS, our engagement, comprising of clients from industries like Quick service restaurants, Technology, Food & Beverage, Hospitality, Travel, Insurance, Consumer Products Goods & other such industries across North America, Europe & South East Asia region. Platform and Product India team currently has presence across New Delhi, Pune and Bengaluru offices and is continuously expanding further at a great pace. Platform and Product India team works with colleagues across clients and geographies to create and deliver real world pragmatic solutions leveraging AI SaaS products & platforms, Generative AI applications, and other Advanced analytics solutions at scale. What You ll Do: Experience with cloud technologies AWS, Azure or GCP Create container images and maintain container registries. Create, update, and maintain production grade applications on Kubernetes clusters and cloud. Inculcate GitOps approach to maintain deployments. Create YAML scripts, HELM charts for Kubernetes deployments as required. Take part in cloud design and architecture decisions and support lead architects build cloud agnostic applications. Create and maintain Infrastructure-as-code templates to automate cloud infrastructure deployment Create and manage CI/CD pipelines to automate containerized deployments to cloud and K8s. Maintain git repositories, establish proper branching strategy, and release management processes. Support and maintain source code management and build tools. Monitoring applications on cloud and Kubernetes using tools like ELK, Grafana, Prometheus etc. Automate day to day activities using scripting. Work closely with development team to implement new build processes and strategies to meet new product requirements. Troubleshooting, problem solving, root cause analysis, and documentation related to build, release, and deployments. Ensure that systems are secure and compliant with industry standards. What You ll Bring A master s or bachelor s degree in computer science or related field from a top university. 2-4+ years of hands-on experience in DevOps Hands-on experience designing and deploying applications to cloud (Aws / Azure/ GCP) Expertise on deploying and maintaining applications on Kubernetes Technical expertise in release automation engineering, CI/CD or related roles. Hands on experience in writing Terraform templates as IaC, Helm charts, Kubernetes manifests Should have strong hold on Linux commands and script automation. Technical understanding of development tools, source control, and continuous integration build systems, e.g. Azure DevOps, Jenkins, Gitlab, TeamCity etc. Knowledge of deploying LLM models and toolchains Configuration management of various environments. Experience working in agile teams with short release cycles. Good to have programming experience in python / go. Characteristics of a forward thinker and self-starter that thrives on new challenges and adapts quickly to learning new knowledge. Perks & Benefits ZS offers a comprehensive total rewards package including health and well-being, financial planning, annual leave, personal growth and professional development. Our robust skills development programs, multiple career progression options and internal mobility paths and collaborative culture empowers you to thrive as an individual and global team member. We are committed to giving our employees a flexible and connected way of working. A flexible and connected ZS allows us to combine work from home and on-site presence at clients/ZS offices for the majority of our week. The magic of ZS culture and innovation thrives in both planned and spontaneous face-to-face connections. Travel Travel is a requirement at ZS for client facing ZSers; business needs of your project and client are the priority. While some projects may be local, all client-facing ZSers should be prepared to travel as needed. Travel provides opportunities to strengthen client relationships, gain diverse experiences, and enhance professional growth by working in different environments and cultures. Considering applying At ZS, we're building a diverse and inclusive company where people bring their passions to inspire life-changing impact and deliver better outcomes for all. We are most interested in finding the best candidate for the job and recognize the value that candidates with all backgrounds, including non-traditional ones, bring. If you are interested in joining us, we encourage you to apply even if you don't meet 100% of the requirements listed above. To Complete Your Application Candidates must possess or be able to obtain work authorization for their intended country of employment.An on-line application, including a full set of transcripts (official or unofficial), is required to be considered.

Posted 1 month ago

Apply

4.0 - 7.0 years

11 - 16 Lacs

Pune

Hybrid

So, what’s the role all about? As a Sr. Cloud Services Automation Engineer, you will be responsible for designing, developing, and maintaining robust end-to-end automation solutions that support our customer onboarding processes from an on-prem software solution to Azure SAAS platform and streamline cloud operations. You will work closely with Professional Services, Cloud Operations, and Engineering teams to implement tools and frameworks that ensure seamless deployment, monitoring, and self-healing of applications running in Azure. How will you make an impact? Design and develop automated workflows that orchestrate complex processes across multiple systems, databases, endpoints, and storage solutions in on-prem and public cloud. Design, develop, and maintain internal tools/utilities using C#, PowerShell, Python, Bash to automate and optimize cloud onboarding workflows. Create integrations with REST APIs and other services to ingest and process external/internal data. Query and analyze data from various sources such as, SQL databases, Elastic Search indices and Log files (structured and unstructured) Develop utilities to visualize, summarize, or otherwise make data actionable for Professional Services and QA engineers. Work closely with test, ingestion, and configuration teams to understand bottlenecks and build self-healing mechanisms for high availability and performance. Build automated data pipelines with data consistency and reconciliation checks using tools like PowerBI/Grafana for collecting metrics from multiple endpoints and generating centralized and actionable dashboards. Automate resource provisioning across Azure services including AKS, Web Apps, and storage solutions Experience in building Infrastructure-as-code (IaC) solutions using tools like Terraform, Bicep, or ARM templates Develop end-to-end workflow automation in customer onboarding journey that spans from Day 1 to Day 2 with minimal manual intervention Have you got what it takes? Bachelor’s degree in computer science, Engineering, or related field (or equivalent experience). Proficiency in scripting and programming languages (e.g., C#, .NET, PowerShell, Python, Bash). Experience working with and integrating REST APIs Experience with IaC and configuration management tools (e.g., Terraform, Ansible) Familiarity with monitoring and logging solutions (e.g., Azure Monitor, Log Analytics, Prometheus, Grafana). Familiarity with modern version control systems (e.g., GitHub). Excellent problem-solving skills and attention to detail. Ability to work with development and operations teams, to achieve desired results, on common projects Strategic thinker and capable of learning new technologies quickly Good communication with peers, subordinates and managers You will have an advantage if you also have: Experience with AKS infrastructure administration. Experience orchestrating automation with Azure Automation tools like Logic Apps. Experience working in a secure, compliance driven environment (e.g. CJIS/PCI/SOX/ISO) Certifications in vendor or industry specific technologies. What’s in it for you? Join an ever-growing, market disrupting, global company where the teams – comprised of the best of the best – work in a fast-paced, collaborative, and creative environment! As the market leader, every day at NiCE is a chance to learn and grow, and there are endless internal career opportunities across multiple roles, disciplines, domains, and locations. If you are passionate, innovative, and excited to constantly raise the bar, you may just be our next NiCEr! Enjoy NiCE-FLEX! At NiCE, we work according to the NiCE-FLEX hybrid model, which enables maximum flexibility: 2 days working from the office and 3 days of remote work, each week. Naturally, office days focus on face-to-face meetings, where teamwork and collaborative thinking generate innovation, new ideas, and a vibrant, interactive atmosphere. Requisition ID: 7454 Reporting into: Director Role Type: Individual Contributor

Posted 1 month ago

Apply

6.0 - 11.0 years

10 - 16 Lacs

Pune

Remote

What You'll Do We are looking for experienced Machine Learning Engineers with a background in software development and a deep enthusiasm for solving complex problems. You will lead a dynamic team dedicated to designing and implementing a large language model framework to power diverse applications across Avalara. Your responsibilities will span the entire development lifecycle, including conceptualization, prototyping and delivery of the LLM platform features. You will build core agent infrastructureA2A orchestration and MCP-driven tool discoveryso teams can launch secure, scalable agent workflows. You will be reporting to Senior Manager, Machine Learning What Your Responsibilities Will Be We are looking for engineers who can think quick and have a background in implementation. Your responsibilities will include: Build on top of the foundational framework for supporting Large Language Model Applications at Avalara Experience with LLMs - like GPT, Claude, LLama and other Bedrock models Leverage best practices in software development, including Continuous Integration/Continuous Deployment (CI/CD) along with appropriate functional and unit testing in place. Promote innovation by researching and applying the latest technologies and methodologies in machine learning and software development. Write, review, and maintain high-quality code that meets industry standards, contributing to the project's. Lead code review sessions, ensuring good code quality and documentation. Mentor junior engineers, encouraging a culture of collaboration Proficiency in developing and debugging software with a preference for Python, though familiarity with additional programming languages is valued and encouraged. What You'll Need to be Successful 6+ years of experience building Machine Learning models and deploying them in production environments as part of creating solutions to complex customer problems. Proficiency working in cloud computing environments (AWS, Azure, GCP), Machine Learning frameworks, and software development best practices. Experience working with technological innovations in AI & ML(esp. GenAI) and apply them. Experience with design patterns and data structures. Good analytical, design and debugging skills. Technologies you will work with: Python, LLMs, Agents, A2A, MCP, MLFlow, Docker, Kubernetes, Terraform, AWS, GitLab, Postgres, Prometheus, and Grafana

Posted 1 month ago

Apply

5.0 - 10.0 years

8 - 15 Lacs

Pune

Remote

What You'll Do We are looking for experienced Machine Learning Engineers with a background in software development and a deep enthusiasm for solving complex problems. You will lead a dynamic team dedicated to designing and implementing a large language model framework to power diverse applications across Avalara. Your responsibilities will span the entire development lifecycle, including conceptualization, prototyping and delivery of the LLM platform features. You will have a blend of technical skills in the fields of AI & Machine Learning especially with LLMs and a deep-seated understanding of software development practices where you'll work with a team to ensure our systems are scalable, performant and accurate. You will be reporting to Senior Manager, AI/ML. What Your Responsibilities Will Be We are looking for engineers who can think quick and have a background in implementation. Your responsibilities will include: Build on top of the foundational framework for supporting Large Language Model Applications at Avalara Experience with LLMs - like GPT, Claude, LLama and other Bedrock models Leverage best practices in software development, including Continuous Integration/Continuous Deployment (CI/CD) along with appropriate functional and unit testing in place. Inspire creativity by researching and applying the latest technologies and methodologies in machine learning and software development. Write, review, and maintain high-quality code that meets industry standards. Lead code review sessions, ensuring good code quality and documentation. Mentor junior engineers, encouraging a culture of collaboration. Proficiency in developing and debugging software with a preference for Python, though familiarity with additional programming languages is valued and encouraged. What You'll Need to be Successful Bachelor's/Master's degree in computer science with 5+ years of industry experience in software development, along with experience building Machine Learning models and deploying them in production environments. Proficiency working in cloud computing environments (AWS, Azure, GCP), Machine Learning frameworks, and software development best practices. Work with technological innovations in AI & ML(esp. GenAI) Experience with design patterns and data structures. Good analytical, design and debugging skills. Technologies you will work with: Python, LLMs, MLFlow, Docker, Kubernetes, Terraform, AWS, GitLab, Postgres, Prometheus, Grafana

Posted 1 month ago

Apply

4.0 - 7.0 years

9 - 12 Lacs

Pune

Hybrid

So, what’s the role all about? In NiCE as a Senior Software professional specializing in designing, developing, and maintaining applications and systems using the Java programming language. They play a critical role in building scalable, robust, and high-performing applications for a variety of industries, including finance, healthcare, technology, and e-commerce How will you make an impact? Working knowledge of unit testing Working knowledge of user stories or use cases Working knowledge of design patterns or equivalent experience. Working knowledge of object-oriented software design. Team Player Have you got what it takes? Bachelor’s degree in computer science, Business Information Systems or related field or equivalent work experience is required. 4+ year (SE) experience in software development Well established technical problem-solving skills. Experience in Java, spring boot and microservices. Experience with Kafka, Kinesis, KDA, Apache Flink Experience in Kubernetes operators, Grafana, Prometheus Experience with AWS Technology including (EKS, EMR, S3, Kinesis, Lambda’s, Firehose, IAM, CloudWatch, etc) You will have an advantage if you also have: Experience with Snowflake or any DWH solution. Excellent communication skills, problem-solving skills, decision-making skills Experience in Databases Experience in CI/CD, git, GitHub Actions Jenkins based pipeline deployments. Strong experience in SQL What’s in it for you? Join an ever-growing, market disrupting, global company where the teams – comprised of the best of the best – work in a fast-paced, collaborative, and creative environment! As the market leader, every day at NiCE is a chance to learn and grow, and there are endless internal career opportunities across multiple roles, disciplines, domains, and locations. If you are passionate, innovative, and excited to constantly raise the bar, you may just be our next NiCEr! Enjoy NiCE-FLEX! At NiCE, we work according to the NiCE-FLEX hybrid model, which enables maximum flexibility: 2 days working from the office and 3 days of remote work, each week. Naturally, office days focus on face-to-face meetings, where teamwork and collaborative thinking generate innovation, new ideas, and a vibrant, interactive atmosphere. Requisition ID: 6965 Reporting into: Tech Manager Role Type: Individual Contributor

Posted 1 month ago

Apply

5.0 - 8.0 years

15 - 19 Lacs

Pune

Hybrid

So, what’s the role all about? Seeking a skilled and experienced DevOps Engineer in designing, producing, and testing high-quality software that meets specified functional and non-functional requirements within the time and resource constraints given. How will you make an impact? Design, implement, and maintain CI/CD pipelines using Jenkins to support automated builds, testing, and deployments. Manage and optimize AWS infrastructure for scalability, reliability, and cost-effectiveness. To streamline operational workflows and develop automation scripts and tools using shell scripting and other programming languages. Collaborate with cross-functional teams (Development, QA, Operations) to ensure seamless software delivery and deployment. Monitor and troubleshoot infrastructure, build failures, and deployment issues to ensure high availability and performance. Implement and maintain robust configuration management practices and infrastructure-as-code principles. Document processes, systems, and configurations to ensure knowledge sharing and maintain operational consistency. Performing ongoing maintenance and upgrades (Production & non-production) Occasional weekend or after-hours work as needed Have you got what it takes? Experience: 5-8 years in DevOps or a similar role. Cloud Expertise: Proficient in AWS services such as EC2, S3, RDS, Lambda, IAM, CloudFormation, or similar. CI/CD Tools: Hands-on experience with Jenkins pipelines (declarative and scripted). Scripting Skills: Proficiency in either shell scripting or powershell Programming Knowledge: Familiarity with at least one programming language (e.g., Python, Java, or Go). IMP: Scripting/Programming is integral to this role and will be a key focus in the interview process. Version Control: Experience with Git and Git-based workflows. Monitoring Tools: Familiarity with tools like CloudWatch, Prometheus, or similar. Problem-solving: Strong analytical and troubleshooting skills in a fast-paced environment. CDK Knowledge in AWS DevOps. You will have an advantage if you also have: Prior experience in Development or Automation is a significant advantage. Windows system administration is a significant advantage. Experience with monitoring and log analysis tools is an advantage. Jenkins pipeline knowledge What’s in it for you? Join an ever-growing, market disrupting, global company where the teams – comprised of the best of the best – work in a fast-paced, collaborative, and creative environment! As the market leader, every day at NiCE is a chance to learn and grow, and there are endless internal career opportunities across multiple roles, disciplines, domains, and locations. If you are passionate, innovative, and excited to constantly raise the bar, you may just be our next NiCEr! Enjoy NiCE-FLEX! At NiCE, we work according to the NiCE-FLEX hybrid model, which enables maximum flexibility: 2 days working from the office and 3 days of remote work, each week. Naturally, office days focus on face-to-face meetings, where teamwork and collaborative thinking generate innovation, new ideas, and a vibrant, interactive atmosphere. Requisition ID: 7318 Reporting into: Tech Manager Role Type: Individual Contributor

Posted 1 month ago

Apply

8.0 - 13.0 years

12 - 16 Lacs

Bengaluru

Work from Office

Key Job Responsibilities and Duties: The core premise for the Booking SRE lies in treating operational and reliability problems of software systems as a software engineering problem. We code our way out of problems where operations are concerned addressing availability, scalability, latency, and efficiency challenges within the vast infrastructure here at Booking. We expect our SRE engineers to be software engineers that optimize systems rather than be system operators. You will impact millions of people all over the globe with your creative solutions You work in one of the biggest e-commerce companies in the world You will solve exciting problems at scale by writing and deploying code across tens of thousands of servers Ensuring an everything as code mindset for yourself and your team You will have the opportunity to collaborate with many of the worlds leading SREs You will be free to launch your own ideas and solutions within our sophisticated production environment Here are some of the tools and technologies we use to achieve this: Python, Go, Puppet, Kubernetes, Elasticsearch, Prometheus, HAProxy, Cassandra, Kafka etc What youll be doing: Design, develop and implement software that improves the stability, scalability, availability and latency of the Booking.com products; Take ownership of one or more services and have the freedom to do what is best for our business and customers; Solve problems occurring with our highly available production systems and build solutions and automation to prevent them from happening again; Build effective monitoring to supervise the health of your system, and jump in to handle outages; Build and run capacity tests to manage the growth of your systems; Plan for reliability by designing systems to work across our multinational data centers; Develop tools to assist the product development teams with successfully deploying 1000s of change sets every day; Be an advocate of engineering standard processes; Share the on-call rotation and be an escalation contact for incidents: Contribute to Booking.com's growth through interviewing, on-boarding, or other recruitment efforts. What youll bring: 8 years + hands-on experience in software and site reliability engineering within the technology sector. Coupled with expertise with building, operating and maintaining sophisticated and scalable systems. Solid experience in at least one programming language. We use Java, Python, Go, Ruby, Perl; Experience with Infrastructure as Code technologies; Knowledge of cloud computing fundamentals; Solid foundation in Linux administration and troubleshooting; Understanding of Service level agreements and objectives; Additional experience in OpenStack, Kubernetes, Networking, Security or Storage is desirable; Supervising / observability technologies like Prometheus, Graphite, Grafana, Kibana, Elasticsearch are a plus; Good interpersonal skills Proficient command of the English language, both written and spoken

Posted 1 month ago

Apply

3.0 - 5.0 years

6 - 16 Lacs

Pune

Work from Office

Primary Job Responsibilities: Collaborate with team members to maintain, monitor, and improve data ingestion pipelines on the Data & AI platform. Attend the office 3 times a week for collaborative sessions and team alignment. Drive innovation in ingestion and analytics domains to enhance performance and scalability. Work closely with the domain architect to implement and evolve data engineering strategies Required Skills: Minimum 5 years of experience in Python development focused on Data Engineering. Hands-on experience with Databricks and Delta Lake format. Strong proficiency in SQL, data structures, and robust coding practices. Solid understanding of scalable data pipelines and performance optimization. Preferred / Nice to Have: Familiarity with monitoring tools like Prometheus and Grafana. Experience using Copilot or AI-based tools for code enhancement and efficiency.

Posted 1 month ago

Apply

5.0 - 10.0 years

7 - 12 Lacs

Bengaluru

Work from Office

Educational Requirements MCA,MTech,Master of Business Administration,Bachelor of Engineering,BCA,BTech Service Line Cloud & Infrastructure Services Responsibilities As Tools SME Tools in SolarWinds/Splunk/Dynatrace/Devpops tool will work on Design, Setup and Configuration of Observability Platforms with Correlation, Anomaly Detection, Visualization and Dashboards, AI ops, Devops, Tool Integration : Collaborate with DevOps architects, development teams, and operations teams to understand their tool requirements and identify opportunities for optimizing the DevOps toolchain. Evaluate and recommend new tools and technologies that can enhance our DevOps capabilities context, considering factors like cost, integration, and local support. Lead the implementation, configuration, and integration of various DevOps tools, including CI/CD platforms (e.g., Jenkins, GitLab CI, Azure DevOps), infrastructure-as-code (IaC) tools (e.g., Terraform, Ansible), containerization and orchestration tools (e.g., Docker, Kubernetes), monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack), and testing framework Establish standards and best practices for the usage and management of the DevOps toolset Ensure the availability, performance, and stability of the DevOps toolchain Perform regular maintenance tasks, including upgrades, patching, and backups of the DevOps tools. Provide technical support and troubleshooting assistance to development and operations teams regarding the usage of the DevOps tools. Monitor the health and performance of the toolset and implement proactive measures to prevent issues. Design and implement integrations between different tools in the DevOps pipeline to create seamless and automated workflows Develop automation scripts and utilities to streamline tool provisioning, configuration, and management within the environment. Work with development teams to integrate testing and security tools into the CI/CD pipeline. Additional Responsibilities: Besides the professional qualifications of the candidates, we place great importance in addition to various forms personality profile. These include: High analytical skills A high degree of initiative and flexibility High customer orientation High quality awareness Excellent verbal and written communication skills Technical and Professional Requirements: At least 6+ years of experience in Solarwinds or Splunk or Dynatrace or Devlops Toolset Proven experience with several key DevOps tools, including CI/CD platforms (e.g., Jenkins, GitLab CI, Azure DevOps), IaC tools (e.g., Terraform, Ansible), containerization (Docker, Kubernetes), and monitoring tools (e.g., Prometheus, Grafana, ELK stack). Good level knowledge of Linux environment Good working knowledge on YAML and Python Good working knowledge in Event correlation and Observability Good Communication skills Good analytical and problem-solving skills Preferred Skills: Technology->Infra_ToolAdministration-Others->Solarwinds Technology->Infra_ToolAdministration-Others->Splunk Admin Technology->DevOps->DevOps Architecture Consultancy Technology->Dynatrace->Digital Performance Management Tool

Posted 1 month ago

Apply

10.0 - 15.0 years

22 - 37 Lacs

Bengaluru

Work from Office

Who We Are At Kyndryl, we design, build, manage and modernize the mission-critical technology systems that the world depends on every day. So why work at Kyndryl? We are always moving forward – always pushing ourselves to go further in our efforts to build a more equitable, inclusive world for our employees, our customers and our communities. The Role As an ELK (Elastic, Logstash & Kibana) Data Engineer, you would be responsible for developing, implementing, and maintaining the ELK stack-based solutions for Kyndryl’s clients. This role would be responsible to develop efficient and effective, data & log ingestion, processing, indexing, and visualization for monitoring, troubleshooting, and analysis purposes. Responsibilities: Design, implement, and maintain scalable data pipelines using ELK Stack (Elasticsearch, Logstash, Kibana) and Beats for monitoring and analytics. Develop data processing workflows to handle real-time and batch data ingestion, transformation and visualization. Implement techniques like grok patterns, regular expressions, and plugins to handle complex log formats and structures. Configure and optimize Elasticsearch clusters for efficient indexing, searching, and performance tuning. Collaborate with business users to understand their data integration & visualization needs and translate them into technical solutions Create dynamic and interactive dashboards in Kibana for data visualization and insights that can enable to detect the root cause of the issue. Leverage open-source tools such as Beats and Python to integrate and process data from multiple sources. Collaborate with cross-functional teams to implement ITSM solutions integrating ELK with tools like ServiceNow and other ITSM platforms. Anomaly detection using Elastic ML and create alerts using Watcher functionality Extract data by Python programming using API Build and deploy solutions in containerized environments using Kubernetes. Monitor Elasticsearch clusters for health, performance, and resource utilization Automate routine tasks and data workflows using scripting languages such as Python or shell scripting. Provide technical expertise in troubleshooting, debugging, and resolving complex data and system issues. Create and maintain technical documentation, including system diagrams, deployment procedures, and troubleshooting guides If you're ready to embrace the power of data to transform our business and embark on an epic data adventure, then join us at Kyndryl. Together, let's redefine what's possible and unleash your potential. Your Future at Kyndryl Every position at Kyndryl offers a way forward to grow your career. We have opportunities that you won’t find anywhere else, including hands-on experience, learning opportunities, and the chance to certify in all four major platforms. Whether you want to broaden your knowledge base or narrow your scope and specialize in a specific sector, you can find your opportunity here. Who You Are You’re good at what you do and possess the required experience to prove it. However, equally as important – you have a growth mindset; keen to drive your own personal and professional development. You are customer-focused – someone who prioritizes customer success in their work. And finally, you’re open and borderless – naturally inclusive in how you work with others. Required Technical and Professional Experience: Minimum of 5 years of experience in ELK Stack and Python programming Graduate/Postgraduate in computer science, computer engineering, or equivalent with minimum of 10 years of experience in the IT industry. ELK Stack : Deep expertise in Elasticsearch, Logstash, Kibana, and Beats. Programming : Proficiency in Python for scripting and automation. ITSM Platforms : Hands-on experience with ServiceNow or similar ITSM tools. Containerization : Experience with Kubernetes and containerized applications. Operating Systems : Strong working knowledge of Windows, Linux, and AIX environments. Open-Source Tools : Familiarity with various open-source data integration and monitoring tools. Knowledge of network protocols, log management, and system performance optimization. Experience in integrating ELK solutions with enterprise IT environments. Strong analytical and problem-solving skills with attention to detail. Knowledge in MySQL or NoSQL Databases will be added advantage Fluent in English (written and spoken). Preferred Technical and Professional Experience “Elastic Certified Analyst” or “Elastic Certified Engineer” certification is preferrable Familiarity with additional monitoring tools like Prometheus, Grafana, or Splunk. Knowledge of cloud platforms (AWS, Azure, or GCP). Experience with DevOps methodologies and tools. Being You Diversity is a whole lot more than what we look like or where we come from, it’s how we think and who we are. We welcome people of all cultures, backgrounds, and experiences. But we’re not doing it single-handily: Our Kyndryl Inclusion Networks are only one of many ways we create a workplace where all Kyndryls can find and provide support and advice. This dedication to welcoming everyone into our company means that Kyndryl gives you – and everyone next to you – the ability to bring your whole self to work, individually and collectively, and support the activation of our equitable culture. That’s the Kyndryl Way. What You Can Expect With state-of-the-art resources and Fortune 100 clients, every day is an opportunity to innovate, build new capabilities, new relationships, new processes, and new value. Kyndryl cares about your well-being and prides itself on offering benefits that give you choice, reflect the diversity of our employees and support you and your family through the moments that matter – wherever you are in your life journey. Our employee learning programs give you access to the best learning in the industry to receive certifications, including Microsoft, Google, Amazon, Skillsoft, and many more. Through our company-wide volunteering and giving platform, you can donate, start fundraisers, volunteer, and search over 2 million non-profit organizations. At Kyndryl, we invest heavily in you, we want you to succeed so that together, we will all succeed. Get Referred! If you know someone that works at Kyndryl, when asked ‘How Did You Hear About Us’ during the application process, select ‘Employee Referral’ and enter your contact's Kyndryl email address.

Posted 1 month ago

Apply

4.0 - 6.0 years

20 - 25 Lacs

Pune

Work from Office

Greetings from Peoplefy Infosolutions !!! We are hiring for one of our reputed MNC client based in Pune . We are looking for candidates with 4 + years of experience in below skills - Primary skills : Python DBT SSIS Snowflake Linux Datadog Prometheus Grafana Interested candidates for above position kindly share your CVs on chitralekha.so@peoplefy.com with below details - Experience : CTC : Expected CTC : Notice Period : Location :

Posted 1 month ago

Apply

5.0 - 8.0 years

15 - 25 Lacs

Hyderabad, Pune, Bengaluru

Hybrid

Warm Greetings from SP Staffing!! Role: Azure Devops Experience Required :5 to 8 yrs Work Location :Hyderabad/Pune/Bangalore Required Skills, Azure Devops Terraform Bash/Powershell/Pytgon Promethesus/Grafana Interested candidates can send resumes to nandhini.spstaffing@gmail.com

Posted 1 month ago

Apply

5.0 - 7.0 years

11 - 12 Lacs

Hyderabad

Work from Office

We are seeking a highly skilled Devops Engineer to join our dynamic development team. In this role, you will be responsible for designing, developing, and maintaining both frontend and backend components of our applications using Devops and associated technologies. You will collaborate with cross-functional teams to deliver robust, scalable, and high-performing software solutions that meet our business needs. The ideal candidate will have a strong background in devops, experience with modern frontend frameworks, and a passion for full-stack development. Requirements : Bachelor's degree in Computer Science Engineering, or a related field. 5 to 7+ years of experience in full-stack development, with a strong focus on DevOps. DevOps with AWS Data Engineer - Roles & Responsibilities: Use AWS services like EC2, VPC, S3, IAM, RDS, and Route 53. Automate infrastructure using Infrastructure as Code (IaC) tools like Terraform or AWS CloudFormation . Build and maintain CI/CD pipelines using tools AWS CodePipeline, Jenkins,GitLab CI/CD. Cross-Functional Collaboration Automate build, test, and deployment processes for Java applications. Use Ansible , Chef , or AWS Systems Manager for managing configurations across environments. Containerize Java apps using Docker . Deploy and manage containers using Amazon ECS , EKS (Kubernetes) , or Fargate . Monitoring & Logging using Amazon CloudWatch,Prometheus + Grafana,E Stack (Elasticsearch, Logstash, Kibana),AWS X-Ray for distributed tracing manage access with IAM roles/policies . Use AWS Secrets Manager / Parameter Store for managing credentials. Enforce security best practices , encryption, and audits. Automate backups for databases and services using AWS Backup , RDS Snapshots , and S3 lifecycle rules . Implement Disaster Recovery (DR) strategies. Work closely with development teams to integrate DevOps practices. Document pipelines, architecture, and troubleshooting runbooks. Monitor and optimize AWS resource usage. Use AWS Cost Explorer , Budgets , and Savings Plans . Must-Have Skills: Experience working on Linux-based infrastructure. Excellent understanding of Ruby, Python, Perl, and Java . Configuration and managing databases such as MySQL, Mongo. Excellent troubleshooting. Selecting and deploying appropriate CI/CD tools Working knowledge of various tools, open-source technologies, and cloud services. Awareness of critical concepts in DevOps and Agile principles. Managing stakeholders and external interfaces. Setting up tools and required infrastructure. Defining and setting development, testing, release, update, and support processes for DevOps operation. Have the technical skills to review, verify, and validate the software code developed in the project. Interview Mode : F2F for who are residing in Hyderabad / Zoom for other states Location : 43/A, MLA Colony,Road no 12, Banjara Hills, 500034 Time : 2 - 4pm

Posted 1 month ago

Apply

3.0 - 8.0 years

5 - 10 Lacs

Pune

Work from Office

Since its inception in 2003, driven by visionary college students transforming online rent payment, Entrata has evolved into a global leader serving property owners, managers, and residents. Honored with prestigious awards like the Utah Business Fast 50, Silicon Slopes Hall of Fame - Software Company - 2022, Women Tech Council Shatter List, our comprehensive software suite spans rent payments, insurance, leasing, maintenance, marketing, and communication tools, reshaping property management worldwide. Our 2200+ global team members embody intelligence and adaptability, engaging actively from top executives to part-time employees. With offices across Utah, Texas, India, Israel, and the Netherlands, Entrata blends startup innovation with established stability, evident in our transparent communication values and executive town halls. Our product isn't just desirable; it's industry essential. At Entrata, we passionately refine living experiences, uphold collective excellence, embrace > Job Summary Entrata Software is seeking a DevOps Engineer to join our R&D team in Pune, India. This role will focus on automating infrastructure, streamlining CI/CD pipelines, and optimizing cloud-based deployments to improve software delivery and system reliability. The ideal candidate will have expertise in Kubernetes, AWS, Terraform, and automation tools to enhance scalability, security, and observability. Success in this role requires strong problem-solving skills, collaboration with development and security teams, and a commitment to continuous improvement. If you thrive in fast-paced, Agile environments and enjoy solving complex infrastructure challenges, we encourage you to apply! Key Responsibilities Design, implement, and maintain CI/CD pipelines using Jenkins, GitHub Actions, and ArgoCD to enable seamless, automated software deployments. Deploy, manage, and optimize Kubernetes clusters in AWS, ensuring reliability, scalability, and security. Automate infrastructure provisioning and configuration using Terraform, CloudFormation, Ansible, and scripting languages like Bash, Python, and PHP. Monitor and enhance system observability using Prometheus, Grafana, and ELK Stack to ensure proactive issue detection and resolution. Implement DevSecOps best practices by integrating security scanning, compliance automation, and vulnerability management into CI/CD workflows. Troubleshoot and resolve cloud infrastructure, networking, and deployment issues in a timely and efficient manner. Collaborate with development, security, and IT teams to align DevOps practices with business and engineering objectives. Optimize AWS cloud resource utilization and cost while maintaining high availability and performance. Establish and maintain disaster recovery and high-availability strategies to ensure system resilience. Improve incident response and on-call processes by following SRE principles and automating issue resolution. Promote a culture of automation and continuous improvement, identifying and eliminating manual inefficiencies in development and operations. Stay up-to-date with emerging DevOps tools and trends, implementing best practices to enhance processes and technologies. Ensure compliance with security and industry standards, enforcing governance policies across cloud infrastructure. Support developer productivity by providing self-service infrastructure and deployment automation to accelerate the software development lifecycle. Document processes, best practices, and troubleshooting guides to ensure clear knowledge sharing across teams. Minimum Qualifications 3+ years of experience as a DevOps Engineer or similar role. Strong proficiency in Kubernetes, Docker, and AWS. Hands-on experience with Terraform, CloudFormation, and CI/CD tools (Jenkins, GitHub Actions, GitLab CI/CD, ArgoCD). Solid scripting and automation skills with Bash, Python, PHP, or Ansible. Expertise in monitoring and logging tools such as NewRelic, Prometheus, Grafana, and ELK Stack. Understanding of DevSecOps principles, security best practices, and vulnerability management. Strong problem-solving skills and ability to troubleshoot cloud infrastructure and deployment issues effectively. Preferred Qualifications Experience with GitOps methodologies using ArgoCD or Flux. Familiarity with SRE principles and managing incident response for high-availability applications. Knowledge of serverless architectures and AWS cost optimization strategies. Hands-on experience with compliance and governance automation for cloud security. Previous experience working in Agile, fast-paced environments with a focus on DevOps transformation. Strong communication skills and ability to mentor junior engineers on DevOps best practices. If you're passionate about automation, cloud infrastructure, and building scalable DevOps solutions ,

Posted 1 month ago

Apply

6.0 - 8.0 years

18 - 22 Lacs

Hyderabad

Hybrid

Job Title: Automation Lead Infrastructure & Scripting Location: Hyderabad (Hybrid) Job Type: Contract to Hire About the Role: We are seeking a results-driven Automation Lead with strong expertise in Python, Ansible , and other scripting tools to drive automation initiatives across our IT infrastructure landscape. The ideal candidate will have a solid background in networking, Active Directory , and general infrastructure operations , along with a passion for solving complex problems through automation. Key Responsibilities: Lead the design, development, and deployment of automation solutions for infrastructure operations using Python, Ansible , and other tools. Identify manual processes and develop scripts/playbooks to automate configuration, provisioning, and monitoring. Collaborate with network, server, and platform teams to understand requirements and develop end-to-end automation workflows. Maintain and enhance existing automation frameworks, ensuring scalability and maintainability. Implement and manage configuration management, compliance, and orchestration strategies. Mentor junior engineers and establish automation best practices across teams. Integrate with CI/CD pipelines to streamline delivery and deployment processes. Monitor automation performance and provide continuous improvements and updates. Required Skills and Experience: 6+ years of experience in infrastructure engineering or automation. Strong hands-on experience with Python for scripting and automation. Expertise in Ansible for configuration management and orchestration. Experience with other scripting tools such as PowerShell, Bash, Shell scripting , etc is a plus. Solid understanding of network fundamentals (switching, routing, VLANs, firewalls). Exposure to Active Directory, DNS, DHCP , and other Windows infrastructure services. Experience integrating with REST APIs for automation and monitoring purposes. Exposure to version control systems such as Git and CI/CD tools like Jenkins, GitLab CI, or etc . Strong troubleshooting and analytical skills with an automation-first mindset. Nice to Have: Experience with infrastructure as code (IaC) tools like Terraform . Familiarity with containerization (Docker) and orchestration platforms (Kubernetes). Experience with monitoring tools like Prometheus, Grafana, Nagios , etc. Cloud automation experience with Azure, or GCP . Knowledge of ITIL practices and change management processes. Education: Bachelor's degree in Computer Science, Engineering, Information Technology, or a related field (or equivalent practical experience).

Posted 1 month ago

Apply

8.0 - 12.0 years

11 - 15 Lacs

Kochi

Work from Office

Job Title - Cloud Platform Engineer Associate Manager ACS Song Management Level:Level 8 Associate Manager Location:Kochi, Coimbatore, Trivandrum Must have skills:AWS, Terraform Good to have skills:Hybrid Cloud Experience:8-12 years of experience is required Educational Qualification:Graduation (Accurate educational details should capture) Job Summary Within our Cloud Platforms & Managed Services Solution Line, we apply an agile approach to provide true on-demand cloud platforms. We implement and operate secure cloud and hybrid global infrastructures using automation techniques for our clients business critical application landscape. As a Cloud Platform Engineer you are responsible for implementing on cloud and hybrid global infrastructures using infrastructure-as-code. Roles and Responsibilities Implement Cloud and Hybrid Infrastructures using Infrastructure-as-Code. Automate Provisioning and Maintenance for streamlined operations. Design and Estimate Infrastructure with an emphasis on observability and security. Establish CI/CD Pipelines for seamless application deployment. Ensure Data Integrity and Security through robust mechanisms. Implement Backup and Recovery Procedures for data protection. Build Self-Service Systems for enhanced developer autonomy. Collaborate with Development and Operations Teams for platform optimization. Professional and Technical Skills Customer-Focused Communicator adept at engaging cross-functional teams. Cloud Infrastructure Expert in AWS, Azure, or GCP. Proficient in Infrastructure as Code with tools like Terraform. Experienced in Container Orchestration (Kubernetes, Openshift, Docker Swarm). Skilled in Observability Tools like Prometheus, Grafana, etc., as well as Competent in Log Aggregation tools (Loki, ELK, Graylog) and Familiar with Tracing Systems such as Tempo. CI/CD and GitOps Savvy with potential knowledge of Argo-CD or Flux. Automation Proficiency in Bash and high-level languages (Python, Golang). Linux, Networking, and Database Knowledge for robust infrastructure management. Hybrid Cloud Experience a plus Additional Information About Our Company | Accenture (do not remove the hyperlink) Qualification Experience:3-5 years of experience is required Educational Qualification:Graduation (Accurate educational details should capture)

Posted 1 month ago

Apply

8.0 - 13.0 years

10 - 15 Lacs

Bengaluru

Work from Office

We are seeking a Senior DevOps Engineer to build pipeline automation, integrating DevSecOps principles and operations of product build and releases. Mentor and guide DevOps teams, fostering a culture of technical excellence and continuous learning. What You'll Do Design & Architecture: Architect and implement scalable, resilient, and secure Kubernetes-based solutions on Amazon EKS. Deployment & Management: Deploy and manage containerized applications, ensuring high availability, performance, and security. Infrastructure as Code (IaC): Develop and maintain Terraform scripts for provisioning cloud infrastructure and Kubernetes resources. CI/CD Pipelines: Design and optimize CI/CD pipelines using tools like Jenkins, GitHub Actions, GitLab CI/CD, or ArgoCD along with automated builds, tests (unit, regression), and deployments. Monitoring & Logging: Implement monitoring, logging, and alerting solutions using Prometheus, Grafana, ELK stack, or CloudWatch. Security & Compliance: Ensure security best practices in Kubernetes, including RBAC, IAM policies, network policies, and vulnerability scanning. Automation & Scripting: Automate operational tasks using Bash, Python, or Go for improved efficiency. Performance Optimization: Tune Kubernetes workloads and optimize cost/performance of Amazon EKS clusters. Test Automation & Regression Pipelines - Integrate automated regression testing and build sanity checks into pipelines to ensure high-quality releases. Security & Resource Optimization - Manage Kubernetes security (RBAC, network policies) and optimize resource usage with Horizontal Pod Autoscalers (HPA) and Vertical Pod Autoscalers (VPA) . Collaboration: Work closely with development, security, and infrastructure teams to enhance DevOps processes. Minimum Qualifications Bachelor's degree (or above) in Engineering/Computer Science. 8+ years of experience in DevOps, Cloud, and Infrastructure Automation in a DevOps engineer role. Expertise with Helm charts, Kubernetes Operators, and Service Mesh (Istio, Linkerd, etc.) Strong expertise in Amazon EKS and Kubernetes (design, deployment, and management) Expertise in Terraform, Jenkins and Ansible Expertise with CI/CD tools (Jenkins, GitHub Actions, GitLab CI/CD, ArgoCD, etc.) Strong experience with monitoring and logging tools (Prometheus, Grafana, ELK, CloudWatch) Proficiency in Bash, Python, for automation and scripting

Posted 1 month ago

Apply

6.0 - 8.0 years

22 - 25 Lacs

Bengaluru

Work from Office

Candidate Skill: Technical Skills DevOps | Kubernetes | CI/CD | AWS | Azure | GCP | Docker | Helm | Terraform | Ansible | Cloud Infrastructure | Prometheus | Grafana | Jenkins | Git | Bash | Python | Microservices | Cloud-Native | DevSecOps | GitLab CI | Istio JD: We are seeking a DevOps + Kubernetes Engineer to join our dynamic team in Bengaluru or Hyderabad. In this role, you will be responsible for building, maintaining, and scaling our infrastructure using Kubernetes and DevOps best practices. You will work closely with development and operations teams to implement automation processes, manage CI/CD pipelines, and ensure efficient infrastructure management for scalable and reliable applications. Key Responsibilities: Design, implement, and maintain Kubernetes clusters for production, staging, and development environments. Manage CI/CD pipelines for automated application deployment and infrastructure provisioning. Use DevOps best practices to ensure efficient infrastructure management, automation, and monitoring. Automate infrastructure provisioning and application deployments with tools such as Helm, Terraform, or Ansible. Monitor and optimize the performance, scalability, and availability of applications and infrastructure. Collaborate with software engineers to improve system performance and optimize cloud infrastructure. Handle the containerization of applications using Docker and manage them via Kubernetes. Troubleshoot, debug, and resolve issues in production environments in a timely manner. Implement security best practices for managing containerized environments and DevOps workflows. Contribute to the continuous improvement of development and deployment processes using DevOps tools. Required Skills and Qualifications: 6-8 years of experience in DevOps with a strong focus on Kubernetes and containerized environments. Expertise in Kubernetes cluster management and orchestration. Proficiency in CI/CD pipeline tools like Jenkins, GitLab CI, or CircleCI. Strong experience with cloud platforms such as AWS, Azure, or GCP. Knowledge of Docker for containerization and Helm for managing Kubernetes applications. Experience with infrastructure as code (IaC) using tools like Terraform, CloudFormation, or Ansible. Familiarity with monitoring and logging tools such as Prometheus, Grafana, ELK stack, or Datadog. Strong scripting skills in Bash, Python, or Groovy. Experience with version control systems like Git. Excellent problem-solving and troubleshooting skills, especially in distributed environments. Good understanding of security best practices in cloud and containerized environments. Technical Skills: DevOps | Kubernetes | CI/CD | AWS | Azure | GCP | Docker | Helm | Terraform | Ansible | Cloud Infrastructure | Prometheus | Grafana | Jenkins | Git | Bash | Python | Microservices | Cloud-Native | DevSecOps | GitLab CI | Istio

Posted 1 month ago

Apply

15.0 - 20.0 years

5 - 9 Lacs

Chennai

Work from Office

Project Role : Application Developer Project Role Description : Design, build and configure applications to meet business process and application requirements. Must have skills : DevOps Good to have skills : NAMinimum 12 year(s) of experience is required Educational Qualification : 15 years full time education Summary :As an Application Developer, you will design, build, and configure applications to meet business process and application requirements in a fast-paced environment, ensuring seamless integration and functionality. Roles & Responsibilities:- Expected to be an SME, collaborate, and manage the team to perform.- Responsible for team decisions.- Engage with multiple teams and contribute on key decisions.- Expected to provide solutions to problems that apply across multiple teams.- Lead the development and implementation of software solutions.- Collaborate with cross-functional teams to define, design, and ship new features.- Ensure the best possible performance, quality, and responsiveness of applications.- Identify bottlenecks and bugs, and devise solutions to mitigate and address these issues. Professional & Technical Skills: - Must To Have Skills: Proficiency in DevOps.- Strong understanding of continuous integration and continuous deployment (CI/CD) pipelines.- Experience with infrastructure as code (IaC) tools like Terraform or CloudFormation.- Knowledge of containerization technologies such as Docker and Kubernetes.- Hands-on experience with monitoring and logging tools like Prometheus and ELK stack. Additional Information:- The candidate should have a minimum of 12 years of experience in DevOps.- This position is based at our Chennai office.- A 15 years full-time education is required. Qualification 15 years full time education

Posted 1 month ago

Apply

8.0 - 13.0 years

10 - 15 Lacs

Bengaluru

Work from Office

What you will do: Design, implement, and maintain observability solutions (logging, monitoring, tracing) for cloud-native applications and infrastructure. Develop and optimize diagnostics tooling to quickly identify and resolve system or application-level issues. Monitor cloud infrastructure to ensure uptime, performance, and scalability, responding promptly to incidents and outages. Collaborate with development, operations, and support teams to drive improvements in system observability and troubleshooting workflows. Lead root cause analysis for major incidents, driving long-term fixes to prevent recurrence. Resolve customer-facing operational issues in a timely and effective manner. Automate operational processes and incident response tasks to reduce manual interventions and improve efficiency. Continuously assess and improve cloud observability tools, integrating new features and technologies where vital. Create and maintain comprehensive documentation on cloud observability frameworks, tools, and processes Who you will work with As a member of the Site Reliability Engineering (SRE) team, you will collaborate with a diverse group of professionals across various functions and regions. You will work closely with: Software Engineering Teams: Partner with developers to ensure that new features and services are reliable, scalable, and observable from the outset. You'll participate in design reviews and contribute to the overall architecture to enhance system performance and reliability. Coordinate with SRE team to automate deployment processes, manage infrastructure as code, and ensure seamless deployment pipelines. Product Management: Engage with product managers to understand customer requirements and ensure that reliability and performance are integral parts of product roadmaps. DevOps and Infrastructure Teams: Customer Support: Collaborate with customer support teams to diagnose and resolve incidents, providing insights and tools that enable faster troubleshooting and improved user experiences. Security and Compliance Teams: Work alongside security experts to maintain compliance with industry standards, ensuring that all systems and processes adhere to security best practices. Global Network Operations Teams: Interact with global operations staff spread across India, Europe, Canada, and the USA to support 24/7 service reliability and incident response. Data Analytics and Reporting: Team up with data analysts to create meaningful dashboards and reports that provide insights Who you are: Bachelor's degree in Computer Science, Engineering, or related field, or equivalent work experience. 8+ years of experience in cloud engineering, (SRE), or DevOps. Expertise with cloud platforms (AWS, Azure, GCP) and related monitoring/observability tools (e.g., Prometheus, Grafana, Datadog, ELK Stack). Strong experience with diagnostics and troubleshooting tools for cloud services. Proficient in scripting languages (Python, Bash, etc.) and infrastructure-as-code (Terraform, CloudFormation). Experience in operational incident management, including root cause analysis and post-mortem reviews. Solid understanding of containerization (Docker, Kubernetes) and microservices architecture. Knowledge of network performance monitoring and debugging techniques. Desire to solve complex problems Proactive in communicating and handling stakeholders remotely and in various time-zones Demonstrated ability to collaborate with Engineering teams.

Posted 1 month ago

Apply

0.0 - 1.0 years

1 - 2 Lacs

Bengaluru

Work from Office

About The Role The Site Reliability Engineering team focused on Efficiency and Performance is responsible for driving AWS cost intelligence, managing the ThousandEyes infrastructure, and ensuring optimal resource utilization and performance. In this role, the Senior Site Reliability Engineer will play a crucial part in optimizing the tools, services, and infrastructure that support the ThousandEyes platform. What You'll Do The Site Reliability Engineering team dedicated to Efficiency and Performance is responsible for optimizing AWS cost intelligence, managing the ThousandEyes infrastructure, and ensuring resource intelligence and performance. By strategically managing cloud resources and infrastructure, this team enhances the overall performance and reliability of our services. The Senior Site Reliability Engineer in this role will lead efforts to optimize cloud expenditures, streamline infrastructure management, and ensure that all resources are utilized efficiently, driving continuous improvement in service reliability and performance. Think self service and participate in and contribute to improve our "Follow the sun" model incident response and on-call rotation. Qualifications Ability to design and implement scalable and well tested solutions, with focus on streamlining operations. Strong hands on experience in cloud preferably AWS, Strong Infrastructure as Code skills, ideally with Terraform and Kubernetes. Previous experience in AWS cost management Understanding of Prometheus and its ecosystem, including Alertmanager. Ability to write high quality code in Python, Go, or equivalent languages Good understanding of Unix/Linux systems, the kernel, system libraries, file systems, and client-server protocols.

Posted 1 month ago

Apply

7.0 - 10.0 years

11 - 16 Lacs

Mumbai, Hyderabad, Pune

Work from Office

Key Responsibilities: Design, build, and maintain CI/CD pipelines for ML model training, validation, and deployment Automate and optimize ML workflows, including data ingestion, feature engineering, model training, and monitoring Deploy, monitor, and manage LLMs and other ML models in production (on-premises and/or cloud) Implement model versioning, reproducibility, and governance best practices Collaborate with data scientists, ML engineers, and software engineers to streamline end-to-end ML lifecycle Ensure security, compliance, and scalability of ML/LLM infrastructure Troubleshoot and resolve issues related to ML model deployment and serving Evaluate and integrate new MLOps/LLMOps tools and technologies Mentor junior engineers and contribute to best practices documentation Required Skills & Qualifications: 8+ years of experience in DevOps, with at least 3 years in MLOps/LLMOps Strong experience with cloud platforms (AWS, Azure, GCP) and container orchestration (Kubernetes, Docker) Proficient in CI/CD tools (Jenkins, GitHub Actions, GitLab CI, etc.) Hands-on experience deploying and managing different types of AI models (e.g., OpenAI, HuggingFace, custom models) to be used for developing solutions. Experience with model serving tools such as TGI, vLLM, BentoML, etc. Solid scripting and programming skills (Python, Bash, etc.) Familiarity with monitoring/logging tools (Prometheus, Grafana, ELK stack) Strong understanding of security and compliance in ML environments Preferred Skills: Knowledge of model explainability, drift detection, and model monitoring Familiarity with data engineering tools (Spark, Kafka, etc. Knowledge of data privacy, security, and compliance in AI systems. Strong communication skills to effectively collaborate with various stakeholders Critical thinking and problem-solving skills are essential Proven ability to lead and manage projects with cross-functional teams

Posted 1 month ago

Apply

7.0 - 10.0 years

8 - 13 Lacs

Mumbai, Hyderabad, Pune

Work from Office

Key Responsibilities: Design, build, and maintain CI/CD pipelines for ML model training, validation, and deployment Automate and optimize ML workflows, including data ingestion, feature engineering, model training, and monitoring Deploy, monitor, and manage LLMs and other ML models in production (on-premises and/or cloud) Implement model versioning, reproducibility, and governance best practices Collaborate with data scientists, ML engineers, and software engineers to streamline end-to-end ML lifecycle Ensure security, compliance, and scalability of ML/LLM infrastructure Troubleshoot and resolve issues related to ML model deployment and serving Evaluate and integrate new MLOps/LLMOps tools and technologies Mentor junior engineers and contribute to best practices documentation Required Skills & Qualifications: 8+ years of experience in DevOps, with at least 3 years in MLOps/LLMOps Strong experience with cloud platforms (AWS, Azure, GCP) and container orchestration (Kubernetes, Docker) Proficient in CI/CD tools (Jenkins, GitHub Actions, GitLab CI, etc.) Hands-on experience deploying and managing different types of AI models (e.g., OpenAI, HuggingFace, custom models) to be used for developing solutions. Experience with model serving tools such as TGI, vLLM, BentoML, etc. Solid scripting and programming skills (Python, Bash, etc.) Familiarity with monitoring/logging tools (Prometheus, Grafana, ELK stack) Strong understanding of security and compliance in ML environments Preferred Skills: Knowledge of model explainability, drift detection, and model monitoring Familiarity with data engineering tools (Spark, Kafka, etc. Knowledge of data privacy, security, and compliance in AI systems. Strong communication skills to effectively collaborate with various stakeholders Critical thinking and problem-solving skills are essential Proven ability to lead and manage projects with cross-functional teams

Posted 1 month ago

Apply

2.0 - 7.0 years

8 - 14 Lacs

Pune, Coimbatore

Work from Office

Job Summary : We are seeking a skilled Erlang Developer to join our backend engineering team. The ideal candidate will have a strong background in Erlang, with working experience in Elixir and RabbitMQ. You will play a key role in designing, building, and maintaining scalable, fault-tolerant systems used in high-availability environments. Key Responsibilities : - Design, develop, test, and maintain scalable Erlang-based backend applications. - Collaborate with cross-functional teams to understand requirements and deliver efficient solutions. - Integrate messaging systems such as RabbitMQ to ensure smooth communication between services. - Write reusable, testable, and efficient code in Erlang and Elixir. - Monitor system performance and troubleshoot issues in production. - Ensure high availability and responsiveness of services. - Participate in code reviews and contribute to best practices in functional programming. Required Skills : - Proficiency in Erlang with hands-on development experience. - Working knowledge of Elixir and the Phoenix framework. - Strong experience with RabbitMQ and messaging systems. - Good understanding of distributed systems and concurrency. - Experience with version control systems like Git. - Familiarity with CI/CD pipelines and containerization (Docker is a plus). Preferred Qualifications : - Experience working in telecom, fintech, or real-time systems. - Knowledge of OTP (Open Telecom Platform) and BEAM VM internals. - Familiarity with monitoring tools like Prometheus, Grafana, etc.

Posted 1 month ago

Apply

1.0 - 3.0 years

10 - 15 Lacs

Bengaluru

Work from Office

SRE 1 (Clouds Op) Locations: B'lore & Pune Exp - 1 to 3 yrs Candiates only from B2C product companies Exp - GCP, Prometheus, Grafana, ELK, Newrelic, Pingdom, or Pagerduty , Kubernets Experience with CI/CD tools 5 days week Rotational Shift

Posted 1 month ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies