Home
Jobs

35 Observability Jobs

Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
Filter
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

10.0 - 15.0 years

30 - 45 Lacs

Noida

Work from Office

Naukri logo

Your Role We are building 5ive.ai a deeptech AI platform that personalizes video-based learning at scale for K-12 students. As our Engineering Manager, you'll lead the full engineering stack: from cloud architecture and DevOps to scalable frontends and personalized content pipelines. You will shape the technology roadmap and manage an elite team delivering real-world AI at scale. What You'll Own Engineering Leadership Lead, hire, mentor, and grow a high-performing engineering team. Manage the end-to-end technical delivery of features and infrastructure. Collaborate cross-functionally with product, design, and AI/ML teams. Full Stack Architecture Architect frontend (React/Next.js) and backend (Node.js/Python) systems. Lead design of scalable, modular, and reusable components and APIs. Implement observability and performance tracking in client-side apps. DevOps & Cloud Strategy Architect secure and scalable systems on AWS (EC2, Lambda, S3, RDS, CloudFront, MediaConvert, etc.) Own CI/CD pipelines and environment workflows (dev staging prod). Automate infra using Terraform, CloudFormation, or AWS CDK. Adaptive Video Streaming Build & optimize our personalized video delivery platform. Integrate HLS/DASH-based streaming, FFmpeg pipelines, DRM/CDN workflows. Evaluate vendors (Vimeo, Mux, AWS Media Services) and self-hosting tradeoffs. AI/ML Infrastructure Integration Collaborate with Data Science to deploy and scale ML models. Build support for GPU-inferencing, microservices, and async workers. Support real-time content personalization based on user traits. Security & Compliance Ensure best practices for cloud IAM, VPC security, access control. Help align infrastructure and data flows with COPPA, GDPR, FERPA. Team Processes & Quality Define workflows for code reviews, TDD, BDD, release cycles. Instill strong engineering culture, balancing speed and quality. What Were Looking For 10+ years of experience in software engineering; 2+ in engineering leadership. Hands-on expertise across full stack: Node.js, Python, React/Next.js. Deep knowledge of AWS ecosystem & DevOps (CI/CD, monitoring, cost optimization). Experience building and deploying enterprise-grade applications. Track record of managing scalable video delivery infrastructure. Strong understanding of microservices, container orchestration (Docker/Kubernetes). Nice to Have Experience with FFmpeg, DRM, CDN optimization. Knowledge of student data privacy and security standards. Prior work in edtech or AI/ML deployment environments. Familiarity with frontend observability tools (Sentry, Lighthouse). Key Skills Checklist Full Stack (Node.js, Python, React, Next.js) AWS (EC2, S3, Lambda, CloudFront, RDS, MediaConvert) DevOps (CI/CD, GitHub Actions/GitLab CI, Jenkins) Infrastructure as Code (Terraform, CloudFormation, AWS CDK) Video Streaming (FFmpeg, HLS/DASH, DRM, Mux, Vimeo, AWS Media) Observability (CloudWatch, Prometheus, ELK, Datadog) AI/ML Infra (GPU inference, async workers, SageMaker familiarity) Security (IAM, VPC, secrets management) Microservices (Docker, Kubernetes) Team Leadership & Agile Delivery

Posted 1 day ago

Apply

4.0 - 7.0 years

9 - 15 Lacs

Bengaluru

Hybrid

Naukri logo

Job Description: We are looking for a highly motivated SRE Observability Engineer with strong experience in observability platforms and automation. The ideal candidate will have excellent Python coding skills and hands-on experience with Prometheus and Grafana. Key Skills: SRE & Observability practices Prometheus (Monitoring and Alerting) Grafana (Dashboarding & Visualization) Strong Python programming for automation and graphing Good understanding of infrastructure monitoring

Posted 1 day ago

Apply

6.0 - 11.0 years

6 - 15 Lacs

Pune, Bengaluru

Work from Office

Naukri logo

This is a FULL TIME POSITION with Infosys. F2F interview must for these roles. Multiple roles - 8-10 Positions including Architect level Location - Bangalore or Pune Are you an SRE or Observability Enthusiast? Do you thrive on turning complex systems into transparent ones? Are you passionate about diving deep into metrics, logs, and traces to uncover insights and optimize performance? We're seeking experienced professionals in the following roles (with minimum 2-3 years of relevant experience in any of the below skills) : SRE Engineer / Architect / Consultant - Design and implement SRE practices - Design and implement robust monitoring and alerting systems - Automate routine tasks and streamline operations - Ensure system reliability, scalability, and performance - Strong understanding of cloud platforms and containerization technologies Observability Engineer / Lead - Design and implement effective observability strategies - Analyze logs, metrics, and traces to identify performance bottlenecks - Set up alerts and notifications for critical issues - Experience in tools like Datadog, Dynatrace, New Relic, Splunk, Prometheus, and Grafana We'd love to hear from you, if you think you fit into any of the above roles. Let's build the future of technology together! Abhishek.Sharma@ZentekInfosoft.com

Posted 2 days ago

Apply

10.0 - 15.0 years

0 Lacs

Navi Mumbai, Maharashtra, India

On-site

Foundit logo

Job Descriptions for Pre-Sales Consultants About Jios Hyper Automation Product Engineering Team We are building next-generation Infra services & operations platforms catering to a Hybrid cloud environment. This platform will have all the necessary functions & features to achieve Operational Intelligence. Viz. Consumer Onboarding, Multi-Cloud Market Place (including managed on-prem choice), Provisioning various cloud services/platforms, Infra/Platform Usage Analysis & Optimization advice, Billing, Observability (Infra/Applications Metrics & Logs) with AI/ML-based Advanced Analytics (Predictive, Perspective and Descriptive) for proactive preventive measures. Keywords Pre-Sales, AWS, Azure, GCP, Solution Architect, Cloud Migration, RFP Years of Experience 1. Total IT - 10-15 Years 2.Total Relevant - 3 Years Location - Navi-Mumbai, Bangalore, Delhi Roles or Responsibilities Pitch solutions to a customer and explain all the features and benefits of a particular product or services Prepare cost estimates and technical proposals such that it meets client's requirement Help sales executives during the technical presentations respond to a request for information (RFIs) or request for proposals (RFPs) from customers Determine the technical requirement to meet customer goals Product Knowledge/Tech Skills 1. Must have a. 3+ years of pre-sales experience in Cloud Native Enterprise Solutions / SaaS Solutions / Cloud Platform Offerings b. Hands-on expertise to build POCs to demonstrate product capabilities c. Good understanding of Cloud platform management functions Onboarding, Provisioning, Monitoring (Observability), Billing, Cost-Advisory, and Compliance d. Good understanding of cloud network and information security features e. Experience in implementing cloud-native applications, containerization, and Distributed computing. f. Expertise in any of Cloud platforms implementation Azure, GCP, RedHat or AWS a. Strong analytical skills to assess technical capabilities and constraints b. Strong Verbal/Written Communication and Presentation skills c. Good understanding of design patterns/frameworks & best practices, e.g., Sync/Async APIs/Services, Authorization/Authentication, distributed computing, information security, extensible data modeling etc. g. Ability to multi-task and handle numerous competing priorities. 1. Nice to have a. Exposure to SQL/No-SQL/Timeseries persistent stores, e.g., Oracle, Mongo DB, Redis, PostgreSQL b. Experience in implementing continuous integration and delivery (CI & CD) c. Exposure to Cloud Management Product Engineering Solutions d. Government Community Cloud norms. 2. Generic Skills Team player, Networking, Social & Cultural awareness

Posted 2 days ago

Apply

10.0 - 12.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Foundit logo

Director u2013u00A0Hyderabad Infrastructure Operations Lead Level:u00A0 M2 Supervisor:u00A0 Harsh Chadha, Sr. Director ITu2013 Lilly Hyderabad About the Team: The Hyderabad Infrastructure Operations Lead is responsible for leading the Digital Core Infrastructure operations teams spanning 2 shifts based in Hyderabad, India. This strategic leader shares responsibility for the governance and operational excellence of the enterprise InfraOPS with their global counterparts. This role oversees 2 of the 3 global operations teams of infrastructure platforms (Servers, Storage, Cloud Ops) to enhance service delivery, automation, and efficiency across the organization.u00A0u00A0 The ideal candidate will lead key operations transformation initiatives, driving related service management processes, while ensuring alignment with business goals, industry best practices, and emerging technologies, like AI and Automation. They will collaborate with cross-functional teams, lead innovation efforts, and manage vendor relationships to optimize platform performance and scalability. Additionally, they will have a strong background in Infrastructure, platform operations, hyperscale cloud, leading high performing global teams, and a proven track record of managing large-scale service delivery. These skills will enable improved user efficiency and experience and help support the broader Company purpose of making life better for people around the world. What youu2019ll be doing: As the leader of the Hyderabad Infrastructure Operations Team, youu2019ll be operating as a highly effective People, Transformation, and Relationship Leader.u00A0u00A0You will have the desire and proven ability to cut through ambiguity and re-imagine how services should be established and managed to ensure the highest levels of efficiency.u00A0 You will be a respected and robust partner who feels obligated to focus on enterprise value-based outcomes u2013 one that can establish new enterprise capabilities through engagement with cross functional partners and vendors whilst minimizing technical dept. Key Responsibilities: Hyderabad Infrastructure Operations Team Leadership Be a Leader: u00A0Lead multipleu00A0teams with multiple first line leaders focused on the ongoing operational support of Lillyu2019s global Technology infrastructure.u00A0 Be Bold: u00A0You will drive Infrastructure Operations to never have to fix the same problem twice through adoption of AI OPS, Event Driven Automation, and robust Observability. Be Fast: u00A0You will accelerate initiatives in areas such as: Infrastructure AI OPS automation, cloud IaaS management, and cloud infrastructure as code to enable critical business projects. Be Proactive: u00A0You will have groundbreaking opportunities to transform our operations processes using proactive, predictive, and automated AI & Observability capabilities. Be Your Best: u00A0You will bring high learning agility and Infrastructure operations / engineering skills to help us enable the Lilly Technology strategy, identify tech opportunities, and accelerate our AI OPS journey. Incident and Change Leadership Follow ITIL-based incident, problem, and change management processes using ServiceNow.u00A0 Manage incident resolution and root cause analysis for critical server issues.u00A0 Oversee change management processes, ensuring minimal impact to production environments.u00A0 Incident, Change and Request Management: Participate in incident response and root cause analysis to prevent recurrences, be available on-call as needed, and participate in an on-call schedule. Able to work off-hours and weekends if needed for any major incidents/critical activities. Work under pressure to guide teams in resolving incidents quickly. Oversee changes to all infrastructure teams, ensuring adherence to processes with minimal production impact.u00A0u00A0 Partner with Tech@Lilly, Cyber, Quality, Procurement, and other partner organizations to ensure high Shared Consciousness in transformation roadmaps Other responsibilities Partner with cross functional group of architects, technologists, and service area leadership to establish and execute against an ongoing engineering excellence program focusing on continuous improvement Demonstrate the ability to drive, lead and coach others, and influence others outside their sphere of influence. Manage a team - responsible for staff performance evaluations and management (e.g., disciplinary) training and development and have authority to hire. Act as a member of the Lilly Hyderabad T@L Lead Team to ensure governance, process and compliance consistency across the various Lilly Hyderabad T@L service areas. Provide coaching and mentorship to others within the function to enhance the teamu2019s ongoing technical development and understanding of technologies, services, quality and security compliance standards, and methodologies. Identify and hire talent to foster innovation and excellence. Proven experience in assessing business value, risk mitigation, cost optimization, and return on investment. Deliver results based upon annual goals, department goals and management requests. Develop department budget, performance standards, and schedules. Establish operating policies and procedures. Implement initiatives for continuous improvement and ideas for positive disruption Basic Qualifications: A bachelor's degree in an IT subject area (computer science, information systems, etc.) or equivalent experience. 10+ years of experience in IT Infrastructure operations, with a strong focus on server & storage platforms (e.g., Windows Server, Linux, Storage & Backups, Virtualization) Proven leadership experience managing or working on global/diverse teamsu00A0 Strong knowledge of ITIL frameworks, service operations, and process improvement methodologies. Demonstrated leadership, influence, communication, presentation, and facilitation skills. Demonstrated strong partnership skills and influence with business partners inside business unit context. Demonstrated influence and communication skills across all levels of IT. Strong organizational and communication skills with multiple examples of being able to convey complex ideas and thoughts in manners that resulted in definitive directions and results. Strong negotiation skills. Deep vendor management experience. Proactive, demonstrated ability to challenge the status quo and strong ability to drive peers and above to timely decisions. A high level of intellectual curiosity, external perspective, technical aptitude and innovation interest. Demonstrated experience in service transformation with a focus on people, process, and technology.u00A0u00A0 Experienced in delivering and sustaining solutions throughout software development lifecycle: design, engineering, construct, testing, deployment, and support of software solutions, platforms, services, and capabilities. Demonstrated ownership of sustainable capabilities and services within the budget, timeline, and scope constraints.u00A0 Demonstrated business and technical acumen through interactions with key business and IT leadership. Additional Skills / Preferences: Masteru2019s degree in IT subject area (computer science, information systems, etc.). Basic understanding ofu00A0 cloud technologies u00A0(Azure, AWS) and hybrid cloud environments.u00A0 Proficient in utilizing monitoring tools such as Splunk or similar platforms.u00A0 System Maintenance and Monitoring: Ensure the stability, performance, and security of Linux/Windows/Cloud-based/Virtualization/Storage systems. Monitor system health, troubleshoot issues, and implement necessary fixes.u00A0 Customer Support: Provide timely and effective support to customers on an as-needed basis. Address and resolve technical issues, ensuring minimal disruption to services.u00A0 Experience with Agile and DevOps methodologiesu00A0 Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form () for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response. Lillyu00A0does not discriminate on the basis of age, race, color, religion, gender, sexual orientation, gender identity, gender expression, national origin, protected veteran status, disability or any other legally protected status. #WeAreLilly

Posted 1 week ago

Apply

3.0 - 8.0 years

9 - 15 Lacs

Pune, Bengaluru

Hybrid

Naukri logo

Dear Applicant, We have an exciting opportunity in the field of SRE Engineering (Python Scripting) .The successful candidate shall resolve SRE incidents and proactively improve the observability About this position: We are looking for a skilled SRE/DevOps Engineer with expertise in scripting, cloud infrastructure, monitoring, and incident management to ensure the reliability, scalability, and performance of our systems. The ideal candidate will have hands-on experience in Python/Go scripting, GCP, Kubernetes, and CI/CD tools, along with strong troubleshooting skills in Linux and networking. Impact you will realize: Job Responsibilities Enhances Cloud & DevOps Expertise Working with GCP, Kubernetes, and CI/CD tools will deepen your cloud infrastructure and automation skills. Sharpens Scripting & Debugging Abilities: Developing and optimizing Python/Go scripts will improve your coding efficiency and troubleshooting mindset. Builds Strong Observability & Incident Management Skills Hands-on experience with monitoring tools (Grafana, Datadog) and log analysis will make you adept at maintaining system reliability. Boosts Problem-Solving in Real-World Scenarios Troubleshooting Linux, networking, and cloud security issues will refine your ability to diagnose and resolve production challenges effectively. Key skills you will require: Primary Skills Strong scripting skills in Python (must) and/or Go (preferred). Hands-on experience with GCP (logging, security, resource management). Familiarity with monitoring tools (Grafana, Datadog, Prometheus). Knowledge of Linux, Kubernetes, and networking fundamentals. Experience with CI/CD pipelines (Jenkins, Terraform, Ansible). Ability to analyze logs, debug issues, and optimize performance. Qualifications you must require Bachelors degree in computer science, Engineering, or a related field, or equivalent work experience.

Posted 1 week ago

Apply

0.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Foundit logo

________________________________________ Ready to shape the future of work At Genpact, we don&rsquot just adapt to change&mdashwe drive it. AI and digital innovation are redefining industries, and we&rsquore leading the charge. Genpact&rsquos AI Gigafactory, our industry-first accelerator, is an example of how we&rsquore scaling advanced technology solutions to help global enterprises work smarter, grow faster, and transform at scale. From large-scale models to agentic AI, our breakthrough solutions tackle companies most complex challenges. If you thrive in a fast-moving, tech-driven environment, love solving real-world problems, and want to be part of a team that&rsquos shaping the future, this is your moment. Genpact (NYSE: G) is an advanced technology services and solutions company that delivers lasting value for leading enterprises globally. Through our deep business knowledge, operational excellence, and cutting-edge solutions - we help companies across industries get ahead and stay ahead. Powered by curiosity, courage, and innovation, our teams implement data, technology, and AI to create tomorrow, today. Get to know us at genpact.com and on LinkedIn, X, YouTube, and Facebook. Inviting applications for the role of Senior Principal Consultant- Senior Data Engineer - Snowflake, AWS, Cortex AI & Horizon Catalog Role Summary: We are seeking an experienced Senior Data Engineer with deep expertise in modernizing Data & Analytics platforms on Snowflake, leveraging AWS services, Cortex AI, and Horizon Catalog for high-performance, AI-driven data management. The role involves designing scalable data architectures, integrating AI-powered automation, and optimizing data governance, lineage, and analytics frameworks. Key Responsibilities: . Architect & modernize enterprise Data & Analytics platforms on Snowflake, utilizing AWS, Cortex AI, and Horizon Catalog. . Design and optimize Snowflake-based Lakehouse architectures, integrating AWS services (S3, Redshift, Glue, Lambda, EMR, etc.). . Leverage Cortex AI for AI-driven data automation, predictive analytics, and workflow orchestration. . Implement Horizon Catalog for enhanced data lineage, governance, metadata management, and security. . Develop high-performance ETL/ELT pipelines, integrating Snowflake with AWS and AI-powered automation frameworks. . Utilize Snowflake&rsquos native capabilities like Snowpark, Streams, Tasks, and Dynamic Tables for real-time data processing. . Establish data quality automation, lineage tracking, and AI-enhanced data governance strategies. . Collaborate with data scientists, ML engineers, and business stakeholders to drive AI-led data initiatives. . Continuously evaluate emerging AI and cloud-based data engineering technologies to improve efficiency and innovation. Qualifications we seek in you! Minimum Qualifications . experience in Data Engineering, AI-powered automation, and cloud-based analytics. . Expertise in Snowflake (Warehousing, Snowpark, Streams, Tasks, Dynamic Tables). . Strong experience with AWS services (S3, Redshift, Glue, Lambda, EMR). . Deep understanding of Cortex AI for AI-driven data engineering automation. . Proficiency in Horizon Catalog for metadata management, lineage tracking, and data governance. . Advanced knowledge of SQL, Python, and Scala for large-scale data processing. . Experience in modernizing Data & Analytics platforms and migrating on-premises solutions to Snowflake. . Strong expertise in Data Quality, AI-driven Observability, and ModelOps for data workflows. . Familiarity with Vector Databases & Retrieval-Augmented Generation (RAG) architectures for AI-powered analytics. . Excellent leadership, problem-solving, and stakeholder collaboration skills. Preferred Skills: . Experience with Knowledge Graphs (Neo4J, TigerGraph) for structured enterprise data systems. . Exposure to Kubernetes, Terraform, and CI/CD pipelines for scalable cloud deployments. . Background in streaming technologies (Kafka, Kinesis, AWS MSK, Snowflake Snowpipe). Why Join Us . Lead Data & AI platform modernization initiatives using Snowflake, AWS, Cortex AI, and Horizon Catalog. . Work on cutting-edge AI-driven automation for cloud-native data architectures. . Competitive salary, career progression, and an opportunity to shape next-gen AI-powered data solutions. ________________________________________Why join Genpact . Be a transformation leader - Work at the cutting edge of AI, automation, and digital innovation . Make an impact - Drive change for global enterprises and solve business challenges that matter . Accelerate your career - Get hands-on experience, mentorship, and continuous learning opportunities . Work with the best - Join 140,000+ bold thinkers and problem-solvers who push boundaries every day . Thrive in a values-driven culture - Our courage, curiosity, and incisiveness - built on a foundation of integrity and inclusion - allow your ideas to fuel progress Come join the tech shapers and growth makers at Genpact and take your career in the only direction that matters: Up. Let&rsquos build tomorrow together. Genpact is an Equal Opportunity Employer and considers applicants for all positions without regard to race, color, religion or belief, sex, age, national origin, citizenship status, marital status, military/veteran status, genetic information, sexual orientation, gender identity, physical or mental disability or any other characteristic protected by applicable laws. Genpact is committed to creating a dynamic work environment that values respect and integrity, customer focus, and innovation. Furthermore, please do note that Genpact does not charge fees to process job applications and applicants are not required to pay to participate in our hiring process in any other way. Examples of such scams include purchasing a %27starter kit,%27 paying to apply, or purchasing equipment or training.

Posted 1 week ago

Apply

3.0 - 7.0 years

3 - 7 Lacs

Hyderabad / Secunderabad, Telangana, Telangana, India

On-site

Foundit logo

You will be responsible for understanding requirements or SRE goals in depth from both tech and business perspectives You will provide solutions to improve reliability, including identifying and implementing mechanisms and architectures that enable fault tolerance and faster median time to respond and median time to detect You will be responsible for enhancing the incident management process, including the development of an incident prioritization matrix, triage, communication, mitigation, post-mortem analysis and implementation of corrective actions You will manage client stakeholder expectations and queries during production incidents, providing detailed technical analysis of issues and remediation plans for mitigation and prevention in future, and act as the interface for C-level executives, if or when needed You will be a liaison with client engineering teams, build trust and productive relationships with senior client stakeholders and team leads to influence them in making better decisions You will be responsible for identifying opportunities for enhancing system performance and reliability in alignment with business SLAs, SLOs, KPIs and objectives, and provide guidance and assistance to SRE teams in implementing the identified improvements As an SRE expert, you will collaborate with Thoughtworks application development leads and solution architects, recommending changes in system design and adopting best practices for improved reliability from day one You will oversee and mentor other SREs on the team, contributing to their growth and development Job qualificationsTechnical SkillsYou can program with one or more high-level languages such as Python, Golang, Shell scripting, Ruby or Java You are familiar with DevOps and GitOps practices, driving the integration of observability automation into CI/CD pipelines, e.g.: GitLab, Jenkins, CircleCI or equivalent You have in-depth knowledge of configuration management and Infrastructure as Code (IAC) tools such as Terraform, Ansible, ARM and CloudFormation for provisioning and managing infrastructure You have an expertise in observability, logs, tracing and monitoring tools such as Grafana (Loki and Tempo), Prometheus, Graylog, Jaeger, Zipkin, ELK stack or equivalent You have a strong understanding of container-based architecture and hands-on experience with orchestration tools such as Kubernetes, AWS EKS, Docker Swarm, Nomad, etc. You have in-depth experience in application and infrastructure performance tuning and scaling to handle heavy loads under different scenarios e.g.: Periodic traffic load and tsunami patterns You have a good understanding of essential concepts such as quality gates encompassing SLI/SLO/SLA, chaos engineering, golden signals, blameless postmortem methodologies, synthetic monitoring, distributed tracing, end-user monitoring and performance testing You have experience with network load balancing, security tech stacks, Transport Layer Security (TLS) and certificate management, and an understanding of standard networking protocols and configurations Professional SkillsYou have strong communication and articulation skills, and are proficient in English You are able to convey resolutions to audiences with varying degrees of technical/business proficiency and bring them to consensus You have excellent problem-solving and analytical skills, with a focus on continuous improvement You have good listening and presentation skills You solve challenging problems and difficult to debug issues with a never give up attitude You can collaborate with cross-functional engineering teams to conduct capacity planning and scalability assessments, and design solutions for handling current and future growth You have the ability to work under pressure, with composure, during production incidents You understand requirements provided by the client on both technical and business aspects, and can break them down for successful implementation

Posted 1 week ago

Apply

5.0 - 7.0 years

0 Lacs

Hyderabad / Secunderabad, Telangana, Telangana, India

On-site

Foundit logo

Ready to shape the future of work At Genpact, we don&rsquot just adapt to change&mdashwe drive it. AI and digital innovation are redefining industries, and we&rsquore leading the charge. Genpact&rsquos , our industry-first accelerator, is an example of how we&rsquore scaling advanced technology solutions to help global enterprises work smarter, grow faster, and transform at scale. From large-scale models to , our breakthrough solutions tackle companies most complex challenges. If you thrive in a fast-moving, tech-driven environment, love solving real-world problems, and want to be part of a team that&rsquos shaping the future, this is your moment. Genpact (NYSE: G) is an advanced technology services and solutions company that delivers lasting value for leading enterprises globally. Through our deep business knowledge, operational excellence, and cutting-edge solutions - we help companies across industries get ahead and stay ahead. Powered by curiosity, courage, and innovation, our teams implement data, technology, and AI to create tomorrow, today. Get to know us at and on , , , and Inviting applications for the role of Principal Consultant -Lead MLOps Engineer! In this role, you will define, implement and oversee the MLOps strategy for scalable, compliant, and cost-efficient deployment of AI/ GenAI models across the enterprise. This role combines deep DevOps knowledge, infrastructure architecture, and AI platform design to guide how teams build and ship ML models securely and reliably. You will establish governance, reuse, and automation frameworks for AI infrastructure, including Terraform-first cloud automation, multi-environment CI/CD, and observability pipelines. Responsibilities Architect secure, reusable, modular IaC frameworks across cloud and regions for MLOps Lead the development of CI/CD pipelines and standardize deployment frameworks. Design observability and monitoring systems for ML/ GenAI workloads. Collaborate with platform, data science, compliance and Enterprise Architecture teams to ensure scalable ML operations. Define enterprise-wide MLOps architecture and standards (build ? deploy ? monitor) Lead design of GenAI / LLMOps platform (Bedrock/OpenAI/Hugging Face + RAG stack) Integrate governance controls (approvals, drift detection, rollback strategies) Define model metadata standards, monitoring SLAs, and re-training workflows Influence tooling, hiring, and roadmap decisions for AI/ML delivery Be engaging in the design, development and maintenance of data pipelines for various AI use cases Required to actively contribution to key deliverables as part of an agile development team Qualifications we seek in you! Minimum Qualifications Good years of experience in DevOps or MLOps roles. Degree/qualification in Computer Science or a related field, or equivalent work experience Strong Python programming skills. Hands on experience in containerised deployment. Proficient with AWS (SageMaker, Lambda, ECR), Terraform, and Python. Demonstrated experience deploying multiple GenAI systems into production. Hands-on experience deploying 3-4 ML/ GenAI models in AWS. Deep understanding of ML model lifecycle: train ? test ? deploy ? monitor ? retrain. Experience in developing, testing, and deploying data pipelines using public cloud. Clear and effective communication skills to interact with team members, stakeholders and end users Knowledge of governance and compliance policies, standards, and procedures Exposure to RAG/LLM workloads and model deployment infrastructure. Experience in developing, testing, and deploying data pipelines Preferred Qualifications/ Skills Experience designing model governance frameworks and CI/CD pipelines. Knowledge of governance and compliance policies, standards, and procedures Advanced understanding of platform security, cost optimization, and ML observability. Why join Genpact Be a transformation leader - Work at the cutting edge of AI, automation, and digital innovation Make an impact - Drive change for global enterprises and solve business challenges that matter Accelerate your career - Get hands-on experience, mentorship, and continuous learning opportunities Work with the best - Join 140,000+ bold thinkers and problem-solvers who push boundaries every day Thrive in a values-driven culture - Our courage, curiosity, and incisiveness - built on a foundation of integrity and inclusion - allow your ideas to fuel progress Come join the tech shapers and growth makers at Genpact and take your career in the only direction that matters: Up. Let&rsquos build tomorrow together. Genpact is an Equal Opportunity Employer and considers applicants for all positions without regard to race, color , religion or belief, sex, age, national origin, citizenship status, marital status, military/veteran status, genetic information, sexual orientation, gender identity, physical or mental disability or any other characteristic protected by applicable laws. Genpact is committed to creating a dynamic work environment that values respect and integrity, customer focus, and innovation. Furthermore, please do note that Genpact does not charge fees to process job applications and applicants are not required to pay to participate in our hiring process in any other way. Examples of such scams include purchasing a %27starter kit,%27 paying to apply, or purchasing equipment or training.

Posted 1 week ago

Apply

4.0 - 9.0 years

25 - 40 Lacs

Gurugram, Chennai, Bengaluru

Hybrid

Naukri logo

Software Engineer - Observability Strong Experience in Python Coding, AWS Services- Cloud Watch, X-Ray & Lambda, Open Telemetry ( Should know) Dynatrace On-prem and SaaS | Person should have hands-on experience in setting up and designing dashboards Should be hand on in Python Coding Observability – Must have complete context of SLI/SLO/SLA, how to set up, how to measure, how to track and communicate Open Source Observability Stack – Good Understanding of Open Telemetry , How to instrument applications to get desired metrics, traces, logs, etc AWS Service – Cloud Watch, X-Ray, Lambda, overall data flow Open Shift Rosa – Red Hat Open shift on AWS Development Experience – Any language, should be able to read code and develop utilities as required. Grafana

Posted 1 week ago

Apply

5.0 - 7.0 years

10 - 20 Lacs

Pune, Chennai, Bengaluru

Hybrid

Naukri logo

Site Reliability Engineer As a Senior Site Reliability Engineer, you will play a critical role in supporting application developers by providing expert guidance on Application and infrastructure best practices from reliability perspective. Your role covers the entire life cycle of a product/application. Your primary focus will be Automation, Observability, reliability and Release management with CICD with an emphasis on solving operations issues Must have at least 5+ years of SRE experience in large programs with focus on release engineering, observability tasks and reliability Must have good understanding of Site Reliability Engineering (SRE) and release management processes should possess strong analytical and troubleshooting skills Should be a strong team player and enjoy collaborating with different people and profiles as well as share knowledge and strive for continuous development and learning. Excellent communication skills along with leadership skills Responsibilities (includes but not limited) Improve reliability, quality, and time-to-market of our suite of products/applications. Define suitable metrics for system with SLO/SLI and setup observability mechanism to track it Define error budget as per the SLO Define strategy and setup up High Availability and Load Balancer based architecture Drive a metrics-driven culture and software delivery process using data to measure overall system quality and reliability. Balance feature development speed and reliability with well-defined service level objectives Provide primary operational support and engineering for products/applications Partner with solution architect and development teams to improve services reliability Participate in system design, infra management and capacity planning Participate in optimizing code, automating operational tasks and toil reduction Provide solutions for performance management, disaster recovery, monitoring and observability Work with business users to understand issues, develop root cause analysis and work with the development team for enhancements/fixes Working on distributed traces to visualize the entire workflow and analyze the cause of problems/incidents Improve security and performance of infrastructure and applications Provide support, improve, and implement infrastructure as code Define, evangelize, and maintain SRE best practices Solutionize and implement DevSecOps best practices Improve automation including systems self-healing capability Manage and participate in on-call incidents (Priority Incident) Skills Good experience in scripting or development languages, including expertise in Python, Ruby, JSON, Java, and Node.JS, PHP (anyone) Experience with scripting in PowerShell(M) and Bash/Shell/Perl (anyone) Strong experience on one or more Observability tools like New Relic, AppDynamics, Prometheus, Dynatrace, DataDog, Splunk, Experience in Observability Dashobard creation, custom metrics, Synthetic Monitoring and Real User Monitoring (RUM) Strong knowledge of microservices architecture with APIs and REST API’s Experience in CICD tooling and best practices Experience of Cloud platforms such as AWS, Azure, and Google Experience in container orchestration and practices, including Kubernetes, Docker Swarm Experience in infrastructure automation tools like Terraform, Cloud Formation, Ansible, and Puppet (Anyone) Systems Administration and operating system experience on Linux, windows, including an understanding of networking. Knowledge on SQL, NoSQL (Oracle, Couchbase) Experience working on tools like Remedy, ServiceNow, Confluence, Jira Experience on Chaos engineering (good to have) Experience with Cloud cost optimization (Good to have) Knowledge on message broker application such as RabbitMQ, Kafka or ActiveMQ (good to have)

Posted 1 week ago

Apply

14.0 - 24.0 years

50 - 60 Lacs

Noida, Hyderabad, Pune

Work from Office

Naukri logo

Expectations Prior experience serving as an architect in Practice, COE, and HBUs, where they have creating service offerings, solution accelerators, and unique selling propositions Play a critical role in driving automation, continuous integration/continuous delivery (CI/CD), and monitoring capabilities to enhance the development and operations processes. Lead and execute designing, defining, and prototyping the end-to-end unified observability system leveraging NewRelic, Splunk and Grafana Stack Define build, implementation, and deployment strategies for the DevOps, Observability and Site Reliability Engineering Marketing of technology & domain solutions / service offerings to internal/external stakeholders Manage business relationship with the technology partners & start-up eco systems and demonstrate edge over competition. Passionate about technology and customer success with excellent communication and articulation skills Should have prior experience in presenting capabilities and solutions to end customers Build initial prototypes of the observability solution and lead the demo sessions with the customer teams Behavior Competencies Excellent Communication, interpersonal and Presentation Skills People Management Conflict Resolution Solutioning Customer Service Accountability Judgement and decision making Ability to build and maintain relationships with stakeholders Technical Skills At least 4 years of pre-sales experience, working with RFI / RFP, developing and presenting technical design & solution to the internal and external stakeholders Extensive experience in assessing SRE, DevOps, Observability maturity state for with ability to define maturity improvement roadmap. Extensive experience in defining and implementing SRE, DevOps, Observability strategies for 3 or more large scale projects Experience of cloud platforms such as AWS or Azure or GCP Deep expertise in Time Series Databases configurations and implementation on AWS cloud Experience of scale observability projects as architect in designing, implementation, and cloud deployment of observability on containerized (Azure AKS or AWS EKS) applications using NewRelic, Splunk and Grafana Stack or open source Grafana and Prometheus products/tools Deep expertise in designing and implementing of end-to-end distributed tracing using several Daemonsets/agents and telemetry gathering patterns. 3+ years in a Monitoring & Observability automation using NewRelic, Splunk and Grafana Stack including Prometheus based alerting. Deep expertise in observability tools such as Splunk, NewRelic, AWS CloudWatch, AWS OpenSearch, and ELK etc

Posted 1 week ago

Apply

8.0 - 13.0 years

20 - 30 Lacs

Bangalore Rural, Bengaluru

Work from Office

Naukri logo

Immediate Hiring: Java + Observability Engineer (Apache Storm) Location : Bengaluru | Architect Level Only Immediate Joiners We are looking for a skilled and experienced Java + Observability Engineer with expertise in Apache Storm to join our team in Bengaluru . This is an exciting opportunity for professionals passionate about modern observability stacks and distributed systems. Key Skills Required : Java (Version 8/11 or higher) Observability Tools : Prometheus, Grafana, OpenTelemetry, ELK, Jaeger, Zipkin, New Relic Containerization : Docker, Kubernetes CI/CD Pipelines Experience designing and building scalable systems as an Architect Hands-on with Apache Storm Note : This role is open to immediate joiners only . If you're ready to take on a challenging architect-level role and make an impact, send your resume to sushil@saisservices.com

Posted 2 weeks ago

Apply

8.0 - 13.0 years

40 - 65 Lacs

Hyderabad

Remote

Naukri logo

Technical Head of Cloud & DevOps Location: 100% Remote (India, Eastern Europe, UK, or U.S.-based candidates; occasional travel to company hubs or conferences as needed) Type: Full-time, Senior Technical Leadership Role Overview We are seeking a Head of Cloud & DevOps to lead the hands-on management, scaling, and continuous improvement of our decentralized compute infrastructure. This position will serve as the primary technical leader for cloud operations, Kubernetes orchestration, infrastructure management, and DevOps pipelines, ensuring platform reliability, performance, and scalability. You will work closely with the CTO, product management, and cross-functional engineering teams to operationalize our companys evolving platform, drive our migration to in-house Distributed Kubernetes Service (DKS), and ensure high uptime and SLA adherence for enterprise customers. This role requires deep technical expertise combined with strong leadership to guide and mentor teams, while remaining actively engaged in architecture reviews, troubleshooting, and hands-on problem solving. This role is designed for candidates who aspire to grow into a future CTOO position, taking on expanded enterprise leadership responsibilities as the platform scales globally. Mandatory Skills Kubernetes orchestration (multi-cluster, DKS, service mesh) Cloud infrastructure scaling (AWS, hybrid, AI workloads) DevOps & CI/CD leadership (Jenkins, GitOps, version control) Infrastructure as Code (IaC) (Terraform, Helm, Ansible) Incident response and uptime optimization (SRE, observability, 99.9%+ SLAs) Security & Compliance knowledge (SOC 2, ISO 27001, access control, encryption) Team leadership in DevOps/SRE/Cloud Ops Monitoring and alerting systems Platform reliability and SLA adherence 8+ years in Cloud Infrastructure, 4+ in Kubernetes/DevOps leadership Non Mandatory skills Experience with Distributed Kubernetes Service (DKS) migrations Passion for decentralized computing / Web3 / blockchain NXQ Token or similar token incentive familiarity Cloud-native architecture for AI workloads Experience with hybrid or bare-metal Kubernetes deployments Global infrastructure experience Knowledge of performance-based DevOps metrics (error budgets, SLOs) Key Responsibilities Infrastructure Ownership & Uptime Leadership Own the full operational lifecycle of our companys decentralized compute infrastructure, spanning Kubernetes, VMs, AI workloads, hybrid cloud integrations, and blockchain components. • Develop and execute infrastructure scaling plans to meet growth demands while maintaining enterprise-grade SLAs (99.9%+ uptime). • Build robust monitoring, observability, alerting, and incident response systems to proactively manage global NanoServer operations. • Maintain deep involvement in diagnosing and resolving performance, capacity, and stability issues. Kubernetes Platform Management & DKS Migration Lead the architecture, deployment, and ongoing optimization of our companys Distributed Kubernetes Service (DKS). • Manage the transition from AWS EKS to DKS with zero downtime, thorough testing, rollbacks, and security assurance. • Ensure DKS delivers parity or superiority to leading cloud providers' managed Kubernetes offerings. DevOps Leadership Drive maturity in CI/CD pipelines, infrastructure-as-code, configuration management, and automated testing practices. • Oversee deployment reliability, version control, rollbacks, and release management. • Lead incident response runbooks, playbooks, SRE error budgets, and continuous reliability improvements. Security & Compliance Implement strong security controls for Kubernetes clusters, network access, identity management, data privacy, and blockchain-related assets. • Collaborate with compliance teams on certifications (SOC 2, ISO 27001, etc.) as required by enterprise clients. • Maintain operational adherence to security standards and best practices. Team Leadership & Execution Lead, mentor, and grow cross-functional cloud operations teams: DevOps, SRE, infrastructure engineers, and backend developers. • Foster a culture of accountability, continuous improvement, operational excellence, and proactive ownership. • Set clear objectives, performance metrics, and technical execution roadmaps aligned to business goals. Collaboration & Stakeholder Alignment • Partner closely with the CTO, product management, and engineering leadership to translate platform objectives into actionable infrastructure projects. • Represent technical operations in cross-functional planning sessions and communicate platform health, SLAs, and operational risks. Qualifications & Experience 8+ years of experience managing complex cloud infrastructure, with at least 4+ years leading DevOps/SRE/Kubernetes operations at scale. • Strong hands-on expertise with Kubernetes orchestration, multi-cluster management, service mesh, container security, and high-scale distributed systems. • Proven success in infrastructure scaling, uptime optimization, incident response, and capacity planning. • In-depth knowledge of DevOps pipelines, CI/CD frameworks, Infrastructure-as-Code (Terraform, Helm), and automated deployments. • Demonstrated ability to lead migrations from managed cloud services to in-house infrastructure. • Strong understanding of cloud security, access controls, encryption, data privacy, and enterprise compliance . • Passion for decentralized cloud computing, Web3/blockchain concepts, or AI-driven infrastructure is a plus. • Excellent leadership, communication, and cross-functional collaboration skills. • Bachelors or Master’s degree in Computer Science, Engineering, or a related field; equivalent experience considered. Compensation & Benefits Competitive base salary depending on candidate location • Equity participation aligned to long-term growth of our company • Performance-based annual bonuses • NXQ token incentives aligned with ecosystem growth • Comprehensive healthcare coverage • Remote work flexibility with home office stipends • Opportunities for global collaboration and occasional travel • High-impact leadership role shaping the future of cloud technology • Structured career path to grow into CTOO based on organizational maturity and demonstrated leadership

Posted 2 weeks ago

Apply

15.0 - 24.0 years

22 - 37 Lacs

Greater Noida

Work from Office

Naukri logo

Role: Tools & Automation Architect Experience • 12 - 18 years’ Experience in Cloud & Infrastructure Management Solutions with solid understanding of architecture, design principles and management of IT Infrastructure Automation Platform. Experience to Lead efforts for Assessment, solution design, integrate and implement BMC Helix, ServiceNow ITOM, ITSM, ITAM & monitoring tools Experience in IT Infrastructure delivery & support roles. Experience in Monitoring Tools such as SolarWinds, Nagios, OpsRamp, Manage Engine, ServiceNow ITOM, BMC suite, Observability tools- analysis, plan, design and implement. Possess comprehensive knowledge of the configuration management field and ability to complete difficult and complex assignments. Configuration for the monitoring of performance of Key Business Transactions Ensure proper methodology and standardization are used when implementing solutions. Deep analytical skills to understand complex procedural problems, identify root causes and provide solutions. Expertise in enterprise automation platforms including GenAI, AI & ML. Self-starter and who can work in a dynamic environment. Prepare Technical Documentation including high- and low-level design in alignment as per requirement & responsible for making continuous improvement by regular analysis, reporting and training The ability to create re - useable solutions, utilities for driving transformation and automation for our customers. Lead a technical team to deliver these solutions at scale, be an evangelist for driving transformation and change within the organization. Must have knowledge of DR/ HA/ Standalone and distributed architectures Expertise in the use of network management protocols (e.g. SNMP, SNMP Traps, Syslog, ICMP, NetFlow etc.) Should have experience in describing provisions for configuration identification, change control, configuration status accounting and configuration audits. Strong demonstrable ability to anticipate and highlight project risks in terms of schedule, cost, resource and customer satisfaction Help to Prepare & Track project costs to meet budget. Help to create required SOW, OLA & other relevant documents for any projects. Manage contracts with vendors and suppliers by assigning tasks and communicating expected deliverables as per SOW. Report to leadership team on project status on a timely and comprehensive manner.

Posted 2 weeks ago

Apply

7.0 - 10.0 years

20 - 30 Lacs

Bangalore Rural, Bengaluru

Work from Office

Naukri logo

Role & responsibilities: Design end-to-end monitoring and observability solutions to provide comprehensive visibility into infrastructure, applications, and networks. Implement monitoring tools and frameworks (e.g., Prometheus, Grafana, OpsRamp, Dynatrace, New Relic) to track key performance indicators and system health metrics. Integration of monitoring and observability solutions with IT Service Management Tools. Develop and deploy dashboards, alerts, and reports to proactively identify and address system performance issues. Architect scalable observability solutions to support hybrid and multi-cloud environments. Collaborate with infrastructure, development, and DevOps teams to ensure seamless integration of monitoring systems into CI/CD pipelines. Continuously optimize monitoring configurations and thresholds to minimize noise and improve incident detection accuracy. Automate alerting, remediation, and reporting processes to enhance operational efficiency. Utilize AIOps and machine learning capabilities for intelligent incident management and predictive analytics. Work closely with business stakeholders to define monitoring requirements and success metrics. Document monitoring architectures, configurations, and operational procedures. Required Skills: Strong understanding of infrastructure and platform development principles and experience with programming languages such as Python, Ansible, for developing custom scripts. Strong knowledge of monitoring frameworks, logging systems (ELK stack, Fluentd), and tracing tools (Jaeger, Zipkin) along with the OpenSource solutions like Prometheus, Grafana. Extensive experience with monitoring and observability solutions such as OpsRamp, Dynatrace, New Relic, must have worked with ITSM integration (e.g. integration with ServiceNow, BMC remedy, etc.) Working experience with RESTful APIs and understanding of API integration with the monitoring tools. Familiarity with AIOps and machine learning techniques for anomaly detection and incident prediction. Knowledge of ITIL processes and Service Management frameworks. Familiarity with security monitoring and compliance requirements. Excellent analytical and problem-solving skills, ability to debug and troubleshoot complex automation issues CVs to angel@anveta,com

Posted 2 weeks ago

Apply

6.0 - 8.0 years

0 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

Foundit logo

Key Responsibilities: As Tools SME Tools in SolarWinds Splunk Dynatrace Devpops tool will work on Design Setup and Configuration of Observability Platforms with Correlation Anomaly Detection Visualization and Dashboards AI ops Devops Tool Integration Collaborate with DevOps architects development teams and operations teams to understand their tool requirements and identify opportunities for optimizing the DevOps toolchain Evaluate and recommend new tools and technologies that can enhance our DevOps capabilities context considering factors like cost integration and local support Lead the implementation configuration and integration of various DevOps tools including CI CD platforms e g Jenkins GitLab CI Azure DevOps infrastructure as code IaC tools e g Terraform Ansible containerization and orchestration tools e g Docker Kubernetes monitoring and logging tools e g Prometheus Grafana ELK stack and testing framework Establish standards and best practices for the usage and management of the DevOps toolset Ensure the availability performance and stability of the DevOps toolchain Perform regular maintenance tasks including upgrades patching and backups of the DevOps tools Provide technical support and troubleshooting assistance to development and operations teams regarding the usage of the DevOps tools Monitor the health and performance of the toolset and implement proactive measures to prevent issues Design and implement integrations between different tools in the DevOps pipeline to create seamless and automated workflows Develop automation scripts and utilities to streamline tool provisioning configuration and management within the environment Work with development teams to integrate testing and security tools into the CI CD pipeline Technical Requirements: At least 6 years of experience in Solarwinds or Splunk or Dynatrace or Devlops Toolset Proven experience with several key DevOps tools including CI CD platforms e g Jenkins GitLab CI Azure DevOps IaC tools e g Terraform Ansible containerization Docker Kubernetes and monitoring tools e g Prometheus Grafana ELK stack Good level knowledge of Linux environment Good working knowledge on YAML and Python Good working knowledge in Event correlation and Observability Good Communication skills Good analytical and problem solving skills Additional Responsibilities: Besides the professional qualifications of the candidates we place great importance in addition to various forms personality profile These include High analytical skills A high degree of initiative and flexibility High customer orientation High quality awareness Excellent verbal and written communication skills Preferred Skills: Technology->Dynatrace->Digital Performance Management Tool,Technology->Infra_ToolAdministration-Others->Solarwinds,Technology->Infra_ToolAdministration-Others->Splunk Admin,Technology->DevOps->DevOps Architecture Consultancy

Posted 2 weeks ago

Apply

12.0 - 15.0 years

30 - 35 Lacs

Bengaluru

Work from Office

Naukri logo

We are seeking a highly experienced and technically profound Cloud Application Architect to drive our cloud-first digital transformation initiatives. This pivotal role involves leading the design, development, and modernization of our enterprise application portfolio to deliver modern, scalable, secure, and business-aligned cloud-native solutions. The ideal candidate will possess a deep, hands-on technical background in application architecture, with a focus on transforming legacy systems into agile, customer-centric, and cloud-optimized experiences within either the Microsoft or Java enterprise stack. This role is critical for shaping our application landscape, ensuring robust end-to-end design, and guiding development teams through complex architectural challenges in a dynamic, cloud-first environment. Key Responsibilities As a Senior Cloud Application Architect, you will: Define Cloud-Native Application Architectures: Lead the definition, design, and implementation of comprehensive cloud-native application architectures and strategic modernization roadmaps for critical enterprise systems, primarily leveraging AWS EKS, Azure AKS, and serverless functions (e.g., AWS Lambda, Azure Functions). Own End-to-End Application Design: Hold ultimate accountability for the end-to-end application design, ensuring solutions meet stringent requirements for scalability (handling high transaction volumes), performance (low latency), robust security (integrating DevSecOps principles like SAST/DAST, Zero Trust), and high reliability (achieving stringent uptime targets). Guide Microservices/API Architecture & Containerization: Provide senior technical guidance and mentorship to multiple distributed project teams on advanced microservices and API-first design patterns, including choreography vs. orchestration, eventual consistency, and idempotent API design. Lead the adoption and implementation of Docker containerization and Kubernetes orchestration (AKS/EKS) for efficient application deployment and management. Develop Deployment & Operational Strategy: Define and enforce declarative deployment strategies (e.g., GitOps with ArgoCD/FluxCD). Design application-level disaster recovery and business continuity plans, including multi-region deployments with active-active/active-passive patterns and automated failover mechanisms. Collaborate Cross-Functionally: Collaborate extensively as a strategic partner with cross-functional teams including software developers (Java/.NET), product owners, business analysts, DevOps engineers, security specialists, and infrastructure teams. Translate complex business requirements into clear, actionable technical specifications. Lead Technical Design Sessions & Governance: Lead high-stakes technical design sessions, facilitate architecture review boards (ARB), and prepare comprehensive architectural documentation (e.g., Architecture Decision Records (ADRs), sequence diagrams, data flow diagrams) to ensure alignment, maintain architectural integrity, and govern new feature implementations. Support Build vs. Buy & Tool Selection: Actively support critical build vs. buy analyses for new functionalities. Evaluate, select, and champion various cloud services (PaaS, SaaS) and third-party tools (e.g., API Management gateways, caching solutions, message brokers) based on technical fit, business needs, and cost efficiency. Conduct and present Proof-of-Concepts (PoCs) for emerging technologies and strategic platform integrations. Drive DevSecOps & Observability Integration: Champion the integration of advanced DevSecOps practices, from "shift-left" security to automated CI/CD pipelines. Implement comprehensive application observability solutions (e.g., Prometheus, Grafana, Application Insights) to monitor SLOs/SLIs, diagnose performance issues, and proactively ensure system health. Optimize Application-Level Costs: Design and optimize application architectures to maximize cloud cost efficiency, leveraging serverless computing, right-sizing container workloads, and implementing intelligent autoscaling policies. Mentor & Foster Innovation: Mentor junior and mid-level developers and architects on cloud-native development best practices, application refactoring techniques, and effective utilization of cloud services. Explore and prototype the integration of emerging technologies (e.g., AI/ML, Generative AI) for intelligent features and digital workflow automation. Qualifications: Education: Bachelors or Masters degree in Computer Science, Engineering, Information Technology, or a related field. Experience: 1 12+ years of progressive experience in application architecture, with a significant and demonstrable focus on cloud-native application design, digital-first transformations, and modernizing enterprise software. Application Development Background: Strong application background with hands-on experience in either the Microsoft (.NET Core, ASP.NET) or Java (Spring Boot, J2EE) enterprise/product software architecture. Cloud Platform Expertise: Proven experience delivering cloud-first solutions using public cloud platforms (AWS, Azure are preferred; GCP experience is a plus), with a deep understanding of their PaaS and IaaS offerings relevant to application development. Modern Application Design Principles: Deep knowledge and hands-on experience with microservices, API-driven development, event-driven architecture, serverless computing, and domain-driven design. Containerization & Orchestration: Expertise in Docker and Kubernetes (EKS, AKS), including deployment strategies and operational best practices for containerized applications. Agile, DevOps, & CI/CD: Strong understanding and practical experience with agile delivery models, comprehensive DevOps practices, and continuous integration/deployment (CI/CD) pipelines. Communication & Stakeholder Management: Excellent communication, presentation, and stakeholder management skills, with a proven ability to bridge technical and business perspectives, and advise senior leadership. Leadership & Governance: Extensive experience in leading cross-functional development and architecture teams, managing architectural governance, and mentoring engineers in large-scale programs. Preferred Skills: Cloud Certifications: Relevant cloud certifications (e.g., AWS Certified Solutions Architect – Professional, Azure Solutions Architect Expert, Certified Kubernetes Application Developer - CKAD). Enterprise Architecture Frameworks: Knowledge of enterprise architecture frameworks (e.g., TOGAF) in the context of digital transformation. Observability Tools: Experience with comprehensive observability solutions for applications (e.g., Prometheus, Grafana, Datadog, Application Insights, distributed tracing tools like Jaeger). Security by Design: Direct experience implementing security best practices at the application architecture level (e.g., OWASP, threat modeling, secure coding standards). AI/ML Integration: Experience with integrating analytics, personalization, and AI/ML capabilities into application architectures. Low-Code/No-Code Platforms: Exposure to low-code/no-code development tools and digital workflow automation platforms.

Posted 2 weeks ago

Apply

8.0 - 12.0 years

2 - 11 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

Foundit logo

This might be a good fit for you, if enabling people to do their best resonates with you. you love platform engineering you want to build cool things with cool people. you love automating everything you love building high impact tools and software which everyone depends on you love automating everything! What Your Responsibilities Will Be Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality What Youll Need to be Successful Qualifications Software Engineering : Understand software engineering fundamentals and have experience developing software among a team of engineers. Strong experience in the practice of testing. Build Automation: Experience getting artifacts in a variety of languages packaged and tested so that they can be trusted to go into Production. Automatically. Release Automation: Experience in getting artifacts running in production in a reliable manner. Automatically. Observability : Experience with developing service level indicators and objectives, instrumenting software, and building meaningful alerts. Troubleshooting : A passion for tracking down technical root causes of distributed systems, and software. Containers/Container Orchestration Systems : A solid understanding of how to manage and maintain container-based systems especially on Kubernetes. Artificial Intelligence : A grounding in infrastructure for and the use of Agentic Systems. Infrastructure-as-Code : Experience with deploying and maintaining infrastructure as code with tools such as Terraform and Pulumi. Technical Writing : We will need to build documentation and diagrams for other engineering teams. Customer Satisfaction : Keen eye for customer satisfaction (our customers are other engineering teams and Avalara customers). Passion for Learning : Interest in the broader technology space with a constant desire to expand your understanding. Adaptability: Experience working on a variety of projects. Preferred Qualifications GO : Our tooling is developed in GO Distributed Computing : Experience architecting, developing, and deploying distributed services across regions and clouds. GitLab : Experience in working with, managing, and deploying. Artifactory : Experience in working with, managing, and deploying. Technical Writing : writing technical documents that people love and adore. Open Source: Build side-projects or contribute to other open-source projects. Experience Minimum 8 years of experience in a SaaS environment Bachelors degree in computer science or equivalent Ability to participate in an on-call rotation Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality

Posted 2 weeks ago

Apply

8.0 - 13.0 years

3 - 12 Lacs

Hyderabad / Secunderabad, Telangana, Telangana, India

On-site

Foundit logo

This might be a good fit for you, if enabling people to do their best resonates with you. you love platform engineering you want to build cool things with cool people. you love automating everything you love building high impact tools and software which everyone depends on you love automating everything! What Your Responsibilities Will Be Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality What Youll Need to be Successful Qualifications Software Engineering : Understand software engineering fundamentals and have experience developing software among a team of engineers. Strong experience in the practice of testing. Build Automation: Experience getting artifacts in a variety of languages packaged and tested so that they can be trusted to go into Production. Automatically. Release Automation: Experience in getting artifacts running in production in a reliable manner. Automatically. Observability : Experience with developing service level indicators and objectives, instrumenting software, and building meaningful alerts. Troubleshooting : A passion for tracking down technical root causes of distributed systems, and software. Containers/Container Orchestration Systems : A solid understanding of how to manage and maintain container-based systems especially on Kubernetes. Artificial Intelligence : A grounding in infrastructure for and the use of Agentic Systems. Infrastructure-as-Code : Experience with deploying and maintaining infrastructure as code with tools such as Terraform and Pulumi. Technical Writing : We will need to build documentation and diagrams for other engineering teams. Customer Satisfaction : Keen eye for customer satisfaction (our customers are other engineering teams and Avalara customers). Passion for Learning : Interest in the broader technology space with a constant desire to expand your understanding. Adaptability: Experience working on a variety of projects. Preferred Qualifications GO : Our tooling is developed in GO Distributed Computing : Experience architecting, developing, and deploying distributed services across regions and clouds. GitLab : Experience in working with, managing, and deploying. Artifactory : Experience in working with, managing, and deploying. Technical Writing : writing technical documents that people love and adore. Open Source: Build side-projects or contribute to other open-source projects. Experience Minimum 8 years of experience in a SaaS environment Bachelors degree in computer science or equivalent Ability to participate in an on-call rotation Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality

Posted 2 weeks ago

Apply

8.0 - 13.0 years

3 - 11 Lacs

Delhi, India

On-site

Foundit logo

This might be a good fit for you, if enabling people to do their best resonates with you. you love platform engineering you want to build cool things with cool people. you love automating everything you love building high impact tools and software which everyone depends on you love automating everything! What Your Responsibilities Will Be Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality What Youll Need to be Successful Qualifications Software Engineering : Understand software engineering fundamentals and have experience developing software among a team of engineers. Strong experience in the practice of testing. Build Automation: Experience getting artifacts in a variety of languages packaged and tested so that they can be trusted to go into Production. Automatically. Release Automation: Experience in getting artifacts running in production in a reliable manner. Automatically. Observability : Experience with developing service level indicators and objectives, instrumenting software, and building meaningful alerts. Troubleshooting : A passion for tracking down technical root causes of distributed systems, and software. Containers/Container Orchestration Systems : A solid understanding of how to manage and maintain container-based systems especially on Kubernetes. Artificial Intelligence : A grounding in infrastructure for and the use of Agentic Systems. Infrastructure-as-Code : Experience with deploying and maintaining infrastructure as code with tools such as Terraform and Pulumi. Technical Writing : We will need to build documentation and diagrams for other engineering teams. Customer Satisfaction : Keen eye for customer satisfaction (our customers are other engineering teams and Avalara customers). Passion for Learning : Interest in the broader technology space with a constant desire to expand your understanding. Adaptability: Experience working on a variety of projects. Preferred Qualifications GO : Our tooling is developed in GO Distributed Computing : Experience architecting, developing, and deploying distributed services across regions and clouds. GitLab : Experience in working with, managing, and deploying. Artifactory : Experience in working with, managing, and deploying. Technical Writing : writing technical documents that people love and adore. Open Source: Build side-projects or contribute to other open-source projects. Experience Minimum 8 years of experience in a SaaS environment Bachelors degree in computer science or equivalent Ability to participate in an on-call rotation Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality

Posted 2 weeks ago

Apply

6.0 - 9.0 years

8 - 11 Lacs

Pune

Work from Office

Naukri logo

We are hiring a DevOps / Site Reliability Engineer for a 6-month full-time onsite role in Pune (with possible extension). The ideal candidate will have 69 years of experience in DevOps/SRE roles with deep expertise in Kubernetes (preferably GKE), Terraform, Helm, and GitOps tools like ArgoCD or Flux. The role involves building and managing cloud-native infrastructure, CI/CD pipelines, and observability systems, while ensuring performance, scalability, and resilience. Experience in infrastructure coding, backend optimization (Node.js, Django, Java, Go), and cloud architecture (IAM, VPC, CloudSQL, Secrets) is essential. Strong communication and hands-on technical ability are musts. Immediate joiners only.

Posted 2 weeks ago

Apply

0.0 years

0 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

Foundit logo

Ready to shape the future of work At Genpact, we don&rsquot just adapt to change&mdashwe drive it. AI and digital innovation are redefining industries, and we&rsquore leading the charge. Genpact&rsquos AI Gigafactory, our industry-first accelerator, is an example of how we&rsquore scaling advanced technology solutions to help global enterprises work smarter, grow faster, and transform at scale. From large-scale models to agentic AI, our breakthrough solutions tackle companies most complex challenges. If you thrive in a fast-moving, tech-driven environment, love solving real-world problems, and want to be part of a team that&rsquos shaping the future, this is your moment. Genpact (NYSE: G) is an advanced technology services and solutions company that delivers lasting value for leading enterprises globally. Through our deep business knowledge, operational excellence, and cutting-edge solutions - we help companies across industries get ahead and stay ahead. Powered by curiosity, courage, and innovation, our teams implement data, technology, and AI to create tomorrow, today. Get to know us at genpact.com and on LinkedIn, X, YouTube, and Facebook. Inviting applications for the role of Senior Principal Consultant- Senior Data Engineer - Databricks, Azure & Mosaic AI Role Summary: We are seeking a Senior Data Engineer with extensive expertise in Data & Analytics platform modernization using Databricks, Azure, and Mosaic AI. This role will focus on designing and optimizing cloud-based data architectures, leveraging AI-driven automation to enhance data pipelines, governance, and processing at scale. Key Responsibilities: . Architect & modernize Data & Analytics platforms using Databricks on Azure. . Design and optimize Lakehouse architectures integrating Azure Data Lake, Databricks Delta Lake, and Synapse Analytics. . Implement Mosaic AI for AI-driven automation, predictive analytics, and intelligent data engineering solutions. . Lead the migration of legacy data platforms to a modern cloud-native Data & AI ecosystem. . Develop high-performance ETL pipelines, integrating Databricks with Azure services such as Data Factory, Synapse, and Purview. . Utilize MLflow & Mosaic AI for AI-enhanced data processing and decision-making. . Establish data governance, security, lineage tracking, and metadata management across modern data platforms. . Work collaboratively with business leaders, data scientists, and engineers to drive innovation. . Stay at the forefront of emerging trends in AI-powered data engineering and modernization strategies. Qualifications we seek in you! Minimum Qualifications . experience in Data Engineering, Cloud Platforms, and AI-driven automation. . Expertise in Databricks (Apache Spark, Delta Lake, MLflow) and Azure (Data Lake, Synapse, ADF, Purview). . Strong experience with Mosaic AI for AI-powered data engineering and automation. . Advanced proficiency in SQL, Python, and Scala for big data processing. . Experience in modernizing Data & Analytics platforms, migrating from on-prem to cloud. . Knowledge of Data Lineage, Observability, and AI-driven Data Governance frameworks. . Familiarity with Vector Databases & Retrieval-Augmented Generation (RAG) architectures for AI-powered data analytics. . Strong leadership, problem-solving, and stakeholder management skills. Preferred Skills: . Experience with Knowledge Graphs (Neo4J, TigerGraph) for data structuring. . Exposure to Kubernetes, Terraform, and CI/CD for scalable cloud deployments. . Background in streaming technologies (Kafka, Spark Streaming, Kinesis). Why join Genpact . Be a transformation leader - Work at the cutting edge of AI, automation, and digital innovation . Make an impact - Drive change for global enterprises and solve business challenges that matter . Accelerate your career - Get hands-on experience, mentorship, and continuous learning opportunities . Work with the best - Join 140,000+ bold thinkers and problem-solvers who push boundaries every day . Thrive in a values-driven culture - Our courage, curiosity, and incisiveness - built on a foundation of integrity and inclusion - allow your ideas to fuel progress Come join the tech shapers and growth makers at Genpact and take your career in the only direction that matters: Up. Let&rsquos build tomorrow together. Genpact is an Equal Opportunity Employer and considers applicants for all positions without regard to race, color, religion or belief, sex, age, national origin, citizenship status, marital status, military/veteran status, genetic information, sexual orientation, gender identity, physical or mental disability or any other characteristic protected by applicable laws. Genpact is committed to creating a dynamic work environment that values respect and integrity, customer focus, and innovation. Furthermore, please do note that Genpact does not charge fees to process job applications and applicants are not required to pay to participate in our hiring process in any other way. Examples of such scams include purchasing a %27starter kit,%27 paying to apply, or purchasing equipment or training.

Posted 3 weeks ago

Apply

9.0 - 14.0 years

20 - 35 Lacs

Chennai, Bengaluru

Work from Office

Naukri logo

Dynatrace Specialist 9+ Years Location : Bangalore / Chennai Company : HCLTech Experience : 9 to 13 Years Employment Type : Full-Time | Permanent About the Role : HCLTech is seeking an experienced Dynatrace Specialist to join our IT Observability and AIOps team. The ideal candidate will be responsible for implementing, managing, and optimizing Dynatrace-based performance monitoring for enterprise applications. Key Responsibilities : Deploy, configure, and maintain Dynatrace for end-to-end observability. Create custom dashboards, alerts, and synthetic monitoring. Troubleshoot application and infrastructure performance issues using Dynatrace insights. Collaborate with development and DevOps teams to enhance performance tuning. Integrate Dynatrace with ITSM, CI/CD, and other APM tools. Required Skills : 9+ years of IT experience with minimum 3 years in Dynatrace (APM, DEM, RUM, Synthetic). Strong knowledge of application stacks (Java, .NET, Node.js, containers). Experience with Kubernetes, Docker, and cloud-native environments. Exposure to ServiceNow, AppDynamics, Splunk, or similar tools (preferred). Strong scripting and automation skills (Python, Shell, PowerShell preferred). Preferred Certification : Dynatrace Associate/Professional Certification (preferred)

Posted 3 weeks ago

Apply

3.0 - 5.0 years

0 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

Foundit logo

Candidate is expected to write good quality C/C++, Java Code and should be able to develop corresponding Unit tests and Automation. He/She must have hands on experience with Cloud Native technologies like docker/Kubernetes/Monitoring/observability. Additional skill sets include Perl and Python scripting, DB and XML concepts. Should be familiar with Agile methodology, CI/CD process and should have exposure to messaging framework like Kafka. Candidate should be able to understand requirements and deliver independently. Experience in billing domain will be an added advantage. Career Level - IC2

Posted 1 month ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies