MLOps Engineering manager

5 - 7 years

5 - 7 Lacs

Gurgaon / Gurugram Haryana India

Posted:1 month ago| Platform: Foundit logo

Apply

Skills Required

Work Mode

On-site

Job Type

Full Time

Job Description

Dynamic Yield, a Mastercard company, is dedicated to powering an inclusive, digital economy that benefits everyone, everywhere. Our SSO Data Science team, specifically the Horizontal Data Science Enablement Team, is looking for an MLOps Engineering Manager . This critical leadership role involves solving complex MLOps challenges, overseeing the entire organization's Databricks platform, building robust CI/CD and automation pipelines, and championing MLOps best practices. You'll lead the charge in optimizing the machine learning lifecycle, ensuring platform stability, and collaborating closely with data engineers, data scientists, and other key stakeholders to support their data processing and analytics needs. All About You As an MLOps Engineering Manager, you will: Databricks Platform Leadership: Oversee the administration, configuration, and maintenance of Databricks clusters and workspaces for the entire organization. Continuously monitor Databricks clusters for high workloads or excessive usage costs, proactively alerting relevant stakeholders to address issues impacting overall cluster health. Implement and manage security protocols, including access controls and data encryption, to safeguard sensitive information in adherence with Mastercard standards. Facilitate the integration of various data sources into Databricks, ensuring seamless data flow and consistency. Identify and resolve issues related to Databricks infrastructure, providing timely support to users and stakeholders. MLOps Solution Ownership: Bring deep MLOps expertise to the table, specifically within the scope of, but not limited to: Model monitoring, Feature catalog/store, Model lineage maintenance, and CI/CD pipelines to gatekeep the model lifecycle from development to production. Own and maintain MLOps solutions, either by leveraging open-source options or through third-party vendors. Build LLMOps pipelines using open-source solutions, recommend alternatives, and onboard new products to the solution. Operational Excellence & Collaboration: Maintain services once they are live by measuring and monitoring availability, latency, and overall system health. Work closely with data engineers, data scientists, and other stakeholders to support their data processing and analytics needs. Maintain comprehensive documentation of Databricks configurations, processes, and best practices. Lead participation in security and architecture reviews of the infrastructure. What Experience You Need Education: Master's degree in computer science, software engineering, or a similar field. Databricks Expertise: Strong experience with Databricks and its management of roles and resources. Cloud & APIs: Experience in cloud technologies and operations , and experience supporting APIs and Cloud technologies . MLOps Solutions: Experience with MLOps solutions like MLFlow . Data Skills: Experience with performing data analysis, data observability, data ingestion, and data integration. DevOps/SRE Background: 5+ years of DevOps, SRE, or general systems engineering experience. CI/CD Proficiency: 2+ years of hands-on experience in industry-standard CI/CD tools like Git/BitBucket, Jenkins, Maven, Artifactory, and Chef . Data Governance: Experience architecting and implementing data governance processes and tooling (such as data catalogs, lineage tools, role-based access control, PII handling). Programming: Strong coding ability in Python or other languages like Java and C++, plus a solid grasp of SQL fundamentals . Problem-Solving & Ownership: Possess a systematic problem-solving approach, coupled with strong communication skills and a strong sense of ownership and drive. What Could Set You Apart SQL Tuning: Experience with SQL tuning . Automation: Strong automation experience. Data Observability: Strong Data Observability experience. Operations: Operations experience in supporting highly scalable systems. Global Operations: Ability to operate in a 24x7 environment encompassing global time zones. Self-Motivation: Self-motivating and creatively solves software problems while effectively keeping modeling systems operational.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You

Gurgaon / Gurugram, Haryana, India

Gurgaon / Gurugram, Haryana, India