We are seeking a highly capable MLOps Engineer to join our growing AI/ML Team. You will bridge the gap between data science and operations, ensuring that machine learning models are efficiently tested, deployed, monitored, and maintained in production environments. You will work closely with data scientists, software engineers, infrastructure, and development teams to build scalable and reliable ML infrastructure. You will be instrumental in supporting clinical decision-making, operational efficiency, quality outcomes, and patient care.
What You Will Be Doing :
Model Deployment and Infrastructure
- Design, build, and maintain scalable, secure ML pipelines for model training, validation, deployment, and monitoring
- Automate deployment workflows using CI/CD pipelines and infrastructure-as-code tools
- Partner with Infrastructure Teams to manage (Azure) cloud-based ML infrastructure, ensuring compliance with InfoSec and AI policies
- Ensure applications run at peak efficiency
Model Testing, Monitoring, and Validation
- Develop rigorous testing frameworks for ML models, including clinical validation, traditional model performance measures, population segmentation, and edge-case analysis
- Build monitoring systems to detect model drift, overfitting, data anomalies, and performance degradation in real-time
- Continuously analyze model performance metrics and operational logs to identify improvement opportunities
- Translate monitoring insights into actionable recommendations for data scientists to improve model precision, recall, fairness, and efficiency
Model Transparency & Governance
- Maintain detailed audit trails, logs, and metadata for all model versions, training datasets, and configurations to ensure full traceability and support internal audits
- Ensure models meet transparency and explainability standards using tools like SHAP, LIME, or integrated explainability APIs.
- Collaborate with data scientists and clinical teams to ensure models are interpretable, actionable, and aligned with practical applications
- Support corporate Compliance and AI Governance policies
- Advocate for best practices in ML engineering, including reproducibility, version control, and ethical AI
- Develop product guides, model documentation, and model cards for internal and external stakeholders
Required Qualifications
- bachelors Degree in Computer Science, Machine Learning, Data Science, or a related field
- 2+ years of experience in MLOps, DevOps, or ML engineering
- Proficiency in Python and ML frameworks such as Keras, PyTorch, Scikit-Learn, TensorFlow, and XGBoost
- Experience with containerization (Docker), orchestration (Kubernetes), and CI/CD tools
- Familiarity with healthcare datasets and privacy regulations
- Strong analytical skills to interpret model performance data and identify optimization opportunities
- Proven ability to optimize application performance, including improving code efficiency, right-sizing infrastructure usage, and reducing system latency
- Experience implementing rollback strategies, including version control, rollback triggers, and safe deployment practices across lower and upper environments
- 2+ years of experience developing in a cloud environment (AWS, GCS, Azure)
- 2+ years of experience with Github, Github Actions, CI/CD, and source control
- 2+ years working within an Agile environment
Preferred Qualifications :
- Experience with MLOps platforms like MLflow, TFX, or Kubeflow
- Healthcare experience, particularly using administrative and prior authorization data
- Proven experience with developing and deploying ML systems into production environments
- Experience working with Product, Engineering, Infrastructure, and Architecture teams
- Proficiency using Azure cloud-based services and infrastructure such as Azure MLOps
- Experience with feature flagging tools and strategies