Home
Jobs

LLM & ML Ops Engineer

4 - 9 years

25 - 35 Lacs

Posted:1 day ago| Platform: Naukri logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Gainwell is seeking LLM Ops Engineers and ML Ops Engineers

Summary

LLM Ops Engineers and ML Ops Engineers

Your role in our mission

Core LLM Ops Responsibilities:

  • Develop and manage scalable deployment strategies specifically tailored for LLMs (GPT, Llama, Claude, etc.).
  • Optimize LLM inference performance, including model parallelization, quantization, pruning, and fine-tuning pipelines.
  • Integrate prompt management, version control, and retrieval-augmented generation (RAG) pipelines.
  • Manage vector databases, embedding stores, and document stores used in conjunction with LLMs.
  • Monitor hallucination rates, token usage, and overall cost optimization for LLM APIs or on-prem deployments.
  • Continuously monitor models for its performance and ensure alert system in place.
  • Ensure compliance with ethical AI practices, privacy regulations, and responsible AI guidelines in LLM workflows.

Core ML Ops Responsibilities:

  • Design, build, and maintain robust CI/CD pipelines for ML model training, validation, deployment, and monitoring.
  • Implement version control, model registry, and reproducibility strategies for ML models.
  • Automate data ingestion, feature engineering, and model retraining workflows.
  • Monitor model performance, drift, and ensure proper alerting systems are in place.
  • Implement security, compliance, and governance protocols for model deployment.
  • Collaborate with Data Scientists to streamline model development and experimentation.

What we're looking for

  • Bachelor's/Master’s degree in computer science, Engineering, or related fields.
  • Strong experience with ML Ops tools (Kubeflow, MLflow, TFX, SageMaker, etc.).
  • Experience with LLM-specific tools and frameworks (LangChain,Lang Graph,  LlamaIndex, Hugging Face, OpenAI APIs, Vector DBs like Pinecone, FAISS, Weavite, Chroma DB etc.).
  • Solid experience in deploying models in cloud (AWS, Azure, GCP) and on-prem environments.
  • Proficient in containerization (Docker, Kubernetes) and CI/CD practices.
  • Familiarity with monitoring tools like Prometheus, Grafana, and ML observability platforms.
  • Strong coding skills in Python, Bash, and familiarity with infrastructure-as-code tools (Terraform, Helm, etc.).Knowledge of healthcare AI applications and regulatory compliance (HIPAA, CMS) is a plus. 
  • Strong skills in Giskard, Deepeval etc.

What you should expect in this role

  • Fully Remote Opportunity – Work from anywhere in the India
  • Minimal Travel Required – Occasional travel opportunities (0-10%). 
  • Opportunity to Work on Cutting-Edge AI Solutions in a mission-driven healthcare technology environment. 

Role Description

Core LLM Ops Responsibilities:

       Develop and manage scalable deployment strategies specifically tailored for LLMs (GPT, Llama, Claude, etc.).

       Optimize LLM inference performance, including model parallelization, quantization, pruning, and fine-tuning pipelines.

       Integrate prompt management, version control, and retrieval-augmented generation (RAG) pipelines.

       Manage vector databases, embedding stores, and document stores used in conjunction with LLMs.

       Monitor hallucination rates, token usage, and overall cost optimization for LLM APIs or on-prem deployments.

       Continuously monitor models for its performance and ensure alert system in place.

       Ensure compliance with ethical AI practices, privacy regulations, and responsible AI guidelines in LLM workflows.

Core ML Ops Responsibilities:

  • Design, build, and maintain robust CI/CD pipelines for ML model training, validation, deployment, and monitoring.
  • Implement version control, model registry, and reproducibility strategies for ML models.
  • Automate data ingestion, feature engineering, and model retraining workflows.
  • Monitor model performance, drift, and ensure proper alerting systems are in place.
  • Implement security, compliance, and governance protocols for model deployment.
  • Collaborate with Data Scientists to streamline model development and experimentation.

Mock Interview

Practice Video Interview with JobPe AI

Start Agentic Ai Interview Now
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Gainwell Technologies
Gainwell Technologies

Information Technology and Services

Los Angeles

approximately 5,000 Employees

117 Jobs

    Key People

  • Megan McMahon

    CEO
  • Michael Behm

    CFO

RecommendedJobs for You