Posted:14 hours ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Position Summary:


Knorr-Bremse, the world leader in braking systems and safety-critical mobility solutions, is building an AI Center of Excellence to accelerate innovation across truck and rail businesses.


We are seeking a skilled ML/GenAI Engineer to be the operational backbone of our AI team. Your mission is to take the innovative models and prototypes developed by our scientists and transform them into production-grade, scalable, and highly efficient services.


This is a hands-on engineering role focused on the post-modeling lifecycle: optimization, deployment, and MLOps. You will be the expert on making our cutting-edge Generative AI systems (including LLMs and RAG pipelines) run fast, consume minimal resources, and operate reliably at scale. You will bridge the critical gap between experimental AI and robust, enterprise-ready applications.


If you are passionate about the engineering challenges of making large models performant and cost-effective in real-world environments, this role is for you.


Essential Functions:

  • Model Optimization and Inference Acceleration:

    Analyze and optimize inference scripts for low latency and high throughput. Implement advanced techniques to reduce model memory consumption and computational cost, such as quantization (e.g., bitsandbytes, GPTQ), pruning, and knowledge distillation. Leverage specialized serving runtimes and compilers (e.g., vLLM, TensorRT-LLM, Triton Inference Server) to maximize hardware utilization.


  • Deployment and Infrastructure Management:

    Containerize AI applications and their dependencies using Docker for reproducible deployments. Deploy models as scalable and resilient microservices on cloud platforms (AWS, GCP and/or Azure) using services like Kubernetes (GKE/AKS), Cloud Run, or Vertex AI Endpoints. Build and manage the cloud infrastructure required to support our AI services, using Infrastructure as Code (IaC) principles where applicable (e.g., Terraform).



  • MLOps and Automation:

    Design and implement robust CI/CD pipelines for automated model testing, building, and deployment. Develop comprehensive monitoring and logging solutions to track model performance, resource usage, and system health in production. Implement alerting mechanisms to proactively identify and address issues like model drift, performance degradation, or infrastructure failures.


  • API and Service Development:

    Develop clean, well-documented, and efficient APIs (e.g., REST, gRPC) using frameworks like FastAPI or Flask to serve model predictions to downstream applications. Ensure the security, authentication, and scalability of the AI service endpoints.
  • Cross-Functional Collaboration:

    Work closely with AI scientists, product managers, software engineers, and UX designers to align AI solutions with business needs.

    :

    Work closely with the AI Scientist to understand model architecture and dependencies, providing feedback to improve its "deployability." Partner with the lead Generative AI Scientist to ensure that the production environment can meet the performance and cost requirements of their proposed solutions.


  • Ensuring Responsible & Secure AI:

    Implement

    AI solutions in line with ethical and security guidelines

    . The AI Engineer is expected to incorporate principles of

    Responsible AI

    – ensuring models are fair, transparent, and privacy-compliant. They must guard against risks (bias, data leakage, prompt injection attacks on agents, etc.) and follow security best practices for AI deployments. For instance, when deploying an agentic AI system that can take actions, the engineer will put guardrails in place to prevent unintended behaviors and to log decisions for auditability.



  • Continuous Learning & Improvement:

    Stay up-to-date with the latest AI research and emerging tools, and continuously

    improve AI models

    in production. AI Engineers track new developments (new model architectures, improved frameworks like LangChain updates, etc.) and assess how these can enhance existing systems. They also gather user feedback and model performance data to iteratively refine the solutions (e.g. fine-tuning an LLM for better accuracy or reducing latency by optimizing code).


Skills:

  • Strong Programming & Scripting:

    Expert-level Python; proficiency in shell scripting (Bash).
  • Cloud Expertise:

    Deep, hands-on experience with either

    GCP

    (GKE, Cloud Run, Vertex AI, Cloud Build) or

    Azure

    (AKS, Azure ML, Azure Functions, Azure DevOps) it AWS (EKS/ECS/SageMaker/Lambda/AWS Developer Tools). Experience in more than one is a major advantage.
  • MLOps & DevOps Tooling

    :

    Containerization & Orchestration

    : Docker (Expert), Kubernetes (Strongly Required).

    CI/CD:

    Experience building pipelines with tools like GitHub Actions, Jenkins, Azure DevOps, or CircleCI.

    Infrastructure as Code

    Familiarity with Terraform or similar tools is highly desirable.

    Monitoring:

    Experience with tools like Prometheus, Grafana, or cloud-native monitoring services (e.g., Google Cloud Monitoring, Azure Monitor).
  • Model Serving & Optimization:

    Practical experience with modern model serving frameworks (e.g., Triton, TorchServe, KServe, vLLM). Knowledge of model optimization techniques (quantization, pruning) is a significant plus.

    API Development

    : Proficiency in building APIs with frameworks like FastAPI, Flask, or similar.


Behavioral Competencies:


  • Systems Thinker:

    Ability to see the entire system and understand how different components interact, from the model code to the underlying infrastructure.
  • Automation-First Mindset:

    A strong drive to automate repetitive tasks and build resilient, self-healing systems.
  • Performance and Efficiency Driven

    : A relentless focus on optimizing performance, latency, and cost.
  • Reliability and Ownership:

    A deep sense of responsibility for the stability and performance of systems in production.


Experience & Education:


  • Bachelor’s or Master’s Degree in Computer Science, Software Engineering, or a related technical field.
  • Minimum 3-5+ years of hands-on experience in a DevOps, MLOps, or Software Engineering role with a focus on deploying and managing machine learning systems in production.



Position Requirements:


  • Ability to travel up to 10% domestically and internationally.


What We Offer

  • Solve the most challenging engineering problems at the intersection of AI and cloud infrastructure.
  • Be the expert who makes cutting-edge AI research a practical reality at enterprise scale.
  • Work in a collaborative, high-impact team where your engineering skills are essential to the success of our AI strategy.
  • Gain deep expertise in the rapidly growing field of MLOps for Generative AI.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

hyderabad, bengaluru, delhi / ncr