Lead Generative AI Research Engineer

5 - 9 years

0 Lacs

Posted:4 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

You are an experienced Lead Generative AI Engineer responsible for training, optimizing, scaling, and deploying various generative AI models including large language models, voice/speech foundation models, vision, and multi-modal foundation models using cutting-edge techniques and frameworks. Your role involves architecting and implementing state-of-the-art neural architecture, robust training, and inference infrastructure to efficiently take complex models with billions of parameters to production while optimizing for low latency, high throughput, and cost efficiency. Your key responsibilities include: - Architecting and refining foundation model infrastructure to support optimized AI models deployment focusing on C/C++, CUDA, and kernel-level programming enhancements. - Implementing optimization techniques like quantization, distillation, sparsity, streaming, and caching for model performance enhancements. - Spearheading the development of Vision pipelines to ensure scalable training and inference workflows of billions of parameter foundation models. - Innovating for state-of-the-art architectures involving Panoptic Segmentation, Image Classification, and Image Generation. - Designing, developing, and innovating state-of-the-art large multimodal models. - Executing training and inference processes to minimize latency and maximize throughput utilizing GPU clusters and custom hardware. - Integrating and tailoring frameworks like PyTorch, TensorFlow, DeepSpeed, Lightening, FSDP, and Habana for fast model training and inference. - Enhancing post-deployment mechanisms with exhaustive testing, real-time monitoring, and robustness checks. - Driving continuous improvement initiatives for deployed models with automated pipelines for drift detection and performance degradation. - Leading the charge in model management including version control, reproducibility, and lineage tracking. - Cultivating a culture of high-performance computing and optimization within the AI/ML domain. Qualifications: - Ph.D. with 5+ years or MS with 8+ years of experience in ML Engineering, Data Science, or related fields. - Demonstrated expertise in high-performance computing with proficiency in Python, C/C++, CUDA, and kernel-level programming for AI applications. - Extensive experience in optimizing training and inference for large-scale AI models. - Understanding of Diffusion Models, Variational Autoencoders, Bayesian Modelling, and Reinforcement Learning is beneficial. - Experience in building billions of parameters generative AI foundation models. - Proven success in deploying optimized ML systems on a large scale using cloud infrastructures and GPU resources. - In-depth understanding and hands-on experience with advanced model optimization frameworks and MLOps tools. - Familiarity with contemporary MLOps frameworks and their application in production environments. - Strong grasp of state-of-the-art ML infrastructures, deployment strategies, and optimization methodologies. - Innovative problem-solving skills and collaborative mindset. - Exceptional communication and team collaboration skills.,

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You