Posted:3 weeks ago|
Platform:
On-site
Full Time
Key Responsibilities: Develop and Fine-Tune LLMs (e.g., GPT-4, Claude, LLaMA, Mistral, Gemini) using instruction tuning, prompt engineering, chain-of-thought prompting, and fine-tuning techniques. Build RAG Pipelines: Implement Retrieval-Augmented Generation solutions leveraging embeddings, chunking strategies, and vector databases like FAISS, Pinecone, Weaviate, and Qdrant. Implement and Orchestrate Agents: Utilize frameworks like MCP, OpenAI Agent SDK, LangChain, LlamaIndex, Haystack, and DSPy to build dynamic multi-agent systems and serverless GenAI applications. Deploy Models at Scale: Manage model deployment using HuggingFace, Azure Web Apps, vLLM, and Ollama, including handling local models with GGUF, LoRA/QLoRA, PEFT, and Quantization methods. Integrate APIs: Seamlessly integrate with APIs from OpenAI, Anthropic, Cohere, Azure, and other GenAI providers. Ensure Security and Compliance: Implement guardrails, perform PII redaction, ensure secure deployments, and monitor model performance using advanced observability tools. Optimize and Monitor: Lead LLMOps practices focusing on performance monitoring, cost optimization, and model evaluation. Work with AWS Services: Hands-on usage of AWS Bedrock, SageMaker, S3, Lambda, API Gateway, IAM, CloudWatch, and serverless computing to deploy and manage scalable AI solutions. Contribute to Use Cases: Develop AI-driven solutions like AI copilots, enterprise search engines, summarizers, and intelligent function-calling systems. Cross-functional Collaboration: Work closely with product, data, and DevOps teams to deliver scalable and secure AI products. Required Skills and Experience: Deep knowledge of LLMs and foundational models (GPT-4, Claude, Mistral, LLaMA, Gemini). Strong expertise in Prompt Engineering, Chain-of-Thought reasoning, and Fine-Tuning methods. Proven experience building RAG pipelines and working with modern vector stores ( FAISS, Pinecone, Weaviate, Qdrant ). Hands-on proficiency in LangChain, LlamaIndex, Haystack, and DSPy frameworks. Model deployment skills using HuggingFace, vLLM, Ollama, and handling LoRA/QLoRA, PEFT, GGUF models. Practical experience with AWS serverless services: Lambda, S3, API Gateway, IAM, CloudWatch. Strong coding ability in Python or similar programming languages. Experience with MLOps/LLMOps for monitoring, evaluation, and cost management. Familiarity with security standards: guardrails, PII protection, secure API interactions. Use Case Delivery Experience: Proven record of delivering AI Copilots, Summarization engines, or Enterprise GenAI applications. Experience 6-8 years of experience in AI/ML roles, focusing on LLM agent development, data science workflows, and system deployment. Demonstrated experience in designing domain-specific AI systems and integrating structured/unstructured data into AI models. Proficiency in designing scalable solutions using LangChain and vector databases. Job Type: Full-time Pay: ₹2,000,000.00 - ₹3,000,000.00 per year Benefits: Health insurance Schedule: Monday to Friday Work Location: In person
Quantanite
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python NowHyderabad
9.0 - 13.0 Lacs P.A.
Bengaluru
9.0 - 13.0 Lacs P.A.
Bengaluru
13.0 - 17.0 Lacs P.A.
Hyderabad, Bengaluru, Delhi / NCR
30.0 - 45.0 Lacs P.A.
Ghaziabad, Uttar Pradesh, India
Salary: Not disclosed
Pune, Bengaluru, Mumbai (All Areas)
Experience: Not specified
30.0 - 45.0 Lacs P.A.
Hyderābād
6.51 - 8.095 Lacs P.A.
Bengaluru
4.8 - 6.8182 Lacs P.A.
Hyderabad
25.0 - 30.0 Lacs P.A.
India
Salary: Not disclosed