Jobs
Interviews

337 Quantization Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

0 years

0 Lacs

Mohali district, India

On-site

Skill Sets: Expertise in ML/DL, model lifecycle management, and MLOps (MLflow, Kubeflow) Proficiency in Python, TensorFlow, PyTorch, Scikit-learn, and Hugging Face models Strong experience in NLP, fine-tuning transformer models, and dataset preparation Hands-on with cloud platforms (AWS, GCP, Azure) and scalable ML deployment (Sagemaker, Vertex AI) Experience in containerization (Docker, Kubernetes) and CI/CD pipelines Knowledge of distributed computing (Spark, Ray), vector databases (FAISS, Milvus), and model optimization (quantization, pruning) Familiarity with model evaluation, hyperparameter tuning, and model monitoring for drift detection

Posted 5 hours ago

Apply

5.0 years

0 Lacs

Coimbatore, Tamil Nadu, India

Remote

Location: Coimbatore / Remote Experience: Minimum 3–5 years Type: Full-Time About Xlorit Xlorit, based in Coimbatore, is a premier digital solutions provider specializing in web, app, and UI/UX experiences tailored to meet your needs. Our mission is to simplify success by offering cutting-edge tools, agile expertise, and transparent partnerships. Join us to help businesses thrive by driving efficiency, creating exceptional customer experiences, and staying ahead in the digital era. Role Overview We are looking for a skilled and hands-on AI/ML Developer with strong expertise in model fine-tuning, AI system deployment, and hardware optimization. This role focuses on building and implementing AI models. You will work directly on training, optimizing, and deploying large AI models, with an emphasis on performance and scalability. Key Responsibilities Assemble and configure GPU-based hardware environments (e.g., A100, H100, RTX series) for AI workloads. Deploy open-source and commercial AI models, including LLMs and SLMs, for high-throughput inference. Fine-tune models using techniques such as LoRA, QLoRA, PEFT, and instruction tuning. Prepare and preprocess training datasets, including formatting, tokenization, and data cleaning. Participate in the complete ML pipeline: training, validation, benchmarking, and evaluation. Use tools such as Hugging Face Transformers, vLLM, TGI, DeepSpeed, and Weights & Biases. Required Qualifications 3–5 years of hands-on experience in AI/ML model development and deployment. Proficiency in Python and PyTorch. Experience with model training, fine-tuning, and hardware optimization. Familiarity with LLM architectures and transformer-based models. Knowledge of evaluation metrics (e.g., perplexity, BLEU, MMLU, QA-F1). Strong understanding of AI system performance tuning and memory-efficient inferencing. Preferred (Nice to Have) Experience with RLHF pipelines, quantization (GGUF, GPTQ), or MLOps practices. Exposure to multi-modal models (text, vision, or audio). Experience using tools like FastAPI, Docker, or Triton Inference Server. Ready to join: If you’re passionate about building innovative solutions and thrive in a collaborative environment, we’d love to hear from you! Please submit your resume detailing your experience and why you would be a great fit for our team.

Posted 5 hours ago

Apply

10.0 years

0 Lacs

Pune, Maharashtra, India

On-site

About Fusemachines Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world. Role Overview: We are seeking a highly skilled and motivated MLOps Engineer with a strong background in computer vision. In this role, you will be responsible for the full lifecycle of our machine learning models, from development and optimization to deployment and scaling. You will build and maintain the infrastructure that allows our cutting-edge computer vision algorithms to run reliably and efficiently in production. The ideal candidate will have a deep understanding of both MLOps principles and 3D computer vision, with hands-on experience in containerization, model optimization, and scalable systems. Key Responsibilities: Design, build, and maintain robust, scalable, and automated MLOps pipelines for model training, evaluation, and deployment (CI/CD for ML) Containerize machine learning applications using Docker for scalable and reproducible deployments Deploy and manage ML models at scale Optimize deep learning models for inference performance, including techniques like quantization, pruning, and distillation Work with and extend state-of-the-art AI models for tasks such as: Depth estimation and 6D object pose estimation Image and video segmentation Dense point tracking and feature matching Develop and maintain monitoring systems to track model performance, detect data drift, and ensure the reliability of production systems Collaborate with AI researchers and software engineers to transition models from research to production Manage and optimize pipelines for processing large-scale 3D data, including point clouds, LiDAR, and stereoscopic imagery Apply a strong mathematical understanding of spatial transformations, rigid body rotations, and coordinate frame alignment to ensure algorithmic integrity in production Required Qualifications: Bachelor's in Computer Science, Engineering, or a related field Proven experience in an MLOps, DevOps, or similar role with a focus on machine learning Strong programming skills in Python and/or C++ A portfolio of projects or publications in the field of computer vision or MLOps Hands-on experience with model optimization techniques (quantization, etc.) and frameworks (e.g., TensorRT, ONNX Runtime) Hands-on experience with containerization technologies Experience with CI/CD tools (e.g., Jenkins, GitLab CI, CircleCI) and version control (Git) Solid understanding of general computer vision and 3D computer vision principles Experience with deep learning frameworks such as PyTorch or TensorFlow A strong mathematical foundation in spatial transformations, rigid body rotations, coordinate frame alignment, and triangulation Fusemachines is an Equal Opportunities Employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws. Powered by JazzHR CKEyAqRgGj

Posted 6 hours ago

Apply

4.0 years

0 Lacs

Kochi, Kerala, India

Remote

Job Overview Codelynks is an IT consulting and services company helping businesses build innovative and scalable solutions. We are looking for a talented AI/ML Engineer with 2–4 years of experience to work on AI-enabling a SaaS application. If you have hands-on experience in AI/ML model development and a strong interest in applying these skills to real-world applications, we want to hear from you. Key Responsibilities Assist in the design, development, and integration of AI/ML features into a SaaS application. Work on training, fine-tuning, and deploying large language models (LLMs) with embedding techniques. Develop Natural Language Processing (NLP) components, including tokenization, attention mechanisms, and sentiment analysis. Implement AI tasks such as text summarization, semantic search, and basic supervised/unsupervised learning models. Support optimization of AI/ML models through methods like quantization and pruning. Contribute to creating searchable vector databases from structured and unstructured data. Assist in building conversational AI solutions to improve user experience. Collaborate with senior engineers and cross-functional teams to deliver AI-enabled features. Monitor and test model performance, making improvements where necessary. Required Skills and Experience 2–4 years of hands-on experience in AI/ML software development. Familiarity with LLMs and basic fine-tuning techniques. Experience working with NLP frameworks and libraries (e.g., Hugging Face, spaCy, NLTK). Understanding of sentiment analysis, text summarization, and semantic search concepts. Basic knowledge of supervised and unsupervised learning approaches. Proficiency in Python for AI/ML development. Experience integrating AI models into applications. Ability to work with APIs and build simple chatbot or conversational AI features. Preferred Qualifications Exposure to AI/ML applications in SaaS or cloud-based platforms. Understanding of vector databases and embedding search. Experience in agile or remote work environments. Strong problem-solving and analytical skills. Key Competencies Willingness to learn and adapt quickly to new AI technologies. Ability to work collaboratively in a team environment. Attention to detail and focus on delivering quality work. Good communication skills to work effectively with technical and non-technical stakeholders.

Posted 12 hours ago

Apply

5.0 - 9.0 years

0 Lacs

karnataka

On-site

You will be joining an innovative company that is transforming retail checkout experiences by leveraging cutting-edge Computer Vision technology to replace traditional barcodes. Our platform streamlines and accelerates checkout processes, elevating the shopping journey for both retailers and consumers. As we continue to expand rapidly, we are seeking an Android/Cross-Platform App Developer with a wealth of experience to contribute to shaping the future of retail technology. Your main responsibilities will include leading the research, design, and development of advanced computer vision models for various tasks such as object detection, tracking, segmentation, OCR, scene understanding, and 3D vision. You will be tasked with translating business requirements into scalable scientific solutions by employing state-of-the-art deep learning and classical computer vision techniques. Additionally, designing and executing experiments to assess the performance, robustness, and accuracy of computer vision models in real-world production settings will be a key part of your role. Collaborating with cross-functional teams, including software engineering, product, and data teams to seamlessly integrate vision models into applications is essential for success. You will also be driving innovation through the generation of internal intellectual property (patents, publications) and contributing to the long-term AI/ML roadmap. Providing scientific and technical leadership, mentoring junior scientists, and reviewing designs and architectures will be crucial. To stay at the forefront of the field, you will need to stay current with the latest advancements in AI, deep learning, and computer vision by engaging in academic and industrial research. For this role, you are required to have a M.S. or Ph.D. in Computer Science, Electrical Engineering, or a related field with a focus on Computer Vision or Machine Learning. A minimum of 5 years of practical experience in constructing and deploying production-grade computer vision models is necessary. You should possess a solid theoretical foundation and practical experience with deep learning frameworks like PyTorch, as well as model architectures such as CNNs, Vision Transformers, and Diffusion Models. Experience with handling large-scale datasets, training pipelines, and performance evaluation metrics is expected. Proficiency in Python and scientific computing libraries like NumPy, OpenCV, and scikit-learn is essential. Familiarity with model optimization for edge deployment (ONNX, TensorRT, pruning/quantization) would be advantageous. Excellent written and verbal communication skills are required, along with a history of mentoring and collaborating with others. Preferred qualifications include experience working with computer vision in real-time systems (e.g., AR/VR, robotics, automotive, surveillance), having research papers published in top-tier conferences (CVPR, ICCV, NeurIPS, etc.), exposure to MLOps or ML model lifecycle in production environments, and familiarity with cloud platforms (AWS/GCP/Azure) and containerization tools (Docker, Kubernetes) as well as basic bash scripting skills.,

Posted 18 hours ago

Apply

5.0 years

3 - 8 Lacs

Thiruvananthapuram

On-site

Job Requirements Design and develop AI/ML-based applications with a focus on deployment on embedded hardware platforms (e.g., Renesas RZ/V2H, NVIDIA Jetson, STM32, etc.) Port and optimize AI models for real-time performance on resource-constrained embedded systems Perform model quantization, pruning, and conversion (e.g., ONNX, TensorRT, TVM, TFLite, DRP-AI) for deployment End-to-end AI model lifecycle development including data preparation, training, validation, and inference optimization Customize and adapt AI network architectures for specific edge AI use cases (e.g., object detection, classification, audio detection) Data Preparation & Preprocessing: Collect, organize, and preprocess audio/image datasets. Work Experience Minimum 5 years of experience in AI/ML application development. Strong Python programming skills, including AI frameworks such as PyTorch, TensorFlow, Keras. Solid experience in developing deep learning-based solutions for Computer Vision, Imaging and Audio. Deep understanding of DL architectures like CNN, FCN and their application to visual tasks. Experience in model optimization techniques such as quantization, pruning, layer fusion, and INT8 calibration for edge inference. Hands-on experience in deploying AI models on embedded platforms. Proficiency in tools such as OpenCV, ONNX, TVM, TFLite, or custom inference engines. Understanding of system constraints like memory, compute, and power on edge devices. Exposure to real-time audio processing, video processing and robotics.

Posted 1 day ago

Apply

3.0 years

6 - 8 Lacs

Gurgaon

On-site

Required Experience 3 - 7 Years Skills LLM, model optimization, model evaluation + 4 more Senior Software Engineer - AI ( 3-5 Yrs of relevant experience ) AL Model Engineering Expertise: Deep proficiency in AI model engineering, that includes : Training and fine-tuning Vision Language Models (VLM), large language models (LLMs) and other Deep Learning (DL) architectures. Model optimization, quantization, and deployment for latency and throughput. Model evaluation and monitoring in production environments. Creating Small Language Models using Distillation , Pruning etc. Worked on accelerating AI training and inference, with GPU and NPU on Compute, Edge and Mobile platforms with SOCs from NVIDIA, Qualcomm etc. AI Solution Development: Build and integrate AI capabilities (e.g., computer vision, NLP, recommendation systems) into production-grade Web/Mobile App software. Scalability & Performance: Experience scaling AI systems for high-volume, real-time applications, leveraging Cloud-Native or Edge-AI technologies. Data-Driven Development: Strong understanding of data pipelines and feature engineering for AI applications.

Posted 1 day ago

Apply

8.0 years

2 - 8 Lacs

Gurgaon

On-site

Required Experience 8 - 15 Years Skills LLM, Model optimization, Model evaluation + 4 more AI Software Lead ( 7-10 Yrs of relevant experience ) Production-Focused AI: Design, develop, and deploy scalable AI solutions, emphasizing MLOps best practices. Model Expertise: Deep proficiency in AI model engineering, including: Training and fine-tuning large language models (LLMs) and other deep learning architectures. Model optimization, quantization, and deployment for latency and throughput. Model evaluation and monitoring in production environments. AI Solution Development: Build and integrate AI capabilities (e.g., computer vision, NLP, recommendation systems) into production-grade Web/Mobile App software. Scalability & Performance: Experience scaling AI systems for high-volume, real-time applications, leveraging Cloud-Native or Edge-AI technologies. Data-Driven Development: Strong understanding of data pipelines and feature engineering for AI applications.

Posted 1 day ago

Apply

3.0 - 5.0 years

13 - 15 Lacs

Pune, Maharashtra, India

On-site

About The Opportunity We are a high-growth enterprise AI platform provider in the cloud services & SaaS sector, modernizing data pipelines and automating knowledge work for Fortune 500 clients. Our hybrid teams in Pune and Mumbai build production-grade generative AI solutions on Microsoft Azure—enabling real-time insights, intelligent agents, and scalable RAG applications with robust security and responsible-AI guardrails. Role & Responsibilities Architect, prototype, and deploy GenAI applications (LLMs, RAG, multimodal) on Azure OpenAI, Cognitive Search, and Kubernetes-based microservices. Build and orchestrate agentic frameworks (LangChain, AutoGen) for multi-agent reasoning, tool-calling, and end-to-end workflow automation. Engineer low-latency, high-throughput data and prompt pipelines using Azure Data Factory, Event Hub, and Cosmos DB. Optimize model performance and cost via fine-tuning, quantization, scalable caching on Azure ML and AKS. Implement production-grade CI/CD, observability (App Insights, Prometheus), security, and responsible-AI guardrails. Collaborate cross-functionally with product, design, and customer success teams to deliver measurable business impact. Skills & Qualifications Must-Have 3-5 years hands-on Generative AI/LLM engineering (GPT, Llama 2, Claude) with at least one solution in production. Proficiency with Microsoft Azure services: Azure OpenAI, Functions, Data Factory, Cosmos DB, AKS. Strong Python & TypeScript skills with experience in agentic frameworks (LangChain, AutoGen, Semantic Kernel) and REST/GraphQL APIs. Solid foundation in cloud MLOps: Docker, Helm, Terraform/Bicep, GitHub Actions or Azure DevOps. Proven ability to optimize end-to-end GenAI pipelines for performance, cost efficiency, and reliability. Preferred Experience scaling GenAI pipelines to >10 K QPS using vector databases (Pinecone, Qdrant) and distributed caching. Familiarity with prompt engineering, fine-tuning methodologies, and retrieval-augmented generation best practices. Knowledge of Kubernetes operators, Dapr, and service mesh patterns for resilient microservices. Benefits & Culture Highlights Competitive salary and flexible hybrid work model in Pune and Mumbai offices. Rapid career growth within a pioneering AI leadership team. Collaborative, innovation-driven culture emphasizing ethical and responsible AI. Skills: Generative AI,Azure,Python,LLMs,SQL Azure,agentic framework,Langgraph,autogen,CI,Cd,Kubernetes,Microsoft Azure,Cloud

Posted 1 day ago

Apply

3.0 years

0 Lacs

Gurugram, Haryana, India

On-site

Job Title: Applied AI Builder (AI/GenAI/ML Engineer) Experience: 3+ Years Location: Gurugram (Work from Office) Availability: Immediate Joiner About the Role: We are seeking a passionate and hands-on Applied AI Builder to join our team and drive innovation at the intersection of Generative AI, Machine Learning , and Cloud technologies . This role is ideal for someone with a strong foundation in ML/AI, practical experience with LLMs and GenAI frameworks , and a desire to solve real-world problems using cutting-edge tools and techniques. As an Applied AI Builder, you’ll work on designing, developing, and deploying AI solutions across multiple domains, leveraging state-of-the-art models and frameworks. You will collaborate with cross-functional teams to build scalable applications that integrate GenAI, RAG (Retrieval-Augmented Generation), and vector-based search. Key Responsibilities: Design and develop AI/GenAI applications using LLMs and frameworks like LangChain, Haystack , etc. Implement Retrieval-Augmented Generation (RAG) pipelines using vector databases and semantic search techniques. Build, fine-tune, and deploy ML and DL models using PyTorch, TensorFlow , or JAX . Develop robust, production-ready code in Python with reusable and scalable modules. Optimize and deploy models on cloud platforms such as AWS, GCP, or Azure . Collaborate with data engineers, product managers, and designers to deliver AI-driven features and services. Stay current with the latest research and advancements in GenAI, LLMs, and emerging ML technologies. Required Skills & Experience: 3+ years of hands-on experience in Machine Learning, Deep Learning , and Python programming . Strong exposure to Generative AI and working with Large Language Models (LLMs) . Experience with Prompt Engineering , LangChain , or Haystack . Proficiency with vector databases (e.g., FAISS, Pinecone, Weaviate, Milvus). Solid understanding of RAG-based architectures and semantic search. Familiarity with cloud platforms (AWS, GCP, or Azure) and deploying AI models at scale. Experience with at least one deep learning framework: PyTorch , TensorFlow , or JAX . Good to Have: Experience building AI-based chatbots or intelligent agents. Knowledge of model optimization techniques (quantization, pruning, distillation). Exposure to MLOps tools and workflows for scalable AI deployment. Published work, open-source contributions, or participation in AI competitions (e.g., Kaggle, Hugging Face). Why Join Us? Work at the forefront of AI and Generative AI innovation . Build impactful, production-grade solutions that are deployed and used at scale. Collaborate with a highly skilled, passionate team in a fast-paced and agile environment . Opportunity for rapid growth and working on real-world AI applications across industries.

Posted 1 day ago

Apply

0 years

13 - 15 Lacs

Pune, Maharashtra, India

On-site

About The Opportunity We’re a fast-growing enterprise AI platform provider in the cloud services & software (SaaS) sector , helping Fortune 500 clients modernize data pipelines, automate knowledge work and unlock new revenue with Generative AI. Backed by a deep bench of AI researchers and cloud architects, we build scalable, production-grade solutions on Microsoft Azure. Join our Pune-based hybrid team to shape the next generation of agentic AI products.Role & Responsibilities Design, prototype and deploy GenAI applications (LLMs, RAG, multimodal) on Azure OpenAI, Cognitive Search and Kubernetes-based micro-services. Build and orchestrate agentic frameworks (LangGraph / AutoGen) to enable multi-agent reasoning, tool-calling and workflow automation at scale. Engineer robust data & prompt pipelines using Azure Data Factory, Event Hub and Cosmos DB, ensuring low-latency, high-throughput inference. Optimize model performance & cost via fine-tuning, quantization and scalable caching on Azure ML and AKS. Harden solutions for production with end-to-end CI/CD, observability (App Insights, Prometheus), security & responsible-AI guardrails. Collaborate cross-functionally with product managers, designers and customer success to deliver measurable business impact. Skills & Qualifications Must-Have 3-5 yrs hands-on in Generative AI / LLM engineering (GPT, Llama 2, Claude, etc.) with at least one product in production. Proven expertise in Microsoft Azure services: Azure OpenAI, Functions, Data Factory, Cosmos DB, AKS. Strong Python/TypeScript with agentic frameworks (LangChain, AutoGen, Semantic Kernel) and REST/GraphQL APIs. Solid grounding in cloud MLOps: Docker, Helm, Terraform/Bicep, GitHub Actions or Azure DevOps. Preferred Experience benchmarking & scaling pipelines to >10 K QPS using Vector DBs (Qdrant, Pinecone) and distributed caching. Familiarity with prompt-engineering, fine-tuning & retrieval-augmented generation (RAG) best practices. Knowledge of Kubernetes operators, Dapr, Service Mesh for fault-tolerant micro-services. Skills: Generative AI,Azure,Python,LLMs,SQL Azure,agentic framework,Langgraph,autogen,CI,Cd,Kubernetes,Microsoft Azure,Cloud

Posted 1 day ago

Apply

10.0 - 12.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Job Description Organization: At CommBank, we never lose sight of the role we play in other peoples financial wellbeing. Our focus is to help people and businesses move forward to progress. To make the right financial decisions and achieve their dreams, targets, and aspirations. Regardless of where you work within our organisation, your initiative, talent, ideas, and energy all contribute to the impact that we can make with our work. Together we can achieve great things. Job Title: Senior Data Scientist Location: Bangalore Business & Team: BB Advanced Analytics and Artificial Intelligence COE Impact & contribution: As a Senior Data Scientist, you will be instrumental in pioneering Gen AI and multi-agentic systems at scale within CommBank. You will architect, build, and operationalize advanced generative AI solutionsleveraging large language models (LLMs), collaborative agentic frameworks, and state-of-the-art toolchains. You will drive innovation, helping set the organizational strategy for advanced AI, multi-agent collaboration, and responsible next-gen model deployment. Roles & Responsibilities: Gen AI Solution Development: Lead end-to-end development, fine-tuning, and evaluation of state-of-the-art LLMs and multi-modal generative models (e.g., transformers, GANs, VAEs, Diffusion Models) tailored for financial domains. Multi-Agentic System Engineering: Architect, implement, and optimize multi-agent systems, enabling swarms of AI agents (utilizing frameworks like Lang chain, Lang graph, and MCP) to dynamically collaborate, chain, reason, critique, and autonomously execute tasks. LLM-Backed Application Design: Develop robust, scalable GenAI-powered APIs and agent workflows using Fast API, Semantic Kernel, and orchestration tools. Integrate observability and evaluation using Lang fuse for tracing, analytics, and prompt/response feedback loops. Guardrails & Responsible AI: Employ frameworks like Guardrails AI to enforce robust safety, compliance, and reliability in LLM deployments. Establish programmatic checks for prompt injections, hallucinations, and output boundaries. Enterprise-Grade Deployment: Productionize and manage at-scale Gen AI and agent systems with cloud infrastructure (GCP/AWS/Azure), utilizing model optimization (quantization, pruning, knowledge distillation) for latency/throughput trade offs. Toolchain Innovation: Leverage and contribute to open source projects in the Gen AI ecosystem (e.g., Lang Chain, Lang Graph, Semantic Kernel, Lang fuse, Hugging face, Fast API). Continuously experiment with emerging frameworks and research. Stakeholder Collaboration: Partner with product, engineering, and business teams to define high-impact use cases for Gen AI and agentic automation; communicate actionable technical strategies and drive proof-of-value experiments into production. Mentorship & Thought Leadership: Guide junior team members in best practices for Gen AI, prompt engineering, agentic orchestration, responsible deployment, and continuous learning. Represent CommBank in the broader AI community through papers, patents, talks, and open-source. Essential Skills: 10+ years of hands-on experience in Machine Learning, Deep Learning, or Generative AI domains, including practical expertise with LLMs, multi-agent frameworks, and prompt engineering. Proficient in building and scaling multi-agent AI systems using Lang Chain, Lang Graph, Semantic Kernel, MCP, or similar agentic orchestration tools. Advanced experience developing and deploying Gen AI APIs using Fast API; operational familiarity with Lang fuse for LLM evaluation, tracing, and error analytics. Experience with transformer architectures (BERT/GPT, etc.), fine-tuning LLMs, and model optimization (distillation/quantization/pruning). Experience integrating open and commercial LLM APIs and building retrieval-augmented generation (RAG) pipelines. Familiarity with robust experimentation using tools like Lang Smith, GitHub Copilot, and experiment tracking systems. Papers, patents, or open-source contributions to the Gen AI/LLM/Agentic AI ecosystem. Experience with financial services or regulated industries for secure and responsible deployment of AI. Education Qualifications: Bachelors or Masters degree in Computer Science, Engineering, Information Technology. Show more Show less

Posted 1 day ago

Apply

7.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Job Description YOUR IMPACT Are you passionate about leveraging cutting-edge AI/ML techniques, including Large Language Models, to solve complex, mission-critical problems in a dynamic environment? Do you want to contribute to safeguarding a leading global financial institution? OUR IMPACT We are Compliance Engineering, a global team of engineers and scientists dedicated to preventing, detecting, and mitigating regulatory and reputational risks across Goldman Sachs. We build and operate a suite of platforms and applications that protect the firm and its clients. We Offer Access to petabyte scaleof structured and unstructured data to fuel your AI/ML models, including textual data suitable for LLM applications. The opportunity to work with state-of-the-art LLM models and agentic framework. A collaborative environment where you can learn from and contribute to a team of experienced engineers and scientists. The chance to make a tangible impact on the firm's ability to manage risk and maintain its reputation. Within Compliance Engineering, we are seeking an experienced AI/ML Engineer to join our Engineering team. This role will focus on solving highly complex business problems using AI/ML techniques, incorporating latest emerging trends om building out vertical AI agents to run on data at massive scale. How You Will Fulfill Your Potential As a member of our team, you will: Design and architect scalable and reliable end-to-end AI/ML solutions specifically tailored for compliance applications, ensuring adherence to relevant regulatory requirements. This encompasses the development and implementation of GenAI-driven solutions, including agentic frameworks for automating compliance processes, RAG pipelines, and the creation and utilization of embeddings for compliance knowledge bases. Explore diverse AI/ML problems, such as model fine-tuning, prompt engineering, and experimentation with different algorithmic approaches to address novel business challenges. Develop, test, and maintain high-quality, production-ready code. Lead technical projects from inception to completion, providing guidance and mentorship to junior engineers. Collaborate effectively with compliance officers, legal counsel, and other stakeholders to understand business requirements and translate them into technical solutions. Participate in code reviews to ensure code quality, maintainability, and adherence to coding standards. Promote best practices for AI/ML development, including version control, testing, and documentation. Stay current with the latest advancements in AI/ML platforms, tools, and techniques to solve business problems. Qualifications A successful candidate will possess the following attributes: A Bachelor's, Master's or PhD degree in Computer Science, Machine Learning, Mathematics, or a similar field of study. Preferably 7+ years AI/ML industry experience for Bachelor’s/Masters, 4+ years for PhD with a focus on Language Models. Strong foundation in machine learning algorithms, including deep learning architectures (e.g., transformers, RNNs, CNNs) Proficiency in Python and relevant libraries/frameworks such as TensorFlow, PyTorch, Hugging Face Transformers, scikit-learn. Demonstrated expertise in GenAI techniques, including but not limited to Retrieval-Augmented Generation (RAG), model fine-tuning, prompt engineering, AI agents, and evaluation techniques. Experience working with embedding models and vector databases. Experience with MLOps practices, including model deployment, containerization (Docker, kubernetes), CI/CD, and model monitoring. Strong verbal and written communication skills. Curiosity, ownership and willingness to work in a collaborative environment. Proven ability to mentor and guide junior engineers. Experience in some of the following is desired and can set you apart from other candidates: Experience with Agentic Frameworks (e.g., Langchain, AutoGen) and their application to real-world problems. Understanding of scalability and performance optimization techniques for real-time inference such as quantization, pruning, and knowledge distillation. Experience with model interpretability techniques. Prior experience in code reviews/ architecture design for distributed systems. Experience with data governance and data quality principles. About Goldman Sachs At Goldman Sachs, we commit our people, capital and ideas to help our clients, shareholders and the communities we serve to grow. Founded in 1869, we are a leading global investment banking, securities and investment management firm. Headquartered in New York, we maintain offices around the world. We believe who you are makes you better at what you do. We're committed to fostering and advancing diversity and inclusion in our own workplace and beyond by ensuring every individual within our firm has a number of opportunities to grow professionally and personally, from our training and development opportunities and firmwide networks to benefits, wellness and personal finance offerings and mindfulness programs. Learn more about our culture, benefits, and people at GS.com/careers. We’re committed to finding reasonable accommodations for candidates with special needs or disabilities during our recruiting process. Learn more: https://www.goldmansachs.com/careers/footer/disability-statement.html © The Goldman Sachs Group, Inc., 2023. All rights reserved. Goldman Sachs is an equal employment/affirmative action employer Female/Minority/Disability/Veteran/Sexual Orientation/Gender Identity

Posted 2 days ago

Apply

0 years

0 Lacs

Gurugram, Haryana, India

On-site

Position: AI/ML Engineer - GenAI Stack Hacker Who You Are You're a tool-native LLM hacker who ships AI systems faster than most people write specs. You build with GPT-4o, LangChain, Hugging Face, and Together.ai like it's second nature. You don't just use AI APIs — you orchestrate them, fine-tune them, and make them work together intelligently. What You'll Build Unified Model Router: LangChain-style orchestration layer that seamlessly switches between Together.ai, OpenAI, self-hosted HF models, and custom fine-tuned models Intelligent Cost Optimizer: ML-driven system that routes requests based on complexity, cost, speed, and quality requirements Storytelling AI Pipeline: Prompt templates, context managers, and narrative consistency engines specifically for interactive storytelling Model Evaluation Framework: Systematic testing for story quality, character consistency, and user engagement across different models Custom Fine-tuning Pipeline: Data prep, LoRA training, and deployment for UIXBridge-specific storytelling models Tech Stack You'll Master Core: Python, FastAPI, async workflows, WebSockets AI Orchestration: LangChain, Semantic Kernel, custom routing logic Models: Transformers, Diffusers, PEFT, HuggingFace Hub, Together.ai APIs Infrastructure: Replicate, RunPod, modal.com for self-hosted models Prompt Engineering: Advanced templating, few-shot learning, chain-of-thought reasoning You Should Know How to build production AI systems that don't break when APIs go down Prompt engineering for creative writing and narrative consistency Model deployment, quantization, and GPU optimization How to make AI feel magical in user-facing applications Cost optimization across multiple AI providers Bonus Points Built AI storytelling tools, writing assistants, or creative agents Experience with RLHF, constitutional AI, or safety alignment Open source contributions to AI tooling or model repos Day 1 Goals Audit current AI integration and identify optimization opportunities Build unified model calling framework with intelligent fallbacks Implement cost tracking and optimization across all AI providers Apply with: GitHub showing AI experiments, deployed AI tools, or hackathon wins. Show us something cool you built with LLMs. Please Note: This will be a part-time contractual role

Posted 2 days ago

Apply

6.0 years

0 Lacs

Jamshedpur, Jharkhand, India

Remote

🌟 Position Overview We are hiring an experienced Python Developer with deep expertise in AI automation , workflow orchestration , and custom model development to build next-generation healthcare applications. This role is ideal for developers who love training, modifying, and deploying production-grade AI systems while navigating the challenges of compliance, data governance, and real-time performance in healthcare environments. 🧠 Key Responsibilities🤖 AI Workflow Automation & Orchestration Build intelligent automation pipelines for clinical workflows Orchestrate real-time and batch AI workflows using Airflow , Prefect , or Dagster Develop event-driven architectures and human-in-the-loop validation layers Automate data ingestion, processing, and inference pipelines for medical data 🧪 Custom AI Model Development Train custom NLP, CV, or multi-modal models for medical tasks Fine-tune open-source models for domain-specific adaptation Use transfer learning on small datasets for clinical use Build ensemble learning systems for diagnostic accuracy 🧬 Open-Source Model Adaptation Modify models from Hugging Face, Meta, or Google for medical understanding Build custom tokenizers for EMR/EHR and medical terminology Apply quantization, pruning , and other inference optimizations Develop custom loss functions , training loops , and architectural variations ⚙️ MLOps & Deployment Create training and CI/CD pipelines using MLflow, Kubeflow, or W&B Scale distributed training across multi-GPU environments (Ray/Horovod) Deploy models with FastAPI/Flask , supporting both batch and real-time inference Monitor for model drift , performance degradation, and compliance alerts 🏥 Healthcare AI Specialization Build systems for: Medical text classification & entity recognition Radiology report generation Clinical risk prediction Auto-coding & billing Real-time care alerts Ensure HIPAA compliance in all stages of model lifecycle ✅ Required Qualifications💻 Technical Skills 4–6 years of Python development 2+ years in ML/AI with deep learning frameworks (PyTorch/TensorFlow) Experience modifying open-source transformer models Strong expertise in workflow orchestration tools (Airflow, Prefect, Dagster) Hands-on with MLOps tools (MLflow, W&B, SageMaker, DVC) 🔍 Core Competencies Strong foundation in transformers , NLP , and CV Experience in distributed computing, GPU programming, and model compression Ability to explain and interpret model decisions (XAI, SHAP, LIME) Familiarity with containerized deployments (Docker, K8s) 🧰 Technical Stack Languages : Python 3.9+, CUDA ML Frameworks : PyTorch, TensorFlow, Hugging Face, ONNX Workflow Tools : Airflow, Prefect, Dagster MLOps : MLflow, Weights & Biases, SageMaker Infra : AWS, Kubernetes, GPU Clusters Data : Spark, Dask, Pandas Databases : PostgreSQL, MongoDB, Delta Lake, S3 Versioning : Git, DVC 🌟 Preferred Qualifications Healthcare experience (EHR, medical NLP, radiology, DICOM, FHIR) Knowledge of federated learning , differential privacy , and AutoML Experience with multi-modal , multi-task , or edge model deployment Contributions to open-source projects , or research publications Knowledge of explainable AI and responsible ML practices 🎯 Key Projects You'll Work On Real-time clinical documentation automation Custom NER models for ICD/CPT tagging LLM adaptation for medical conversation understanding Real-time risk stratification pipelines for hospitals 🎁 What We Offer Comprehensive health, plans Flexible work options (remote/hybrid) with quarterly in-person meetups 📤 Application Requirements Resume with ML/AI experience GitHub or portfolio links (model code, notebooks, demos) Cover letter describing your AI workflow or custom model build Code samples (open-source or private repos) Optional: Research papers, Kaggle profile, open challenges 🧪 Interview Process Initial HR screening (30 mins) Take-home Python + ML coding challenge Technical ML/AI deep-dive (90 mins) Model training/modification practical (2 hrs) System design for ML pipeline (60 mins) Presentation or walkthrough of past AI work (45 mins) Culture fit + final discussion References + offer 🏥 About the Role You will help build AI that doesn’t just analyze data — it augments clinical decisions , automates medical documentation, and assists doctors in real-time. This role will impact thousands of patients and redefine how AI powers healthcare. Aarna Tech Consultants Pvt. Ltd. (Atcuality) is an equal opportunity employer. We believe in diversity, ethics, and inclusive AI systems for healthcare.

Posted 4 days ago

Apply

5.0 years

0 Lacs

Jamshedpur, Jharkhand, India

Remote

🌟 Position Overview We are seeking a highly skilled Python Developer with a strong focus on AI integration, prompt engineering, and AWS Bedrock optimization to join our AI-native healthcare team. The ideal candidate will bring 3–5 years of Python experience with a deep understanding of LLM configuration , token cost optimization , and compliance-first AI development in regulated healthcare environments . At Atcuality , you’ll be instrumental in scaling LLM-powered systems for clinical intelligence, agent orchestration, and real-time decision support. 🧠 Key Responsibilities🤖 AI Implementation & Integration Integrate LLMs via AWS Bedrock , OpenAI, or Claude APIs for clinical use cases Build scalable, secure Python APIs and wrappers for model consumption Implement streaming AI responses and fallback mechanisms for fail-safe delivery Configure multi-agent decision orchestration for healthcare workflows Build prompt+context orchestration layers that ensure factual accuracy and compliance ☁️ AWS Bedrock Configuration & Management Set up and manage Bedrock endpoints for Claude, Titan, and Llama 3 Configure access control, PrivateLink, VPC endpoints , and model A/B testing Monitor performance, scaling, latency, and token metrics in real time Implement model versioning and failover strategies 🧮 Token Optimization & Cost Control Build systems for token counting, estimation, and analytics Develop prompt compression , token-aware caching, and context window managers Implement budgeted inference logic for controlled spending Monitor token usage via dashboards and alerts ✍️ Prompt Engineering & Optimization Design healthcare-safe prompts using few-shot , chain-of-thought , and constitutional AI strategies Create dynamic prompt builders , version control , and testing harnesses Implement prompt injection detection and sanitization Measure and improve prompt accuracy, safety, and efficiency 🔒 Healthcare AI Compliance & Security Ensure HIPAA and PHI security in all prompt interactions and API calls Integrate PHI redaction , audit trails, and explanation consistency Implement content filtering, reasoning traceability , and medical fact validation Design AI systems that comply with FHIR, HITECH , and HITRUST controls ⚙️ Technical Implementation Write clean, modular, and async-friendly Python 3.10+ code Use FastAPI , Pydantic , and LangChain for LLM orchestration Build comprehensive test suites using Pytest Maintain documentation for all AI-based endpoints and workflows Optimize for low-latency and production-scale LLM workloads ✅ Required Qualifications👨‍💻 Technical Skills 3–5 years of Python development , with focus on backend AI systems 1+ years of LLM experience (OpenAI, Claude, Llama, or similar) Hands-on with AWS Bedrock or equivalent managed LLM service Strong understanding of prompt engineering and token budgets Proficiency with asyncio , aiohttp , and async concurrency patterns REST API development using FastAPI or Flask 🔧 Core Competencies Understanding of transformers, attention mechanisms , and LLM internals Familiarity with LangChain , LlamaIndex , embedding models , and vector stores Experience with retrieval-augmented generation (RAG) patterns Ability to handle streamed responses, retries , and multi-agent orchestration Prompt design for accuracy, context compression, and LLM alignment 💎 Preferred Qualifications Healthcare domain experience or exposure to clinical data workflows Knowledge of HIPAA, FHIR, DICOM , or ICD-10 medical codes Experience with Anthropic Claude API and OpenAI migration strategies Proficiency with WebSockets , event-driven agents , and prompt chaining Understanding of RLHF, constitutional AI , or reward modeling Experience with model quantization, optimization , or open-source AI tools Contributions to open-source LLM or prompt engineering frameworks 🧰 Technical Environment Languages : Python 3.10+, TypeScript (for frontend integrations) AI Platforms : AWS Bedrock, OpenAI, Claude Frameworks : FastAPI, LangChain, Pydantic, asyncio Vector DBs : Pinecone, Weaviate Databases : PostgreSQL, Redis Monitoring : Datadog, CloudWatch, Weights & Biases Infrastructure : AWS, Docker, Kubernetes Version Control : Git, GitHub, GitLab 🎁 What We Offer Competitive salary based on experience and impact Comprehensive health, dental, and vision insurance Flexible remote/hybrid work setup with Jamshedpur base Stock options and equity participation in the company’s AI product stack Work on impactful, compliant AI healthcare systems Paid time off, wellness support, and parental leave policies 📤 Application Requirements Please email the following to 📧 career@atcuality.com : Updated resume focused on Python, AI integration, and LLM usage GitHub portfolio or code samples showing AI/LLM-related work Short cover letter explaining your prompt engineering philosophy Blog posts or public demos (if applicable) 🧪 Interview Process Initial HR screening call (30 mins) Python + AI coding task (take-home or live) LLM and prompt engineering deep dive (90 mins) System design for AI workflow orchestration (60 mins) Live prompt optimization exercise (45 mins) Team culture fit interview (45 mins) Reference checks + offer 🚀 💡 Why This Role Matters You’ll help define how AI thinks, responds, and behaves in a clinical setting — shaping intelligent systems that assist doctors, empower patients, and accelerate digital healthcare. Your code won’t just talk to models — it’ll help save lives . Aarna Tech Consultants Pvt. Ltd. (Atcuality) is an equal opportunity employer. We encourage applicants from diverse and non-traditional tech backgrounds to apply.

Posted 4 days ago

Apply

7.0 years

0 Lacs

Pune, Maharashtra, India

On-site

Role & Responsibilities Agentic AI Platform Delivery Develop and maintain autonomous software agents using modern LLM frameworks. Build reusable components for business process automation. Design agent orchestration, prompt engineering, and LLM integrations. Enable deployment across CRM systems and enterprise data platforms. Generative AI & Model Optimization Fine-tune LLMs/SLMs with proprietary NBFC data. Work on model distillation, quantization, and edge deployment readiness. Self-Learning Systems Create adaptive frameworks that learn from interaction outcomes. Implement lightweight models to support real-time decision-making. Ideal Candidate B.E./B.Tech/M.Tech in Computer Science or related field 4–7 years in AI/ML roles with proficiency in: Languages: Python, Node.JS, JavaScript, React, Java Tools/Frameworks: LangChain, Semantic Kernel, LangGraph, CrewAI Platforms: GCP, MS Foundry, Copilot Studio, BigQuery, Power Apps/BI Agent Tools: Agent Development Kit (ADK), Multi-agent Communication Protocol (MCP) Strong understanding of: Prompt engineering, LLM integration, and orchestration

Posted 6 days ago

Apply

2.0 years

0 Lacs

India

Remote

Company Description Friska.AI redefines healthcare with advanced artificial intelligence, delivering tailored health solutions designed uniquely for individuals. Our mission is to empower people and communities with personalized healthcare recommendations that address specific needs and promote healthier lives. We analyze individual health data to create customized plans for nutrition, diet, fitness, and more. Detailed health reports provide actionable insights to achieve and sustain better health. Our platform also offers tools for population health management, aiding organizations and healthcare providers in understanding community health trends. Role Description This is a full-time remote role for ASR Engineers / Machine Learning Researchers. The role involves developing algorithms, working on pattern recognition and machine learning models including neural networks, and applying statistical analysis to improve our AI-driven health solutions. Day-to-day tasks include collaborating with cross-functional teams to design and implement AI models, analyzing large datasets, and continuously enhancing the performance and accuracy of our healthcare recommendations through innovative research. Experience: Minimum 2 years of hands-on experience in ASR, speech processing, or closely related ML/NLP domains. Prior work with impaired speech datasets or accessibility-focused technologies is highly preferred. Background in projects involving assistive tech, voice accessibility, or inclusive AI will be considered a strong advantage. Qualifications Programming & Frameworks: Proficiency in Python. Hands-on experience with PyTorch and/or TensorFlow for model development. ASR Expertise: Practical experience with advanced ASR models like Whisper, Conformer, Wav2Vec 2.0, etc. Familiarity with speaker-adaptive and impaired speech models is a strong plus. Model Optimization: Knowledge of optimization methods such as quantization, pruning, distillation, and LoRA (Low-Rank Adaptation) for lightweight deployment. Data Handling: Proficiency in handling large-scale speech datasets. Strong understanding of data preprocessing, augmentation, and labeling pipelines for robust model training. Evaluation: Experience with ASR evaluation metrics: WER (Word Error Rate), CER (Character Error Rate), RTF (Real-Time Factor). Knowledge of SER (Sentence Error Rate) and PER (Phoneme Error Rate) is desirable. Engineering Practices: Familiarity with software engineering workflows (e.g., Git, GitHub/GitLab, CI/CD pipelines). Ability to build and integrate RESTful APIs for deploying ASR functionalities. Cloud & Deployment: Experience with cloud platforms such as AWS, Google Cloud, or Azure for training, hosting, and scaling models. Responsibilities: Design, train, and optimize speaker-adaptive ASR models, focusing on high accuracy and real-time performance, especially for users with speech impairments. Apply domain adaptation techniques (e.g., fine-tuning, transfer learning, LoRA) to customize models across diverse speech patterns and demographics. Build feedback-based learning systems to continually refine ASR performance based on user corrections and interactions. Collaborate closely with frontend/backend developers, UI/UX designers, and clinical experts to deliver user-centered, accessible speech solutions. Rigorously test and benchmark models using both standard evaluation metrics and real-world test environments. Stay up-to-date with state-of-the-art research in ASR and bring innovative ideas into the development pipeline.

Posted 6 days ago

Apply

4.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Job Category: AIML Job Type: Full Time Job Location: Bengaluru Mangalore Experience: 4-8 Years Skills: AI AWS/AZURE/GCP Azure ML C computer vision data analytics Data Modeling Data Visualization deep learning Descriptive Analytics GenAI Image processing Java LLM models ML ONNX Predictive Analytics Python R Regression/Classification Models SageMaker SQL TensorFlow Position Overview We are looking for an experienced AI/ML Engineer to join our team. The ideal candidate will bring a deep understanding of machine learning, artificial intelligence, and big data technologies, with proven expertise in developing scalable AI/ML solutions. This role shall lead technical efforts, mentor team members, and collaborate with cross-functional teams to design, develop, and deploy cutting edge AI/ML applications. Job Details Job Category: AI/ML Engineer. Job Type: Full-Time Job Location: Bangalore/Mangalore Experience Required: 4-8 Years Key Responsibilities Design, develop, and deploy deep learning models for object classification, detection, and segmentation using CNNs and Transfer Learning. Implement image preprocessing and advanced computer vision pipelines. Optimize deep learning models using pruning, quantization, and ONNX for deployment on edge devices. Work with PyTorch, TensorFlow, and ONNX frameworks to develop and convert models. Accelerate model inference using GPU programming with CUDA and cuDNN. Port and test models on embedded and edge hardware platforms. (Orin, Jetson, Hailo) Conduct research and experiments to evaluate and integrate GenAI technologies in computer vision tasks. Explore and implement cloud-based AI workflows, particularly using AWS/Azure AI/ML services. Collaborate with cross-functional teams for data analytics, data processing, and large-scale model training. Desired Profile Strong programming experience in Python. Solid background in deep learning, CNNs, and transfer learning and Machine learning basics. Expertise in object detection, classification, segmentation. Proficiency with PyTorch, TensorFlow, and ONNX. Experience with GPU acceleration (CUDA, cuDNN). Hands-on knowledge of model optimization (pruning, quantization). Experience deploying models to edge devices (e.g., Jetson, mobile, Orin, Hailo ) Understanding of image processing techniques. Familiarity with data pipelines, data preprocessing, and data analytics. Willingness to explore and contribute to Generative AI and cloud-based AI solutions. Good problem-solving and communication skills. Good to have Experience with C/C++. Familiarity with AWS Cloud AI/ML tools (e.g., SageMaker, Rekognition). Exposure to GenAI frameworks like OpenAI, Stable Diffusion, etc. Knowledge of real-time deployment systems and streaming analytics. Qualifications Graduation/Post-graduation in Computers, Engineering, or Statistics from a reputed institute If you are passionate to work in a collaborative and challenging environment, apply now!

Posted 6 days ago

Apply

3.0 years

0 Lacs

India

On-site

🚀 We're Hiring: Senior AI/ML Engineer | Join Our Mission to Build the Future of Intelligent Systems Are you passionate about cutting-edge AI and machine learning? Are you eager to build and deploy Large Language Models, intelligent agents, and transformative AI systems at scale? We’re looking for a " Senior AI/ML Engineer" to join our fast-moving team and make a real impact. 🔍 What You’ll Do: Fine-tune and optimize state-of-the-art models (Transformers, CNNs, RNNs) for real-world use cases Apply quantization and other optimization techniques (INT8, INT4, dynamic quantization) to maximize model performance Build, deploy, and scale intelligent AI agents and multi-agent systems Design APIs and services using FastAPI, Flask, or Django REST, and deploy using Docker/Kubernetes across AWS/GCP/Azure Collaborate on production-grade solutions using PyTorch, TensorFlow, Hugging Face Transformers Leverage vector databases (e.g., Pinecone, Weaviate, Chroma) to power intelligent applications ✅ What We’re Looking For: 3+ years of AI/ML development experience with a proven track record of delivering impactful projects Expertise in fine-tuning and deploying LLMs and deep learning models Deep understanding of modern AI workflows and frameworks (Transformers, Agents, RAG, etc.) Strong Python skills and solid software engineering fundamentals Hands-on experience with production environments, CI/CD, and cloud platforms Proficiency in databases (PostgreSQL + vector DBs) and collaborative Git workflows 🌟 Bonus Points For: Experience with distributed training and large-scale model development Familiarity with retrieval-augmented generation (RAG), prompt engineering, and chain-of-thought reasoning. Background in AI ethics or responsible AI practices Start-up or rapid development experience Bachelor’s or Master’s degree in Computer Science, AI, or a related field 🌐 Why Join Us? Work on high-impact, technically challenging problems Be part of a collaborative, fast-moving, and forward-thinking team Build products that leverage cutting-edge AI to solve real-world challenges Shape the direction of intelligent systems from the ground up \#AIJobs #MachineLearning #DeepLearning #LLMs #MLJobs #TechHiring #StartupJobs #ArtificialIntelligence #FastAPI #HuggingFace #PyTorch #Kubernetes #AIEngineer

Posted 1 week ago

Apply

4.0 years

3 - 5 Lacs

Vadodara

On-site

Role & Responsibilities 4+ years of experience applying AI to practical uses Develop and train computer vision models for tasks like: Object detection and tracking (YOLO, Faster R-CNN, etc.) Image classification, segmentation, OCR (e.g., PaddleOCR, Tesseract) Face recognition/blurring, anomaly detection, etc. Optimize models for performance on edge devices (e.g., NVIDIA Jetson, OpenVINO, TensorRT). Process and annotate image/video datasets; apply data augmentation techniques. Proficiency in Large Language Models. Strong understanding of statistical analysis and machine learning algorithms. Hands-on implementing various machine learning algorithms such as linear regression, logistic regression, decision trees, and clustering algorithms. Understanding of image processing concepts (thresholding, contour detection, transformations, etc.) Experience in model optimization, quantization, or deploying to edge (Jetson Nano/Xavier, Coral, etc.) Strong programming skills in Python (or C++), with expertise in: Implement and optimize machine learning pipelines and workflows for seamless integration into production systems. Hands-on experience with at least one real-time CV application (e.g., surveillance, retail analytics, industrial inspection, AR/VR). OpenCV, NumPy, PyTorch/TensorFlow Computer vision models like YOLOv5/v8, Mask R-CNN, DeepSORT Engage with multiple teams and contribute on key decisions. Expected to provide solutions to problems that apply across multiple teams. Lead the implementation of large language models in AI applications. Research and apply cutting-edge AI techniques to enhance system performance. Contribute to the development and deployment of AI solutions across various domains Requirements Design, develop, and deploy ML models for: OCR-based text extraction from scanned documents (PDFs, images) Table and line-item detection in invoices, receipts, and forms Named entity recognition (NER) and information classification Evaluate and integrate third-party OCR tools (e.g., Tesseract, Google Vision API, AWS Textract, Azure OCR,PaddleOCR, EasyOCR) Develop pre-processing and post-processing pipelines for noisy image/text data Familiarity with video analytics platforms (e.g., DeepStream, Streamlit-based dashboards). Experience with MLOps tools (MLflow, ONNX, Triton Inference Server). Background in academic CV research or published papers. Knowledge of GPU acceleration, CUDA, or hardware integration (cameras, sensors).

Posted 1 week ago

Apply

4.0 years

0 Lacs

Vadodara, Gujarat, India

On-site

Role & Responsibilities 4+ years of experience applying AI to practical uses Develop and train computer vision models for tasks like: Object detection and tracking (YOLO, Faster R-CNN, etc.) Image classification, segmentation, OCR (e.g., PaddleOCR, Tesseract) Face recognition/blurring, anomaly detection, etc. Optimize models for performance on edge devices (e.g., NVIDIA Jetson, OpenVINO, TensorRT). Process and annotate image/video datasets; apply data augmentation techniques. Proficiency in Large Language Models. Strong understanding of statistical analysis and machine learning algorithms. Hands-on implementing various machine learning algorithms such as linear regression, logistic regression, decision trees, and clustering algorithms. Understanding of image processing concepts (thresholding, contour detection, transformations, etc.) Experience in model optimization, quantization, or deploying to edge (Jetson Nano/Xavier, Coral, etc.) Strong programming skills in Python (or C++), with expertise in: Implement and optimize machine learning pipelines and workflows for seamless integration into production systems. Hands-on experience with at least one real-time CV application (e.g., surveillance, retail analytics, industrial inspection, AR/VR). OpenCV, NumPy, PyTorch/TensorFlow Computer vision models like YOLOv5/v8, Mask R-CNN, DeepSORT Engage with multiple teams and contribute on key decisions. Expected to provide solutions to problems that apply across multiple teams. Lead the implementation of large language models in AI applications. Research and apply cutting-edge AI techniques to enhance system performance. Contribute to the development and deployment of AI solutions across various domains Requirements Design, develop, and deploy ML models for: OCR-based text extraction from scanned documents (PDFs, images) Table and line-item detection in invoices, receipts, and forms Named entity recognition (NER) and information classification Evaluate and integrate third-party OCR tools (e.g., Tesseract, Google Vision API, AWS Textract, Azure OCR,PaddleOCR, EasyOCR) Develop pre-processing and post-processing pipelines for noisy image/text data Familiarity with video analytics platforms (e.g., DeepStream, Streamlit-based dashboards). Experience with MLOps tools (MLflow, ONNX, Triton Inference Server). Background in academic CV research or published papers. Knowledge of GPU acceleration, CUDA, or hardware integration (cameras, sensors).

Posted 1 week ago

Apply

5.0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Job Description: We are looking for a Lead Generative AI Engineer with 3–5 years of experience to spearhead development of cutting-edge AI systems involving Large Language Models (LLMs) , Vision-Language Models (VLMs) , and Computer Vision (CV) . You will lead model development, fine-tuning, and optimization for text, image, and multi-modal use cases. This is a hands-on leadership role that requires a deep understanding of transformer architectures, generative model fine-tuning, prompt engineering, and deployment in production environments. Roles and Responsibilities: Lead the design, development, and fine-tuning of LLMs for tasks such as text generation, summarization, classification, Q&A, and dialogue systems. Develop and apply Vision-Language Models (VLMs) for tasks like image captioning, VQA, multi-modal retrieval, and grounding. Work on Computer Vision tasks including image generation, detection, segmentation, and manipulation using SOTA deep learning techniques. Leverage frameworks like Transformers, Diffusion Models, and CLIP to build and fine-tune multi-modal models. Fine-tune open-source LLMs and VLMs (e.g., LLaMA, Mistral, Gemma, Qwen, MiniGPT, Kosmos, etc.) using task-specific or domain-specific datasets. Design data pipelines , model training loops, and evaluation metrics for generative and multi-modal AI tasks. Optimize model performance for inference using techniques like quantization, LoRA, and efficient transformer variants. Collaborate cross-functionally with product, backend, and ML ops teams to ship models into production. Stay current with the latest research and incorporate emerging techniques into product pipelines. Requirements: Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, Machine Learning, or related field. 3–5 years of hands-on experience in building, training, and deploying deep learning models, especially in LLM, VLM , and/or CV domains. Strong proficiency with Python , PyTorch (or TensorFlow), and libraries like Hugging Face Transformers, OpenCV, Datasets, LangChain, etc. Deep understanding of transformer architecture , self-attention mechanisms , tokenization , embedding , and diffusion models . Experience with LoRA , PEFT , RLHF , prompt tuning , and transfer learning techniques. Experience with multi-modal datasets and fine-tuning vision-language models (e.g., BLIP, Flamingo, MiniGPT, Kosmos, etc.). Familiarity with MLOps tools , containerization (Docker), and model deployment workflows (e.g., Triton Inference Server, TorchServe). Strong problem-solving, architectural thinking, and team mentorship skills.

Posted 1 week ago

Apply

7.0 - 10.0 years

0 Lacs

Chandigarh

On-site

bebo Technologies is a leading complete software solution provider. bebo stands for 'be extension be offshore'. We are a business partner of QASource, inc. USA[www.QASource.com]. We offer outstanding services in the areas of software development, sustenance engineering, quality assurance and product support. bebo is dedicated to provide high-caliber offshore software services and solutions. Our goal is to 'Deliver in time-every time'. For more details visit our website: www.bebotechnologies.com Let's have a 360 tour of our bebo premises by clicking on below link: https://www.youtube.com/watch?v=S1Bgm07dPmMKey Required Skills: Bachelor's or Master’s degree in Computer Science, Data Science, or related field. 7–10 years of industry experience, with at least 5 years in machine learning roles. Advanced proficiency in Python and common ML libraries: TensorFlow, PyTorch, Scikit-learn. Experience with distributed training, model optimization (quantization, pruning), and inference at scale. Hands-on experience with cloud ML platforms: AWS (SageMaker), GCP (Vertex AI), or Azure ML. Familiarity with MLOps tooling: MLflow, TFX, Airflow, or Kubeflow; and data engineering frameworks like Spark, dbt, or Apache Beam. Strong grasp of CI/CD for ML, model governance, and post-deployment monitoring (e.g., data drift, model decay). Excellent problem-solving, communication, and documentation skills.

Posted 1 week ago

Apply

4.0 years

4 - 8 Lacs

Noida

On-site

Position Overview- We are looking for an experienced AI Engineer to design, build, and optimize AI-powered applications, leveraging both traditional machine learning and large language models (LLMs). The ideal candidate will have a strong foundation in LLM fine-tuning, inference optimization, backend development, and MLOps, with the ability to deploy scalable AI systems in production environments. ShyftLabs is a leading data and AI company, helping enterprises unlock value through AI-driven products and solutions. We specialize in data platforms, machine learning models, and AI-powered automation, offering consulting, prototyping, solution delivery, and platform scaling. Our Fortune 500 clients rely on us to transform their data into actionable insights. Key Responsibilities: Design and implement traditional ML and LLM-based systems and applications. Optimize model inference for performance and cost-efficiency. Fine-tune foundation models using methods like LoRA, QLoRA, and adapter layers. Develop and apply prompt engineering strategies including few-shot learning, chain-of-thought, and RAG. Build robust backend infrastructure to support AI-driven applications. Implement and manage MLOps pipelines for full AI lifecycle management. Design systems for continuous monitoring and evaluation of ML and LLM models. Create automated testing frameworks to ensure model quality and performance. Basic Qualifications: Bachelor’s degree in Computer Science, AI, Data Science, or a related field. 4+ years of experience in AI/ML engineering, software development, or data-driven solutions. LLM Expertise Experience with parameter-efficient fine-tuning (LoRA, QLoRA, adapter layers). Understanding of inference optimization techniques: quantization, pruning, caching, and serving. Skilled in prompt engineering and design, including RAG techniques. Familiarity with AI evaluation frameworks and metrics. Experience designing automated evaluation and continuous monitoring systems. Backend Engineering Strong proficiency in Python and frameworks like FastAPI or Flask. Experience building RESTful APIs and real-time systems. Knowledge of vector databases and traditional databases. Hands-on experience with cloud platforms (AWS, GCP, Azure) focusing on ML services. MLOps & Infrastructure Familiarity with model serving tools (vLLM, SGLang, TensorRT). Experience with Docker and Kubernetes for deploying ML workloads. Ability to build monitoring systems for performance tracking and alerting. Experience building evaluation systems using custom metrics and benchmarks. Proficient in CI/CD and automated deployment pipelines. Experience with orchestration tools like Airflow. Hands-on experience with LLM frameworks (Transformers, LangChain, LlamaIndex). Familiarity with LLM-specific monitoring tools and general ML monitoring systems. Experience with distributed training and inference on multi-GPU environments. Knowledge of model compression techniques like distillation and quantization. Experience deploying models for high-throughput, low-latency production use. Research background or strong awareness of the latest developments in LLMs. Tools & Technologies We Use Frameworks: PyTorch, TensorFlow, Hugging Face Transformers Serving: vLLM, TensorRT-LLM, SGlang, OpenAI API Infrastructure: Docker, Kubernetes, AWS, GCP Databases: PostgreSQL, Redis, Vector Databases We are proud to offer a competitive salary alongside a strong healthcare insurance and benefits package. We pride ourselves on the growth of our employees, offering extensive learning and development resources.

Posted 1 week ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies