Applied Machine Learning Scientist – Voice AI, NLP & GenAI Applications
Location
Working Days
Working Hours
Experience
Function
Apply
Subject Line
About Darwix AI
Darwix AI is a GenAI-powered platform transforming how enterprise sales, support, and credit teams engage with customers. Our proprietary AI stack ingests data across calls, chat, email, and CCTV streams to generate:
- Real-time nudges for agents and reps
- Conversational analytics and scoring to drive performance
- CCTV-based behavior insights to boost in-store conversion
IndiaMart, Wakefit, Emaar, GIVA, Bank Dofar
Role Overview
Applied Machine Learning Scientist
This is a core role in our AI/ML team where you’ll be responsible for building the foundational ML capabilities that drive our real-time sales intelligence platform. You will work on large-scale multilingual voice-to-text pipelines, transformer-based intent detection, and retrieval-augmented generation systems used in live enterprise deployments.
Key ResponsibilitiesVoice-to-Text (ASR) Engineering
- Deploy and fine-tune ASR models such as WhisperX, wav2vec 2.0, or DeepSpeech for Indian and GCC languages
- Integrate diarization and punctuation recovery pipelines
- Benchmark and improve transcription accuracy across noisy call environments
- Optimize ASR latency for real-time and batch processing modes
NLP & Conversational Intelligence
- Train and deploy NLP models for sentence classification, intent tagging, sentiment, emotion, and behavioral scoring
- Build call scoring logic aligned to domain-specific taxonomies (sales pitch, empathy, CTA, etc.)
- Fine-tune transformers (BERT, RoBERTa, etc.) for multilingual performance
- Contribute to real-time inference APIs for NLP outputs in live dashboards
GenAI & LLM Systems
- Design and test GenAI prompts for summarization, coaching, and feedback generation
- Integrate retrieval-augmented generation (RAG) using OpenAI, HuggingFace, or open-source LLMs
- Collaborate with product and engineering teams to deliver LLM-based features with measurable accuracy and latency metrics
- Implement prompt tuning, caching, and fallback strategies to ensure system reliability
Experimentation & Deployment
- Own model lifecycle: data preparation, training, evaluation, deployment, monitoring
- Build reproducible training pipelines using MLflow, DVC, or similar tools
- Write efficient, well-structured, production-ready code for inference APIs
- Document experiments and share insights with cross-functional teams
Required Qualifications
- Bachelor’s or Master’s degree in Computer Science, AI, Data Science, or related fields
- 3–7 years experience applying ML in production, including NLP and/or speech
- Experience with transformer-based architectures for text or audio (e.g., BERT, Wav2Vec, Whisper)
- Strong Python skills with experience in PyTorch or TensorFlow
- Experience with REST APIs, model packaging (FastAPI, Flask, etc.), and containerization (Docker)
- Familiarity with audio pre-processing, signal enhancement, or feature extraction (MFCC, spectrograms)
- Knowledge of MLOps tools for experiment tracking, monitoring, and reproducibility
- Ability to work collaboratively in a fast-paced startup environment
Preferred Skills
- Prior experience working with multilingual datasets (Hindi, Arabic, Tamil, etc.)
- Knowledge of diarization and speaker separation algorithms
- Experience with LLM APIs (OpenAI, Cohere, Mistral, LLaMA) and RAG pipelines
- Familiarity with inference optimization techniques (quantization, ONNX, TorchScript)
- Contribution to open-source ASR or NLP projects
- Working knowledge of AWS/GCP/Azure cloud platforms
What Success Looks Like
- Transcription accuracy improvement ≥ 85% across core languages
- NLP pipelines used in ≥ 80% of Darwix AI’s daily analyzed calls
- 3–5 LLM-driven product features delivered in the first year
- Inference latency reduced by 30–50% through model and infra optimization
- AI features embedded across all Tier 1 customer accounts within 12 months
Life at Darwix AI
You will be working in a high-velocity product organization where AI is core to our value proposition. You’ll collaborate directly with the founding team and cross-functional leads, have access to enterprise datasets, and work on ML systems that impact large-scale, real-time operations.
We value rigor, ownership, and speed. Model ideas become experiments in days, and successful experiments become deployed product features in weeks.
Compensation & Perks
- Competitive fixed salary based on experience
- Quarterly/Annual performance-linked bonuses
- ESOP eligibility post 12 months
- Compute credits and model experimentation environment
- Health insurance, mental wellness stipend
- Premium tools and GPU access for model development
- Learning wallet for certifications, courses, and AI research access
Career Path
- Year 1: Deliver production-grade ASR/NLP/LLM systems for high-usage product modules
- Year 2: Transition into Senior Applied Scientist or Tech Lead for conversation intelligence
- Year 3: Grow into Head of Applied AI or Architect-level roles across vertical product lines
How to Apply
careers@darwix.ai
- Updated resume (PDF)
- A short write-up (200 words max):
-
“How would you design and optimize a multilingual voice-to-text and NLP pipeline for noisy call center data in Hindi and English?”
- Optional: GitHub or portfolio links demonstrating your work
Subject Line