Posted:1 week ago|
Platform:
On-site
Full Time
We are looking for a highly skilled and motivated Data Scientist with deep experience in building recommendation systems to join our team. This role demands expertise in deep learning, embedding-based retrieval, and the Google Cloud Platform (GCP). You will play a critical role in developing intelligent systems that enhance user experiences through personalized content discovery. Key Responsibilities: Develop, train, and deploy recommendation models using two-tower, multi-tower, and cross-encoder architectures . Generate and utilize text/image embeddings (e.g., CLIP , BERT , Sentence Transformers ) for content-based recommendations. Design semantic similarity search pipelines using vector databases (FAISS, ScaNN, Qdrant, Matching Engine). Create and manage scalable ML pipelines using Vertex AI , Kubeflow Pipelines , and GKE . Handle large-scale data preparation and feature engineering using Dataproc (PySpark) and Dataflow . Implement cold-start strategies leveraging metadata and multimodal embeddings. Work on user modeling , temporal personalization , and re-ranking strategies . Run A/B tests and interpret results to measure real-world impact. Collaborate with cross-functional teams (Engineering, Product, DevOps) for model deployment and monitoring. Must-Have Skills: Strong command of Python and ML libraries: pandas, polars, numpy, scikit-learn, matplotlib, tensorflow, torch, transformers. Deep understanding of modern recommender systems and embedding-based retrieval . Experience with TensorFlow , Keras , or PyTorch for building deep learning models. Hands-on with semantic search , ANN search , and real-time vector matching . Proven experience with Vertex AI , Kubeflow on GKE , and ML pipeline orchestration. Familiarity with vector DBs such as Qdrant , FAISS , ScaNN , or Matching Engine on GCP. Experience in deploying models via Vertex AI Online Prediction , TF Serving , or Cloud Run . Knowledge of feature stores , embedding versioning , and MLOps practices (CI/CD, monitoring). Preferred / Good to Have: Experience with ranking models (e.g., XGBoost , LightGBM , DLRM ) for candidate scoring. Exposure to LLM-powered personalization or hybrid retrieval systems. Familiarity with streaming pipelines using Pub/Sub , Dataflow , Cloud Functions . Hands-on with multi-modal retrieval (text + image + tabular data). Strong grasp of cold-start problem solving , using enriched metadata and embeddings. GCP Stack You’ll Work With: ML & Pipelines: Vertex AI, Vertex Pipelines, Kubeflow on GKE Embedding & Retrieval: Matching Engine, Qdrant, FAISS, ScaNN, Milvus Processing: Dataproc (PySpark), Dataflow Ingestion & Serving: Pub/Sub, Cloud Functions, Cloud Run, TF Serving CI/CD & Automation: GitHub Actions, GitLab CI, Terraform Show more Show less
VAYUZ Technologies
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python NowMumbai
Experience: Not specified
25.0 - 30.0 Lacs P.A.
Bengaluru
Experience: Not specified
25.0 - 30.0 Lacs P.A.
11.0 - 15.0 Lacs P.A.
10.0 - 14.0 Lacs P.A.
Kolkata
25.0 - 30.0 Lacs P.A.
Bengaluru
7.0 - 11.0 Lacs P.A.
Bengaluru
4.0 - 8.0 Lacs P.A.
Gurugram
7.0 - 9.0 Lacs P.A.
Bengaluru
25.0 - 30.0 Lacs P.A.
4.8 - 6.0 Lacs P.A.