Home
Jobs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Data Scientist


Key Responsibilities:

  • Develop, train, and deploy recommendation models using

    two-tower, multi-tower, and cross-encoder architectures

    .
  • Generate and utilize text/image embeddings (e.g.,

    CLIP

    ,

    BERT

    ,

    Sentence Transformers

    ) for content-based recommendations.
  • Design

    semantic similarity search pipelines

    using vector databases (FAISS, ScaNN, Qdrant, Matching Engine).
  • Create and manage scalable ML pipelines using

    Vertex AI

    ,

    Kubeflow Pipelines

    , and

    GKE

    .
  • Handle large-scale data preparation and feature engineering using

    Dataproc (PySpark)

    and

    Dataflow

    .
  • Implement cold-start strategies leveraging metadata and multimodal embeddings.
  • Work on

    user modeling

    ,

    temporal personalization

    , and

    re-ranking strategies

    .
  • Run A/B tests and interpret results to measure real-world impact.
  • Collaborate with cross-functional teams (Engineering, Product, DevOps) for model deployment and monitoring.


Must-Have Skills:

  • Strong command of

    Python

    and ML libraries: pandas, polars, numpy, scikit-learn, matplotlib, tensorflow, torch, transformers.
  • Deep understanding of

    modern recommender systems

    and

    embedding-based retrieval

    .
  • Experience with

    TensorFlow

    ,

    Keras

    , or

    PyTorch

    for building deep learning models.
  • Hands-on with

    semantic search

    ,

    ANN search

    , and

    real-time vector matching

    .
  • Proven experience with

    Vertex AI

    ,

    Kubeflow on GKE

    , and ML pipeline orchestration.
  • Familiarity with vector DBs such as

    Qdrant

    ,

    FAISS

    ,

    ScaNN

    , or

    Matching Engine

    on GCP.
  • Experience in deploying models via

    Vertex AI Online Prediction

    ,

    TF Serving

    , or

    Cloud Run

    .
  • Knowledge of

    feature stores

    ,

    embedding versioning

    , and

    MLOps practices

    (CI/CD, monitoring).


Preferred / Good to Have:

  • Experience with

    ranking models

    (e.g.,

    XGBoost

    ,

    LightGBM

    ,

    DLRM

    ) for candidate scoring.
  • Exposure to

    LLM-powered personalization

    or hybrid retrieval systems.
  • Familiarity with

    streaming pipelines

    using

    Pub/Sub

    ,

    Dataflow

    ,

    Cloud Functions

    .
  • Hands-on with

    multi-modal retrieval

    (text + image + tabular data).
  • Strong grasp of

    cold-start problem solving

    , using enriched metadata and embeddings.

GCP Stack You’ll Work With:

  • ML & Pipelines:

    Vertex AI, Vertex Pipelines, Kubeflow on GKE
  • Embedding & Retrieval:

    Matching Engine, Qdrant, FAISS, ScaNN, Milvus
  • Processing:

    Dataproc (PySpark), Dataflow
  • Ingestion & Serving:

    Pub/Sub, Cloud Functions, Cloud Run, TF Serving
  • CI/CD & Automation:

    GitHub Actions, GitLab CI, Terraform

Mock Interview

Practice Video Interview with JobPe AI

Start Data Interview Now
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You

Hyderabad, Pune, Bengaluru

Gurugram, Haryana

Bengaluru, Karnataka, India