Jobs

Interviews
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home

Jobs

Home
>
Jobs in Gurugram
>
Eucloid Data Solutions
>
LLM Evaluation Engineer (GenAI QE)

LLM Evaluation Engineer (GenAI QE)

Eucloid Data Solutions

6 years

0 Lacs

Gurugram Haryana India

Posted:1 week ago| Platform:

Apply

Skills Required

evaluation ai data healthcare saas databricks drive design testing metrics logging model reporting visualize ml openai retrieval integration versioning compliance content tooling communication documentation research development creativity teamwork engineering consulting diversity

Work Mode

On-site

Job Type

Full Time

Job Description

Note:

Please apply only if you have

6 years or more
of relevant experience (excluding internship)
Comfortable working
5-days a week
from Gurugram, Haryana
Are an immediate joiner or currently serving your notice period

About Eucloid

At Eucloid, innovation meets impact. As a leader in AI and Data Science, we create solutions that redefine industries—from Hi-tech and D2C to Healthcare and SaaS. With partnerships with giants like Databricks, Google Cloud, and Adobe, we’re pushing boundaries and building next-gen technology.

Join our talented team of engineers, scientists, and visionaries from top institutes like IITs, IIMs, and NITs. At Eucloid, growth is a promise, and your work will drive transformative results for Fortune 100 clients.

What You’ll Do

Design and implement robust frameworks for evaluating large language models (LLMs) across dimensions like accuracy, safety, hallucination, and reasoning.
Build modular pipelines for automated, semi-automated, and human-in-the-loop evaluations.
Integrate GenAI testing tools such as Giskard, RAGAS, DeepEval, TruLens, Opik/Comet, and LangSmith.
Define and implement custom evaluation metrics tailored to use cases like RAG, agents, and safety guardrails.
Curate or generate high-quality evaluation datasets across domains (e.g., legal, medical, QA, coding).
Collaborate with developers to instrument tracing and logging for real-world model behavior capture.
Build dashboards and reporting mechanisms to visualize performance, regressions, and model comparisons.
Conduct prompt-based testing, chain-of-thought evaluations, adversarial testing, and A/B comparisons.
Contribute to red-teaming and stress-testing efforts to uncover vulnerabilities and ethical risks.

What Makes You a Fit

Academic Background:

Bachelor’s or Master’s degree in Computer Science, Data Science, Artificial Intelligence, or a related field.

Technical Expertise:

Minimum 6 years of hands-on experience in building, testing, or evaluating AI/ML systems
, with a strong focus on LLMs or Generative AI applications.
Proficiency in
Python
, along with experience using
ML/NLP libraries
such as Hugging Face, LangChain, OpenAI SDK, or Cohere.
Experience in building
evaluation pipelines
or benchmarks for LLM performance across metrics like accuracy, robustness, safety, and hallucination.
Deep understanding of
prompt engineering
,
retrieval-augmented generation (RAG)
, and
agentic evaluation
techniques.
Hands-on familiarity with
evaluation tools
such as Giskard, RAGAS, DeepEval, TruLens, LangSmith, Opik/Comet, or similar.
Working knowledge of
vector databases
like FAISS, Pinecone, or Weaviate, and embedding-based evaluation methods.
Experience with
CI/CD pipelines
, unit/integration testing for LLM apps, and model versioning for reproducibility.
Ability to define
custom evaluation metrics
tailored to specific use cases (e.g., RAG performance, guardrail compliance, hallucination detection).
Strong grasp of
model instrumentation
techniques for tracing/logging model behavior in real-world flows.

Extra Skills:

Experience in developing LLM-based applications such as chatbots, copilots, or RAG systems.
Exposure to designing or evaluating AI safety systems (e.g., jailbreaking prevention, content filters).
Open-source contributions to GenAI tooling or evaluation libraries.
Strong communication and documentation skills.
Comfort working in fast-paced, research-heavy environments.

Why You’ll Love It Here

Innovate with the Best Tech:
Work on groundbreaking projects using AI, GenAI, LLMs, and massive-scale data platforms. Tackle challenges that push the boundaries of innovation.
Impact Industry Giants:
Deliver business-critical solutions for Fortune 100 clients across Hi-tech, D2C, Healthcare, SaaS, and Retail. Partner with platforms like Databricks, Google Cloud, and Adobe to create high-impact products.
Collaborate with a World-Class Team:
Join exceptional professionals from IITs, IIMs, NITs, and global leaders like Walmart, Amazon, Accenture, and ZS. Learn, grow, and lead in a team that values expertise and collaboration.
Accelerate Your Growth:
Access our Centres of Excellence to upskill and work on industry-leading innovations. Your professional development is a top priority.
Work in a Culture of Excellence:
Be part of a dynamic workplace that fosters creativity, teamwork, and a passion for building transformative solutions. Your contributions will be recognized and celebrated.

About Our Leadership

Anuj Gupta –

Raghvendra Kushwah

Key Benefits

Competitive salary and performance-based bonus.
Comprehensive benefits package, including health insurance and flexible work hours.
Opportunities for professional development and careers growth.

Location:

Application: Role Name.

Eucloid is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment.

Mock Interview

Practice Video Interview with JobPe AI

Start Evaluation Interview Now

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

Eucloid Data Solutions

14 Jobs

RecommendedJobs for You

LLM Evaluation Engineer (GenAI QE)

Eucloid Data Solutions

Gurugram, Haryana, India

LLM Evaluation Engineer (GenAI QE)

Eucloid Data Solutions

Gurugram, Haryana, India

Before You Leave... Find Your Perfect Job!

Login to

Please Verify Your Phone or Email

Confirm Action

Contact Us

Search

Profile

Upskill and Grow with AI

Personal Settings

LLM Evaluation Engineer (GenAI QE)

Experience & Salary

Skills Required

Work Mode

Job Type

Job Description

Note:

6 years or more

5-days a week

About Eucloid

What You’ll Do

What Makes You a Fit

Academic Background:

Technical Expertise:

Minimum 6 years of hands-on experience in building, testing, or evaluating AI/ML systems

Python

ML/NLP libraries

evaluation pipelines

prompt engineering

retrieval-augmented generation (RAG)

agentic evaluation

evaluation tools

vector databases

CI/CD pipelines

custom evaluation metrics

model instrumentation

Extra Skills:

Why You’ll Love It Here

Innovate with the Best Tech:

Impact Industry Giants:

Collaborate with a World-Class Team:

Accelerate Your Growth:

Work in a Culture of Excellence:

About Our Leadership

Anuj Gupta –

Raghvendra Kushwah

Key Benefits

Location:

Application: Role Name.

More Jobs at Eucloid Data Solutions