Senior Software Test Engineer-GenAI Testing

5 - 9 years

10 - 20 Lacs

Posted:1 week ago| Platform: Naukri logo

Apply

Work Mode

Hybrid

Job Type

Full Time

Job Description

Role & responsibilities

Design and implement end-to-end QA strategies for applications using Node.js, integrated with LLMs, retrieval-augmented generation (RAG), and Agentic AI workflows.
• Establish comprehensive benchmarks and quality metrics for GenAI components including accuracy, coherence, relevance, stability, and safety.• Develop structured evaluation datasheets for LLM behaviour validation: test prompts, expected responses, classification criteria, and scoring rubrics.• Perform data quality testing for RAG databases and ensure relevant, high-quality retrieval to minimize hallucinations and improve grounding.• Conduct A/B testing across model versions, prompt designs, and system configurations to measure and compare output quality.• Define methodologies and simulate non-deterministic behaviours using Agentic AI testing techniques.• Collaborate closely with developers, product owners, and AI engineers to test prompt engineering pipelines, function-calling interfaces, and fallback logic.• Build QA automation where applicable and integrate GenAI evaluations into CI/CD pipelines.• Lead internal capability development by mentoring QA peers on GenAI testing practices and helping evolve the organizations AI quality maturity.

Preferred candidate profile

6+ years of experience in software quality assurance, with at least 3+ years working in or around GenAI or LLM-based systems.
• Deep understanding of GenAI quality dimensions: response grounding, factual correctness, context awareness, and hallucination minimization.• Experience creating and maintaining LLM evaluation datasets and designing test cases for dynamic prompt behaviour.• Hands-on experience with tools and techniques for testing retrieval pipelines, embedding quality, and vector similarity results in RAG architectures.• Familiarity with non-deterministic testing strategies, agent loop evaluation, and multi-step LLM task validation.• Comfortable working with APIs, logs, test scripts, and tracing tools to validate both system and AI behaviour.• Strong analytical thinking and a methodical approach to identifying bugs, regressions, and inconsistencies in AI outputs.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Kongsberg Software And Services logo
Kongsberg Software And Services

Industrial Machinery Manufacturing

Kongsberg Oslo

RecommendedJobs for You

New Delhi, Gurugram