Posted:1 month ago|
Platform:
Hybrid
Full Time
Senior Software Engineer/ LLM Ops Engineer External Description Description - External JD - What You Will Do Design, implement, and maintain LLM operations workflows using tools like Langfuse to monitor performance, track usage, and create feedback loops for continuous improvement Develop and maintain infrastructure-as-code for AI deployments using Terraform and AWS services (Lambda, SQS, API Gateway, OpenSearch, CloudWatch) Build and enhance monitoring, logging, and alerting systems to ensure optimal performance and reliability of our LLM infrastructure Collaborate with AI engineers to design and implement evaluation frameworks (including LLM-as-judge systems) to measure and improve model performance Manage prompt versioning, testing, and deployment pipelines through CI/CD and custom tooling Implement and maintain security guardrails for LLM interactions, ensuring compliance with best practices Create comprehensive documentation for LLM operations, including runbooks for production incidents Participate in on-call rotations to support mission-critical AI systems Drive innovation in LLM operations by researching and implementing best practices and emerging tools in the rapidly evolving GenAI space Deep understanding of prompt engineering strategies What You Will Bring To succeed in this role, you will need a combination of experience, technology skills, personal qualities, and education. Required Qualifications 3+ years of experience in DevOps, SRE, or similar roles, with at least 1 year specifically working with LLMs or AI systems in production Strong hands-on experience with AWS cloud services, particularly Bedrock, Lambda, SQS, API Gateway, OpenSearch, and CloudWatch Experience with infrastructure-as-code using Terraform, CloudFormation, or similar tools Proficiency in Python and experience building automation tooling and pipelines Familiarity with LangOps platforms such as Langfuse for LLM observability and evaluation Experience with CI/CD pipelines Knowledge of logging, monitoring, and alerting systems Understanding of security best practices for AI systems, including prompt injection mitigation techniques Excellent troubleshooting and problem-solving skills Strong communication skills and ability to work effectively with cross-functional teams Must be legally entitled to work in the country where the role is located Preferred Qualifications Experience with prompt engineering and testing tools like Promptfoo Familiarity with vector databases and retrieval-augmented generation (RAG) systems Knowledge of serverless architectures and event-driven systems Experience with AWS Guardrails for LLM security Background in data engineering or machine learning operations Understanding of financial systems and data security requirements in the finance industry Familiarity with implementing technical solutions to meet compliance requirements outlined in SOC2, ISAE 3402, and ISO 27001
Xoriant
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Practice Video Interview with JobPe AI
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Pune, Maharashtra, India
Salary: Not disclosed
Mumbai, Pune, Bengaluru
0.5 - 0.6 Lacs P.A.
40.0 - 50.0 Lacs P.A.
Kolkata, Mumbai, New Delhi, Hyderabad, Pune, Chennai, Bengaluru
17.0 - 19.0 Lacs P.A.
Hyderabad
14.0 - 19.0 Lacs P.A.
12.0 - 14.0 Lacs P.A.
20.0 - 25.0 Lacs P.A.
Bengaluru
Experience: Not specified
25.0 - 30.0 Lacs P.A.
Bengaluru
5.0 - 9.0 Lacs P.A.
Bengaluru
7.0 - 8.0 Lacs P.A.