Gen AI Engineer- RAG, Vertex AI, AWS Bedrock

Bryckel AI

4 years

0 Lacs

India

Posted:6 days ago| Platform:

Apply

Skills Required

ai vertex aws data processing design retrieval ocr textract vision extract orchestration strategies api fastapi latency storage ml gcp inference model docker mlflow python development reasoning automation management tuning integration planning

Work Mode

Remote

Job Type

Full Time

Job Description

About Us We’re an early-stage startup building LLM-native products that turn unstructured documents into intelligent, usable insights. We work with RAG pipelines, multi-cloud LLMs, and fast data processing — and we’re looking for someone who can build, deploy, and own these systems end-to-end. Key Responsibilities: RAG Application Development: Design and build end-to-end Retrieval-Augmented Generation (RAG) pipelines using LLMs deployed on Vertex AI and AWS Bedrock , integrated with Quadrant for vector search. OCR & Multimodal Data Extraction: Use OCR tools (e.g., Textract) and vision-language models (VLMs) to extract structured and unstructured data from PDFs, images, and multimodal content. LLM Orchestration & Agent Design: Build and optimize workflows using LangChain , LlamaIndex , and custom agent frameworks. Implement autonomous task execution using agent strategies like ReAct , Function Calling , and tool-use APIs . API & Streaming Interfaces: Build and expose production-ready APIs (e.g., with FastAPI) for LLM services, and implement streaming outputs for real-time response generation and latency optimization. Data Pipelines & Retrieval: Develop pipelines for ingestion, chunking, embedding, and storage using Quadrant and PostgreSQL , applying hybrid retrieval techniques (dense + keyword search), rerankers, GraphRAG. Serverless AI Workflows: Deploy serverless ML components (e.g., AWS Lambda, GCP Cloud Functions) for scalable inference and data processing. MLOps & Model Evaluation: Deploy, monitor, and iterate on AI systems with lightweight MLOps workflows (Docker, MLflow, CI/CD). Benchmark and evaluate embeddings, retrieval strategies, and model performance. Qualifications: Strong Python development skills (must-have). LLMs: Claude and Gemini models Experience building AI agents and LLM-powered reasoning pipelines. Deep understanding of embeddings, vector search, and hybrid retrieval techniques. Experience with Quadrant DB Experience designing multi-step task automation and execution chains. Streaming: Ability to implement and debug LLM streaming and async flows Knowledge of memory and context management strategies for LLM agents (e.g., vector memory, scratchpad memory, episodic memory). Experience with AWS Lambda for serverless AI workflows and API integrations. Bonus: LLM fine-tuning, multimodal data processing, knowledge graph integration, or advanced AI planning techniques. Prior experience at startups only ( not IT services or Enterprises) and short notice period Who You Are 2–4 years of real-world AI/ML experience, ideally with production LLM apps Startup-ready: fast, hands-on, comfortable with ambiguity Clear communicator who can take ownership and push features end-to-end Available to join immediately Why Join Us? Founding-level role with high ownership Build systems from scratch using the latest AI stack Fully remote, async-friendly, fast-paced team

More Jobs at Bryckel AI

Gen AI Engineer- RAG, Vertex AI, AWS Bedrock

India

4.0 - 4.0 yrs

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start Ai Interview Now

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.