Posted:6 days ago|
Platform:
Remote
Full Time
About Us We’re an early-stage startup building LLM-native products that turn unstructured documents into intelligent, usable insights. We work with RAG pipelines, multi-cloud LLMs, and fast data processing — and we’re looking for someone who can build, deploy, and own these systems end-to-end. Key Responsibilities: RAG Application Development: Design and build end-to-end Retrieval-Augmented Generation (RAG) pipelines using LLMs deployed on Vertex AI and AWS Bedrock , integrated with Quadrant for vector search. OCR & Multimodal Data Extraction: Use OCR tools (e.g., Textract) and vision-language models (VLMs) to extract structured and unstructured data from PDFs, images, and multimodal content. LLM Orchestration & Agent Design: Build and optimize workflows using LangChain , LlamaIndex , and custom agent frameworks. Implement autonomous task execution using agent strategies like ReAct , Function Calling , and tool-use APIs . API & Streaming Interfaces: Build and expose production-ready APIs (e.g., with FastAPI) for LLM services, and implement streaming outputs for real-time response generation and latency optimization. Data Pipelines & Retrieval: Develop pipelines for ingestion, chunking, embedding, and storage using Quadrant and PostgreSQL , applying hybrid retrieval techniques (dense + keyword search), rerankers, GraphRAG. Serverless AI Workflows: Deploy serverless ML components (e.g., AWS Lambda, GCP Cloud Functions) for scalable inference and data processing. MLOps & Model Evaluation: Deploy, monitor, and iterate on AI systems with lightweight MLOps workflows (Docker, MLflow, CI/CD). Benchmark and evaluate embeddings, retrieval strategies, and model performance. Qualifications: Strong Python development skills (must-have). LLMs: Claude and Gemini models Experience building AI agents and LLM-powered reasoning pipelines. Deep understanding of embeddings, vector search, and hybrid retrieval techniques. Experience with Quadrant DB Experience designing multi-step task automation and execution chains. Streaming: Ability to implement and debug LLM streaming and async flows Knowledge of memory and context management strategies for LLM agents (e.g., vector memory, scratchpad memory, episodic memory). Experience with AWS Lambda for serverless AI workflows and API integrations. Bonus: LLM fine-tuning, multimodal data processing, knowledge graph integration, or advanced AI planning techniques. Prior experience at startups only ( not IT services or Enterprises) and short notice period Who You Are 2–4 years of real-world AI/ML experience, ideally with production LLM apps Startup-ready: fast, hands-on, comfortable with ambiguity Clear communicator who can take ownership and push features end-to-end Available to join immediately Why Join Us? Founding-level role with high ownership Build systems from scratch using the latest AI stack Fully remote, async-friendly, fast-paced team
Bryckel AI
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python NowSalary: Not disclosed
Salary: Not disclosed