AION is building the next generation of AI cloud platform by transforming the future of high-performance computing (HPC) through its decentralized AI cloud. Purpose-built for bare-metal performance, AION democratizes access to compute power for AI training, fine-tuning, inference, data labeling, and beyond.By leveraging underutilized resources such as idle GPUs and data centers, AION provides a scalable, cost-effective, and sustainable solution tailored for developers, researchers, and enterprises.Led by high-pedigree founders with previous exits, AION is well-funded by major VCs with strategic global partnerships. Headquartered in the US with global presence, the company is building its initial core team in India.

Who You Are

You're an ML systems engineer who's passionate about building high-performance inference infrastructure. You don't need to be an expert in everything - this field is evolving too rapidly for that - but you have strong fundamentals and the curiosity to dive deep into optimization challenges. You thrive in early-stage environments where you'll learn cutting-edge techniques while building production systems. You think systematically about performance bottlenecks and are excited to push the boundaries of what's possible in AI infrastructure.

Key Responsibilities

Build and optimize LLM inference systems working towards 2-4x performance improvements over standard frameworks like vLLM and TensorRT-LLM
Implement modern inference optimizations including KV-cache management, dynamic batching, speculative decoding, compression and quantization strategies
Develop GPU optimization solutions using CUDA, with opportunities to learn advanced techniques like Triton kernel development and CUDA graphs
Design model evaluation and benchmarking systems to assess performance across reasoning, coding, and safety metrics
Contribute to training and fine-tuning infrastructure supporting distributed workloads and RLHF pipeline development
Research and integrate trending open-source models (DeepSeek R1, Qwen 3, Llama 4, Mistral variants) with optimized configurations
Build performance monitoring and profiling tools for GPU cluster analysis, bottleneck identification, and cost optimization
Create cost-performance optimization strategies that balance throughput, latency, and infrastructure costs
Explore agent orchestration capabilities for multi-step reasoning and tool integration workflows
Collaborate with tech and product teams to identify optimization opportunities and translate them into production improvements

Requirements

High agency individual looking to own and influence product architecture and company direction
5+ years of software engineering experience with focus on performance-critical systems and production deployments
Strong Python expertise and working knowledge of C++ for performance optimization
Working understanding of deep learning fundamentals including transformer architectures, attention mechanisms, and neural network training/inference
Hands-on experience with PyTorch including model development, training loops, and basic distributed computing concepts
Basic GPU programming experience with CUDA or willingness to quickly learn GPU optimization techniques
Experience with at least one modern inference framework (vLLM, TensorRT-LLM, SGLang or similar) in a production setting
Understanding of distributed systems concepts including load balancing, auto-scaling, and fault tolerance
Strong debugging and performance profiling skills for identifying and resolving system bottlenecks

Benefits

Join the ground floor of a mission-driven AI startup revolutionizing compute infrastructure
Work with a high-caliber, globally distributed team backed by major VCs
Competitive compensation and benefits
Fast-paced, flexible work environment with room for ownership and impact
Hybrid model: 3 days in-office, 2 days remote with flexibility to work remotely for part of the year

More Jobs at aion

Technical Operations Manager

Bengaluru, Karnataka, India

6.0 - 6.0 yrs

Salary: Not disclosed

ML Engineer

bengaluru, karnataka, india

5.0 - 5.0 yrs

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

aion

Login to

Please Verify Your Phone or Email

Confirm Action

Search

Profile

Upskill and Grow with AI

ML Engineer