Remote
Full Time
Scry AI is a research-led enterprise AI company that builds intelligent platforms to drive efficiency, insight, and compliance. Our platforms Collatio®, Auriga®, and Concentio® streamline complex workflows by automating data extraction, validation, reconciliation and delivering real-time intelligence.
We are seeking a DevOps Manager to lead our infrastructure, CI/CD, and reliability practices across cloud and on-prem deployments. You will own uptime, performance, security, and cost efficiency for AI/ML workloads powering Collatio®, Auriga®, and Concentio®.
As DevOps Manager, you will lead a small team of DevOps/SRE engineers to design, automate, and operate secure, compliant, and highly available platforms across AWS/Azure/GCP and customer on-prem environments. You will standardize IaC, improve CI/CD velocity, build robust observability, and enable GPU-accelerated AI inference at scale for enterprise clients.
Platform Reliability & Operations
• Own SLOs/SLIs, availability, latency, and capacity planning across services.
• Lead incident response, root-cause analysis, postmortems, and on-call processes.
• Implement backup, disaster recovery, and business continuity for multi-region and on-prem.
• Architect Kubernetes platforms (managed and self-hosted), including RBAC, network policies, and secrets management.
• Standardize infrastructure with Terraform, Helm, and GitOps (Argo CD) for repeatable customer deployments.
• Support Concentio® edge/IoT rollouts with secure remote updates and telemetry pipelines.
• Enable GPU scheduling and drivers (CUDA, NVIDIA), inference runtimes (Triton), and model packaging.
• Build MLOps foundations (MLflow, feature stores) and artifact/version governance.
• Operate data services (Kafka, PostgreSQL, Redis, MinIO/S3, Elasticsearch/Opensearch) for high-throughput pipelines.
• Own CI/CD with GitHub Actions/GitLab CI/Jenkins; establish trunk-based development, automated testing, and canary/blue-green releases.
• Maintain internal developer platforms, templates, and golden paths to improve delivery speed and quality.
• Implement least-privilege access, SSO (Okta/AAD), Vault-based secrets, image scanning (Trivy), and policy as code.
• Ensure SOC 2, ISO 27001, HIPAA/GDPR alignment with audit trails and immutable logs.
• Build end-to-end observability using Prometheus, Grafana, Loki/EFK, and OpenTelemetry.
• Track cloud spend, rightsize resources, and negotiate quotas for GPU/compute.
• Partner with Product, Data Science, and Customer Success to plan capacity for new features and enterprise go-lives.
• Strong Kubernetes expertise (production operations, networking, security, Helm, GitOps).
• Proven IaC experience with Terraform and configuration management (Ansible).
• CI/CD at scale with GitHub Actions/GitLab CI/Jenkins; artifact registries and SBOMs.
• Observability: Prometheus, Grafana, ELK/EFK or Loki, alerting and runbooks.
• Cloud proficiency in at least one major provider (AWS/Azure/GCP) and Linux fundamentals.
• Security fundamentals: network segmentation, TLS, secrets management, container hardening.
• Experience running data/streaming systems (Kafka, Redis, PostgreSQL) in production.
• Excellent communication, incident leadership, and stakeholder management.
• GPU orchestration, Triton Inference Server, Hugging Face model serving.
• Service mesh (Istio/Linkerd), API gateways, and zero-trust patterns.
• MLOps tooling (MLflow, Feast), Airflow, dbt.
• Compliance implementations for regulated industries (BFSI, healthcare).
• Certifications: CKA/CKAD, AWS/Azure/GCP Architect, Security+.
• Drives reliability with automation, not toil.
• Balances speed and safety with measurable delivery improvements.
• Thrives in customer-facing, hybrid cloud, and on-prem environments.
• Coaches teams with clear standards, runbooks, and continuous improvement.
If you want to build secure, high-performance platforms for real-world AI at enterprise scale, follow our page for more such relevant job openings.
Scry AI
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
gurugram, haryana, india
Salary: Not disclosed
delhi, india
Experience: Not specified
Salary: Not disclosed
pune, maharashtra, india
Experience: Not specified
Salary: Not disclosed
chennai, tamil nadu, india
Salary: Not disclosed
hyderabad, telangana, india
Experience: Not specified
Salary: Not disclosed
hyderabad, telangana, india
Salary: Not disclosed
Salary: Not disclosed
pune, maharashtra
Salary: Not disclosed
pune, maharashtra
Salary: Not disclosed
pune, maharashtra, india
Salary: Not disclosed