Home
Jobs

Devops Engineer

4 - 9 years

15 - 20 Lacs

Posted:10 hours ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

OVERVIEW: DevOps & ML Ops Engineer would be responsible for developing and maintaining scalable, stable services that deliver machine learning models to end users with guaranteed uptime. The primary focus will be on the infrastructure, deployment, and continuous integration/continuous delivery (CI/CD) processes for our ML services DESCRIPTION: Manage resource allocation and workload scheduling for multiple DevOps services, ensuring efficient utilization of CPU/GPU resources and creating reliable queues based on service priorities. Maintain VM environments and manage OS updates, keep up-to-date VM inventory Work alongside the Dev and QA team to detect hot spots in our applications and set preventative measure before it becomes a live issue. Troubleshooting and provide solutions for system configurations Plan, execute and test disaster recovery Monitor and examine all application, performance, event, and system logs to assist in troubleshooting Responsible for filing all IT/Colocation tickets ensuring fulfilment of requests, escalating to the right person if necessary. Design, develop, and maintain the infrastructure required for deploying and scaling machine learning services. Implement and manage the CI/CD pipelines to ensure seamless and efficient deployment of ML models. Collaborate with data scientists, ML researchers, and language experts to understand the requirements for deploying ML models and provide necessary infrastructure support. Automate and streamline the build, test, and deployment processes to enhance efficiency and reduce time-to-market. Monitor and optimize the performance, availability, and scalability of production ML systems. Develop and maintain robust monitoring, logging, and alerting systems to proactively identify and address issues. Implement security best practices to protect sensitive data and ensure compliance with relevant regulations. Stay up-to-date with industry trends and emerging technologies related to DevOps, ML Ops and propose innovative solutions to improve our service delivery. Complete all other tasks that are deemed appropriate for this role and assigned by the manager/supervisor REQUIRED SKILLS: Strong knowledge of cloud platforms (such as AWS, Azure, or GCP) and local cluster deployments, and experience in deploying and managing ML services on these platforms. Knowledge of distributed computing frameworks (e.g., Spark) and big data technologies (e.g., Hadoop, Kafka). Proficiency in Python, Shell, Ruby, Golang, or C++ and experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation). Hands-on experience with containerization technologies (e.g., Docker) and orchestration frameworks (e.g. Kubernetes). Familiarity with CI/CD tools (e.g., Jenkins, GitLab CI/CD) and version control systems (e.g., Git). Solid understanding of networking, security, and system administration concepts. Strong problem-solving and troubleshooting skills, with the ability to quickly analyze and resolve issues in complex ML systems. Excellent communication and collaboration skills, with the ability to work effectively in a team-oriented environment. REQUIRED EXPERIENCE AND QUALIFICATIONS: Bachelor's or higher degree in Computer Science, Engineering, or a related field. Proven experience as an DevOps Engineer, ML Engineer, or a similar role, with a focus on deploying and maintaining machine learning models in production environments. DESIRED SKILLS AND EXPERIENCE: Experience with machine learning frameworks and libraries, such as TensorFlow, PyTorch, or scikit-learn. Familiarity with serverless computing and event-driven architectures. Experience with logging and monitoring tools (e.g., ELK Stack, Prometheus, Grafana). Understanding of software development methodologies and agile practices

Mock Interview

Practice Video Interview with JobPe AI

Start Azure Interview Now
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Transperfect
Transperfect

Translation and Localization

New York NY

5001-10000 Employees

3 Jobs

    Key People

  • Phil Shawe

    CEO and Co-Founder
  • Liz Elting

    Co-Founder and Former CEO

RecommendedJobs for You

Hyderabad, Chennai, Bengaluru