Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
3.0 - 8.0 years
3 - 12 Lacs
Hyderabad / Secunderabad, Telangana, Telangana, India
On-site
The role is responsible for the design, integration, and management of high performance computing (HPC) systems encompassing both hardware and software within the organization's network infrastructure. This individual manages system administration and supports business platforms while incorporating new technologies in a sophisticated, evolving technology landscape. The role ensures seamless system integration to meet organizational requirements. Roles & Responsibilities: Implement and manage cloud-based infrastructure supporting HPC environments for data science (e.g., AI/ML workflows, Image Analysis) Collaborate with data scientists and ML engineers to deploy scalable machine learning models into production Ensure security, scalability, and reliability of HPC systems in the cloud Optimize cloud resources for cost-effectiveness and efficiency Stay updated on latest cloud services and industry best practices Provide technical leadership and guidance on cloud and HPC systems management Develop and maintain CI/CD pipelines for multi-cloud resource deployment Monitor and troubleshoot cluster operations, applications, and cloud environments Document system design and operational procedures What We Expect of You We value diverse talents united by the goal of serving patients. We seek a professional with the following qualifications: Basic Qualifications: Master's degree with 4-6 years of hands-on HPC administration experience in Computer Science, IT, or related field OR Bachelor's degree with 6-8 years of hands-on HPC administration experience OR Diploma with 10-12 years of hands-on HPC administration experience Demonstrated expertise in cloud computing (preferably AWS) and cloud architecture Experience with containerization (Singularity, Docker) and cloud HPC solutions Proficiency with infrastructure-as-code (IaC) tools like Terraform, CloudFormation, Packer, Ansible, Git Expert scripting skills (Python or Bash) and Linux/Unix system administration (Red Hat or Ubuntu preferred) Experience with job scheduling/resource management tools (SLURM, PBS, LSF) Knowledge of storage architectures and distributed file systems (Lustre, GPFS, Ceph) Understanding of networking architecture and security best practices Preferred Qualifications: Experience supporting healthcare life sciences research Experience with Kubernetes (EKS) and service mesh architectures Knowledge of AWS Lambda and event-driven architectures Exposure to multi-cloud environments (Azure, GCP) Familiarity with ML frameworks (TensorFlow, PyTorch) and data pipelines Cloud certifications (AWS Certified Solutions Architect, Google Cloud Professional Cloud Architect) Experience in Agile development environments Experience with distributed computing and big data technologies (Hadoop, Spark) Professional Certifications: Red Hat Certified Engineer (RHCE) or Linux Professional Institute Certification (LPIC) (preferred) AWS Certified Solutions Architect Associate or Professional (preferred) Soft Skills: Strong analytical and problem-solving skills Effective communication and collaboration with global, virtual, and cross-functional teams Ability to work in fast-paced, cloud-first environments
Posted 2 weeks ago
3 - 8 years
10 - 19 Lacs
Chennai
Work from Office
Role & responsibilities: Design, implementation & support of high-performance compute clusters Solid knowledge on HPC systems, including CPU/GPU architecture, scalable/robust storage, high-bandwidth inter-connects, and a knowledge of cloud based computing architectures Apply their attention to detail to generate HW BOMs for the HCP Clusters, provide vendor management and oversee HW release activities. Use their strong skills with the Linux OS to configure appropriate operating systems for the HPC system Understand and assemble the project specifications and performance requirements at the subsystem and system levels. Adhere and drive to project timelines to insure program achievements complete on time. Support design and release of new products to manufacturing and ultimately the customer, providing quality golden images, procedures, scripts and documentation to the manufacturing team and customer support team. Validated in-depth and flavor agnostic knowledge of Linux systems (SuSE, RedHat, Rocky, Ubuntu) Experience of crafting and maintaining robust storage Strong HPC HW knowledge especially in the server, GPU, networking, Storage, BIOS & BMC arenas. Experience in System-D, Net boot/PXE, Linux HA. Strong understanding of TCP/IP fundamentals and knowledge of protocols, DNS, DHCP, HTTP, LDAP, SMTP. Ability to code and develop Shell and Python scripts. Experience with one or more of the listed Configuration Mgmt utilities. (Salt, Chef, Puppet etc) . Preferred candidate profile: Possess a strong DevOps focus: Knowledge of setting up a continuous development pipeline (Jenkins), Repository software (Git-based), Singularity & Docker Containers. Kubernetes, Prometheus & Grafana experience Knowledge of Apache/Nginx, Setting up proxy/reverse proxy, application server routing, load balancing (HA Proxy) BS or MS degree + 3 to 5 years validated experience Computer Engineering or Electrical Engineer related fields Team Orientation & Interpersonal Highly motivated teammate with ability to develop and maintain collaborative relationships with all levels within and external to the organization. Organization & Time Management Able to plan, schedule, organize, and follow up on tasks related to the job to achieve goals within or ahead of established time frames. Multi-task - Ability to expeditiously organize, coordinate, manage, prioritize, and perform multiple tasks simultaneously to swiftly assess a situation, determine a logical course of action, and apply the appropriate response. Adaptability to Change Able to be flexible and supportive, and able to assimilate change positively and proactively in rapid growth environment. Outstanding teammate with excellent written and verbal communications skills. Education: Doctorate (Academic) Degree and 0 years related work experience; Master's Level Degree and related work experience of 3 years; Bachelor's Level Degree and related work experience of 5 years
Posted 1 month ago
8 - 13 years
25 - 40 Lacs
Chennai
Hybrid
Role & responsibilities: Design, implementation & support of high-performance compute clusters Solid knowledge on HPC systems, including CPU/GPU architecture, scalable/robust storage, high-bandwidth inter-connects, and a knowledge of cloud based computing architectures Apply their attention to detail to generate HW BOMs for the HCP Clusters, provide vendor management and oversee HW release activities. Use their strong skills with the Linux OS to configure appropriate operating systems for the HPC system Understand and assemble the project specifications and performance requirements at the subsystem and system levels. Adhere and drive to project timelines to insure program achievements complete on time. Support design and release of new products to manufacturing and ultimately the customer, providing quality golden images, procedures, scripts and documentation to the manufacturing team and customer support team. Required Qualifications: Validated in-depth and flavor agnostic knowledge of Linux systems (SuSE, RedHat, Rocky, Ubuntu) Experience of crafting and maintaining robust storage Strong HPC HW knowledge especially in the server, GPU, networking, Storage, BIOS & BMC arenas. Experience in System-D, Net boot/PXE, Linux HA. Strong understanding of TCP/IP fundamentals and knowledge of protocols, DNS, DHCP, HTTP, LDAP, SMTP. Ability to code and develop Shell and Python scripts. Experience with one or more of the listed Configuration Mgmt utilities. (Salt, Chef, Puppet etc) . Preferred Qualifications: Possess a strong DevOps focus: Knowledge of setting up a continuous development pipeline (Jenkins), Repository software (Git-based), Singularity & Docker Containers. Kubernetes, Prometheus & Grafana experience Knowledge of Apache/Nginx, Setting up proxy/reverse proxy, application server routing, load balancing (HA Proxy) BS or MS degree + 3 to 5 years validated experience Computer Engineering or Electrical Engineer related fields. Skills and Abilities: Team Orientation & Interpersonal Highly motivated teammate with ability to develop and maintain collaborative relationships with all levels within and external to the organization. Organization & Time Management Able to plan, schedule, organize, and follow up on tasks related to the job to achieve goals within or ahead of established time frames. Multi-task - Ability to expeditiously organize, coordinate, manage, prioritize, and perform multiple tasks simultaneously to swiftly assess a situation, determine a logical course of action, and apply the appropriate response. Adaptability to Change – Able to be flexible and supportive, and able to assimilate change positively and proactively in rapid growth environment. Outstanding teammate with excellent written and verbal communications skills. Qualifications : Doctorate (Academic) Degree and 0 years related work experience; Master's Level Degree and related work experience of 3 years; Bachelor's Level Degree and related work experience of 5 years Perks and benefits Excellent benefits.
Posted 1 month ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
17062 Jobs | Dublin
Wipro
9393 Jobs | Bengaluru
EY
7759 Jobs | London
Amazon
6056 Jobs | Seattle,WA
Accenture in India
6037 Jobs | Dublin 2
Uplers
5971 Jobs | Ahmedabad
Oracle
5764 Jobs | Redwood City
IBM
5714 Jobs | Armonk
Tata Consultancy Services
3524 Jobs | Thane
Capgemini
3518 Jobs | Paris,France