Home
Jobs

290 Monitoring Tools Jobs - Page 5

Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
Filter
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

4.0 - 8.0 years

8 - 14 Lacs

Bengaluru

Work from Office

Naukri logo

Job Description : We are actively seeking an experienced for Server Management Administrator (Linux with RedHat & Networking Troubleshooting) to join our team. Job Title : Server Management Administrator (Linux with RedHat & Networking Troubleshooting) Experience : 4 to 8 Years Location : Pan India Notice Period : Immediate Joiners Job Summary : We are hiring a Server Management Administrator with expertise in Linux (RedHat) and Networking Troubleshooting to manage and maintain IT infrastructure. The role involves server administration, networking, security, and performance monitoring to ensure smooth operations. Key Responsibilities : - Linux Server Administration : Install, configure, and maintain RedHat-based servers. - Networking Troubleshooting : Resolve network issues (DNS, DHCP, TCP/IP, Firewalls, VPNs, Load Balancers). - System Monitoring & Security : Monitor performance, apply patches, and enhance security. - User & Access Management : Manage accounts, roles, and permissions. - Backup & Recovery : Implement data backup and recovery strategies. - Virtualization & Cloud Exposure (Preferred) : Experience with AWS, Azure, or GCP is a plus. - Incident & Problem Management : Collaborate with IT teams to resolve issues. Required Skills : - Strong experience in RedHat Linux administration. - Hands-on experience in network troubleshooting. - Knowledge of security & compliance standards. - Familiarity with automation tools (Ansible, Puppet, Shell scripting - preferred). - Exposure to virtualization (VMware, KVM - a plus). Qualifications : - Bachelor's degree in IT, Computer Science, or related field. - Preferred Certifications : RHCE, RHCSA, CCNA. - Immediate joiners preferred! Apply now if you have the required skills.

Posted 2 weeks ago

Apply

7.0 - 12.0 years

4 - 8 Lacs

Hyderabad

Work from Office

Naukri logo

As part of the ETG Product Ops team, take ownership and proactively lead the Ops team (L2 and L3 teams) as a Tech Leader on resolving L2/L3 support issues for all the ETG Products (Technology Workflows, AIX, DevX). Ensure all the Incidents and requests are tracked and addressed in a timely manner with a sense of urgency or if need to be escalated to appropriate ETG Engineering or Product teams Track Key performance metrics using ETG and DT Ops dashboard (SLAs for response time, resolution time, customer satisfaction) and ensure all the SLA metrics are met Keep ETG Leadership informed regularly with key updates, SLAs and Performance Metrics Work closely with ETG Engineering teams to understand upcoming feature or Product releases, and train L2/L3 Operations teams in those areas Analyze recurring support issues and customer feedback with ETG Product teams for potential feature improvements Ensure smooth communication with other teams in DT or ServiceNow on Product support needs and goals Drive continuous improvement and lead strategic initiatives to refine operational efficiencies and reinforce customer trust. To be successful in this role you have: Experience in leveraging or critically thinking about how to integrate AI into work processes, decision-making, or problem-solving. This may include using AI-powered tools, automating workflows, analyzing AI-driven insights, or exploring AIs potential impact on the function or industry. More than 7+ years of Exp. working on ServiceNow platform (CSA, CAD certifications preferred) Expertise in writing JavaScript and ServiceNow scripting and problem-solving skills Experience working in Virtual Agent, AI Search, Conversational Interfaces highly preferred Ability to work in a fast-paced and dynamic environment with a sense of urgency towards resolving issues and growth mindset and interest to learn and upskill Experience with monitoring tools, dashboards, and analytics Demonstrated aptitude for learning new technologies quickly Experience with AI/ML and automation in product operations is preferred Strong interpersonal skills, customer centric attitude, ability to deal with cultural diversity Strong communication skills (both written and verbal) Knowledge of industry best practices in product support and operations.

Posted 2 weeks ago

Apply

2.0 - 7.0 years

1 - 2 Lacs

Chennai

Work from Office

Naukri logo

JOB Description Monitoring entire infrastructure of client using various monitoring tools like SCOM, SolarWinds, Telegraph, OEM. Monitoring various types of alerts like CPU Utilization Memory Utilization Database related alerts DR Replication issues Backup Failure Alerts Exchange Mail Queue Threshold Alerts Service Mailbox quota breach alert Adobe Experience Manager / Site 24/7 Alerts Application URL Alerting Scheduling Maintenance Mode for planned Activity. Daily repeat CI analysis of events/alerts/incident and raising proactive problem tickets which helps in reduction of major incident. Handling Major Incidents, Driving the major incident bridge, sending communication about major incident to stake holders. CMDB Inventory Management - Onboarding and Offboarding of Device s are commissioned/decommissioned. Coordinating with Service Provider for MPLS related outage Daily follow ups with Regional and internal teams to ensure all the node are up and running fine.s Candidate Skills: Good in Communication Salary: Upto 4.5 LPA Notice Period: Immediate Joiner Exp: Minimum 2 Years Relevant Job Category: NOC Engineer Job Type: Full Time Job Location: Chennai

Posted 2 weeks ago

Apply

5.0 - 9.0 years

7 - 11 Lacs

Bengaluru

Work from Office

Naukri logo

Client: Societe generale Roles : L2 & L3 Experience : 5- 7 Years & 7-9 Years Skills : Citrix Admin + Net Scaler Location : Bangalore Budget Citrix L3(7-9 Yrs) 15LPA/16LPA Citrix L2(5-7 Yrs) 9LPA/10LPA Specialist Systems Engineer - Citrix Administrator Responsible for handling Incident and Request Management. Participate in Change and Problem Management Monitor the infra proactively and fix it even before users report it. Provide support for L1 engineers for incident investigation, diagnosis and resolution. Ensure resolution of most of the incidents and service requests. Provide input to Level 3/4 for problem management and resolution of major or elevated incidents. Provide required inputs to stakeholders involved in case of critical incidents like outages. Raise change requests where required. Implement Standard and minor changes. Ensure ITIL compliance for all incidents and service calls Ensure KPI compliance for all incidents and service calls Adhere to documented notification and escalation process Communicate to the customer while responding to a case and after resolution of the case. Participate in regular reviews with the team leads Update daily reports and checklists as defined. Create and update documentation Daily health checks on the Citrix environment to ensure service availability Administration of XenApp 7.15 and above Support Virtual Desktop Infrastructure hosted on XenDesktop Maintain Citrix XenApp, VDIs and other core Citrix Components Administration of Ivanti User Workspace Manager. Application installation, publication in CITRIX environnent Act on alerts received through SCOM, Zabbix Qlik, Citrix Director or other monitoring tools. Basic Troubleshooting of Windows Server 2012 and 2016 Strong multi-tasking and organizational skills; Ability to prioritize simultaneous high visibility customer and internal escalations Engineer should have minimum knowledge in scripting and should be able to automate reoccurring manual tasks Sl. No Vendor Name Candidate First Name Candidate Last Name Contact Number Skill Monthly PO Rate in INR Candidate Current location Notice Period Interview status- completed/ awaited Feedback pending/ rejected/ shortlisted. Comments

Posted 2 weeks ago

Apply

5.0 - 8.0 years

6 - 10 Lacs

Bengaluru

Work from Office

Naukri logo

Specialist Systems Engineer - Citrix Administrator Responsible for handling Incident and Request Management. Participate in Change and Problem Management Monitor the infra proactively and fix it even before users report it. Provide support for L1 engineers for incident investigation, diagnosis and resolution. Ensure resolution of most of the incidents and service requests. Provide input to Level 3/4 for problem management and resolution of major or elevated incidents. Provide required inputs to stakeholders involved in case of critical incidents like outages. Raise change requests where required. Implement Standard and minor changes. Ensure ITIL compliance for all incidents and service calls Ensure KPI compliance for all incidents and service calls Adhere to documented notification and escalation process Communicate to the customer while responding to a case and after resolution of the case. Participate in regular reviews with the team leads Update daily reports and checklists as defined. Create and update documentation Daily health checks on the Citrix environment to ensure service availability Administration of XenApp 7.15 and above Support Virtual Desktop Infrastructure hosted on XenDesktop Maintain Citrix XenApp, VDIs and other core Citrix Components Administration of Ivanti User Workspace Manager. Application installation, publication in CITRIX environnent Act on alerts received through SCOM, Zabbix Qlik, Citrix Director or other monitoring tools. Basic Troubleshooting of Windows Server 2012 and 2016 Strong multi-tasking and organizational skills; Ability to prioritize simultaneous high visibility customer and internal escalations Engineer should have minimum knowledge in scripting and should be able to automate reoccurring manual tasks

Posted 2 weeks ago

Apply

4.0 - 7.0 years

5 - 9 Lacs

Bengaluru

Work from Office

Naukri logo

Specialist Systems Engineer - Citrix Administrator Missions Responsible for handling Incident and Request Management. Participate in Change and Problem Management Monitor the infra proactively and fix it even before users report it. Provide support for L1 engineers for incident investigation, diagnosis and resolution. Ensure resolution of most of the incidents and service requests. Provide input to Level 3/4 for problem management and resolution of major or elevated incidents. Provide required inputs to stakeholders involved in case of critical incidents like outages. Raise change requests where required. Implement Standard and minor changes. Ensure ITIL compliance for all incidents and service calls Ensure KPI compliance for all incidents and service calls Adhere to documented notification and escalation process Communicate to the customer while responding to a case and after resolution of the case. Participate in regular reviews with the team leads Update daily reports and checklists as defined. Create and update documentation Daily health checks on the Citrix environment to ensure service availability Administration of XenApp 7.15 and above Support Virtual Desktop Infrastructure hosted on XenDesktop Maintain Citrix XenApp, VDIs and other core Citrix Components Administration of Ivanti User Workspace Manager. Application installation, publication in CITRIX environnent Act on alerts received through SCOM, Zabbix Qlik, Citrix Director or other monitoring tools. Basic Troubleshooting of Windows Server 2012 and 2016 Strong multi-tasking and organizational skills; Ability to prioritize simultaneous high visibility customer and internal escalations Engineer should have minimum knowledge in scripting and should be able to automate reoccurring manual tasks

Posted 2 weeks ago

Apply

2.0 - 3.0 years

7 - 10 Lacs

Hyderabad

Work from Office

Naukri logo

AI Ops/Monitoring Specialist openings at Advantum Health Pvt Ltd, Hyderabad. Overview: Were seeking an AI Ops/Monitoring Specialist to ensure the stability, transparency, and performance of AI systems in production. You will monitor, log, and troubleshoot AI and RPA models to ensure continuous reliability and compliance. Key Responsibilities: Monitor AI model health (drift, performance, latency, bias). Build dashboards and alerts using tools like Prometheus, Grafana, or Datadog. Establish SLAs and SLOs for AI/RPA models and pipelines. Collaborate with AI teams to integrate observability into model lifecycles. Document anomalies and assist in root cause analysis and mitigation. Qualifications: Bachelors in Data Science, IT, or a related field. 2+ years in systems monitoring, SRE, or MLOps. Experience with model monitoring tools (e.g., MLflow, Arize, WhyLabs). Familiarity with AI/ML lifecycles and performance metrics. Background in healthcare or compliance-heavy environments is ideal. Ph: 9177078628 Email id: jobs@advantumhealth.com Address: Advantum Health Private Limited, Cyber gateway, Block C, 4th floor Hitech City, Hyderabad. Do follow us on LinkedIn, Facebook, Instagram, YouTube and Threads Advantum Health LinkedIn Page: https://lnkd.in/gVcQAXK3 Advantum Health Facebook Page: https://lnkd.in/g7ARQ378 Advantum Health Instagram Page: https://lnkd.in/gtQnB_Gc Advantum Health India YouTube link: https://lnkd.in/g_AxPaPp Advantum Health Threads link: https://lnkd.in/gyq73iQ6

Posted 2 weeks ago

Apply

6.0 - 8.0 years

8 - 14 Lacs

Chennai

Work from Office

Naukri logo

Must have skills : - 6-8 years of Devops/SRE/IAC Experience, AWS,Scripting is a must [Python/Ansible or Terraform] - Infrastructure/cloud/containers/automation/Monitoring/Support] Requirements : - Four-year university degree or college diploma in the field of computer science and/or 7-8 years equivalent work experience. - Experience using Prometheus, Datadog, Splunk, and Grafana would be preferred. - Experienced using MySQL and cloud-based relational database solutions like Cloud SQL, database replication, and scalability. - Experienced with, Load Balancing (ELB/ILB), Reverse Proxies, CDNs, etc. - Experienced using Python/Bash . - Experience with Jenkins CI/CD automation experience. - Experience with designing, building, and implementing full End to End SDLC automating Infrastructure. - Experienced running big infrastructure platforms at scale - Experienced with Kubernetes, maintenance and deployment of services - Experienced with AWS/GCP Cloud Environments - Minimum of 6 years of experience in the DevOps Infrastructure Field - Ability to make sound and logical judgments. - Demonstrated personnel/project management skills. - Good understanding of the organization's goals and objectives. - Strong interpersonal, written, and oral communication skills in English

Posted 2 weeks ago

Apply

6.0 - 8.0 years

8 - 14 Lacs

Bengaluru

Work from Office

Naukri logo

Must have skills : - 6-8 years of Devops/SRE/IAC Experience, AWS,Scripting is a must [Python/Ansible or Terraform] - Infrastructure/cloud/containers/automation/Monitoring/Support] Requirements : - Four-year university degree or college diploma in the field of computer science and/or 7-8 years equivalent work experience. - Experience using Prometheus, Datadog, Splunk, and Grafana would be preferred. - Experienced using MySQL and cloud-based relational database solutions like Cloud SQL, database replication, and scalability. - Experienced with, Load Balancing (ELB/ILB), Reverse Proxies, CDNs, etc. - Experienced using Python/Bash . - Experience with Jenkins CI/CD automation experience. - Experience with designing, building, and implementing full End to End SDLC automating Infrastructure. - Experienced running big infrastructure platforms at scale - Experienced with Kubernetes, maintenance and deployment of services - Experienced with AWS/GCP Cloud Environments - Minimum of 6 years of experience in the DevOps Infrastructure Field - Ability to make sound and logical judgments. - Demonstrated personnel/project management skills. - Good understanding of the organization's goals and objectives. - Strong interpersonal, written, and oral communication skills in English

Posted 2 weeks ago

Apply

7.0 - 9.0 years

8 - 14 Lacs

Hyderabad

Work from Office

Naukri logo

As a Server Management Administrator (Linux) at ZettaMine Labs, you will be responsible for the comprehensive management, maintenance, and support of Linux-based server infrastructure across a portfolio of client projects. You will utilize your extensive experience in Linux administration, virtualization technologies, networking fundamentals, cloud platforms, and automation tools to ensure the high availability, performance, and security of our clients' server environments. This role requires a strong technical foundation, excellent problem-solving abilities, and the capacity to manage multiple tasks effectively in a fast-paced environment. Responsibilities : - Provide expert-level administration, configuration, and troubleshooting for a variety of Linux distributions including RHEL, SUSE, Ubuntu, and CentOS. - Implement and manage user and group accounts, permissions, and security policies. - Perform advanced system tuning and optimization for performance and stability. - Manage and troubleshoot file systems, storage solutions (local and network), and logical volume management (LVM). - Implement and manage system security hardening based on industry best practices and client requirements. - Plan, execute, and document the installation, configuration, and deployment of new Linux servers (physical and virtual). - Perform routine server maintenance tasks, including patching, upgrades, and configuration changes. - - Monitor server health and performance using various tools and proactively address potential issues. - Manage the decommissioning and secure disposal of end-of-life servers. - Demonstrate extensive hands-on experience in managing and troubleshooting virtualized environments using VMware (vSphere, ESXi), XEN, or other relevant hypervisors. - Provision, clone, migrate, and manage virtual machines efficiently. - Monitor and optimize the performance of virtualized infrastructure. - Troubleshoot issues related to virtual networking and storage. - Possess a strong understanding of TCP/IP networking, DNS, DHCP, routing, VLANs, and load balancing concepts. - Configure and manage software firewalls (iptables, firewalld) and implement network security policies. - - Troubleshoot network connectivity issues at the server level and collaborate with network engineers as needed. - Implement and manage VPN connections and other secure communication protocols. - Demonstrate practical experience in deploying, managing, and troubleshooting Linux server infrastructure on major cloud platforms such as AWS, Azure, and/or GCP. - Utilize cloud-native services for server management, monitoring, and security. - Implement and manage infrastructure-as-code (IaC) using automation tools like Ansible, Chef, Terraform, SaltStack, or similar. - Develop and maintain automation scripts for provisioning, configuration management, and routine tasks. - - Exhibit advanced proficiency in scripting languages such as Bash, Perl, and/or Python for automating complex system administration tasks, creating sophisticated monitoring scripts, and managing infrastructure configurations at scale. - Develop custom tools and scripts to enhance server management capabilities. - Design, implement, and manage comprehensive monitoring solutions using tools like Prometheus, Nagios, Zabbix, Grafana, or similar. - - Configure alerts and notifications for critical system events and performance thresholds. - - Implement and manage centralized logging solutions using the ELK stack (Elasticsearch, Logstash, Kibana) or similar tools for log analysis and troubleshooting. - Utilize GitHub for version control of scripts and configurations. - Manage incidents, problems, and changes effectively using Jira and adhering to ITIL-based processes. - Participate in root cause analysis and contribute to knowledge base articles. - - Lead troubleshooting efforts for complex server-related incidents, identifying root causes and implementing effective and timely resolutions. - Manage incident escalations and communicate effectively with stakeholders. - Possess excellent verbal and written communication skills to interact effectively with technical and non-technical stakeholders, including application developers, database administrators, network engineers, project managers, and client representatives. - - Provide clear and concise technical documentation, including runbooks and standard operating procedures. Required Skills : - 7+ Years of deep, hands-on experience administering various Linux distributions (RHEL, SUSE, Ubuntu, CentOS). - Proven ability to manage servers through their entire lifecycle. - Extensive hands-on knowledge of VMware, XEN, and physical server management, including troubleshooting complex virtualization issues. - Strong understanding and practical application of networking concepts, firewall management, and security protocols. - Significant experience with at least one major cloud platform (AWS/Azure/GCP) and deep proficiency in using automation tools (Ansible, Chef, Terraform, etc.). - Expert-level proficiency in Bash, Perl, or Python scripting for complex automation tasks. - Proven experience implementing and utilizing monitoring tools (Prometheus, Nagios, ELK stack) for proactive issue detection and analysis. - Familiarity and practical experience with GitHub, Jira, and ITIL-based incident/change management processes.

Posted 2 weeks ago

Apply

3.0 - 4.0 years

8 - 14 Lacs

Chennai, Bengaluru

Work from Office

Naukri logo

- 3 - 4 years of experience in technical support roles. - Proficiency in understanding the support ticketing lifecycle and tools such as JIRA and ServiceNow. - Basic understanding of SQL for querying logs and databases. - Knowledge of backend server logging, including API and error codes. - Experience with monitoring tools such as SignalFX, Splunk, and Kibana. - Good understanding of different error codes and HTTP response codes. - Strong knowledge of server and client logs, with the ability to analyze and interpret logs effectively. Responsibilities : - Collaborate with the team to understand current issues impacting operations and drive them to resolution. - Validate and escalate issues using internal logging systems and tools. - Work with server and client logs to understand and troubleshoot issues. - Analyze logs to identify root causes of problems and develop effective solutions. - Manage support tickets through JIRA and ServiceNow, ensuring accurate and timely updates. - Communicate with reporters to gather necessary information and clarify ticket details. - Investigate and identify the root cause of issues when information is incomplete or unclear. - Use backend server logs and error codes to determine the impact of issues and develop strategies for resolution. - Utilize monitoring tools such as SignalFX, Splunk, and Kibana to track system performance and identify potential issues. - Generate reports and provide insights on recurring issues or trends affecting system performance

Posted 2 weeks ago

Apply

4.0 - 7.0 years

4 - 8 Lacs

Hyderabad, Bengaluru

Work from Office

Naukri logo

Tittle : Kubernetes modernization Engineers Experience : 5 - 7 years Location : Hyderabad/Bangalore Key Responsibilities : - Lead Automation : Design and implement an automation framework to migrate workloads to Kubernetes platforms such as AWS EKS, Azure AKS, Google GKE, Oracle OKE, and OpenShift. - Develop Cloud-Native Automation Tools : Build automation tools using Go (Golang) for workload discovery, planning, and transformation into Kubernetes artifacts. - Migrate Kubernetes Across Cloud Providers : Plan and execute seamless migrations of Kubernetes workloads from one cloud provider to another (AWS - Azure, GCP - OCI, etc.) with minimal disruption. - Leverage Open-Source Technologies : Utilize Helm, Kustomize, ArgoCD, and other popular open-source frameworks to streamline cloud-native adoption. - CI/CD & DevOps Integration : Architect and implement CI/CD pipelines using Jenkins (including Jenkinsfile generation) and cloud-native tools like AWS CodePipeline, Azure DevOps, and GCP Cloud Build to support diverse Kubernetes deployments. - Security & Compliance : Define and enforce security best practices, implement zero-trust principles, and proactively address vulnerabilities in automation workflows. - Technical Leadership & Mentorship : Lead and mentor a team of developers, fostering expertise in Golang development, Kubernetes, and DevOps best practices. - Stakeholder Collaboration : Work closely with engineering, security, and cloud teams to align modernization and migration efforts with business goals and project timelines. - Performance & Scalability : Ensure high performance, scalability, and security across automation frameworks and multi-cloud Kubernetes deployments. - Continuous Innovation : Stay ahead of industry trends, integrating emerging tools and methodologies to enhance automation and Kubernetes portability. Qualifications & Experience : - Education : Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience). - Experience : 8+ years in software development, DevOps, or cloud engineering, with 3+ years in a leadership role. - Programming Expertise : Strong proficiency in Go (Golang), Python for building automation frameworks and tools. - Kubernetes & Containers : Deep knowledge of Kubernetes (K8s), OpenShift, Docker, and container orchestration. - Cloud & DevOps : Hands-on experience with AWS, Azure, GCP, OCI, self-managed Kubernetes, OpenShift, and DevOps practices. - CI/CD & Infrastructure-as-Code : Strong background in CI/CD tools (Jenkins, Git, AWS CodePipeline, Azure DevOps, GCP Cloud Build) and Infrastructure-as-Code (IaC) with Terraform, Helm, or similar tools. - Kubernetes Migration Experience : Proven track record in migrating Kubernetes workloads between cloud providers, addressing networking, security, and data consistency challenges. - Security & Observability : Expertise in cloud-native security best practices, vulnerability remediation, and observability solutions. - Leadership & Communication : Proven ability to lead teams, manage projects, and collaborate with stakeholders across multiple domains. Preferred Skills & Certifications : - Experience in self-managed Kubernetes provisioning (e.g., kubeadm, Kubespray) and OpenShift customization (e.g., Operators). - Industry Certifications - CKA, CKAD, or cloud-specific credentials (e.g., AWS Certified DevOps Engineer). - Exposure to multi-cloud and hybrid cloud migration projects

Posted 2 weeks ago

Apply

4.0 - 8.0 years

8 - 14 Lacs

Bengaluru

Work from Office

Naukri logo

Purpose : Administers and maintains multiple high-availability database servers and database services supporting various HMI systems. Establishes positive working relationships within IT, the business partners and with other functional areas of the company. Utilizes technical expertise to define, design and implement database solutions within functional area(s) of responsibility. Establishes and maintains priorities for projects and simultaneous assignments. Implements, monitors and enforces data security controls and measures to ensure integrity and high availability of database systems. Manages, mentors, guides and instructs team members of work delivery, prioritization and capacity planning. This role extends beyond geographical boundaries as it aims to support the Herman Miller business across the globe and involves some amount of travel and flexible working in shift rotations, extended hours and weekend coverage to ensure business is supported. Responsibilities : - Ensure security and SOX requirements are adhered to and respond to audits as required. - Complete the periodic internal SOX Audits ensuring Documented evidence and IPE is maintained. - provide effective ERP and DBA support across the international sites. - Manage SQL Server databases - Configure and maintain database servers and processes - Monitor system's health and performance - In charge of database backup and recovery methods, database access security and integrity, physical data storage design, and data storage administration - In charge of database migrations and server updates - Ensure high levels of performance, availability, sustainability and security - Analyze, solve, and correct issues in real time - Provide suggestions for solutions - Refine and automate regular processes, track issues, and document changes - Proactively monitor SQL Server maintenance tasks, troubleshoot failed processes, and address issues as soon as possible. - Assist developers with query tuning and schema refinement - Provide 24x7 support for critical production systems - Perform scheduled maintenance and support release deployment activities after hours - Participate in continuous process improvement - Make recommendations for system architecture per Microsoft SQL Server best practices - Ensures that the infrastructure is built-out for High Availability, Recoverability, DR, and Performance. - Ensures integration of enterprise application and database performance, monitoring and altering management systems. - Manage the periodic refresh and ad-hoc refresh of test and Stage Environments - Upgrade ERP systems and apply patches as required - Provide Support for the development teams in US, UK, India and APAC Minimum Requirements : - Bachelor's / Master's Degree in Computer Science or equivalent - One or more technical certifications on Database management / administration preferably coupled with certification in Windows server administration. - 6 years of hands-on database administration experience with Microsoft SQL in large multi-server multi-location enterprise environments (including virtualized environments). - Excellent interpersonal skills. Strong oral and written communication skills to work closely with internal customers (IT and users) and third party vendors - Strong analytical and problem solving skills with attention to detail. - Ability to view opportunities and solutions from a broad, long term perspective. - Good working knowledge of SQL 2016,2019 with experience on SSRS, SSIS, SSMS - Good working knowledge of SQL Security - Knowledge of Windows operating systems: 2016,2019 - Knowledge of VM and SAN database optimization - Proactive, methodical and willing to own and resolve issues - A team player with the ability to work on their own - Able to work to tight deadlines - Able to work closely with the leadership to design, implement and improve processes regularly Essential Experience : - Database Administration experience of at least 6 years in a busy commercial and enterprise environment - Certification in Microsoft SQL server administration (2014 and above) - SQL Profiler&Tuning skills - Experience of database replication, mirroring or log shipping - SQL Security management - Experience of managing small team of DBAs - Experience of Business Application Administration and support. - Clear, confident communication skills - Experience building and maintaining a strategic plan. Desirable : - Experience of Infor Syteline ERP system - This is highly desirable - Certification in Oracle Database Administration (Preferred) - Foglight / SpotlightonSQLServer, SharePoint, Red-Gate - Exposure to SOX and ITIL

Posted 2 weeks ago

Apply

6.0 - 11.0 years

20 - 25 Lacs

Hyderabad, Ahmedabad

Hybrid

Naukri logo

Hi Aspirant, Greetings from TechBlocks - IT Software of Global Digital Product Development - Hyderabad !!! About us : TechBlocks is a global digital product engineering company with 16+ years of experience helping Fortune 500 enterprises and high-growth brands accelerate innovation, modernize technology, and drive digital transformation. From cloud solutions and data engineering to experience design and platform modernization, we help businesses solve complex challenges and unlock new growth opportunities. Job Title: Senior DevOps Site Reliability Engineer (SRE) Location : Hyderabad & Ahmedabad Employment Type: Full-Time Work Model - 3 Days from office Job Overview Dynamic, motivated individuals deliver exceptional solutions for the production resiliency of the systems. The role incorporates aspects of software engineering and operations, DevOps skills to come up with efficient ways of managing and operating applications. The role will require a high level of responsibility and accountability to deliver technical solutions. Summary: As a Senior SRE, you will ensure platform reliability, incident management, and performance optimization. You'll define SLIs/SLOs, contribute to robust observability practices, and drive proactive reliability engineering across services. Experience Required: 610 years of SRE or infrastructure engineering experience in cloud-native environments. Mandatory: Cloud : GCP (GKE, Load Balancing, VPN, IAM) Observability: Prometheus, Grafana, ELK, Datadog Containers & Orchestration : Kubernetes, Docker Incident Management: On-call, RCA, SLIs/SLOs IaC : Terraform, Helm Incident Tools: PagerDuty, OpsGenie Nice to Have : GCP Monitoring, Skywalking Service Mesh, API Gateway GCP Spanner, Scope: Drive operational excellence and platform resilience Reduce MTTR, increase service availability Own incident and RCA processes Roles and Responsibilities: Define and measure Service Level Indicators (SLIs), Service Level Objectives ( SLOs), and manage error budgets across services. Lead incident management for critical production issues drive Root Cause Analysis (RCA) and postmortems. Create and maintain runbooks and standard operating procedures for high availability services. Design and implement observability frameworks using ELK, Prometheus, and Grafana ; drive telemetry adoption. Coordinate cross-functional war-room sessions during major incidents and maintain response logs. Develop and improve automated System Recovery, Alert Suppression, and Escalation logic. Use GCP tools like GKE, Cloud Monitoring, and Cloud Armor to improve performance and security posture. Collaborate with DevOps and Infrastructure teams to build highly available and scalable systems. Analyze performance metrics and conduct regular reliability reviews with engineering leads. Participate in capacity planning, failover testing, and resilience architecture reviews. If you are interested , then please share me your updated resume to kranthikt@tblocks.com Warm Regards, Kranthi Kumar kranthikt@tblocks.com Contact: 8522804902 Senior Talent Acquisition Specialist Toronto | Ahmedabad | Hyderabad | Pune www.tblocks.com

Posted 2 weeks ago

Apply

1.0 - 6.0 years

4 - 9 Lacs

Hyderabad

Work from Office

Naukri logo

1-to-2-year experience System engineering. Analytical and problem-solving skills, with attention to detail and ability to work in a fast-paced environment. Strong understanding on Linux Operating system and Ansible scripting. Required Candidate profile Knowledge on Kubernetes and DevOps Tools(git, Jenkins , ArgoCD) are added advantage.

Posted 2 weeks ago

Apply

6.0 - 11.0 years

0 - 0 Lacs

Pune

Hybrid

Naukri logo

JD: Skills : NGINX-F5, DevOps tools, XLR, Jenkins, Unix, Database , Bitbucket, Monitoring tool Responsibilities The successful candidate will fulfil three main missions: monitor, support and deploy applications. Application monitoring: implementing & maintaining dashboards and tools to monitor application health and performance, participate to recurring reviews, detect & analyse anomalies, Application support: as a support engineer, help platform users (internal teams), troubleshoot issues with the platform, dispatch and follow up with other more specialized teams (development, infrastructure) or third party vendors, or investigate incorrect application behaviours, Application deployment: prepare and implement new application releases rollouts and migrations. The successful candidate will work closely with the other team members (application engineers, system & network administrators, technical architects), the application development teams, and other operational teams (customer onboarding, business operations and customer support teams, compliance and finance operations). This is not a customer facing position. Roles Prepare & execute procedures for new application releases/updates Troubleshoot application issues so that other teams can fix or work around them implement restoration procedures as needed; follow up on raised issues. Monitor applications health & performance Improve and maintain tools to monitor applications health (mainly Grafana & Splunk dashboards & alerts) Participate in 24/7 on-call guard duties on rotation basis Help improve team processes (planned works, ticket management, communication flows) Schedule planned works as requested Review procedures so that they minimize customer impacts Make sure communication to impacted users is done accordingly and timely Write RCAs (root cause analysis) in case of production incidents

Posted 2 weeks ago

Apply

4.0 - 8.0 years

3 - 8 Lacs

Noida, Gurugram, Delhi / NCR

Work from Office

Naukri logo

Role & responsibilities Supporting applications in web-based services/agile software development environment using Service Management tools like ServiceNow etc. Experience using ticketing systems like JIRA and application monitoring tools like Dynatrace, Data Dog etc. Excellent communication skills, including the ability to troubleshoot complex issues Preferred candidate profile Immediate joiner

Posted 2 weeks ago

Apply

6.0 - 10.0 years

20 - 32 Lacs

Bengaluru

Hybrid

Naukri logo

Job Description: We are looking for an experienced Senior DevOps Engineer to join our team. As a Senior Cloud DevOps Engineer, you will be responsible for automating and streamlining our cloud infrastructure and deployment processes, ensuring the scalability, reliability, and security of our systems. You will work with cutting-edge technologies, including AWS , Kubernetes , and CI/CD pipelines, with a focus on strong scripting and automation skills to improve workflows and deployment pipelines. Requirements 6+ years of experience as an SRE Engineer / DevOps Engineer or similar software engineering role Strong technical skills in Cloud Infrastructure (AWS), CI/CD pipelines (like Jenkins, Github), Kubernetes Act as a subject matter expert for proposed designs and implementations Hands-on experience in Production monitoring, incident management, emergency response, and root cause analysis. Contribute to the automation of manual work and improving existing tooling Experience with observability tools like Grafana, Prometheus / DataDog / Splunk Track record of debugging and problem-solving in applications running on the microservices architecture Good to have Akamai knowledge Good communication skills for communicating effectively within the team and with stakeholders Work directly with development teams to integrate and build reliable solutions. Write infrastructure-related code when needed (Python /Linux Shell /Groovy /Perl) Experience with infrastructure as code tools like Terraform / CloudFormation, etc. Good knowledge of cloud network design and security Participation in an on-call rotation with the team

Posted 2 weeks ago

Apply

3.0 - 5.0 years

10 - 19 Lacs

Noida

Work from Office

Naukri logo

Join Us in Transforming Cybersecurity Through Exceptional Design At ThreatModeler , were redefining how organizations build secure systemsby helping teams shift left and design security into their workflows from the very beginning. Our mission isnt just about building powerful productsit’s about creating intuitive, human-centered experiences that empower users to do their best work. As part of our growing team, you’ll work at the intersection of design, technology, and cybersecurity—translating complex workflows into intuitive, elegant interfaces. You’ll collaborate closely with engineers, product managers, and researchers to shape the future of threat modeling. With $60 million in institutional funding from Invictus Growth Partners, we’re scaling fast—and we want designers who are just as passionate about usability and clean design as we are about security. What You’ll Do Analyze, troubleshoot, and resolve advanced application and infrastructure-level incidents reported via escalations from Tier-1/Tier-2 support teams. As a Tier-3 technical support engineer, serve as the technical point of contact for critical production issues, with a focus on root cause analysis and permanent resolution. Understanding of SQL Server and SQL Queries Investigate third-party integrations with SSO (e.g., Okta, Azure AD, ADFS), DevOps tools (e.g., Jenkins, Azure Pipelines), and cloud services (e.g., AWS, Azure, GCP). Collaborate with engineering to identify software defects, contribute to hotfixes, and validate bug resolutions. Have an understanding of monitoring, logging, and alerting systems to proactively identify potential issues. Help develop internal tooling or scripts to streamline troubleshooting and deployment processes. Write and maintain internal knowledge base articles, runbooks, and postmortem reports. What You Bring Bachelor’s in Computer Science or a related field. 3–5+ years of experience in technical support or systems engineering with at least 2 years in Tier-3 or escalation-level support. Some SQL Server skills including query optimization, data extraction, and troubleshooting (versions 2014–2019). Working knowledge of REST APIs, Postman, HTTP debugging tools, and authentication mechanisms (OAuth, SAML, etc.), Cloud providers (AWS, Azure, GCP) Familiarity with SaaS architecture and CI/CD tools (e.g., Bitbucket Pipelines, Azure DevOps, Jenkins). Excellent analytical thinking and problem-solving skills, with a calm, methodical approach to issue resolution. Strong interpersonal and communication skills — capable of working across teams and engaging with enterprise clients.

Posted 2 weeks ago

Apply

8.0 - 13.0 years

12 - 22 Lacs

Noida, Gurugram, Delhi / NCR

Work from Office

Naukri logo

Role & responsibilities The position We are seeking a highly skilled and motivated individual to join our team as Lead - Monitoring. As the Lead Monitoring, you will play a crucial role in overseeing and optimizing our systems and networks. Your responsibilities will include monitoring the performance metrics of our IT infrastructure. Additionally, you will lead troubleshooting efforts, identify and resolve system issues, and implement proactive measures to minimize downtime and disruptions. This role requires a keen eye for detail, strong analytical skills, and the ability to collaborate effectively with technical teams to implement solutions and improve overall system performance. Roles and Responsibilities Monitor System Performance: Oversee the monitoring of system performance metrics, including uptime, response times, and resource utilization, using monitoring tools such as Nagios, Microsoft SCOM, Site24X7, and other third-party tools, to ensure optimal performance and availability. Troubleshooting and Issue Resolution: Lead the identification, troubleshooting, and resolution of system issues, working closely with technical teams to implement solutions and minimize downtime. Capacity Planning: Develop and implement capacity planning strategies to forecast future resource needs and optimize system scalability and performance. Incident Response: Develop and maintain incident response protocols and procedures, including escalation paths and response timelines, to address system outages and critical incidents promptly. Monitoring Tools Management: Evaluate, select, and manage monitoring tools and technologies to support efficient and effective monitoring of systems, networks, and applications. Performance Analysis: Conduct performance analysis and trend analysis to identify potential bottlenecks, areas for improvement, and optimization opportunities. Cloud Monitoring: Implement and manage cloud monitoring solutions for platforms such as Azure and AWS, ensuring visibility into cloud-based resources, performance metrics, and cost optimization strategies. Monitor cloud infrastructure, services, and applications to identify and resolve issues proactively. Synthetic Monitoring: Design and implement synthetic monitoring solutions to simulate user interactions and transactions across applications, websites, and services. Analyze synthetic monitoring data to identify performance bottlenecks and optimize user experience. Documentation and Reporting: Maintain accurate documentation of monitoring processes, configurations, and incident reports. Generate regular reports on system performance, uptime, and incident resolution metrics. KPIs and Dashboards: Develop and publish key performance indicators (KPIs), dashboards, and other reporting mechanisms to provide insights into system performance, trends, and areas for improvement. Present findings and recommendations to stakeholders and management. Team Leadership: Provide leadership and guidance to monitoring team members, fostering a culture of collaboration, continuous improvement, and excellence in monitoring practices. Skills Bachelor's degree in Computer Science, Information Technology, or a related field. (Master's degree preferred) Proven experience (5+ years) in system monitoring, performance analysis, and incident response, preferably in a lead or supervisory role. Strong technical expertise in monitoring tools such as Nagios, Microsoft SCOM, Site24X7, and other third-party tools. Solid understanding of network protocols, server infrastructure, and cloud environments (e.g., AWS, Azure), with experience in cloud monitoring, synthetic monitoring, and optimization. Experience with scripting languages (e.g., Python, PowerShell) for automation and monitoring tasks. Excellent analytical, problem-solving, and decision-making skills. Strong leadership, communication, and team collaboration abilities. Experience in publishing KPIs, dashboards, and other reporting mechanisms. Good to have : Relevant certifications such as ITIL Foundation, Certified Monitoring Professional (CMP), Microsoft Certified: Azure Administrator Associate, and Microsoft Certified: SCOM (if available) are a plus Interested candidates please apply on the given link https://apply.workable.com/ezrecruiting/j/1217C832C7/

Posted 2 weeks ago

Apply

3.0 - 5.0 years

5 - 7 Lacs

Coimbatore

Work from Office

Naukri logo

Senior Backend Developer - Scalable Applications & Systems Location: Bengaluru / Hybrid Experience: 3-5 years Type: Full-time About Us At Lyzr , were building intelligent, high-performance platforms that power next-gen digital products. We re looking for a Senior Backend Developer who thrives on building robust, scalable systems and has hands-on experience in end-to-end application development, payment gateway integration, and architecting backend solutions that scale. If you love designing clean APIs, solving real-world challenges, and optimizing systems for performance you ll love this role. What You ll Do Own development of scalable backend systems for production-grade applications Design and implement RESTful APIs, microservices, and data models Own payment gateway integrations (e.g., Razorpay, Stripe, etc.) and ensure secure, seamless transactions Architect and deploy backend solutions across cloud platforms (AWS, GCP, etc.) Build and maintain resilient infrastructure to support high-traffic applications Collaborate with frontend, DevOps, and product teams to ship end-to-end features Optimize systems for performance, reliability, and scalability Drive code quality, system security, and architectural best practices What We re Looking For 3-5 years of experience in backend development with end-to-end ownership of feature delivery Proficiency in Python (or similar backend languages like Node.js, Java, or Go) Strong knowledge of relational and non-relational databases Proven experience with payment gateway integration and order processing systems Experience designing scalable systems and working with event-driven or distributed architectures Familiarity with Docker, Kubernetes, CI/CD pipelines, and monitoring tools Ability to translate business needs into robust technical solutions Excellent debugging, communication, and collaboration skills Nice to Have Experience working in startups or fast-paced product teams Exposure to AI/ML-based systems or microservice architectures Prior experience in fintech or e-commerce platforms Contributions to open-source backend projects Why Join Us Work on impactful projects with real users and scale Join a fast-moving, no-nonsense product team that loves shipping Flexible hybrid working model with high ownership and zero micromanagement Be part of a company that s building for the future

Posted 2 weeks ago

Apply

2.0 - 4.0 years

2 - 6 Lacs

Hyderabad

Work from Office

Naukri logo

Join Our Journey at VuNet At VuNet, were at the forefront of developing an innovative Business Observability platform. Our approach integrates big data and machine learning to revolutionize how customer journeys are monitored and user experiences enhanced. Our cutting-edge solutions are transforming digital payment experiences for major financial institutions, fostering financial inclusion nationwide. In our dynamic environment, we encourage our teams to tackle challenging customer and business issues. We value creativity and efficiency, rapidly transforming brilliant ideas into exceptional products that our customers adore. Our approach is grounded in teamwork, with cross-functional groups delving into details and engaging in constructive debates, all united by our mission to establish VuNet as a leader in the tech product industry. What You Can Make Happen We are looking for smart, self-motivated people with excellent communication skills to join our Customer Operations Team as Observability Engineers. Work with customers to understand their Business, application and IT landscape to design and deliver solutions for unified visibility, monitoring and analytics. Are you ready to be a part of VuNets trailblazing team? Join us and contribute to making a positive impact on the world. Roles & Responsibilities Excellent verbal / Written communication (Added Written as well) and good attitude. Flexible to work in shifts that includes night shift Basic understanding of process and importance of it along with managing reports/documentation. Experience in working at NOC / Command Centre driven using SLA , Experience in ITIL Incident Management and Major / P1 incident management Fluency in MS Excel, power-point is required. Skills & Experience Preference to the candidates who have experience in NOC monitoring tools such as (ITOM, ITSM) as L1 resource. Willingness to learn new technologies and upgrade themselves as and when required. Understanding of infrastructure and flow, how all the components work hand in hand (Network, server, application and database). Educational Qualifications: Must have completed Three-year undergraduates such as Bachelors of Science or Bachelors of commerce with minimum 2 years of work experience. Good to Have Skills & Attributes Good verbal and written communication skills to connect with customers at varying levels of the organization. Ability to operate independently and make decisions with little direct supervision. Additional Information Should be open to work in 24*7 shifts. Benefits 100% Health Coverage of Medical insurance along with family. Financial protection on disability, Life and accidental death for the employee. 100% Parents Coverage for Medical insurance. Mental wellness programs, and counseling with 1:1 sessions. Our Culture VuNet believes in delivering WoW through service. VuNet supports career development programs to expand our People skills and enhance the expertise with various training programs. VuNet advocates open culture by having transparency, inclusivity, adaptability, and collaboration with an environment that fosters employee satisfaction, motivation, and trust, and also having an open communication, collaboration, and innovation within the teams.

Posted 2 weeks ago

Apply

3.0 - 5.0 years

5 - 7 Lacs

Bengaluru

Work from Office

Naukri logo

Job Description Experience: 3-5 years in a DevOps, Site Reliability Engineering (SRE), or similar role. Required Qualifications: Strong experience below AWS Cloud Services - EKS, S3, VPC, Lambda, EC2, RDS, IAM, Landing Zone, Hands-on experience with setting up ArgoCD and integrating with Kubernetes. Hands-on experience with setting up Kubernetes and configuring Ingress, RBAC, Network Policies. Hands-on experience with Helm charts. Basic development experience and understanding of OOPS concepts with any programming languages (Python or Java or Typescript) Hands-on experience in scripting languages like Python or Bash. Good to have Azure Devops experience. Experience with Monitoring tools such as Prometheus, Grafana and Observability tools such as Dynatrace or Newrelic. Soft Skills: Excellent problem-solving abilities, communication skills, and a proactive approach to continuous improvement. Monitor systems and troubleshoot issues to ensure high availability and performance. Work closely with development, QA, and IT teams to enhance productivity and resolve technical challenges. Good to know: Knowledge of serverless architecture and technologies. Experience with distributed systems and microservices architecture. Familiarity with Agile methodologies and tools like Jira or Confluence

Posted 2 weeks ago

Apply

3.0 - 4.0 years

5 - 6 Lacs

Mumbai

Work from Office

Naukri logo

JD for the RE going to deploy in Client Site (Noida Location) Should have L2 level of knowledge. RE Should operate from Client site 3-4 years of experience. RE should coordinate day in and day out activity on product Sentinel SIEM for incident monitoring. RE Should co ordinate with Inspira team for SIEM as primary tool for incident monitoring and analysis; RE should coordinate with team for incident raise on other monitoring tools such as XDR console for further analysis of relevant logs/ activities as needed. This helps in establishing cause of incident, possible impact and identifying remediation requirements accurately.

Posted 2 weeks ago

Apply

8.0 - 13.0 years

7 - 8 Lacs

Hyderabad

Work from Office

Naukri logo

Generative AI (GenAI) EngineerJob Summary:We are seeking a talented and highly motivated Generative AI (GenAI) Engineer As a GenAI Engineer, you will be at the forefront of developing and deploying innovative solutions leveraging cutting-edge generative models You will also be responsible for building, training, and fine-tuning GenAI models for various applications, from text generation, image synthesis, code generation, etc This role requires a strong foundation in machine learning, deep learning, and a passion for exploring the potential of generative AI Qualifications:Education: Masters or PhD degree Good to have any certification in Artificial Intelligence, Machine Learning, or a related field Experience: 8+ years of experience in developing and deploying machine learning models 2+ years of experience with generative models Experience with cloud platforms such as AWS, Azure, or GCP Skills: Technical Expertise: Strong understanding of generative AI models, such as GANs, VAEs, diffusion models, and large language models Proficiency in Python and other programming languages commonly used in machine learning Experience with model training and fine-tuning techniques Knowledge of data preprocessing and feature engineering methods Familiarity with model deployment and monitoring tools Soft Skills: Strong analytical and problem-solving skills Excellent communication, presentation, and interpersonal skills Ability to work independently and as part of a team Creativity and a passion for exploring the potential of generative AI

Posted 2 weeks ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies