MONITORING TOOL - L2

2 - 5 years

4 - 7 Lacs

Posted:1 day ago| Platform: Foundit logo

Apply

Skills Required

Work Mode

On-site

Job Type

Full Time

Job Description

Key Responsibilities:

  1. Monitoring Tool Support:

  • Provide

    L2 support

    for various

    monitoring tools

    (e.g.,

    Nagios

    ,

    Zabbix

    ,

    Splunk

    ,

    Prometheus

    ,

    SolarWinds

    ,

    AppDynamics

    ,

    New Relic

    , etc.).
  • Troubleshoot and resolve escalated

    alerts

    ,

    incidents

    , and

    issues

    related to system performance, application health, network connectivity, and infrastructure availability.
  • Collaborate with

    L1 support

    teams to assist in the diagnosis and resolution of simpler issues.
  1. Incident & Problem Management:

  • Handle

    escalated incidents

    from L1 support, providing

    root cause analysis

    (RCA) and resolution.
  • Track and maintain records of incidents, problems, and resolutions within the

    ticketing system

    (e.g.,

    ServiceNow

    ,

    JIRA

    ).
  • Ensure

    SLA compliance

    for issue resolution and follow-up on tickets to meet agreed-upon timelines.
  1. Alert Management:

  • Review and manage

    monitoring alerts

    for critical systems, servers, databases, and applications.
  • Ensure alerts are appropriately categorized and routed for resolution.
  • Investigate and respond to

    false positives

    or irrelevant alerts to maintain the integrity of the monitoring system.
  1. Performance Monitoring & Reporting:

  • Continuously monitor

    system health

    ,

    application performance

    , and

    network traffic

    to proactively identify issues before they affect services.
  • Maintain and improve monitoring

    dashboards

    to reflect the current health of the environment.
  • Generate regular

    reports

    for system performance and uptime, providing recommendations for improvements or preventive actions.
  1. Tool Configuration & Optimization:

  • Assist in the

    configuration

    and

    tuning

    of monitoring tools to ensure they provide meaningful and actionable data.
  • Customize

    monitoring thresholds,

    alerts

    , and

    notifications

    to align with the organization's operational needs.
  • Continuously improve the

    monitoring setup

    to ensure that it effectively supports the evolving infrastructure and application stack.
  1. Documentation & Knowledge Sharing:

  • Document troubleshooting procedures, known issues, and best practices for the monitoring tools.
  • Share knowledge and insights with

    L1 support

    teams to improve their troubleshooting capabilities.
  • Maintain

    user manuals

    or

    standard operating procedures (SOPs)

    for monitoring tool management and escalation processes.
  1. Collaboration & Communication:

  • Collaborate with

    DevOps

    ,

    System Admins

    , and

    Network Engineers

    to resolve infrastructure or application performance issues.
  • Communicate effectively with internal teams regarding ongoing incidents, resolution timelines, and potential impacts on services.
  1. Proactive System Improvements:

  • Work with the

    IT Operations

    team to identify and implement proactive measures to improve the overall system performance and reduce downtime.
  • Provide input for optimizing

    monitoring thresholds

    , reducing false alarms, and implementing new monitoring solutions or features.

Required Qualifications:

  • 2-5 years

    of experience in

    L2 support

    or operations with

    monitoring tools

    .
  • Strong understanding of

    IT infrastructure

    , including servers, databases, networks, and applications.
  • Hands-on experience with

    monitoring tools

    (e.g.,

    Nagios

    ,

    Zabbix

    ,

    Prometheus

    ,

    Splunk

    ,

    AppDynamics

    ,

    New Relic

    , etc.).
  • Experience working with

    alert management

    systems and troubleshooting

    complex issues

    .
  • Familiarity with

    cloud environments

    (AWS, Azure, GCP) and the related monitoring tools.
  • Solid understanding of

    system performance metrics

    and the ability to identify and troubleshoot issues based on performance data.
  • Experience using

    ticketing systems

    (e.g.,

    ServiceNow

    ,

    JIRA

    ,

    Zendesk

    ) for incident management and tracking.
  • Proficiency in

    Linux/Unix

    and

    Windows Server

    operating systems.
  • Scripting knowledge

    (e.g.,

    Bash

    ,

    Python

    ,

    PowerShell

    ) for automating monitoring tasks and alerts.
  • Good understanding of

    networking concepts

    (DNS, HTTP, TCP/IP, etc.) and their impact on monitoring.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Teamware Solutions logo
Teamware Solutions

IT Services and IT Consulting

Chennai Tamilnadu

RecommendedJobs for You

Bengaluru, Karnataka, India

Bengaluru, Karnataka, India