Job
Description
As a Site Reliability Engineer, you will work in an agile, collaborative environment to build, deploy, configure, and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying the latest software updates & fixes.Your primary responsibilities include:
24x7 Observability: Be part of a worldwide team that monitors the health of production systems and services around the clock, ensuring continuous reliability and optimal customer experience.
Cross-Functional Troubleshooting: Collaborate with engineering teams to provide initial assessments and possible workarounds for production issues. Troubleshoot and resolve production issues effectively.
Deployment and Configuration: Leverage Continuous Delivery (CI/CD) tools to deploy services and configuration changes at enterprise scale.
Security and Compliance Implementation: Implementing security measures that meet or exceed industry standards for regulations such as GDPR, SOC2, ISO 27001, PCI, HIPAA, and FBA.
Maintenance and Support: Tasks related to applying Couchbase security patches and upgrades, supporting Cassandra and Mongo for pager duty rotation, and collaborating with Couchbase Product support for issue resolution.
Required education Bachelor's Degree Preferred education Bachelor's Degree Required technical and professional expertise Bachelor’s degree in Computer Science, IT, or equivalent.5+ years of experience in any database either Netezza, Db2 or MSSQL etc.5+ years of experience in DevOps, CloudOps, or SRE roles.Foundational experience with Linux/Unix systems.Hands-on exposure to cloud platforms (IKS, AWS, or Azure).Understanding of networking and databases.Strong troubleshooting and problem-solving skills.
Preferred technical and professional experience Databases :Strongly preferred experience in working with Netezza/Db2 databases Adminstration. Monitor and optimize DB performance and reliability. Configure and troubleshoot database issues
Kubernetes/OpenShift: Strongly preferred experience in working with production Kubernetes/OpenShift environments.
Automation/Scripting: In depth experience with the Ansible, Python, Terraform, and CI/CD tools such as Jenkins, IBM Continuous Delivery, ArgoCD
Monitoring/Observability: Hands on experience crafting alerts and dashboards using tools such as Instana, New Relic, Grafana/Prometheus