Job
Description
As a DevOps + Site Reliability Engineer you will work in an agile, collaborative environment to build, deploy, configure, and support services in the IBM Cloud. Your responsibilities will encompass the design and implementation of innovative features/automation, fine-tuning and sustaining existing code for optimal performance, uncovering efficiencies, supporting adopters globally, and driving to deliver a highly available cloud offering within IBM Cloud Security Services.In this role, you will be implementing and consuming APIs in the IBM cloud infrastructure environment while configuring integrating services. You will be a motivated self-starter who loves to solve challenging problems and feels comfortable managing multiple and changing priorities, and meeting deadlines in an entrepreneurial environment.Your primary responsibilities include:
Contributing to new features and improving existing capabilities or processes while relentlessly troubleshooting problems to deliver.Practice secure development principles supporting continuous integration and delivery leveraging tools such as Tekton, Ansible, and TerraformOrchestrate and maintain Kubernetes/OpenShift clusters to ensure high availability and resilienceCollaborate across teams in activities including code reviews, testing, audit support, and mitigating issues.Continuously improve code, automation, testing, monitoring and alerting processes to ensure proactive identification and resolution of potential issues.Lead or contribute to the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixesParticipate in on-call rotation and lead or contribute to the problem resolution process for our clients, from analysis and troubleshooting, to deploying workarounds or fixes
Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise 1-3 Years Experience delivering code and debugging problems.1-3 Years Experience in SRE, DevOps or similar roleA strong preference for collaborative teamworkA rigorous approach to problem-solvingExperience with cloud computing technologiesProgramming skills – scripting, Go, Python, or similarHands-on experience with Container technologiesKubernetes (IKS), RedHat OpenShift, Docker, Rancher, PodmanProficient with automation tools and CI/CDs
Preferred technical and professional experience Strongly preferred experience in working with production Kubernetes/OpenShift environments.Excellent Git skills (merges, rebase, branching, forking, submodules)Experience with Tekton, Ansible, Terraform, JenkinsExperience with Rust, C/C++, or JavaExperience using, configuring and troubleshooting CI/CDsExcellent record of improving solutions through automationExperience with monitoring and alerting tools (e.g., Prometheus, Grafana, Kibana, Sysdig, LogDNA).SQL or Postgresql experience