Posted:3 days ago|
Platform:
On-site
Full Time
Infra360 is an emerging global leader in cloud consulting that specializes in innovative cloud-native solutions and exceptional customer service. We partner with clients to modernize and optimize their cloud, ensuring resilience, scalability, cost efficiency and innovation.
Our core services include Cloud Strategy, Site Reliability Engineering (SRE), DevOps, Cloud Security Posture Management (CSPM), and related Managed Services. We specialize in driving operational excellence across multi-cloud environments, helping businesses achieve their goals with agility and reliability.
We thrive on ownership, collaboration, problem-solving, and excellence, fostering an environment where innovation and continuous learning are at the forefront. Join us as we expand and redefine what’s possible in cloud technology and infrastructure.
The Director of DevOps and Cloud Operations will lead and scale Infra360’s technology team, driving growth, operational excellence, and client success. The role involves strategic leadership, project management, and delivering innovative solutions in cloud, DevOps, SRE, and security. The ideal candidate will foster a culture of collaboration and innovation while ensuring high-quality service delivery and identifying opportunities to expand client engagements.
Lead, mentor, and grow a team of engineers, scaling the team from 10 to 50.
Foster a culture of innovation, collaboration, ownership, and excellence.
Oversee talent acquisition, retention, and professional development within the team.
Time Management: Prioritize tasks effectively to balance strategic initiatives, team management, and client interactions.
Accountability: Take ownership of deliverables and decisions, ensuring alignment with company goals and values.
Pressure Handling: Maintain composure under pressure and manage competing priorities effectively.
Client Needs Analysis: As and when required, conduct detailed requirement-gathering sessions with clients to understand their objectives, pain points, and technical needs.
Audit Facilitation: Coordinate with the tech team to perform cloud audits, identifying areas for cost optimization, security improvements, and enhanced reliability.
SOW Creation: As and when required, draft and finalize comprehensive Statements of Work (SOW) that clearly outline deliverables, timelines, and expectations.
Should be able to participate in client discovery calls actively
SOW Understanding: Thoroughly review and understand the SOW, including scope, deliverables, timelines, milestones, and SLAs to own the whole process
Resource Allocation & Onboarding: Identify and onboard the right resources for the project, ensuring team members are briefed on client requirements, project scope, and deliverables.
Stakeholder Alignment: Ensure alignment with clients and internal teams on all aspects of the SOW to avoid scope creep and ensure clear expectations.
Onboarding Process: Develop and execute a structured client onboarding process, ensuring a smooth transition and setup of services.
Access & Tools Setup: Facilitate timely access to client environments, tools, and necessary documentation for the team.
Documentation: Provide regular documentation on service usage, reporting, and escalation processes.
Project Monitoring: Weekly sprint planning with clients and daily stand-up calls with project teams to ensure timely delivery, quality, and efficiency of team members
Work Review & Oversight: Regularly review team members’ work and technical approaches to ensure alignment with best practices.
Quality Assurance: Implement processes to maintain high-quality standards across all deliverables.
Delivery Excellence: Ensure timely and successful delivery of projects, meeting client expectations and SLAs.
Ensuring progress according to SOW and achieving milestones
Monthly SOW progress & achievements to get the sign-off through feedback integrations
Regular Client Meetings: Schedule and conduct weekly/bi-weekly meetings with clients to discuss project progress, address concerns, and gather feedback.
Client Rapport Building: Establish and maintain strong relationships with clients through proactive engagement and communication
Act as a subject matter expert to clients, helping them achieve their cloud and infrastructure goals.
Case Study Development: Provide technical insights and content for creating impactful case studies that highlight successful client engagements and solutions.
Architecture Diagrams: Design and deliver detailed architecture diagrams to visually represent technical solutions for marketing and sales materials.
Collaboration with Marketing: As and when required, work with the marketing team to ensure technical accuracy and relevance in promotional content, showcasing the company’s expertise.
Account Growth Strategy: Develop and execute strategies to expand service offerings within existing client accounts.
Client Needs Assessment: Regularly engage with clients to identify evolving needs and opportunities for additional services in cloud, DevOps, SRE, and security.
Service Expansion: Identify and introduce premium services, add-ons, or long-term engagements that enhance client outcomes.
Cross-Selling Opportunities: Collaborate with internal teams to bundle services and present holistic solutions.
Process Standardization: Identify areas for improvement and implement standardized processes across projects to enhance efficiency and consistency.
Automation: Leverage automation tools and frameworks to streamline repetitive tasks and improve operational workflows.
Continuous Improvement: Foster a culture of continuous improvement by encouraging feedback, conducting regular process reviews, and implementing best practices.
Innovation Initiatives: Drive innovation by introducing new tools, technologies, and methodologies that align with business goals and client needs.
Metrics & KPIs: Define and track key performance indicators (KPIs) to measure process effectiveness and drive data-driven decisions.
Technical Expertise:
Deep knowledge of Infrastructure, Cloud, DevOps, SRE, Database Management, Observability, and Cybersecurity services.
Solid 10+ years of experience as an SRE and DevOps with a proven track record of handling large-scale production environments
Strong Experience with Databases (PostgreSQL, MongoDB, ElasticSearch, Kafka)
Hands-on experience with ELK or other logging and observability tools
Hands-on experience with Prometheus, Grafana & Alertmanager and on-call processes like Pagerduty
Strong with skills - K8s, Terraform, Helm, ArgoCD, AWS/GCP/Azure etc
Good with Python/Go Scripting Automation
Strong with fundamentals like DNS, Networking, Linux
Experience with APM tools like - Newrelic, Datadog, and OpenTelemetry
Good experience with Incident Response, Incident Management, Writing detailed RCAs
Experience with Git and coding best practices
Solutioning & Architecture: Proven ability to design, implement, and optimize end-to-end cloud solutions, following well-architected frameworks and best practices.
Leadership & Team Management: Demonstrated success in scaling teams, fostering a collaborative and innovative work culture, and mentoring talent to achieve excellence.
Problem-Solving & Innovation: Strong analytical skills to understand complex client needs and deliver creative, scalable, and impactful solutions.
Project & Stakeholder Management: Expertise in project planning, execution, and stakeholder management, ensuring alignment with business objectives and client expectations.
Effective Communication: Exceptional verbal and written communication skills to engage with clients, teams, and stakeholders effectively.
Documentation & Organization: Ability to maintain well-organized, structured documentation and adhere to standardized folder structures.
Attention to Detail & Follow Through Consistently capture key points, action items, and follow-ups during meetings and ensure timely execution.
Time Management & Prioritization: Strong time management skills, with the ability to balance multiple priorities, meet deadlines, and optimize productivity.
Task Tracking & Accountability: Maintain a personal task tracker to manage work priorities, monitor progress, and ensure accountability.
Results-Driven & Growth Mindset: A proactive, results-oriented approach with a focus on continuous learning and improvement.
Experience: 12+ years in technology operations, with at least 5 years in a leadership role, managing teams and delivering complex solutions.
Education: Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
Infra360
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python Now35.0 - 40.0 Lacs P.A.
Gurgaon, Haryana, India
Salary: Not disclosed
Gurgaon, Haryana, India
Salary: Not disclosed