Sr. Engineer Data & ML Platform

8 - 13 years

12 - 17 Lacs

Posted:1 week ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

The charter of the Data + ML Platform team is to harness all the data that is ingested and cataloged within the Data LakeHouse for exploration, insights, model development, ML Engineering and Insights Activation. This team is situated within the larger Data Platform group, which serves as one of the core pillars of our company. We process data at a truly immense scale. Our processing is composed of various facets including threat events collected via telemetry data, associated metadata, along with IT asset information, contextual information about threat exposure based on additional processing, etc These facets comprise the overall data platform, which is currently over 200 PB and maintained in a hyper scale Data Lakehouse, built and owned by the Data Platform team. The ingestion mechanisms include both batch and near real-time streams that form the core Threat Analytics Platform used for insights, threat hunting, incident investigations and more.
As an engineer in this team, you will play an integral role as we build out our ML Experimentation Platform from the ground up. You will collaborate closely with Data Platform Software Engineers, Data Scientists & Threat Analysts to design, implement, and maintain scalable ML pipelines that will be used for Data Preparation, Cataloging, Feature Engineering, Model Training, and Model Serving that influence critical business decisions. you'll be a key contributor in a production-focused culture that bridges the gap between model development and operational success. Future plans include generative AI investments for use cases such as modeling attack paths for IT assets. What you'll Do:
  • Help design, build, and facilitate adoption of a modern Data+ML platform
  • Modularize complex ML code into standardized and repeatable components
  • Establish and facilitate adoption of repeatable patterns for model development, deployment, and monitoring
  • Build a platform that scales to thousands of users and offers self-service capability to build ML experimentation pipelines
  • Leverage workflow orchestration tools to deploy efficient and scalable execution of complex data and ML pipelines
  • Review code changes from data scientists and champion software development best practices
  • Leverage cloud services like Kubernetes, blob storage, and queues in our cloud first environment
What you'll Need:
  • B.S. in Computer Science, Data Science, Statistics, Applied Mathematics, or a related field and 10+ years related experience; or M.S. with 8+ years of experience; or Ph.D with 6+ years of experience.
  • 3+ years experience developing and deploying machine learning solutions to production. Familiarity with typical machine learning algorithms from an engineering perspective (how they are built and used, not necessarily the theory); familiarity with supervised / unsupervised approaches: how, why, and when and labeled data is created and used
  • 3+ years experience with ML Platform tools like Jupyter Notebooks, NVidia Workbench, MLFlow, Ray, Vertex AI etc
  • Experience building data platform product(s) or features with (one of) Apache Spark, Flink or comparable tools in GCP. Experience with Iceberg is highly desirable.
  • Proficiency in distributed computing and orchestration technologies (Kubernetes, Airflow, etc)
  • Production experience with infrastructure-as-code tools such as Terraform, FluxCD
  • Expert level experience with Python; Java/Scala exposure is recommended. Ability to write Python interfaces to provide standardized and simplified interfaces for data scientists to utilize internal Crowdstrike tools
  • Expert level experience with CI/CD frameworks such as GitHub Actions
  • Expert level experience with containerization frameworks
  • Strong analytical and problem solving skills, capable of working in a dynamic environment
  • Exceptional interpersonal and communication skills. Work with stakeholders across multiple teams and synthesize their needs into software interfaces and processes.
Critical Skills Needed for Role:
  • Distributed Systems Knowledge
  • Data Platform Experience
  • Machine Learning concepts
Experience with the Following is Desirable:
  • Go
  • Iceberg
  • Pinot or other time-series/OLAP-style database
  • Jenkins
  • Parquet
  • Protocol Buffers/GRPC
Benefits of Working at CrowdStrike:
  • Remote-friendly and flexible work culture
  • Market leader in compensation and equity awards
  • Comprehensive physical and mental we'llness programs
  • Competitive vacation and holidays for recharge
  • Paid parental and adoption leaves
  • Professional development opportunities for all employees regardless of level or role
  • Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections
  • Vibrant office culture with world class amenities
  • Great Place to Work Certified across the globe

Mock Interview

Practice Video Interview with JobPe AI

Start Machine Learning Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Crowdstrike logo
Crowdstrike

Computer and Network Security

Remote

RecommendedJobs for You