Senior Database Infrastructure Engineer- Cassandra, DataStax, Big Data Pipelines

5 years

0 Lacs

Posted:1 week ago| Platform: Linkedin logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

HEROIC Cybersecurity ( HEROIC.com ) is seeking a Senior Data Infrastructure Engineer with deep expertise in DataStax Enterprise (DSE) and Apache Cassandra to help architect, scale, and maintain the data infrastructure that powers our cybersecurity intelligence platforms.

You will be responsible for designing and managing fully automated, big data pipelines that ingest, process, and serve hundreds of billions of breached and leaked records sourced from the surface, deep, and dark web. You'll work with DSE Cassandra, Solr, and Spark, helping us move toward a 99% automated pipeline for data ingestion, enrichment, deduplication, and indexing — all built for scale, speed, and reliability.

This position is critical in ensuring our systems are fast, reliable, and resilient as we ingest thousands of unique datasets daily from global threat intelligence sources.

What you will do:

  • Design, deploy, and maintain high-performance Cassandra clusters using DataStax Enterprise (DSE)
  • Architect and optimize automated data pipelines to ingest, clean, enrich, and store billions of records daily
  • Configure and manage DSE Solr and Spark to support search and distributed processing at scale
  • Automate dataset ingestion workflows from unstructured surface, deep, and dark web sources
  • Cluster management, replication strategy, capacity planning, and performance tuning
  • Ensure data integrity, availability, and security across all distributed systems
  • Write and manage ETL processes, scripts, and APIs to support data flow automation
  • Monitor systems for bottlenecks, optimize queries and indexes, and resolve production issues
  • Research and integrate third-party data tools or AI-based enhancements (e.g., smart data parsing, deduplication, ML-based classification)
  • Collaborate with engineering, data science, and product teams to support HEROIC’s AI-powered cybersecurity platform

Requirements
  • Minimum 5 years experience with Cassandra / DataStax Enterprise in production environments
  • Hands-on experience with DSE Cassandra, Solr, Apache Spark, CQL, and data modeling at scale
  • Strong understanding of NoSQL architecture, sharding, replication, and high availability
  • Advanced knowledge of Linux/Unix, shell scripting, and automation tools (e.g., Ansible, Terraform)
  • Proficient in at least one programming language: Python, Java, or Scala
  • Experience building large-scale automated data ingestion systems or ETL workflows
  • Solid grasp of AI-enhanced data processing, including smart cleaning, deduplication, and classification
  • Excellent written and spoken English communication skills
  • Prior experience with cybersecurity or dark web data (preferred but not required)

Benefits
  • Position Type:

    Full-time
  • Location:

    Pune, India (Remote – Work from anywhere)
  • Compensation:

    Competitive salary based on experience
  • Benefits:

    Paid Time Off + Public Holidays
  • Professional Growth:

    Amazing upward mobility in a rapidly expanding company.
  • Innovative Culture:

    Fast-paced, innovative, and mission-driven. Be part of a team that leverages AI and cutting-edge technologies.

About Us:

Position Keywords:

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You