Title: Data Engineer (Neo4j, JanusGraph)
- Look for Exp. between 3-5 years -
- Must have graph databases (e.g., Neo4j, JanusGraph) and graph query languages - its must.
- Look for batch and real time streaming exp.
- Look with Kafka OR Flink
- Exp. Data structures/algorithms
- Hadoop ecosystems such as Hive, Iceberg, Spark SQL
- 1 coding language is must - (Java or Python or Scala)
Job Description:
The Cloud Data Technologies (CDT) team at eBay oversees data infrastructure and the management of the end-to-end data lifecycle. As a Data Engineer on our team, you will work on our Hadoop-based data warehouse, contributing to scalable and reliable big data solutions for analytics and business insights. This is a hands-on role focused on building, optimizing, and maintaining large data pipelines and warehouse infrastructure.
Key Responsibilities
- Design, develop, and maintain robust data pipelines in Hadoop and related ecosystems, ensuring data reliability, scalability, and performance.
- Implement data ETL processes for batch and streaming analytics requirements.
- Optimize and troubleshoot distributed systems for ingestion, storage, and processing.
- Collaborate with data engineers, analysts, and platform engineers to align solutions with business needs.
- Ensure data security, integrity, and compliance throughout the infrastructure.
- Maintain documentation and contribute to architecture reviews.
- Participate in incident response and operational excellence initiatives for the data warehouse.
- Continuously learn and apply new Hadoop ecosystem tools and data technologies.
Required Skills and Experience
- Extensive experience with Apache Kafka, Apache Flink, and other relevant streaming technologies.
- Proficiency in Hadoop ecosystems such as Hive, Iceberg, Spark sql.
- Good Understanding of Apache Airflow tool for orchestrating complex computational workflows and data processing pipelines.
- Proven ability to design and implement automated data pipelines and materialized views.
- Proficiency in Python, Unix or similar languages.
- Good understanding of SQL oracle, SQL server or similar languages.
- Experience with graph databases (e.g., Neo4j, JanusGraph) and graph query languages.
- Ops & CI/CD: Monitoring (Prometheus/Grafana), logging, pipelines (Jenkins/GitHub Actions).
- Core Engineering: Data structures/algorithms, testing (JUnit/pytest), Git, clean code.
- Cloud Native: Docker, Kubernetes (deploy, network, scale, troubleshoot).
- 6+ years of directly applicable experience
- BS in Computer Science, Engineering, or equivalent experience.