Data Engineer
As a Data Engineer, you will own the end-to-end lifecycle of our data infrastructure.You will design and implement robust, scalable data pipelines and architect modern data solutions using a best-in-class technology stack.Your work will transform raw, messy data into clean, reliable, and actionable data products that power decision-making across the business.Youll collaborate cross-functionally with product managers, data analysts, data scientists, and software engineers to understand data needs and deliver high-performance data solutions.Your impact will be measured by how effectively data is delivered, modeled, and leveraged to drive business Responsibilities :
- Architect & Build : Design, implement and manage cloud-based data platform using a modern ELT (Extract, Load, Transform) approach.
- Data Ingestion : Develop and maintain robust data ingestion pipelines from a variety of sources, including operational databases (MongoDB, RDS), real-time IoT streams, and third-party APIs using services like AWS Kinesis/Lambda or Azure Event Hubs/Functions.
- Data Lake Management : Build and manage a scalable and cost-effective data lake on AWS S3 or Azure Data Lake Storage (ADLS Gen2), using open table formats like Apache Iceberg or Delta Lake.
- Data Transformation : Develop, test, and maintain complex data transformation models using dbt. Champion a software engineering mindset by applying principles of version control (Git), CI/CD, and automated testing to all data logic.
- Orchestration : Implement and manage data pipeline orchestration using modern tools like Dagster, Apache Airflow, or Azure Data Factory.
- Data Quality & Governance : Establish and enforce data quality standards. Implement automated testing and monitoring to ensure the reliability and integrity of all data assets.
- Performance & Cost Optimization : Continuously monitor and optimize the performance and cost of the data platform, ensuring our serverless query engines and storage layers are used efficiently.
- Collaboration : Work closely with data analysts and business stakeholders to understand their needs, model data effectively, and deliver datasets that power our BI tools (Metabase, Power Skills & Experience (Must-Haves) :
- 3+ years of professional experience in a data engineering role.
- Expert-level proficiency in SQL and the ability to write complex, highly-performant queries.
- Proficient in Python based data cleaning packages and tools. Experience in python is a must.
- Hands-on experience building data solutions on a major cloud provider (AWS or Azure), utilizing core services like AWS S3/Glue/Athena or Azure ADLS/Data Factory/Synapse.
- Proven experience building and maintaining data pipelines in Python.
- Experience with NoSQL databases like MongoDB, including an understanding of its data modeling, aggregation framework, and query patterns.
- Deep understanding of data warehousing concepts, including dimensional modeling, star/snowflake schemas, and data modeling best practices.
- Hands-on experience with modern data transformation tools, specifically dbt.
- Familiarity with data orchestration tools like Apache Airflow, Dagster, or Prefect.
- Proficiency with Git and experience working with CI/CD pipelines for data Skills & Experience (Nice-to-Haves) :
- Experience with real-time data streaming technologies, specifically AWS Kinesis or Azure Event Hubs.
- Experience with data cataloging and governance tools (e.g., Open Metadata, Data Hub, Microsoft Purview).
- Knowledge of infrastructure-as-code tools like Terraform or CloudFormation.
- Experience with containerization technologies (Docker, Kubernetes)
(ref:hirist.tech)