About The Organisation
DataFlow Group is a pioneering global provider of specialized Primary Source Verification (PSV) solutions, and background screening and immigration compliance services that assist public and private organizations in mitigating risks to make informed, cost-effective decisions regarding their Applicants and Registrants.
About The Role
We are looking for a highly skilled and experienced Senior ETL & Data Streaming Engineer with over 10 years of experience to play a pivotal role in designing, developing, and maintaining our robust data pipelines.The ideal candidate will have deep expertise in both batch ETL processes and real-time data streaming technologies, coupled with extensive hands-on experience with AWS data services. A proven track record of working with Data Lake architectures and traditional Data Warehousing environments is essential.
Duties And Responsibilities
- Design, develop, and implement highly scalable, fault-tolerant, and performant ETL processes using industry-leading ETL tools to extract, transform, and load data from various source systems into our Data Lake and Data Warehouse.
- Architect and build batch and real-time data streaming solutions using technologies like Talend, Informatica, Apache Kafka or AWS Kinesis to support immediate data ingestion and processing requirements.
- Utilize and optimize a wide array of AWS data services
- Collaborate with data architects, data scientists, and business stakeholders to understand data requirements and translate them into efficient data pipeline solutions.
- Ensure data quality, integrity, and security across all data pipelines and storage solutions.
- Monitor, troubleshoot, and optimize existing data pipelines for performance, cost-efficiency, and reliability.
- Develop and maintain comprehensive documentation for all ETL and streaming processes, data flows, and architectural designs.
- Implement data governance policies and best practices within the Data Lake and Data Warehouse environments.
- Mentor junior engineers and contribute to fostering a culture of technical excellence and continuous improvement.
- Stay abreast of emerging technologies and industry best practices in data engineering, ETL, and streaming.
Qualifications
- 10+ years of progressive experience in data engineering, with a strong focus on ETL, ELT and data pipeline development.
- Deep expertise in ETL Tools : Extensive hands-on experience with commercial ETL tools (Talend)
- Strong proficiency in Data Streaming Technologies : Proven experience with real-time data ingestion and processing using platforms such as AWS Glue,Apache Kafka, AWS Kinesis, or similar.
- Extensive AWS Data Services Experience :
- Proficiency with AWS S3 for data storage and management.
- Hands-on experience with AWS Glue for ETL orchestration and data cataloging.
- Familiarity with AWS Lake Formation for building secure data lakes.
- Good to have experience with AWS EMR for big data processing
- Data Warehouse (DWH) Knowledge : Strong background in traditional data warehousing concepts, dimensional modeling (Star Schema, Snowflake Schema), and DWH design principles.
- Programming Languages : Proficient in SQL and at least one scripting language (e.g., Python, Scala) for data manipulation and automation.
- Database Skills : Strong understanding of relational databases and NoSQL databases.
- Version Control : Experience with version control systems (e.g., Git).
- Problem-Solving : Excellent analytical and problem-solving skills with a keen eye for detail.
- Communication : Strong verbal and written communication skills, with the ability to articulate complex technical concepts to both technical and non-technical audiences.
(ref:hirist.tech)