Posted:1 week ago|
Platform:
Work from Office
Full Time
You will contribute to Owning the design and scalability of the data lake architecture for both streaming and batch workloads, leveraging AWS-native services.
Leading the development of ingestion, transformation, and storage pipelines using AWS Glue, DMS, Kinesis/Kafka, and PySpark.
Structuring and evolving data into OTF formats (Apache Iceberg, Delta Lake) to support real-time and time-travel queries for downstream services.
Driving data productization, enabling API-first and self-service access to curated datasets for fraud detection, reconciliation, and reporting use cases.
Defining and tracking SLAs and SLOs for critical data pipelines, ensuring high availability and data accuracy in a regulated fintech environment.
Collaborating with InfoSec, SRE, and Data Governance teams to enforce data security, lineage tracking, access control, and compliance (GDPR, MAS TRM).
Using Generative AI tools to enhance developer productivity including auto-generating test harnesses, schema documentation, transformation scaffolds, and performance insights.
Mentoring data engineers, setting technical direction, and ensuring delivery of high-quality, observable data pipelines.
Architect scalable, cost-optimized pipelines across real-time and batch paradigms, using tools such as AWS Glue, Step Functions, Airflow, or EMR.
Manage ingestion from transactional sources using AWS DMS, with a focus on schema drift handling and low-latency replication.
Design efficient partitioning, compression, and metadata strategies for Iceberg or Hudi tables stored in S3, and cataloged with Glue and Lake Formation.
Build data marts, audit views, and analytics layers that support both machine-driven processes (e.g. fraud engines) and human-readable interfaces (e.g. dashboards).
Ensure robust data observability with metrics, alerting, and lineage tracking via OpenLineage or Great Expectations.
Lead quarterly reviews of data cost, performance, schema evolution, and architecture design with stakeholders and senior leadership.
Enforce version control, CI/CD, and infrastructure-as-code practices using GitOps and tools like Terraform. Requirements At-least 7 years of experience in data engineering.
Deep hands-on experience with AWS data stack: Glue (Jobs & Crawlers), S3, Athena, Lake Formation, DMS, and Redshift Spectrum
Expertise in designing data pipelines for real-time, streaming, and batch systems, including schema design, format optimization, and SLAs.
Strong programming skills in Python (PySpark) and advanced SQL for analytical processing and transformation.
Proven experience managing data architectures using open table formats (Iceberg, Delta Lake, Hudi) at scale
. Understanding of stream processing with Kinesis/Kafka and orchestration via Airflow or Step Functions.
Experience implementing data access controls, encryption policies, and compliance workflows in regulated environments.
Ability to integrate GenAI tools into data engineering processes to drive measurable productivity and quality gains with strong engineering hygiene.
Demonstrated ability to lead teams, drive architectural decisions, and collaborate with cross-functional stakeholders.
Experience working in a PCI DSS or any other central bank regulated environment with audit logging and data retention requirements. Experience in the payments or banking domain, with use cases around reconciliation, chargeback analysis, or fraud detection.
Familiarity with data contracts, data mesh patterns, and data as a product principles.
Experience using GenAI to automate data documentation, generate data tests, or support reconciliation use cases.
Exposure to performance tuning and cost optimization strategies in AWS Glue, Athena, and S3.
Experience building data platforms for ML/AI teams or integrating with model feature stores.
Engagement Model:: Direct placement with client
This is remote role
Shift timings::10 AM to 7 PM
Uplers
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python NowFaridabad, Haryana, India
40.0 - 40.0 Lacs P.A.
Greater Hyderabad Area
40.0 - 40.0 Lacs P.A.
Mumbai, New Delhi, Bengaluru
9.0 - 10.0 Lacs P.A.
40.0 - 40.0 Lacs P.A.
Kochi, Kerala, India
40.0 - 40.0 Lacs P.A.
Greater Bhopal Area
40.0 - 40.0 Lacs P.A.
Indore, Madhya Pradesh, India
40.0 - 40.0 Lacs P.A.
Visakhapatnam, Andhra Pradesh, India
40.0 - 40.0 Lacs P.A.
Chandigarh, India
40.0 - 40.0 Lacs P.A.
Thiruvananthapuram, Kerala, India
40.0 - 40.0 Lacs P.A.