Home
Jobs

Data Engineer - AWS Data Pipelines & ETL

4 years

0 Lacs

Posted:4 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Title: Data Engineer – AWS Data Pipelines & ETL

Location: Nagpur, Maharashtra, India (On-site)

Experience: 4+ years

Work Type: Direct Hire/Permanent Employment


We are seeking an experienced and motivated Data Engineer to design, develop, and maintain scalable data pipelines and ETL (Extract, Transform, Load) processes in an AWS cloud environment. This role will be critical in enabling efficient data integration, transformation, and reporting across the organization, driving data-driven decision-making.


Key Responsibilities:


Design, Develop, and Maintain Scalable Data Pipelines:

- Architect, build, and manage robust and scalable data pipelines to ingest, transform, and load data from various structured and unstructured sources into data lakes or data warehouses.

- Leverage AWS-native services to build cloud-native solutions that are highly available, reliable, and cost-efficient.


Collaborate with Stakeholders:

- Work closely with data architects, analysts, business intelligence teams, and other stakeholders to gather data requirements, understand reporting and analytics needs, and translate them into actionable technical solutions.

- Act as a technical liaison to align data engineering efforts with organizational goals.


Implement Data Integration and Transformation:

- Develop and automate ETL workflows using AWS Glue, Redshift, S3, Lambda, Athena, Step Functions, and other relevant AWS services.

- Integrate data from disparate systems, ensuring data consistency and conformity to standards.


Ensure Data Quality, Integrity, and Security:

- Establish and maintain data validation, cleansing, and monitoring processes to guarantee high data quality across all stages of the data lifecycle.

- Implement security best practices for data access control, encryption, and compliance with regulatory requirements.


Optimize SQL and Query Performance:

- Write, debug, and optimize complex SQL queries for data extraction, transformation, and loading.

- Optimize queries and processing workflows for performance, scalability, and cost-efficiency on large datasets.


Performance Tuning and Optimization:

- Continuously monitor and improve the performance of data pipelines, addressing bottlenecks, reducing latency, and minimizing processing costs.


Implement Rigorous Testing and Validation:

- Establish unit tests, integration tests, and validation frameworks to ensure the accuracy, completeness, and reliability of data pipelines.

- Perform root cause analysis and troubleshoot data discrepancies and failures in ETL processes.


Documentation and Knowledge Sharing:

- Develop and maintain clear, comprehensive documentation for data pipelines, workflows, architecture diagrams, and standard operating procedures.

- Create technical guides and training materials to support cross-functional teams in utilizing data platforms.


Technology Requirements:


AWS Cloud Services (Required):

- AWS Glue

- AWS Redshift

- Amazon S3

- AWS Lambda

- AWS Athena

- AWS Step Functions

- CloudWatch (for monitoring and logging)


Databases and Data Warehousing (Required):

- PostgreSQL, MySQL (or other RDBMS)

- Redshift Spectrum

- Exposure to NoSQL systems (optional)


Data Integration and Transformation Tools (Nice to have):

- PySpark, Apache Spark

- Pandas (Python)

- SQL-based ETL solutions


Programming Languages (Nice to have):

- Python (preferred)

- SQL

- (Optional: Scala, Java for Spark-based pipelines)


Workflow Orchestration (Nice to have):

- Airflow, AWS Step Functions, or similar


Version Control & DevOps (Nice to have):

- Git

- Experience with CI/CD pipelines for data workflows

- Infrastructure as Code (CloudFormation, Terraform) (optional but preferred)


BI and Querying Tools (Nice to have):

- AWS QuickSight

- Tableau, Power BI (preferred exposure)


Qualifications:


· Proven experience building ETL pipelines and data integration solutions on AWS.

· Strong expertise in SQL, data modeling, and query optimization.

· Familiarity with data security, governance, and compliance best practices.

· Hands-on experience with data lake and data warehouse architectures.

· Excellent problem-solving, debugging, and troubleshooting skills.

· Ability to work collaboratively in a cross-functional, agile environment.

· Strong communication and documentation skills. 

Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You