PySpark Developer

5 years

14 - 18 Lacs

Posted:10 hours ago| Platform: GlassDoor logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Title: PySpark Developer

Locations: Chennai, Hyderabad, Kolkata
Work Mode: Monday–Friday (5 days WFO)
Experience: 5+ years in Backend/Data Engineering
Notice Period: Immediate – 15 days
Must-Have: Python, PySpark, Amazon Redshift, PostgreSQL

About the Role

We are seeking an experienced PySpark Developer with strong data engineering expertise to design, develop, and optimize scalable data pipelines for large-scale data processing. The role involves working across distributed systems, ETL/ELT frameworks, cloud data platforms, and analytics-driven architecture. You will collaborate closely with cross-functional teams to ensure efficient ingestion, transformation, and delivery of high-quality data.

Key Responsibilities

  • Design and develop robust, scalable ETL/ELT pipelines using PySpark to process data from databases, APIs, logs, and file-based sources.
  • Convert raw data into analysis-ready datasets for data hubs and analytical data marts.
  • Build reusable, parameterized Spark jobs for batch and micro-batch processing.
  • Optimize PySpark performance to handle large and complex datasets.
  • Ensure data quality, consistency, lineage, and maintain detailed documentation for all ingestion workflows.
  • Collaborate with Data Architects, Data Modelers, and Data Scientists to implement data ingestion logic aligned with business requirements.
  • Work with AWS services (S3, Glue, EMR, Redshift) for data ingestion, storage, and processing.
  • Support version control, CI/CD practices, and infrastructure-as-code workflows as needed.

Must-Have Skills

  • Minimum 5+ years of data engineering experience, with a strong focus on PySpark/Spark.
  • Proven experience building ingestion frameworks for relational, semi-structured (JSON, XML), and unstructured data (logs, PDFs).
  • Strong Python knowledge along with key data processing libraries.
  • Advanced SQL proficiency (Redshift, PostgreSQL, or similar).
  • Hands-on experience with distributed computing platforms (Spark on EMR, Databricks, etc.).
  • Familiarity with workflow orchestration tools (AWS Step Functions or similar).
  • Strong understanding of data lake and data warehouse architectures, including core data modeling concepts.

Good-to-Have Skills

  • Experience with AWS services: Glue, S3, Redshift, Lambda, CloudWatch, etc.
  • Exposure to Delta Lake or similar large-scale storage frameworks.
  • Experience with real-time streaming tools: Spark Structured Streaming, Kafka.
  • Understanding of data governance, lineage, and cataloging tools (Glue Catalog, Apache Atlas).
  • Knowledge of DevOps and CI/CD pipelines (Git, Jenkins, etc.).

Job Type: Full-time

Pay: ₹1,400,000.00 - ₹1,800,000.00 per year

Application Question(s):

  • How many years of experience you have as Pyspark Developer?
  • Have you worked with Python, Amazon Redshift, PostgreSQL?
  • Mention your Current Location?
  • Mention your NP, Current CTC and ECTC.

Work Location: In person

Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

mumbai metropolitan region