2.0 - 5.0 years

4 - 8 Lacs

Kolkata

Hybrid

Type: Contract-to-Hire (C2H) Job Summary We are looking for a skilled PySpark Developer with hands-on experience in building scalable data pipelines and processing large datasets. The ideal candidate will have deep expertise in Apache Spark , Python , and working with modern data engineering tools in cloud environments such as AWS . Key Skills & Responsibilities Strong expertise in PySpark and Apache Spark for batch and real-time data processing. Experience in designing and implementing ETL pipelines, including data ingestion, transformation, and validation. Proficiency in Python for scripting, automation, and building reusable components. Hands-on experience with scheduling tools like Airflow or Control-M to orchestrate workflows. Familiarity with AWS ecosystem, especially S3 and related file system operations. Strong understanding of Unix/Linux environments and Shell scripting. Experience with Hadoop, Hive, and platforms like Cloudera or Hortonworks. Ability to handle CDC (Change Data Capture) operations on large datasets. Experience in performance tuning, optimizing Spark jobs, and troubleshooting. Strong knowledge of data modeling, data validation, and writing unit test cases. Exposure to real-time and batch integration with downstream/upstream systems. Working knowledge of Jupyter Notebook, Zeppelin, or PyCharm for development and debugging. Understanding of Agile methodologies, with experience in CI/CD tools (e.g., Jenkins, Git). Preferred Skills Experience in building or integrating APIs for data provisioning. Exposure to ETL or reporting tools such as Informatica, Tableau, Jasper, or QlikView. Familiarity with AI/ML model development using PySpark in cloud environments Skills: ci/cd,zeppelin,pycharm,pyspark,etl tools,control-m,unit test cases,tableau,performance tuning,jenkins,qlikview,informatica,jupyter notebook,api integration,unix/linux,git,aws s3,hive,cloudera,jasper,airflow,cdc,pyspark, apache spark, python, aws s3, airflow/control-m, sql, unix/linux, hive, hadoop, data modeling, and performance tuning,agile methodologies,aws,s3,data modeling,data validation,ai/ml model development,batch integration,apache spark,python,etl pipelines,shell scripting,hortonworks,real-time integration,hadoop

Posted 1 week ago

Apply

PySpark Developer Apex One

6.0 - 11.0 years

8 - 12 Lacs

Chennai

Hybrid

Work Mode: Hybrid Interview Mode: Virtual (2 Rounds) Type: Contract-to-Hire (C2H) Job Summary We are looking for a skilled PySpark Developer with hands-on experience in building scalable data pipelines and processing large datasets. The ideal candidate will have deep expertise in Apache Spark , Python , and working with modern data engineering tools in cloud environments such as AWS . Key Skills & Responsibilities Strong expertise in PySpark and Apache Spark for batch and real-time data processing. Experience in designing and implementing ETL pipelines, including data ingestion, transformation, and validation. Proficiency in Python for scripting, automation, and building reusable components. Hands-on experience with scheduling tools like Airflow or Control-M to orchestrate workflows. Familiarity with AWS ecosystem, especially S3 and related file system operations. Strong understanding of Unix/Linux environments and Shell scripting. Experience with Hadoop, Hive, and platforms like Cloudera or Hortonworks. Ability to handle CDC (Change Data Capture) operations on large datasets. Experience in performance tuning, optimizing Spark jobs, and troubleshooting. Strong knowledge of data modeling, data validation, and writing unit test cases. Exposure to real-time and batch integration with downstream/upstream systems. Working knowledge of Jupyter Notebook, Zeppelin, or PyCharm for development and debugging. Understanding of Agile methodologies, with experience in CI/CD tools (e.g., Jenkins, Git). Preferred Skills Experience in building or integrating APIs for data provisioning. Exposure to ETL or reporting tools such as Informatica, Tableau, Jasper, or QlikView. Familiarity with AI/ML model development using PySpark in cloud environments Skills: ci / cd , zeppelin , pycharm , pyspark , etl tools,control-m,unit test cases,tableau,performance tuning , jenkins , qlikview , informatica , jupyter notebook,api integration,unix/linux,git,aws s3 , hive , cloudera , jasper , airflow , cdc , pyspark , apache spark, python, aws s3, airflow/control-m, sql, unix/linux, hive, hadoop, data modeling, and performance tuning,agile methodologies,aws,s3,data modeling,data validation,ai/ml model development,batch integration,apache spark,python,etl pipelines,shell scripting,hortonworks,real-time integration,hadoop

Posted 1 week ago

Apply

PySpark Developer Apex One

6.0 - 8.0 years

5 - 8 Lacs

Mumbai

Hybrid

Work Mode: Hybrid Interview Mode: Virtual (2 Rounds) Type: Contract-to-Hire (C2H) Job Summary We are looking for a skilled PySpark Developer with hands-on experience in building scalable data pipelines and processing large datasets. The ideal candidate will have deep expertise in Apache Spark , Python , and working with modern data engineering tools in cloud environments such as AWS . Key Skills & Responsibilities Strong expertise in PySpark and Apache Spark for batch and real-time data processing. Experience in designing and implementing ETL pipelines, including data ingestion, transformation, and validation. Proficiency in Python for scripting, automation, and building reusable components. Hands-on experience with scheduling tools like Airflow or Control-M to orchestrate workflows. Familiarity with AWS ecosystem, especially S3 and related file system operations. Strong understanding of Unix/Linux environments and Shell scripting. Experience with Hadoop, Hive, and platforms like Cloudera or Hortonworks. Ability to handle CDC (Change Data Capture) operations on large datasets. Experience in performance tuning, optimizing Spark jobs, and troubleshooting. Strong knowledge of data modeling, data validation, and writing unit test cases. Exposure to real-time and batch integration with downstream/upstream systems. Working knowledge of Jupyter Notebook, Zeppelin, or PyCharm for development and debugging. Understanding of Agile methodologies, with experience in CI/CD tools (e.g., Jenkins, Git). Preferred Skills Experience in building or integrating APIs for data provisioning. Exposure to ETL or reporting tools such as Informatica, Tableau, Jasper, or QlikView. Familiarity with AI/ML model development using PySpark in cloud environments Skills: ci/cd,zeppelin,pycharm,pyspark,etl tools,control-m,unit test cases,tableau,performance tuning,jenkins,qlikview,informatica,jupyter notebook,api integration,unix/linux,git,aws s3,hive,cloudera,jasper,airflow,cdc,pyspark, apache spark, python, aws s3, airflow/control-m, sql, unix/linux, hive, hadoop, data modeling, and performance tuning,agile methodologies,aws,s3,data modeling,data validation,ai/ml model development,batch integration,apache spark,python,etl pipelines,shell scripting,hortonworks,real-time integration,hadoop

Posted 2 weeks ago

Apply

PySpark Developer Apex One

5.0 - 7.0 years

9 - 11 Lacs

Hyderabad

Work from Office

Role: PySpark DeveloperLocations:MultipleWork Mode: Hybrid Interview Mode: Virtual (2 Rounds) Type: Contract-to-Hire (C2H) Job Summary We are looking for a skilled PySpark Developer with hands-on experience in building scalable data pipelines and processing large datasets. The ideal candidate will have deep expertise in Apache Spark, Python, and working with modern data engineering tools in cloud environments such as AWS. Key Skills & Responsibilities Strong expertise in PySpark and Apache Spark for batch and real-time data processing. Experience in designing and implementing ETL pipelines, including data ingestion, transformation, and validation. Proficiency in Python for scripting, automation, and building reusable components. Hands-on experience with scheduling tools like Airflow or Control-M to orchestrate workflows. Familiarity with AWS ecosystem, especially S3 and related file system operations. Strong understanding of Unix/Linux environments and Shell scripting. Experience with Hadoop, Hive, and platforms like Cloudera or Hortonworks. Ability to handle CDC (Change Data Capture) operations on large datasets. Experience in performance tuning, optimizing Spark jobs, and troubleshooting. Strong knowledge of data modeling, data validation, and writing unit test cases. Exposure to real-time and batch integration with downstream/upstream systems. Working knowledge of Jupyter Notebook, Zeppelin, or PyCharm for development and debugging. Understanding of Agile methodologies, with experience in CI/CD tools (e.g., Jenkins, Git). Preferred Skills Experience in building or integrating APIs for data provisioning. Exposure to ETL or reporting tools such as Informatica, Tableau, Jasper, or QlikView. Familiarity with AI/ML model development using PySpark in cloud environments

Posted 2 weeks ago

Apply

Compliance-Hyderabad-Associate-Software Engineering Goldman Sachs

5.0 - 10.0 years

10 - 12 Lacs

Hyderabad / Secunderabad, Telangana, Telangana, India

On-site

YOUR IMPACT Are you passionate about developing mission-critical, high quality software solutions, using cutting-edge technology,in a dynamic environment OUR IMPACT We are Compliance Engineering,a global team of more than 300 engineers and scientists whowork on the most complex, mission-critical problems. We: build and operate? a suite of platforms and applications that prevent, detect, and mitigate regulatory and reputational risk across the firm. have access to the latest technology andto massive amounts of structured and unstructured data. leverage modern frameworks to build responsive and intuitive UX/UI and Big Data applications. Compliance Engi??neering is looking to fillseveralbig data software engineering roles Your first deliverable and success criteria will be the deployment, in 2025, of newcomplex data pipelines and surveillance modelstodetect inappropriatetrading activity. ?HOW YOU WILL FULFILL YOUR POTENTIAL As a member of our team, you will: partner globally with sponsors,usersand engineering colleagues across multiple divisions to create end-to-end solutions, learn from experts, leverage varioustechnologies including; Java,Spark,Hadoop, Flink, MapReduce, HBase, JSON, Protobuf, Presto, Elastic Search, Kafka, Kubernetes be able to innovate and incubate new ideas, havean opportunity to work on a broad range of problems, includingnegotiatingdata contracts, capturing data quality metrics, processing large scaledata, buildingsurveillance detection models, be involved inthe full life cycle; defining,designing, implementing, testing, deploying, and maintaining software systems acrossour products. QUALIFICATIONS A successful candidate will possessthe followingattributes: A Bachelor's or Master's degreein Computer Science, Computer Engineering, or a similar field of study. Expertise in java, as well as proficiency with databasesand data manipulation. Experience in end-to-end solutions, automated testingand SDLC concepts. The ability (and tenacity) to clearly express ideas and arguments in meetings and on paper.? Experience inthe some offollowing is desired and can set you apart from other candidates: developing in large-scale systems, such as MapReduceon Hadoop/Hbase, data analysis using tools such as SQL, Spark SQL,Zeppelin/Jupyter, API design, such as to create interconnected services, knowledge of the financial industry and compliance or risk functions, abilityto influence stakeholders.

Posted 4 weeks ago

Apply

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.