Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
2.0 - 5.0 years
4 - 8 Lacs
Kolkata
Hybrid
Type: Contract-to-Hire (C2H) Job Summary We are looking for a skilled PySpark Developer with hands-on experience in building scalable data pipelines and processing large datasets. The ideal candidate will have deep expertise in Apache Spark , Python , and working with modern data engineering tools in cloud environments such as AWS . Key Skills & Responsibilities Strong expertise in PySpark and Apache Spark for batch and real-time data processing. Experience in designing and implementing ETL pipelines, including data ingestion, transformation, and validation. Proficiency in Python for scripting, automation, and building reusable components. Hands-on experience with scheduling tools like Airflow or Control-M to orchestrate workflows. Familiarity with AWS ecosystem, especially S3 and related file system operations. Strong understanding of Unix/Linux environments and Shell scripting. Experience with Hadoop, Hive, and platforms like Cloudera or Hortonworks. Ability to handle CDC (Change Data Capture) operations on large datasets. Experience in performance tuning, optimizing Spark jobs, and troubleshooting. Strong knowledge of data modeling, data validation, and writing unit test cases. Exposure to real-time and batch integration with downstream/upstream systems. Working knowledge of Jupyter Notebook, Zeppelin, or PyCharm for development and debugging. Understanding of Agile methodologies, with experience in CI/CD tools (e.g., Jenkins, Git). Preferred Skills Experience in building or integrating APIs for data provisioning. Exposure to ETL or reporting tools such as Informatica, Tableau, Jasper, or QlikView. Familiarity with AI/ML model development using PySpark in cloud environments Skills: ci/cd,zeppelin,pycharm,pyspark,etl tools,control-m,unit test cases,tableau,performance tuning,jenkins,qlikview,informatica,jupyter notebook,api integration,unix/linux,git,aws s3,hive,cloudera,jasper,airflow,cdc,pyspark, apache spark, python, aws s3, airflow/control-m, sql, unix/linux, hive, hadoop, data modeling, and performance tuning,agile methodologies,aws,s3,data modeling,data validation,ai/ml model development,batch integration,apache spark,python,etl pipelines,shell scripting,hortonworks,real-time integration,hadoop
Posted 1 week ago
6.0 - 11.0 years
8 - 12 Lacs
Chennai
Hybrid
Work Mode: Hybrid Interview Mode: Virtual (2 Rounds) Type: Contract-to-Hire (C2H) Job Summary We are looking for a skilled PySpark Developer with hands-on experience in building scalable data pipelines and processing large datasets. The ideal candidate will have deep expertise in Apache Spark , Python , and working with modern data engineering tools in cloud environments such as AWS . Key Skills & Responsibilities Strong expertise in PySpark and Apache Spark for batch and real-time data processing. Experience in designing and implementing ETL pipelines, including data ingestion, transformation, and validation. Proficiency in Python for scripting, automation, and building reusable components. Hands-on experience with scheduling tools like Airflow or Control-M to orchestrate workflows. Familiarity with AWS ecosystem, especially S3 and related file system operations. Strong understanding of Unix/Linux environments and Shell scripting. Experience with Hadoop, Hive, and platforms like Cloudera or Hortonworks. Ability to handle CDC (Change Data Capture) operations on large datasets. Experience in performance tuning, optimizing Spark jobs, and troubleshooting. Strong knowledge of data modeling, data validation, and writing unit test cases. Exposure to real-time and batch integration with downstream/upstream systems. Working knowledge of Jupyter Notebook, Zeppelin, or PyCharm for development and debugging. Understanding of Agile methodologies, with experience in CI/CD tools (e.g., Jenkins, Git). Preferred Skills Experience in building or integrating APIs for data provisioning. Exposure to ETL or reporting tools such as Informatica, Tableau, Jasper, or QlikView. Familiarity with AI/ML model development using PySpark in cloud environments Skills: ci / cd , zeppelin , pycharm , pyspark , etl tools,control-m,unit test cases,tableau,performance tuning , jenkins , qlikview , informatica , jupyter notebook,api integration,unix/linux,git,aws s3 , hive , cloudera , jasper , airflow , cdc , pyspark , apache spark, python, aws s3, airflow/control-m, sql, unix/linux, hive, hadoop, data modeling, and performance tuning,agile methodologies,aws,s3,data modeling,data validation,ai/ml model development,batch integration,apache spark,python,etl pipelines,shell scripting,hortonworks,real-time integration,hadoop
Posted 1 week ago
6.0 - 8.0 years
5 - 8 Lacs
Mumbai
Hybrid
Work Mode: Hybrid Interview Mode: Virtual (2 Rounds) Type: Contract-to-Hire (C2H) Job Summary We are looking for a skilled PySpark Developer with hands-on experience in building scalable data pipelines and processing large datasets. The ideal candidate will have deep expertise in Apache Spark , Python , and working with modern data engineering tools in cloud environments such as AWS . Key Skills & Responsibilities Strong expertise in PySpark and Apache Spark for batch and real-time data processing. Experience in designing and implementing ETL pipelines, including data ingestion, transformation, and validation. Proficiency in Python for scripting, automation, and building reusable components. Hands-on experience with scheduling tools like Airflow or Control-M to orchestrate workflows. Familiarity with AWS ecosystem, especially S3 and related file system operations. Strong understanding of Unix/Linux environments and Shell scripting. Experience with Hadoop, Hive, and platforms like Cloudera or Hortonworks. Ability to handle CDC (Change Data Capture) operations on large datasets. Experience in performance tuning, optimizing Spark jobs, and troubleshooting. Strong knowledge of data modeling, data validation, and writing unit test cases. Exposure to real-time and batch integration with downstream/upstream systems. Working knowledge of Jupyter Notebook, Zeppelin, or PyCharm for development and debugging. Understanding of Agile methodologies, with experience in CI/CD tools (e.g., Jenkins, Git). Preferred Skills Experience in building or integrating APIs for data provisioning. Exposure to ETL or reporting tools such as Informatica, Tableau, Jasper, or QlikView. Familiarity with AI/ML model development using PySpark in cloud environments Skills: ci/cd,zeppelin,pycharm,pyspark,etl tools,control-m,unit test cases,tableau,performance tuning,jenkins,qlikview,informatica,jupyter notebook,api integration,unix/linux,git,aws s3,hive,cloudera,jasper,airflow,cdc,pyspark, apache spark, python, aws s3, airflow/control-m, sql, unix/linux, hive, hadoop, data modeling, and performance tuning,agile methodologies,aws,s3,data modeling,data validation,ai/ml model development,batch integration,apache spark,python,etl pipelines,shell scripting,hortonworks,real-time integration,hadoop
Posted 2 weeks ago
5.0 - 7.0 years
9 - 11 Lacs
Hyderabad
Work from Office
Role: PySpark DeveloperLocations:MultipleWork Mode: Hybrid Interview Mode: Virtual (2 Rounds) Type: Contract-to-Hire (C2H) Job Summary We are looking for a skilled PySpark Developer with hands-on experience in building scalable data pipelines and processing large datasets. The ideal candidate will have deep expertise in Apache Spark, Python, and working with modern data engineering tools in cloud environments such as AWS. Key Skills & Responsibilities Strong expertise in PySpark and Apache Spark for batch and real-time data processing. Experience in designing and implementing ETL pipelines, including data ingestion, transformation, and validation. Proficiency in Python for scripting, automation, and building reusable components. Hands-on experience with scheduling tools like Airflow or Control-M to orchestrate workflows. Familiarity with AWS ecosystem, especially S3 and related file system operations. Strong understanding of Unix/Linux environments and Shell scripting. Experience with Hadoop, Hive, and platforms like Cloudera or Hortonworks. Ability to handle CDC (Change Data Capture) operations on large datasets. Experience in performance tuning, optimizing Spark jobs, and troubleshooting. Strong knowledge of data modeling, data validation, and writing unit test cases. Exposure to real-time and batch integration with downstream/upstream systems. Working knowledge of Jupyter Notebook, Zeppelin, or PyCharm for development and debugging. Understanding of Agile methodologies, with experience in CI/CD tools (e.g., Jenkins, Git). Preferred Skills Experience in building or integrating APIs for data provisioning. Exposure to ETL or reporting tools such as Informatica, Tableau, Jasper, or QlikView. Familiarity with AI/ML model development using PySpark in cloud environments
Posted 2 weeks ago
5.0 - 10.0 years
10 - 12 Lacs
Hyderabad / Secunderabad, Telangana, Telangana, India
On-site
YOUR IMPACT Are you passionate about developing mission-critical, high quality software solutions, using cutting-edge technology,in a dynamic environment OUR IMPACT We are Compliance Engineering,a global team of more than 300 engineers and scientists whowork on the most complex, mission-critical problems. We: build and operate? a suite of platforms and applications that prevent, detect, and mitigate regulatory and reputational risk across the firm. have access to the latest technology andto massive amounts of structured and unstructured data. leverage modern frameworks to build responsive and intuitive UX/UI and Big Data applications. Compliance Engi??neering is looking to fillseveralbig data software engineering roles Your first deliverable and success criteria will be the deployment, in 2025, of newcomplex data pipelines and surveillance modelstodetect inappropriatetrading activity. ?HOW YOU WILL FULFILL YOUR POTENTIAL As a member of our team, you will: partner globally with sponsors,usersand engineering colleagues across multiple divisions to create end-to-end solutions, learn from experts, leverage varioustechnologies including; Java,Spark,Hadoop, Flink, MapReduce, HBase, JSON, Protobuf, Presto, Elastic Search, Kafka, Kubernetes be able to innovate and incubate new ideas, havean opportunity to work on a broad range of problems, includingnegotiatingdata contracts, capturing data quality metrics, processing large scaledata, buildingsurveillance detection models, be involved inthe full life cycle; defining,designing, implementing, testing, deploying, and maintaining software systems acrossour products. QUALIFICATIONS A successful candidate will possessthe followingattributes: A Bachelor's or Master's degreein Computer Science, Computer Engineering, or a similar field of study. Expertise in java, as well as proficiency with databasesand data manipulation. Experience in end-to-end solutions, automated testingand SDLC concepts. The ability (and tenacity) to clearly express ideas and arguments in meetings and on paper.? Experience inthe some offollowing is desired and can set you apart from other candidates: developing in large-scale systems, such as MapReduceon Hadoop/Hbase, data analysis using tools such as SQL, Spark SQL,Zeppelin/Jupyter, API design, such as to create interconnected services, knowledge of the financial industry and compliance or risk functions, abilityto influence stakeholders.
Posted 4 weeks ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
22558 Jobs | Dublin
Wipro
12294 Jobs | Bengaluru
EY
8435 Jobs | London
Accenture in India
7026 Jobs | Dublin 2
Uplers
6784 Jobs | Ahmedabad
Amazon
6588 Jobs | Seattle,WA
IBM
6430 Jobs | Armonk
Oracle
6230 Jobs | Redwood City
Virtusa
4470 Jobs | Southborough
Capgemini
4309 Jobs | Paris,France