Jobs
Interviews

192 Dataproc Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 8.0 years

5 - 8 Lacs

Bengaluru

Work from Office

Skills desired: Strong at SQL (Multi pyramid SQL joins) Python skills (FastAPI or flask framework) PySpark Commitment to work in overlapping hours GCP knowledge(BQ, DataProc and Dataflow) Amex experience is preferred(Not Mandatory) Power BI preferred (Not Mandatory) Flask, Pyspark, Python, Sql

Posted 1 hour ago

Apply

5.0 - 9.0 years

0 Lacs

pune, maharashtra

On-site

As a Senior Engineer, VP at our Pune location in India, you will be responsible for managing and performing work across various areas of the bank's IT Platform/Infrastructure. Your role will involve analysis, development, and administration, with possible oversight of engineering delivery for specific departments. Your day-to-day tasks will include planning and developing engineering solutions to achieve business goals, ensuring reliability and resiliency in solutions, and promoting maintainability and reusability. You will play a key role in architecting well-integrated solutions and reviewing engineering plans to enhance capability and reusability. You will collaborate with a cross-functional agile delivery team, bringing an innovative approach to software development using the latest technologies and practices to deliver business value efficiently. Your focus will be on fostering a collaborative environment, open code sharing, and supporting all stages of software delivery from analysis to production support. In this role, you will enjoy benefits such as a best-in-class leave policy, gender-neutral parental leaves, sponsorship for industry certifications, employee assistance programs, comprehensive insurance coverage, and health screening. You will be expected to lead engineering efforts, champion best practices, collaborate with stakeholders to achieve business outcomes, and acquire functional knowledge of the business capabilities being digitized. Key Skills required: - GCP Services: Composer, BigQuery, DataProc, GCP Cloud Architecture, etc. - Big Data Hadoop: Hive, HQL, HDFS - Programming: Python, PySpark, SQL Query writing - Scheduler: Control-M or any other scheduler - Experience in Database engines (e.g., SQL Server, Oracle), ETL Pipeline development, Tableau, Looker, and performance tuning - Proficiency in architecture design, technical documentation, and mapping business requirements with technology Desired Skills: - Understanding of Workflow automation and Agile methodology - Terraform Coding and experience in Project Management - Prior experience in Banking/Finance domain and hybrid cloud solutions, preferably using GCP - Product development experience Join us to excel in your career with training, coaching, and continuous learning opportunities. Our culture promotes responsibility, commercial thinking, initiative, and collaboration. We value a positive, fair, and inclusive work environment where we celebrate the successes of our people. Embrace the empowering culture at Deutsche Bank Group and be part of our success together. For more information about our company and teams, please visit our website at https://www.db.com/company/company.htm.,

Posted 18 hours ago

Apply

2.0 - 6.0 years

0 Lacs

karnataka

On-site

As a GCP Senior Data Engineer/Architect, you will play a crucial role in our team by designing, developing, and implementing robust and scalable data solutions on the Google Cloud Platform (GCP). Collaborating closely with Architects and Business Analysts, especially for our US clients, you will translate data requirements into effective technical solutions. Your responsibilities will include designing and implementing scalable data warehouse and data lake solutions, orchestrating complex data pipelines, leading cloud data lake implementation projects, participating in cloud migration projects, developing containerized applications, optimizing SQL queries, writing automation scripts in Python, and utilizing various GCP data services such as BigQuery, Bigtable, and Cloud SQL. Your expertise in data warehouse and data lake design and implementation, experience in data pipeline development and tuning, hands-on involvement in cloud migration and data lake projects, proficiency in Docker and GKE, strong SQL and Python scripting skills, and familiarity with GCP services like BigQuery, Cloud SQL, Dataflow, and Composer will be essential for this role. Additionally, knowledge of data governance principles, experience with dbt, and the ability to work effectively within a team and adapt to project needs are highly valued. Strong communication skills, the willingness to work in UK shift timings, and the openness to giving and receiving feedback are important traits that will contribute to your success in this role.,

Posted 19 hours ago

Apply

8.0 - 13.0 years

0 Lacs

hyderabad, telangana

On-site

You are an experienced GCP Data Engineer with 8+ years of expertise in designing and implementing robust, scalable data architectures on Google Cloud Platform. Your role involves defining and leading the implementation of data architecture strategies using GCP services to meet business and technical requirements. As a visionary GCP Data Architect, you will be responsible for architecting and optimizing scalable data pipelines using Google Cloud Storage, BigQuery, Dataflow, Cloud Composer, Dataproc, and Pub/Sub. You will design solutions for large-scale batch processing and real-time streaming, leveraging tools like Dataproc for distributed data processing. Your responsibilities also include establishing and enforcing data governance, security frameworks, and best practices for data management. You will conduct architectural reviews and performance tuning for GCP-based data solutions, ensuring cost-efficiency and scalability. Collaborating with cross-functional teams, you will translate business needs into technical requirements and deliver innovative data solutions. The required skills for this role include strong expertise in GCP services such as Google Cloud Storage, BigQuery, Dataflow, Cloud Composer, Dataproc, and Pub/Sub. Proficiency in designing and implementing data processing frameworks for ETL/ELT, batch, and real-time workloads is essential. You should have an in-depth understanding of data modeling, data warehousing, and distributed data processing using tools like Dataproc and Spark. Hands-on experience with Python, SQL, and modern data engineering practices is required. Your knowledge of data governance, security, and compliance best practices on GCP will be crucial in this role. Strong problem-solving, leadership, and communication skills are necessary for guiding teams and engaging stakeholders effectively.,

Posted 20 hours ago

Apply

5.0 - 10.0 years

25 - 35 Lacs

Noida, Pune, Bengaluru

Work from Office

Description: We are seeking a proficient Data Governance Engineer to lead the development and management of robust data governance frameworks on Google Cloud Platform (GCP). The ideal candidate will bring in-depth expertise in data management, metadata frameworks, compliance, and security within cloud environments to ensure high-quality, secure, and compliant data practices aligned with organizational goals. Requirements: 4+ years of experience in data governance, data management, or data security. Hands-on experience with Google Cloud Platform (GCP) including BigQuery, Dataflow, Dataproc, and Google Data Catalog. Strong command over metadata management, data lineage, and data quality tools (e.g., Collibra, Informatica). Deep understanding of data privacy laws and compliance frameworks. Proficiency in SQL and Python for governance automation. Experience with RBAC, encryption, and data masking techniques. Familiarity with ETL/ELT pipelines and data warehouse architectures. Job Responsibilities: Develop and implement comprehensive data governance frameworks , focusing on metadata management, lineage tracking , and data quality. Define, document, and enforce data governance policies, access control mechanisms, and security standards using GCP-native services such as IAM, DLP, and KMS. Manage metadata repositories using tools like Collibra, Informatica, Alation, or Google Data Catalog. Collaborate with data engineering and analytics teams to ensure compliance with GDPR, CCPA, SOC 2, and other regulatory standards. Automate processes for data classification, monitoring, and reporting using Python and SQL. Support data stewardship initiatives including the development of data dictionaries and governance documentation. Optimize ETL/ELT pipelines and data workflows to meet governance best practices. What We Offer: Exciting Projects: We focus on industries like High-Tech, communication, media, healthcare, retail and telecom. Our customer list is full of fantastic global brands and leaders who love what we build for them. Collaborative Environment: You Can expand your skills by collaborating with a diverse team of highly talented people in an open, laidback environment — or even abroad in one of our global centers or client facilities! Work-Life Balance: GlobalLogic prioritizes work-life balance, which is why we offer flexible work schedules, opportunities to work from home, and paid time off and holidays. Professional Development: Our dedicated Learning & Development team regularly organizes Communication skills training(GL Vantage, Toast Master),Stress Management program, professional certifications, and technical and soft skill trainings. Excellent Benefits: We provide our employees with competitive salaries, family medical insurance, Group Term Life Insurance, Group Personal Accident Insurance , NPS(National Pension Scheme ), Periodic health awareness program, extended maternity leave, annual performance bonuses, and referral bonuses. Fun Perks: We want you to love where you work, which is why we host sports events, cultural activities, offer food on subsidies rates, Corporate parties. Our vibrant offices also include dedicated GL Zones, rooftop decks and GL Club where you can drink coffee or tea with your colleagues over a game of table and offer discounts for popular stores and restaurants!

Posted 1 day ago

Apply

7.0 - 10.0 years

20 - 27 Lacs

Noida

Work from Office

Job Responsibilities: Technical Leadership: • Provide technical leadership and mentorship to a team of data engineers. • Design, architect, and implement highly scalable, resilient, and performant data pipelines, using GCP technologies is a plus (e.g., Dataproc, Cloud Composer, Pub/Sub, BigQuery). • Guide the team in adopting best practices for data engineering, including CI/CD, infrastructure-as-code, and automated testing. • Conduct code reviews, design reviews, and provide constructive feedback to team members. • Stay up-to-date with the latest technologies and trends in data engineering, Data Pipeline Development: • Develop and maintain robust and efficient data pipelines to ingest, process, and transform large volumes of structured and unstructured data from various sources. • Implement data quality checks and monitoring systems to ensure data accuracy and integrity. • Collaborate with cross functional teams, and business stakeholders to understand data requirements and deliver data solutions that meet their needs. Platform Building & Maintenance: • Design and implement secure and scalable data storage solutions • Manage and optimize cloud infrastructure costs related to data engineering workloads. • Contribute to the development and maintenance of data engineering tooling and infrastructure to improve team productivity and efficiency. Collaboration & Communication: • Effectively communicate technical designs and concepts to both technical and non-technical audiences. • Collaborate effectively with other engineering teams, product managers, and business stakeholders. • Contribute to knowledge sharing within the team and across the organization. Required Qualifications: • Bachelor's or Master's degree in Computer Science, Engineering, or a related field. • 7+ years of experience in data engineering and Software Development. • 7+ years of experience of coding in SQL and Python/Java. • 3+ years of hands-on experience building and managing data pipelines in cloud environment like GCP. • Strong programming skills in Python or Java, with experience in developing data-intensive applications. • Expertise in SQL and data modeling techniques for both transactional and analytical workloads. • Experience with CI/CD pipelines and automated testing frameworks. • Excellent communication, interpersonal, and problem-solving skills. • Experience leading or mentoring a team of engineers Roles and Responsibilities Job Responsibilities: Technical Leadership: • Provide technical leadership and mentorship to a team of data engineers. • Design, architect, and implement highly scalable, resilient, and performant data pipelines, using GCP technologies is a plus (e.g., Dataproc, Cloud Composer, Pub/Sub, BigQuery). • Guide the team in adopting best practices for data engineering, including CI/CD, infrastructure-as-code, and automated testing. • Conduct code reviews, design reviews, and provide constructive feedback to team members. • Stay up-to-date with the latest technologies and trends in data engineering, Data Pipeline Development: • Develop and maintain robust and efficient data pipelines to ingest, process, and transform large volumes of structured and unstructured data from various sources. • Implement data quality checks and monitoring systems to ensure data accuracy and integrity. • Collaborate with cross functional teams, and business stakeholders to understand data requirements and deliver data solutions that meet their needs. Platform Building & Maintenance: • Design and implement secure and scalable data storage solutions • Manage and optimize cloud infrastructure costs related to data engineering workloads. • Contribute to the development and maintenance of data engineering tooling and infrastructure to improve team productivity and efficiency. Collaboration & Communication: • Effectively communicate technical designs and concepts to both technical and non-technical audiences. • Collaborate effectively with other engineering teams, product managers, and business stakeholders. • Contribute to knowledge sharing within the team and across the organization. Required Qualifications: • Bachelor's or Master's degree in Computer Science, Engineering, or a related field. • 7+ years of experience in data engineering and Software Development. • 7+ years of experience of coding in SQL and Python/Java. • 3+ years of hands-on experience building and managing data pipelines in cloud environment like GCP. • Strong programming skills in Python or Java, with experience in developing data-intensive applications. • Expertise in SQL and data modeling techniques for both transactional and analytical workloads. • Experience with CI/CD pipelines and automated testing frameworks. • Excellent communication, interpersonal, and problem-solving skills. • Experience leading or mentoring a team of engineers

Posted 1 day ago

Apply

3.0 - 7.0 years

0 Lacs

karnataka

On-site

As a Data Specialist, you will be responsible for utilizing your expertise in ETL Fundamentals, SQL, BigQuery, Dataproc, Python, Data Catalog, Data Warehousing, and various other tools to contribute to the successful implementation of data projects. Your role will involve working with technologies such as Cloud Trace, Cloud Logging, Cloud Storage, and Datafusion to build and maintain a modern data platform. To excel in this position, you should possess a minimum of 5 years of experience in the data engineering field, with a focus on GCP cloud data implementation suite including BigQuery, Pub Sub, Data Flow/Apache Beam, Airflow/Composer, and Cloud Storage. Your strong understanding of very large-scale data architecture and hands-on experience in data warehouses, data lakes, and analytics platforms will be crucial for the success of our projects. Key Requirements: - Minimum 5 years of experience in data engineering - Hands-on experience in GCP cloud data implementation suite - Strong expertise in GBQ Query, Python, Apache Airflow, and SQL (BigQuery preferred) - Extensive hands-on experience with SQL and Python for working with data If you are passionate about data and have a proven track record of delivering results in a fast-paced environment, we invite you to apply for this exciting opportunity to be a part of our dynamic team.,

Posted 1 day ago

Apply

15.0 - 20.0 years

9 - 14 Lacs

Hyderabad

Work from Office

Project Role : AI / ML Engineer Project Role Description : Develops applications and systems that utilize AI to improve performance and efficiency, including but not limited to deep learning, neural networks, chatbots, natural language processing. Must have skills : Google Cloud Machine Learning Services Good to have skills : Google Pub/Sub, GCP Dataflow, Google DataprocMinimum 2 year(s) of experience is required Educational Qualification : 15 years full time education Summary :As an AI / ML Engineer, you will engage in the development of applications and systems that leverage artificial intelligence to enhance performance and efficiency. Your typical day will involve collaborating with cross-functional teams to design and implement innovative solutions, utilizing advanced technologies such as deep learning and natural language processing. You will also be responsible for analyzing data and refining algorithms to ensure optimal functionality and user experience, while continuously exploring new methodologies to drive improvements in AI applications. Roles & Responsibilities:- Expected to perform independently and become an SME.- Required active participation/contribution in team discussions.- Contribute in providing solutions to work related problems.- Assist in the design and development of AI-driven applications to meet project requirements.- Collaborate with team members to troubleshoot and resolve technical challenges. Professional & Technical Skills: - Must To Have Skills: Proficiency in Google Cloud Machine Learning Services.- Good To Have Skills: Experience with GCP Dataflow, Google Pub/Sub, Google Dataproc.- Strong understanding of machine learning frameworks and libraries.- Experience in deploying machine learning models in cloud environments.- Familiarity with data preprocessing and feature engineering techniques. Additional Information:- The candidate should have minimum 2 years of experience in Google Cloud Machine Learning Services.- This position is based at our Hyderabad office.- A 15 years full time education is required. Qualification 15 years full time education

Posted 2 days ago

Apply

4.0 - 8.0 years

10 - 14 Lacs

Chennai

Work from Office

Role Description Provides leadership for the overall architecture, design, development, and deployment of a full-stack cloud native data analytics platform. Designing & Augmenting Solution architecture for Data Ingestion, Data Preparation, Data Transformation, Data Load, ML & Simulation Modelling, Java BE & FE, State Machine, API Management & Intelligence consumption using data products, on cloud Understand Business Requirements and help in developing High level and Low-level Data Engineering and Data Processing Documentation for the cloud native architecture Developing conceptual, logical and physical target-state architecture, engineering and operational specs. Work with the customer, users, technical architects, and application designers to define the solution requirements and structure for the platform Model and design the application data structure, storage, and integration Lead the database analysis, design, and build effort Work with the application architects and designers to design the integration solution Ensure that the database designs fulfill the requirements, including data volume, frequency needs, and long-term data growth Able to perform Data Engineering tasks using Spark Knowledge of developing efficient frameworks for development and testing using (Sqoop/Nifi/Kafka/Spark/Streaming/ WebHDFS/Python) to enable seamless data ingestion processes on to the Hadoop/BigQuery platforms. Enabling Data Governance and Data Discovery Exposure of Job Monitoring framework along validations automation Exposure of handling structured, Un Structured and Streaming data. Technical Skills Experience with building data platform on cloud (Data Lake, Data Warehouse environment, Databricks) Strong technical understanding of data modeling, design and architecture principles and techniques across master data, transaction data and derived/analytic data Proven background of designing and implementing architectural solutions which solve strategic and tactical business needs Deep knowledge of best practices through relevant experience across data-related disciplines and technologies, particularly for enterprise-wide data architectures, data management, data governance and data warehousing Highly competent with database design Highly competent with data modeling Strong Data Warehousing and Business Intelligence skills or including: Handling ELT and scalability issues for enterprise level data warehouse Creating ETLs/ELTs to handle data from various data sources and various formats Strong hands-on experience of programming language like Python, Scala with Spark and Beam. Solid hands-on and Solution Architecting experience in Cloud Technologies Aws, Azure and GCP (GCP preferred) Hands on working experience of data processing at scale with event driven systems, message queues (Kafka/ Flink/Spark Streaming) Hands on working Experience with GCP Services like BigQuery, DataProc, PubSub, Dataflow, Cloud Composer, API Gateway, Datalake, BigTable, Spark, Apache Beam, Feature Engineering/Data Processing to be used for Model development Experience gathering and processing raw data at scale (including writing scripts, web scraping, calling APIs, write SQL queries, etc.) Experience building data pipelines for structured/unstructured, real-time/batch, events/synchronous/ asynchronous using MQ, Kafka, Steam processing Hands-on working experience in analyzing source system data and data flows, working with structured and unstructured data Must be very strong in writing SparkSQL queries Strong organizational skills, with the ability to work autonomously as well as leading a team Pleasant Personality, Strong Communication & Interpersonal Skills Qualifications A bachelor's degree in computer science, computer engineering, or a related discipline is required to work as a technical lead Certification in GCP would be a big plus Individuals in this field can further display their leadership skills by completing the Project Management Professional certification offered by the Project Management Institute.

Posted 4 days ago

Apply

7.0 - 12.0 years

11 - 15 Lacs

Noida

Work from Office

Primary Skill(s): Lead Data Visualization Engineer with experience in Sigma BI Experience: 7+ Years in of experience in Data Visualization with experience in Sigma BI, PowerBI, Tableau or Looker Job Summary: Lead Data Visualization Engineer with deep expertise in Sigma BI and a strong ability to craft meaningful, insight-rich visual stories for business stakeholders. This role will be instrumental in transforming raw data into intuitive dashboards and visual analytics, helping cross-functional teams make informed decisions quickly and effectively. Key Responsibilities: Lead the design, development, and deployment of Sigma BI dashboards and reports tailored for various business functions. Translate complex data sets into clear, actionable insights using advanced visualization techniques. Collaborate with business stakeholders to understand goals, KPIs, and data requirements. Build data stories that communicate key business metrics, trends, and anomalies. Serve as a subject matter expert in Sigma BI and guide junior team members on best practices. Ensure visualizations follow design standards, accessibility guidelines, and performance optimization. Partner with data engineering and analytics teams to source and structure data effectively. Conduct workshops and training sessions to enable business users in consuming and interacting with dashboards. Drive the adoption of self-service BI tools and foster a data-driven decision-making culture. Required Skills & Experience: 7+ years of hands-on experience in Business Intelligence, with at least 2 years using Sigma BI. Proven ability to build end-to-end dashboard solutions that tell a story and influence decisions. Strong understanding of data modeling, SQL, and cloud data platforms (Snowflake, BigQuery, etc.). Demonstrated experience working with business users, gathering requirements, and delivering user-friendly outputs. Proficient in data storytelling, UX design principles, and visualization best practices. Experience integrating Sigma BI with modern data stacks and APIs is a plus. Excellent communication and stakeholder management skills. Preferred Qualifications: Experience with other BI tools (such as Sigma BI, Tableau, Power BI, Looker) is a plus. Familiarity with AWS Cloud Data Ecosystems (AWS Databricks). Background in Data Analysis, Statistics, or Business Analytics. Working Hours: 2 PM 11 PM IST [~4.30 AM ET 1.30 PM ET]. Communication skills: Good Mandatory Competencies BI and Reporting Tools - BI and Reporting Tools - Power BI BI and Reporting Tools - BI and Reporting Tools - Tableau Database - Database Programming - SQL Cloud - GCP - Cloud Data Fusion, Dataproc, BigQuery, Cloud Composer, Cloud Bigtable Data Science and Machine Learning - Data Science and Machine Learning - Databricks Cloud - AWS - ECS DMS - Data Analysis Skills Beh - Communication and collaboration

Posted 4 days ago

Apply

5.0 - 9.0 years

0 Lacs

chennai, tamil nadu

On-site

As a Software Engineer Practitioner at TekWissen in Chennai, you will be a crucial part of the team responsible for the development and maintenance of the Enterprise Data Platform. Your main focus will be on designing, building, and optimizing scalable data pipelines within the Google Cloud Platform (GCP) environment. Working with GCP Native technologies such as BigQuery, Dataform, Dataflow, and Pub/Sub, you will ensure data governance, security, and optimal performance. This role offers you the opportunity to utilize your full-stack expertise, collaborate with talented teams, and establish best practices for data engineering at the client. To be successful in this role, you should possess a Bachelor's or Master's degree in Computer Science, Engineering, or a related field of study. You should have at least 5 years of experience with a strong understanding of database concepts and multiple database technologies to optimize query and data processing performance. Proficiency in SQL, Python, and Java is essential, along with experience in programming engineering transformations in Python or similar languages. Additionally, you should have the ability to work effectively across different organizations, product teams, and business partners, along with knowledge of Agile (Scrum) methodology and experience in writing user stories. Your skills should include expertise in data architecture, data warehousing, and Google Cloud Platform tools such as BigQuery, Data Flow, Dataproc, Data Fusion, and others. Experience with Data Warehouse concepts, ETL processes, and data service ecosystems is crucial for this role. Strong communication skills are necessary for both internal team collaboration and external stakeholder interactions. Your role will involve advocating for user experience through empathetic stakeholder relationships and ensuring effective communication within the team and with stakeholders. As a Software Engineer Practitioner, you should have excellent communication, collaboration, and influence skills to energize the team. Your knowledge of data, software, architecture operations, data engineering, and data management standards will be valuable in this role. Hands-on experience in Python using libraries like NumPy and Pandas is required, along with extensive knowledge of GCP offerings and bundled services related to data operations. You should also have experience in re-developing and optimizing data operations, data science, and analytical workflows and products. TekWissen Group is an equal opportunity employer that supports workforce diversity, and we encourage applicants from diverse backgrounds to apply. Join us in shaping the future of data engineering and making a positive impact on lives, communities, and the planet.,

Posted 6 days ago

Apply

5.0 - 8.0 years

6 - 10 Lacs

Pune

Hybrid

Mandatory Skills: Cloud-PaaS-GCP-Google Cloud Platform . Location: Wipro PAN India Hybrid 3 days in Wipro office JD: Strong - SQL Strong - Python Any cloud technology (AWS, azure, GCP etc) have to be excellent GCP (preferred) PySpark (preferred) Essential Skills: Proficiency in Cloud-PaaS-GCP-Google Cloud Platform. Experience Required: 5-8 years. Position: Cloud Data Engineer. Work Location: Wipro, PAN India. Work Arrangement: Hybrid model with 3 days in Wipro office. Additional Experience: 8-13 years. Job Description: - Strong expertise in SQL. - Proficient in Python. - Excellent knowledge of any cloud technology (AWS, Azure, GCP, etc.), with a preference for GCP. - Familiarity with PySpark is preferred. Mandatory Skills: Cloud-PaaS-GCP-Google Cloud Platform . JD: Strong - SQL Strong - Python Any cloud technology (AWS, azure, GCP etc) have to be excellent GCP (preferred) PySpark (preferred

Posted 6 days ago

Apply

1.0 - 5.0 years

0 Lacs

karnataka

On-site

Capgemini Invent is the digital innovation, consulting, and transformation brand of the Capgemini Group, a global business line that combines market-leading expertise in strategy, technology, data science, and creative design to help CxOs envision and build what's next for their businesses. In this role, you should have developed/worked on at least one Gen AI project and have experience in data pipeline implementation with cloud providers such as AWS, Azure, or GCP. You should also be familiar with cloud storage, cloud database, cloud data warehousing, and Data lake solutions like Snowflake, BigQuery, AWS Redshift, ADLS, and S3. Additionally, a good understanding of cloud compute services, load balancing, identity management, authentication, and authorization in the cloud is essential. Your profile should include a good knowledge of infrastructure capacity sizing, costing of cloud services to drive optimized solution architecture, leading to optimal infra investment vs. performance and scaling. You should be able to contribute to making architectural choices using various cloud services and solution methodologies. Proficiency in programming using Python is required along with expertise in cloud DevOps practices such as infrastructure as code, CI/CD components, and automated deployments on the cloud. Understanding networking, security, design principles, and best practices in the cloud is also important. At Capgemini, we value flexible work arrangements to provide support for maintaining a healthy work-life balance. You will have opportunities for career growth through various career growth programs and diverse professions tailored to support you in exploring a world of opportunities. Additionally, you can equip yourself with valuable certifications in the latest technologies such as Generative AI. Capgemini is a global business and technology transformation partner with a rich heritage of over 55 years. We have a diverse team of 340,000 members in more than 50 countries, working together to accelerate the dual transition to a digital and sustainable world while creating tangible impact for enterprises and society. Trusted by clients to unlock the value of technology, we deliver end-to-end services and solutions leveraging strengths from strategy and design to engineering, fueled by market-leading capabilities in AI, cloud, and data, combined with deep industry expertise and partner ecosystem. Our global revenues in 2023 were reported at 22.5 billion.,

Posted 1 week ago

Apply

6.0 - 11.0 years

6 - 9 Lacs

Hyderabad

Work from Office

At least 8 + years of experience in any of the ETL tools Prophecy, Datastage 11.5/11.7, Pentaho.. etc . At least 3 years of experience in Pyspark with GCP (Airflow, Dataproc, Big query) capable of configuring data pipelines . Strong Experience in writing complex SQL queries to perform data analysis on Databases SQL server, Oracle, HIVE etc . Possess the following technical skills SQL, Python, Pyspark, Hive, ETL, Unix, Control-M (or similar scheduling tools ) Ability to work independently on specialized assignments within the context of project deliverables Take ownership of providing solutions and tools that iteratively increase engineering efficiencies . Design should help embed standard processes, systems and operational models into the BAU approach for end-to-end execution of Data Pipelines Proven problem solving and analytical abilities including the ability to critically evaluate information gathered from multiple sources, reconcile conflicts, decompose high-level information into details and apply sound business and technical domain knowledge Communicate openly and honestly. Advanced oral, written and visual communication and presentation skills - the ability to communicate efficiently at a global level is paramount. Ability to deliver materials of the highest quality to management against tight deadlines. Ability to work effectively under pressure with competing and rapidly changing priorities.

Posted 1 week ago

Apply

7.0 - 12.0 years

6 - 9 Lacs

Hyderabad

Work from Office

Understanding of Spark core concepts like RDDs, DataFrames, DataSets, SparkSQL and Spark Streaming. Experience with Spark optimization techniques. Deep knowledge of Delta Lake features like time travel, schema evolution, data partitioning. Ability to design and implement data pipelines using Spark and Delta Lake as the data storage layer. Proficiency in Python/Scala/Java for Spark development and integrate with ETL process. Knowledge of data ingestion techniques from various sources (flat files, CSV, API, database) Understanding of data quality best practices and data validation techniques. Other Skills: Understanding of data warehouse concepts, data modelling techniques. Expertise in Git for code management. Familiarity with CI/CD pipelines and containerization technologies. Nice to have experience using data integration tools like DataStage/Prophecy/Informatica/Ab Initio"

Posted 1 week ago

Apply

8.0 - 13.0 years

6 - 10 Lacs

Hyderabad

Work from Office

Skill Extensive experience with Google Data Products (Cloud Data Fusion,BigQuery, Dataflow, Dataproc, AI Building Blocks, Looker, Dataprep, etc.). Expertise in Cloud Data Fusion,BigQuery & Dataproc Experience in MDM, Metadata Management, Data Quality and Data Lineage tools. E2E Data Engineering and Lifecycle (including non-functional requirements and operations) management. Experience with SQL and NoSQL modern data stores. E2E Solution Design skills - Prototyping, Usability testing and data visualization literacy. Excellent knowledge of the software development life cycle

Posted 1 week ago

Apply

10.0 - 15.0 years

35 - 40 Lacs

Bengaluru

Work from Office

Proficiency in Google Cloud Platform (GCP) services, including Dataflow , DataStream , Dataproc , Big Query , and Cloud Storage . Strong experience with Apache Spark and Apache Flink for distributed data processing. Knowledge of real-time data streaming technologies (e.g., Apache Kafka , Pub/Sub ). Familiarity with data orchestration tools like Apache Airflow or Cloud Composer . Expertise in Infrastructure as Code (IaC) tools like Terraform or Cloud Deployment Manager . Experience with CI/CD tools like Jenkins , GitLab CI/CD , or Cloud Build . Knowledge of containerization and orchestration tools like Docker and Kubernetes . Strong scripting skills for automation (e.g., Bash , Python ). Experience with monitoring tools like Cloud Monitoring , Prometheus , and Grafana . Familiarity with logging tools like Cloud Logging or ELK Stack .

Posted 1 week ago

Apply

4.0 - 7.0 years

7 - 14 Lacs

Gurugram

Work from Office

Must have : Bigdata ,GCP Roles & Responsibilities Must have : Bigdata ,GCP Tags Bigdata, GCP Years Of Experience 4 to 7 Years The candidate should have extensive production experience (1-2 Years ) in GCP, Other cloud experience would be a strong bonus. - Strong background in Data engineering 4-5 Years of exp in Big Data technologies including, Hadoop, NoSQL, Spark, Kafka etc. - Exposure to Production application is a must and Operating knowledge of cloud computing platforms (GCP, especially Big Query, Dataflow, Dataproc, Storage, VMs, Networking, Pub Sub, Cloud Functions, Composer servics) Role & responsibilities Preferred candidate profile

Posted 1 week ago

Apply

5.0 - 7.0 years

6 - 7 Lacs

Chennai

Hybrid

Overview: TekWissen is a global workforce management provider throughout India and many other countries in the world. The below clientis a global company with shared ideals and a deep sense of family. From our earliest days as a pioneer of modern transportation, we have sought to make the world a better place one that benefits lives, communities and the planet Job Title: Specialty Development Practitioner Location: Chennai Work Type: Hybrid Position Description: This role is for a proactive Full Stack Software Engineer responsible for creating products to host Supply Chain Analytics algorithms. You will ensure software engineering excellence while developing web applications and tools, employing practices like pair programming and Test-Driven Development (TDD) within an Agile environment. Key responsibilities include acting as a change agent, mentoring teams on Agile methodologies, and contributing to Client's institutional knowledge. Strong written and oral communication skills are essential for interacting with Client leadership, along with a self-starting approach. Required Skills: Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related technical field. 5-7+ years of software engineering and testing experience, including Agile methodologies and Jira. Technical requirements include 3+ years in Python, Java, and Spring Boot development 3+ years with REST APIs; and 3+ years developing web-based UIs using JavaScript, React, Angular, Vue, or TypeScript, along with Pub Sub, APIGEE, and Cloud Storage. Experience with relational (e.g., PostgreSQL, SQL Server), NoSQL, and columnar databases (e.g., BigQuery) is necessary. At least 1 year of experience developing and deploying to cloud platforms such as Google Cloud Platform, Pivotal Cloud Foundry, Amazon Web Services, and Microsoft Azure is also required. A passion for clean code and a strong desire for continuous learning are key. Desired Skills: Full-stack expertise, automated testing (Unit, Integration, E2E), Cloud Computing/Infrastructure experience (especially Google Cloud Platform, Cloud Run containerization, and Google Cloud Storage), and proficiency with Continuous Integration/Continuous Delivery tools like Jenkins, Tekton, or Gradle. Skills Required: Big Query,, Python, Angular, Relational Databases, Google Cloud Platform, Google Cloud Platform Biq Query, Data Flow, Dataproc, Data Fusion, TERRAFORM, Tekton,Cloud SQL, AIRFLOW, POSTGRES, Airflow PySpark, Python, API Experience Required: 5+ Years Education Required: Bachelor's Degree TekWissen® Group is an equal opportunity employer supporting workforce diversity.

Posted 1 week ago

Apply

5.0 - 6.0 years

5 - 6 Lacs

Chennai

Hybrid

Overview: TekWissen is a global workforce management provider throughout India and many other countries in the world. The below clientis a global company with shared ideals and a deep sense of family. From our earliest days as a pioneer of modern transportation, we have sought to make the world a better place one that benefits lives, communities and the planet Job Title: Software Engineer Practitioner Location: Chennai Work Type: Hybrid Position Description: We're seeking a highly skilled and experienced Full Stack Data Engineer to play a pivotal role in the development and maintenance of our Enterprise Data Platform. In this role, you'll be responsible for designing, building, and optimizing scalable data pipelines within our Google Cloud Platform (GCP) environment. You'll work with GCP Native technologies like BigQuery, Dataform,Dataflow, and Pub/Sub, ensuring data governance, security, and optimal performance. This is a fantastic opportunity to leverage your full-stack expertise, collaborate with talented teams, and establish best practices for data engineering at client. Basic Qualifications: Bachelors or Masters degree in a Computer Science, Engineering or a related or related field of study 5+ Years - Strong understating of Database concepts and experience with multiple database technologies optimizing query and data processing performance. 5+ Years - Full Stack Data Engineering Competency in a public cloud Google Critical thinking skills to propose data solutions, test, and make them a reality. 5+ Years - Highly Proficient in SQL, Python, Java- Experience programming engineering transformation in Python or a similar language. 5+ Years - Ability to work effectively across organizations, product teams and business partners. 5+ Years - Knowledge Agile (Scrum) Methodology, experience in writing user stories Deep understanding of data service ecosystems including data warehousing, lakes and Marts User experience advocacy through empathetic stakeholder relationship. Effective Communication both internally (with team members) and externally (with stakeholders) Knowledge of Data Warehouse concepts experience with Data Warehouse/ ETL processes Strong process discipline and thorough understating of IT processes (ISP, Data Security). Skills Required: Data Architecture, Data Warehousing, DataForm, Google Cloud Platform - Biq Query, Data Flow, Dataproc, Data Fusion, TERRAFORM, Tekton,Cloud SQL, AIRFLOW, POSTGRES, Airflow PySpark, Python, API Experience Required: Excellent communication, collaboration and influence skills; ability to energize a team. Knowledge of data, software and architecture operations, data engineering and data management standards, governance and quality Hands on experience in Python using libraries like NumPy, Pandas, etc. Extensive knowledge and understanding of GCP offerings, bundled services, especially those associated with data operations Cloud Console, BigQuery, DataFlow, Dataform, PubSub Experience with recoding, re-developing and optimizing data operations, data science and analytical workflows and products. Experience Required: 5+ Years Education Required: Bachelor's Degree TekWissen Group is an equal opportunity employer supporting workforce diversity.

Posted 1 week ago

Apply

7.0 - 12.0 years

20 - 30 Lacs

Hyderabad, Pune, Bengaluru

Work from Office

Work Location: Bangalore/Pune/Hyderabad/ NCR Experience: 5-12yrs Required Skills: Proven experience as a Data Engineer with expertise in GCP. Strong understanding of data warehousing concepts and ETL processes. Experience with BigQuery, Dataflow, and other GCP data services Design, develop, and maintain data pipelines on GCP. Implement data storage solutions and optimize data processing workflows. Ensure data quality and integrity throughout the data lifecycle. Collaborate with data scientists and analysts to understand data requirements. Monitor and maintain the health of the data infrastructure. Troubleshoot and resolve data-related issues. Thanks & Regards Suganya R Suganya@spstaffing.in

Posted 1 week ago

Apply

3.0 - 6.0 years

13 - 18 Lacs

Bengaluru

Work from Office

We are looking to hire Data engineer for the Platform Engineering team. It is a collection of highly skilled individuals ranging from development to operations with a security first mindset who strive to push the boundaries of technology. We champion a DevSecOps culture and raise the bar on how and when we deploy applications to production. Our core principals are centered around automation, testing, quality, and immutability all via code. The role is responsible for building self-service capabilities that improve our security posture, productivity, and reduce time to market with automation at the core of these objectives. The individual collaborates with teams across the organization to ensure applications are designed for Continuous Delivery (CD) and are well-architected for their targeted platform which can be on-premise or the cloud. If you are passionate about developer productivity, cloud native applications, and container orchestration, this job is for you! Principal Accountabilities: The incumbent is mentored by senior individuals on the team to capture the flow and bottlenecks in the holistic IT delivery process and define future tool sets Skills and Software Requirements: Experience with a language such as Python, Go,SQL, Java, or Scala GCP data services (BigQuery; Dataflow; Dataproc; Cloud Composer; Pub/Sub; Google Cloud Storage; IAM) Experience with Jenkins, Maven, Git, Ansible, or CHEF Experience working with containers, orchestration tools (like Kubernetes, Mesos, Docker Swarm etc.) and container registries (GCE, Docker hub etc.) Experience with [SPI]aaS- Software-as-a-Service, Platform-as-a-Service, or Infrastructure-as- a-Service Acquire, cleanse, and ingest structured and unstructured data on the cloud Combine data from disparate sources in to a single, unified, authoritative view of data (e.g., Data Lake) Enable and support data movement from one system service to another system service Experience implementing or supporting automated solutions to technical problems Experience working in a team environment, proactively executing on tasks while meeting agreed delivery timelines Ability to contribute to effective and timely solutions Excellent oral and written communication skills

Posted 1 week ago

Apply

3.0 - 4.0 years

3 - 7 Lacs

Mumbai

Work from Office

Job Summary We are seeking an experienced and motivated Data Engineer to join our growing team, preferably with experience in the Banking, Financial Services, and Insurance (BFSI) sector. The ideal candidate will have a strong background in designing, building, and maintaining robust and scalable data infrastructure. You will play a crucial role in developing our data ecosystem, ensuring data quality, and empowering data-driven decisions across the organization. This role requires hands-on experience with the Google Cloud Platform (GCP) and a passion for working with cutting-edge data technologies. Responsibilities Design and Develop End-to-End Data Engineering Pipelines: Build, and maintain scalable and reliable data pipelines to ingest, process, and transform large volumes of structured and unstructured data from various sources. Implement Data Quality and Governance: Establish and enforce processes for data validation, transformation, auditing, and reconciliation to ensure data accuracy, completeness, and consistency. Build and Maintain Data Storage Solutions: Design, implement, and manage data vault and data mart to support business intelligence, analytics, and reporting requirements. Orchestrate and Automate Workflows: Utilize workflow management tools to schedule, monitor, and automate complex data workflows and ETL processes. Optimize Data Infrastructure: Continuously evaluate and improve the performance, reliability, and cost-effectiveness of our data infrastructure and pipelines. Collaborate with Stakeholders: Work closely with data analysts, data scientists, and business stakeholders to understand their data needs and deliver effective data solutions. Documentation: Create and maintain comprehensive documentation for data pipelines, processes, and architectures. Key Skills Python: Proficient in Python for data engineering tasks, including scripting, automation, and data manipulation. PySpark: Strong experience with PySpark for large-scale data processing and analytics. SQL: Expertise in writing complex SQL queries for data extraction, transformation, and analysis. Tech Stack (Must Have) Google Cloud Platform (GCP): Dataproc: For managing and running Apache Spark and Hadoop clusters. Composer (Airflow): For creating, scheduling, and monitoring data workflows. Cloud Functions: For event-driven serverless data processing. Cloud Run: For deploying and scaling containerized data applications. Cloud SQL: For managing relational databases. BigQuery: For data warehousing, analytics, and large-scale SQL queries. Qualifications Bachelor's degree in Computer Science, Engineering, Information Technology, or a related field. 3+ years of proven experience in a Data Engineer role. Demonstrable experience with the specified "must-have" tech stack. Strong problem-solving skills and the ability to work independently and as part of a team. Excellent communication and interpersonal skills. Good to Have Experience in the BFSI (Banking, Financial Services, and Insurance) domain. Apache NiFi: Experience with data flow automation and management. Qlik: Familiarity with business intelligence and data visualization tools. AWS: Knowledge of Amazon Web Services data services. DevOps and FinOps: Understanding of DevOps principles and practices (CI/CD, IaC) and cloud financial management (FinOps) to optimize cloud spending.

Posted 1 week ago

Apply

5.0 - 9.0 years

0 Lacs

chennai, tamil nadu

On-site

You should have a strong understanding of the tech stack including GCP Services such as BigQuery, Cloud Dataflow, Pub/Sub, Dataproc, and Cloud Storage. Experience with Data Processing tools like Apache Beam (batch/stream), Apache Kafka, and Cloud Dataprep is crucial. Proficiency in programming languages like Python, Java/Scala, and SQL is required. Your expertise should extend to Orchestration tools like Apache Airflow (Cloud Composer) and Terraform, and Security aspects including IAM, Cloud Identity, and Cloud Security Command Center. Knowledge of Containerization using Docker and Kubernetes (GKE) is essential. Familiarity with Machine Learning platforms such as Google AI Platform, TensorFlow, and AutoML is expected. Candidates with certifications like Google Cloud Data Engineer and Cloud Architect are preferred. You should have a proven track record of designing scalable AI/ML systems in production, focusing on high-performance and cost-effective solutions. Strong experience with cloud platforms (Google Cloud, AWS, Azure) and cloud-native AI/ML services like Vertex AI and SageMaker is important. Your role will involve implementing MLOps practices, including model deployment, monitoring, retraining, and version control. Leadership skills are key to guide teams, mentor engineers, and collaborate effectively with cross-functional teams to achieve business objectives. A deep understanding of frameworks like TensorFlow, PyTorch, and Scikit-learn for designing, training, and deploying models is necessary. Experience with data engineering principles, scalable pipelines, and distributed systems (e.g., Apache Kafka, Spark, Kubernetes) is also required. Nice to have requirements include strong leadership and mentorship capabilities to guide teams towards best practices and high-quality deliverables. Excellent problem-solving skills focusing on designing efficient, high-performance systems are valued. Effective project management abilities are necessary to handle multiple initiatives and ensure timely delivery. Collaboration and teamwork are emphasized to foster a positive and productive work environment.,

Posted 1 week ago

Apply

3.0 - 7.0 years

6 - 16 Lacs

Chennai

Hybrid

Greetings from Getronics! We have permanent opportunities for GCP Data Engineers for Chennai Location . Hope you are doing well! This is Jogeshwari from Getronics Talent Acquisition team. We have multiple opportunities for GCP Data Engineers. Please find below the company profile and Job Description. If interested, please share your updated resume, recent professional photograph and Aadhaar proof at the earliest to jogeshwari.k@getronics.com. Company : Getronics (Permanent role) Client : Automobile Industry Experience Required : 3+ Years in IT and minimum 2+ years in GCP Data Engineering Location : Chennai Skill Required: - GCP Data Engineer, Hadoop, Spark/Pyspark, Google Cloud Platform (Google Cloud Platform) services: BigQuery, DataFlow, Pub/Sub, BigTable, Data Fusion, DataProc, Cloud Compose, Cloud SQL, Compute Engine, Cloud Functions, and App Engine. - 6+ years of professional experience: Data engineering, data product development and software product launches. - 4+ years of cloud data engineering experience building scalable, reliable, and cost- effective production batch and streaming data pipelines using: Data warehouses like Google BigQuery. Workflow orchestration tools like Airflow. Relational Database Management System like MySQL, PostgreSQL, and SQL Server. Real-Time data streaming platform like Apache Kafka, GCP Pub/Sub. LOOKING FOR IMMEDIATE TO 30 DAYS NOTICE CANDIDATES ONLY. Regards, Jogeshwari Senior Specialist

Posted 2 weeks ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies