Jobs
Interviews

17 Spark Core Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

8.0 - 11.0 years

6 - 16 Lacs

hyderabad, chennai, bengaluru

Work from Office

We are seeking an experienced Software Engineer with deep expertise in Scala programming and Big Data technologies to design, develop, and maintain large-scale distributed data processing systems. The ideal candidate will be a hands-on developer with a strong understanding of data pipelines, Spark ecosystem, and related technologies, capable of delivering clean, efficient, and scalable code in an Agile environment. Key Responsibilities Develop and maintain scalable, efficient, and robust data processing pipelines using Scala and Apache Spark (Spark Core, Spark SQL, Spark Streaming). Write clean, maintainable, and well-documented Scala code following industry best practices and coding standards. Design and implement batch and real-time data processing workflows handling large volumes of data. Work closely with cross-functional teams to understand business requirements and translate them into technical solutions that meet quality standards. Utilize Hadoop ecosystem components such as HDFS, Hive, Sqoop, Impala, and related tools to support data storage and retrieval needs. Develop and optimize ETL processes and data warehousing solutions leveraging Big Data technologies. Apply deep knowledge of Data Structures and algorithms to ensure efficient data processing and system performance. Conduct unit testing, code reviews, and performance tuning of data processing jobs. Automate application job scheduling and execution using UNIX shell scripting (advantageous). Participate actively in Agile development processes including daily standups, sprint planning, reviews, and retrospectives. Collaborate effectively with upstream and downstream teams to identify, troubleshoot, and resolve data pipeline issues. Stay current with emerging technologies, frameworks, and industry trends to continuously improve the architecture and implementation of data solutions. Support production environments by handling incidents, root cause analysis, and continuous improvements. Required Skills & Experience Minimum 8 years of professional software development experience with strong emphasis on Scala programming. Extensive experience designing and building distributed data processing pipelines using Apache Spark (Spark Core, Spark SQL, Spark Streaming). Strong understanding of Hadoop ecosystem technologies including HDFS, Hive, Sqoop, Impala , and related tools. Proficient in SQL and NoSQL databases with sound knowledge of database concepts and operations. Familiarity with Data Warehousing concepts and ETL methodologies. Solid foundation in Data Structures, Algorithms, and Object-Oriented Programming. Experience in UNIX/Linux shell scripting to manage and schedule data jobs (preferred). Proven track record of working in Agile software development environments. Excellent problem-solving skills, with the ability to analyze complex issues and provide efficient solutions. Strong verbal and written communication skills, with experience working in diverse, global delivery teams. Ability to manage multiple tasks, collaborate across teams, and adapt to changing priorities. Desired Qualifications Bachelors or Master’s degree in Computer Science, Engineering, or a related technical field. Previous experience working in a global delivery or distributed team environment. Certification or formal training in Big Data technologies or Scala programming is a plus.

Posted 1 day ago

Apply

5.0 - 10.0 years

5 - 10 Lacs

bengaluru

Work from Office

The Team The Data Engineering team is responsible for architecting, building, and maintaining our evolving data infrastructure, as well as curating and governing the data assets created on our platform. We work closely with various stakeholders to acquire, process, and refine vast datasets, focusing on creating scalable and optimized data pipelines. Our team possesses broad expertise in critical data domains, technology stacks, and architectural patterns. We foster knowledge sharing and collaboration, resulting in a unified strategy and seamless data management. The Impact: This role is the foundation of the products delivered. The data onboarded is the base for the company as it feeds into the products, platforms, and essential for supporting our advanced analytics and machine learning initiatives. Whats in it for you Be the part of a successful team which works on delivering top priority projects which will directly contribute to Companys strategy. Drive the testing initiatives including supporting Automation strategy, performance, and security testing. This is the place to enhance your Testing skills while adding value to the business. As an experienced member of the team, you will have the opportunity to own and drive a project end to end and collaborate with developers, business analysts and product managers who are experts in their domain which can help you to build multiple skillsets. Responsibilities Design, develop, and maintain scalable and efficient data pipelines to process large volumes of data. To implement ETL processes to acquire, validate, and process incoming data from diverse sources. Collaborate with cross-functional teams, including data scientists, analysts, and software engineers, to understand data requirements and translate them into technical solutions. Implement data ingestion, transformation, and integration processes to ensure data quality, accuracy, and consistency. Optimize Spark jobs and data processing workflows for performance, scalability, and reliability. Troubleshoot and resolve issues related to data pipelines, data processing, and performance bottlenecks. Conduct code reviews and provide constructive feedback to junior team members to ensure code quality and best practices adherence. Stay updated with the latest advancements in Spark and related technologies and evaluate their potential for enhancing existing data engineering processes. Develop and maintain documentation, including technical specifications, data models, and system architecture diagrams. Stay abreast of emerging trends and technologies in the data engineering and big data space and propose innovative solutions to enhance data processing capabilities. What Were Looking For 5+ Years of experience in Data Engineering or related field Strong experience in Python programming with expertise in building data-intensive applications. Proven hands-on experience with Apache Spark, including Spark Core, Spark SQL, Spark Streaming, and Spark MLlib. Solid understanding of distributed computing concepts, parallel processing, and cluster computing frameworks. Proficiency in data modeling, data warehousing, and ETL techniques. Experience with workflow management platforms, preferably Airflow. Familiarity with big data technologies such as Hadoop, Hive, or HBase. Strong Knowledge of SQL and experience with relational databases. Hand on experience with AWS cloud data platform Strong problem-solving and troubleshooting skills, with the ability to analyze complex data engineering issues and provide effective solutions. Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams. Nice to have experience on DataBricks Preferred Qualifications Bachelors degree in Information Technology, Computer Information Systems, Computer Engineering, Computer Science, or other technical discipline Whats In It For You? Our Purpose: Progress is not a self-starter. It requires a catalyst to be set in motion. Information, imagination, people, technologythe right combination can unlock possibility and change the world. At S&P Global we transform data into Essential Intelligence, pinpointing risks and opening possibilities. Our People: Our Values: Integrity, Discovery, Partnership At S&P Global, we focus on Powering Global Markets. We start with a foundation of integrity in all we do, bring a spirit of discovery to our work, and collaborate in close partnership with each other and our customers to achieve shared goals. Benefits: We take care of you, so you cantake care of business. We care about our people. Thats why we provide everything youand your careerneed to thrive at S&P Global. Health & WellnessHealth care coverage designed for the mind and body. Flexible DowntimeGenerous time off helps keep you energized for your time on. Continuous LearningAccess a wealth of resources to grow your career and learn valuable new skills. Invest in Your FutureSecure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs. Family Friendly PerksIts not just about you. S&P Global has perks for your partners and little ones, too, with some best-in class benefits for families. Beyond the BasicsFrom retail discounts to referral incentive awardssmall perks can make a big difference.

Posted 3 days ago

Apply

5.0 - 10.0 years

0 Lacs

karnataka

On-site

As a software developer, you will be working in a constantly evolving environment driven by technological advances and the strategic direction of the organization you are employed by. Your primary responsibilities will include creating, maintaining, auditing, and enhancing systems to meet specific needs, often based on recommendations from systems analysts or architects. You will be tasked with testing both hardware and software systems to identify and resolve system faults. Additionally, you will be involved in writing diagnostic programs and designing and developing code for operating systems and software to ensure optimal efficiency. In situations where necessary, you will also provide recommendations for future developments. Joining us offers numerous benefits, including the opportunity to work on challenging projects and solve complex technical problems. You can expect rapid career growth and the chance to assume leadership roles. Our mentorship program allows you to learn from experienced mentors and industry experts, while our global opportunities enable you to collaborate with clients from around the world and gain international experience. We offer competitive compensation packages and benefits to our employees. If you are passionate about technology and interested in working on innovative projects with a skilled team, pursuing a career as an Infosys Power Programmer could be an excellent choice for you. To be considered for this role, you must possess the following mandatory skills: - Proficiency in AWS Glue, AWS Redshift/Spectrum, S3, API Gateway, Athena, Step, and Lambda functions. - Experience with Extract Transform Load (ETL) and Extract Load & Transform (ELT) data integration patterns. - Expertise in designing and constructing data pipelines. - Development experience in one or more object-oriented programming languages, preferably Python. In terms of job specifications, we are looking for candidates who meet the following criteria: - At least 5 years of hands-on experience in developing, testing, deploying, and debugging Spark Jobs using Scala in the Hadoop Platform. - Profound knowledge of Spark Core and working with RDDs and Spark SQL. - Familiarity with Spark Optimization Techniques and Best Practices. - Strong understanding of Scala Functional Programming concepts like Try, Option, Future, and Collections. - Proficiency in Scala Object-Oriented Programming covering Classes, Traits, Objects (Singleton and Companion), and Case Classes. - Sound knowledge of Scala Language Features including the Type System and Implicit/Givens. - Hands-on experience working in the Hadoop Environment (HDFS/Hive), AWS S3, EMR. - Proficiency in Python programming. - Working experience with Workflow Orchestration tools such as Airflow and Oozie. - Experience with API calls in Scala. - Familiarity and exposure to file formats like Apache AVRO, Parquet, and JSON. - Desirable knowledge of Protocol Buffers and Geospatial data analytics. - Ability to write test cases using frameworks like scalatest. - Good understanding of Build Tools such as Gradle & SBT. - Experience using GIT, resolving conflicts, and working with branches. - Preferred experience in workflow systems like Airflow. - Strong programming skills focusing on data structures and algorithms. - Excellent analytical and communication skills. Candidates applying for this position should have: - 7-10 years of industry experience. - A BE/B.Tech in Computer Science or an equivalent qualification.,

Posted 1 week ago

Apply

4.0 - 8.0 years

0 Lacs

karnataka

On-site

As a Senior AWS Data Engineer Cloud Data Platform at Teamware Solutions, a division of Quantum Leap Consulting Pvt. Ltd, located in Bangalore, you will be responsible for end-to-end implementation of Cloud data engineering solutions like Enterprise Data lake and Data hub in AWS. Working onsite in an office environment for 5 days a week, you will collaborate with the Offshore Manager and Onsite Business Analyst to understand the requirements and deliver scalable, distributed, cloud-based enterprise data solutions. You should have a strong background in AWS cloud technology, with 4-8 years of hands-on experience. Proficiency in architecting and delivering highly scalable solutions is a must, along with expertise in Cloud data engineering solutions, Lambda or Kappa Architectures, Data Management concepts, and Data Modelling. You should be proficient in AWS services such as EMR, Glue, S3, Redshift, and DynamoDB, as well as have experience in Big Data frameworks like Hadoop and Spark. Additionally, you must have hands-on experience with AWS compute and storage services, AWS Streaming Services, troubleshooting and performance tuning in Spark framework, and knowledge of Application DevOps tools like Git and CI/CD Frameworks. Familiarity with AWS CloudWatch, Cloud Trail, Account Config, Config Rules, security, key management, data migration processes, and analytical skills is required. Good communication and presentation skills are essential for this role. Desired skills include experience in building stream-processing systems, Big Data ML toolkits, Python, Offshore/Onsite Engagements, flow tools like Airflow, Nifi or Luigi, and AWS services like STEP & Lambda. A professional background in BE/B.Tech/MCA/M.Sc/M.E/M.Tech/MBA is preferred, and an AWS certified Data Engineer certification is recommended. If you are interested in this position and meet the qualifications mentioned above, please send your resume to netra.s@twsol.com.,

Posted 1 week ago

Apply

2.0 - 6.0 years

8 - 12 Lacs

Gurugram

Work from Office

At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleagues As part of Team Amex, you'll experience this powerful backing with comprehensive support for your holistic well-being and many opportunities to learn new skills, develop as a leader, and grow your career, Here, your voice and ideas matter, your work makes an impact, and together, you will help us define the future of American Express, Join Team Amex and let's lead the way together, From building next-generation apps and microservices in Kotlin to using AI to help protect our franchise and customers from fraud, you could be doing entrepreneurial work that brings our iconic, global brand into the future As a part of our tech team, we could work together to bring ground-breaking and diverse ideas to life that power our digital systems, services, products and platforms If you love to work with APIs, contribute to open source, or use the latest technologies, well support you with an open environment and learning culture Function Description: American Express is looking for energetic, successful and highly skilled Engineers to help shape our technology and product roadmap Our Software Engineers not only understand how technology works, but how that technology intersects with the people who count on it every day Today, innovative ideas, insight and new points of view are at the core of how we create a more powerful, personal and fulfilling experience for our customers and colleagues, with batch/real-time analytical solutions using ground-breaking technologies to deliver innovative solutions across multiple business units, This Engineering role is based in our Global Risk and Compliance Technology organization and will have a keen focus on platform modernization, bringing to life the latest technology stacks to support the ongoing needs of the business as well as compliance against global regulatory requirements Qualifications: Support the Compliance and Operations Risk data delivery team in India to lead and assist in the design and actual development of applications, Responsible for specific functional areas within the team, this involves project management and taking business specifications, The individual should be able to independently run projects/tasks delegated to them, Technology Skills: Bachelor degree in Engineering or Computer Science or equivalent 2 to 5 years experience is required GCP professional certification Data Engineer Expert in Google BigQuery tool for data warehousing needs, Experience on Big Data (Spark Core and Hive) preferred Familiar with GCP offerings, experience building data pipelines on GCP a plus Hadoop Architecture, having knowledge on Hadoop, Map Reduce, Hbase, UNIX shell scripting experience is good to have Creative problem solving (Innovative) We back you with benefits that support your holistic well-being so you can be and deliver your best This means caring for you and your loved ones' physical, financial, and mental health, as well as providing the flexibility you need to thrive personally and professionally: Competitive base salaries Bonus incentives Support for financial-well-being and retirement Comprehensive medical, dental, vision, life insurance, and disability benefits (depending on location) Flexible working model with hybrid, onsite or virtual arrangements depending on role and business need Generous paid parental leave policies (depending on your location) Free access to global on-site wellness centers staffed with nurses and doctors (depending on location) Free and confidential counseling support through our Healthy Minds program Career development and training opportunities American Express is an equal opportunity employer and makes employment decisions without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran status, disability status, age, or any other status protected by law, Offer of employment with American Express is conditioned upon the successful completion of a background verification check, subject to applicable laws and regulations, Show

Posted 3 weeks ago

Apply

4.0 - 9.0 years

10 - 20 Lacs

Pune, Chennai, Bengaluru

Hybrid

Experience: 4-10years Location - Pune, Bangalore, Chennai, Noida and Gurgaon Notice period should be Immediate to 30days only Mandatory skill - Apache Spark, JAVA programming Strong knowledge in Apache Spark framework Core Spark, Spark Data Frames, Spark streaming Hands-on experience in any one of the programming languages (Java) Good understanding of distributed programming concepts Experience in optimizing Spark DAG, and Hive queries on Tez Experience using tools like Git, Autosys, Bitbucket, Jira Ability to apply DWH principles within Hadoop environment and NoSQL databases.

Posted 4 weeks ago

Apply

7.0 - 12.0 years

9 - 12 Lacs

Bengaluru

Work from Office

Responsibilities: * Design, develop, test & maintain Scala applications using Spark. * Collaborate with cross-functional teams on project delivery. * Optimize application performance through data analysis.

Posted 1 month ago

Apply

6.0 - 11.0 years

5 - 15 Lacs

Chennai, Bengaluru, Mumbai (All Areas)

Hybrid

Mandatory Skill : Spark and Scala Data Engineering. Secondary Skill Python 5+ years of in depth hands on experience of developing, testing, deployment and debugging of Spark Jobs using Scala in Hadoop Platform In depth knowledge of Spark Core, working with RDDs, Spark SQL In depth knowledge on Spark Optimization Techniques and Best practices Good Knowledge of Scala Functional Programming: Try, Option, Future, Collections Good Knowledge of Scala OOPS: Classes, Traits and Objects (Singleton and Companion), Case Classes Good Understanding of Scala Language Features: Type System, Implicit/Givens Hands on experience of working in Hadoop Environment (HDFS/Hive), AWS S3, EMR Working experience on Workflow Orchestration tools like Airflow, Oozie Working with API calls in Scala Understanding and exposure to file formats such as Apache AVRO, Parquet, JSON Good to have knowledge of Protocol Buffers and Geospatial data analytics. Writing Test cases using frameworks such as scalatest. Good Knowledge of Build Tools such as: Gradle & SBT in depth Experience on using GIT, resolving conflicts, working with branches. Good to have Python programming skills Good to have worked on some workflow systems as Airflow Strong programming skills using data structures and algorithms. Excellent analytical skills Good communication skills

Posted 1 month ago

Apply

6.0 - 11.0 years

5 - 15 Lacs

Hyderabad, Chennai, Bengaluru

Hybrid

Mandatory Skill : Spark and Scala Data Engineering. Secondary Skill Python 5+ years of in depth hands on experience of developing, testing, deployment and debugging of Spark Jobs using Scala in Hadoop Platform In depth knowledge of Spark Core, working with RDDs, Spark SQL In depth knowledge on Spark Optimization Techniques and Best practices Good Knowledge of Scala Functional Programming: Try, Option, Future, Collections Good Knowledge of Scala OOPS: Classes, Traits and Objects (Singleton and Companion), Case Classes Good Understanding of Scala Language Features: Type System, Implicit/Givens Hands on experience of working in Hadoop Environment (HDFS/Hive), AWS S3, EMR Working experience on Workflow Orchestration tools like Airflow, Oozie Working with API calls in Scala Understanding and exposure to file formats such as Apache AVRO, Parquet, JSON Good to have knowledge of Protocol Buffers and Geospatial data analytics. Writing Test cases using frameworks such as scalatest. Good Knowledge of Build Tools such as: Gradle & SBT in depth Experience on using GIT, resolving conflicts, working with branches. Good to have Python programming skills Good to have worked on some workflow systems as Airflow Strong programming skills using data structures and algorithms. Excellent analytical skills Good communication skills

Posted 1 month ago

Apply

5.0 - 10.0 years

3 - 7 Lacs

Bengaluru

Work from Office

Job Title:EMR_Spark SMEExperience:5-10 YearsLocation:Bangalore : Technical Skills: 5+ years of experience in big data technologies with hands-on expertise in AWS EMR and Apache Spark. Proficiency in Spark Core, Spark SQL, and Spark Streaming for large-scale data processing. Strong experience with data formats (Parquet, Avro, JSON) and data storage solutions (Amazon S3, HDFS). Solid understanding of distributed systems architecture and cluster resource management (YARN). Familiarity with AWS services (S3, IAM, Lambda, Glue, Redshift, Athena). Experience in scripting and programming languages such as Python, Scala, and Java. Knowledge of containerization and orchestration (Docker, Kubernetes) is a plus. Architect and develop scalable data processing solutions using AWS EMR and Apache Spark. Optimize and tune Spark jobs for performance and cost efficiency on EMR clusters. Monitor, troubleshoot, and resolve issues related to EMR and Spark workloads. Implement best practices for cluster management, data partitioning, and job execution. Collaborate with data engineering and analytics teams to integrate Spark solutions with broader data ecosystems (S3, RDS, Redshift, Glue, etc.). Automate deployments and cluster management using infrastructure-as-code tools like CloudFormation, Terraform, and CI/CD pipelines. Ensure data security and governance in EMR and Spark environments in compliance with company policies. Provide technical leadership and mentorship to junior engineers and data analysts. Stay current with new AWS EMR features and Spark versions to recommend improvements and upgrades. Requirements and Skills Performance tuning and optimization of Spark jobs. Problem-solving skills with the ability to diagnose and resolve complex technical issues. Strong experience with version control systems (Git) and CI/CD pipelines. Excellent communication skills to explain technical concepts to both technical and non-technical audiences. Qualification: Education qualificationB.Tech, BE, BCA, MCA, M. Tech or equivalent technical degree from a reputed college. Certifications: AWS Certified Solutions Architect Associate/Professional AWS Certified Data Analytics Specialty

Posted 1 month ago

Apply

5.0 - 8.0 years

7 - 10 Lacs

Pune

Work from Office

Java+Spark Primary skill - ApacheSpark Secondary skill -Java Strong knowledge in Apache Sparkframework CoreSpark,SparkData Frames,Sparkstreaming Hands-on experience in any one of the programming languages (Java) Good understanding of distributed programming concepts. Experience in optimizingSparkDAG, and Hive queries on Tez Experience using tools like Git, Autosys, Bitbucket, Jira Ability to apply DWH principles within Hadoop environment and NoSQL databases. Mandatory Skills: Apache Spark.: Experience: 5-8 Years.

Posted 1 month ago

Apply

8.0 - 13.0 years

25 - 40 Lacs

Chennai

Work from Office

Architect & Build Scalable Systems: Design and implement a petabyte-scale lakehouse Architectures to unify data lakes and warehouses. Real-Time Data Engineering: Develop and optimize streaming pipelines using Kafka, Pulsar, and Flink. Required Candidate profile Data engineering experience with large-scale systems• Expert proficiency in Java for data-intensive applications. Handson experience with lakehouse architectures, stream processing, & event streaming

Posted 2 months ago

Apply

5.0 - 10.0 years

3 - 7 Lacs

Bengaluru

Work from Office

Job Title:EMR_Spark SME Experience:5-10 Years Location:Bangalore : Technical Skills: 5+ years of experience in big data technologies with hands-on expertise in AWS EMR and Apache Spark. Proficiency in Spark Core, Spark SQL, and Spark Streaming for large-scale data processing. Strong experience with data formats (Parquet, Avro, JSON) and data storage solutions (Amazon S3, HDFS). Solid understanding of distributed systems architecture and cluster resource management (YARN). Familiarity with AWS services (S3, IAM, Lambda, Glue, Redshift, Athena). Experience in scripting and programming languages such as Python, Scala, and Java. Knowledge of containerization and orchestration (Docker, Kubernetes) is a plus. Architect and develop scalable data processing solutions using AWS EMR and Apache Spark. Optimize and tune Spark jobs for performance and cost efficiency on EMR clusters. Monitor, troubleshoot, and resolve issues related to EMR and Spark workloads. Implement best practices for cluster management, data partitioning, and job execution. Collaborate with data engineering and analytics teams to integrate Spark solutions with broader data ecosystems (S3, RDS, Redshift, Glue, etc.). Automate deployments and cluster management using infrastructure-as-code tools like CloudFormation, Terraform, and CI/CD pipelines. Ensure data security and governance in EMR and Spark environments in compliance with company policies. Provide technical leadership and mentorship to junior engineers and data analysts. Stay current with new AWS EMR features and Spark versions to recommend improvements and upgrades. Requirements and Skills Performance tuning and optimization of Spark jobs. Problem-solving skills with the ability to diagnose and resolve complex technical issues. Strong experience with version control systems (Git) and CI/CD pipelines. Excellent communication skills to explain technical concepts to both technical and non-technical audiences. Qualification: Education qualificationB.Tech, BE, BCA, MCA, M. Tech or equivalent technical degree from a reputed college. Certifications: AWS Certified Solutions Architect – Associate/Professional AWS Certified Data Analytics – Specialty

Posted 3 months ago

Apply

9 - 11 years

37 - 40 Lacs

Ahmedabad, Bengaluru, Mumbai (All Areas)

Work from Office

Dear Candidate, We are hiring a Scala Developer to work on high-performance distributed systems, leveraging the power of functional and object-oriented paradigms. This role is perfect for engineers passionate about clean code, concurrency, and big data pipelines. Key Responsibilities: Build scalable backend services using Scala and the Play or Akka frameworks . Write concurrent and reactive code for high-throughput applications . Integrate with Kafka, Spark, or Hadoop for data processing. Ensure code quality through unit tests and property-based testing . Work with microservices, APIs, and cloud-native deployments. Required Skills & Qualifications: Proficient in Scala , with a strong grasp of functional programming Experience with Akka, Play, or Cats Familiarity with Big Data tools and RESTful API development Bonus: Experience with ZIO, Monix, or Slick Soft Skills: Strong troubleshooting and problem-solving skills. Ability to work independently and in a team. Excellent communication and documentation skills. Note: If interested, please share your updated resume and preferred time for a discussion. If shortlisted, our HR team will contact you. Kandi Srinivasa Reddy Delivery Manager Integra Technologies

Posted 3 months ago

Apply

7 - 11 years

50 - 60 Lacs

Mumbai, Delhi / NCR, Bengaluru

Work from Office

Role :- Resident Solution ArchitectLocation: RemoteThe Solution Architect at Koantek builds secure, highly scalable big data solutions to achieve tangible, data-driven outcomes all the while keeping simplicity and operational effectiveness in mind This role collaborates with teammates, product teams, and cross-functional project teams to lead the adoption and integration of the Databricks Lakehouse Platform into the enterprise ecosystem and AWS/Azure/GCP architecture This role is responsible for implementing securely architected big data solutions that are operationally reliable, performant, and deliver on strategic initiatives Specific requirements for the role include: Expert-level knowledge of data frameworks, data lakes and open-source projects such as Apache Spark, MLflow, and Delta Lake Expert-level hands-on coding experience in Python, SQL ,Spark/Scala,Python or Pyspark In depth understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, RDD caching, Spark MLib IoT/event-driven/microservices in the cloud- Experience with private and public cloud architectures, pros/cons, and migration considerations Extensive hands-on experience implementing data migration and data processing using AWS/Azure/GCP services Extensive hands-on experience with the Technology stack available in the industry for data management, data ingestion, capture, processing, and curation: Kafka, StreamSets, Attunity, GoldenGate, Map Reduce, Hadoop, Hive, Hbase, Cassandra, Spark, Flume, Hive, Impala, etc Experience using Azure DevOps and CI/CD as well as Agile tools and processes including Git, Jenkins, Jira, and Confluence Experience in creating tables, partitioning, bucketing, loading and aggregating data using Spark SQL/Scala Able to build ingestion to ADLS and enable BI layer for Analytics with strong understanding of Data Modeling and defining conceptual logical and physical data models Proficient level experience with architecture design, build and optimization of big data collection, ingestion, storage, processing, and visualization Responsibilities : Work closely with team members to lead and drive enterprise solutions, advising on key decision points on trade-offs, best practices, and risk mitigationGuide customers in transforming big data projects,including development and deployment of big data and AI applications Promote, emphasize, and leverage big data solutions to deploy performant systems that appropriately auto-scale, are highly available, fault-tolerant, self-monitoring, and serviceable Use a defense-in-depth approach in designing data solutions and AWS/Azure/GCP infrastructure Assist and advise data engineers in the preparation and delivery of raw data for prescriptive and predictive modeling Aid developers to identify, design, and implement process improvements with automation tools to optimizing data delivery Implement processes and systems to monitor data quality and security, ensuring production data is accurate and available for key stakeholders and the business processes that depend on it Employ change management best practices to ensure that data remains readily accessible to the business Implement reusable design templates and solutions to integrate, automate, and orchestrate cloud operational needs and experience with MDM using data governance solutions Qualifications : Overall experience of 12+ years in the IT field Hands-on experience designing and implementing multi-tenant solutions using Azure Databricks for data governance, data pipelines for near real-time data warehouse, and machine learning solutions Design and development experience with scalable and cost-effective Microsoft Azure/AWS/GCP data architecture and related solutions Experience in a software development, data engineering, or data analytics field using Python, Scala, Spark, Java, or equivalent technologies Bachelors or Masters degree in Big Data, Computer Science, Engineering, Mathematics, or similar area of study or equivalent work experience Good to have- - Advanced technical certifications: Azure Solutions Architect Expert, - AWS Certified Data Analytics, DASCA Big Data Engineering and Analytics - AWS Certified Cloud Practitioner, Solutions Architect - Professional Google Cloud Certified Location : - Mumbai, Delhi / NCR, Bengaluru , Kolkata, Chennai, Hyderabad, Ahmedabad, Pune, Remote

Posted 3 months ago

Apply

5.0 - 8.0 years

4 - 8 Lacs

bengaluru

Work from Office

Primary Skill: Spark, Java programming Strong knowledge in Apache Spark framework Core Spark, Spark Data Frames, Spark streaming Hands-on experience in any one of the programming languages (Java, Scala) Good understanding of distributed programming concepts. Experience in optimizing Spark DAG, and Hive queries on Tez Experience using tools like Git, Autosys, Bitbucket, Jira Mandatory Skills: Apache Spark.Experience: 5-8 Years.

Posted Date not available

Apply

5.0 - 8.0 years

4 - 8 Lacs

pune

Work from Office

Java+Spark Primary skill - Apache Spark Secondary skill - Java Strong knowledge in Apache Spark framework Core Spark, Spark Data Frames, Spark streaming Hands-on experience in any one of the programming languages (Java) Good understanding of distributed programming concepts. Experience in optimizing Spark DAG, and Hive queries on Tez Experience using tools like Git, Autosys, Bitbucket, Jira Ability to apply DWH principles within Hadoop environment and NoSQL databases. Mandatory Skills: Apache Spark.Experience: 5-8 Years.

Posted Date not available

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies