Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
6.0 - 10.0 years
0 Lacs
navi mumbai, maharashtra
On-site
You should have 6-8 years of experience with a deep understanding of the Spark framework, along with hands-on experience in Spark SQL and Pyspark. Your expertise should include Python programming and familiarity with common Python libraries. Strong analytical skills are essential, especially in database management, including writing complex queries, query optimization, debugging, user-defined functions, views, and indexes. Your problem-solving abilities will be crucial in designing, implementing, and maintaining efficient data models and pipelines. Experience with Big Data technologies is a must, while familiarity with any ETL tool would be advantageous. As part of your responsibilities, you will be working on projects to deliver, review, and design PySpark and Spark SQL-based data engineering analytics solutions. Your tasks will involve writing clean, efficient, reusable, testable, and scalable Python logic for analytical solutions. Emphasis will be on building solutions for data cleaning, data scraping, and exploratory data analysis, ensuring compatibility with any BI tool. Collaboration with Data Analysts/BI developers to provide clean and processed data will be essential. You will design data processing pipelines using ETL techniques, develop and deliver complex requirements to achieve business goals, and work with unstructured, structured, and semi-structured data and their respective databases. Effective coordination with internal engineering and development teams to understand requirements and develop solutions is critical. Communication with stakeholders to grasp business logic and provide optimal data engineering solutions will also be part of your role. It is important to adhere to best coding practices and standards throughout your work.,
Posted 3 weeks ago
6.0 - 10.0 years
0 Lacs
punjab
On-site
The Data Engineer position at Data Bricks in Melbourne requires a candidate with 6-10 years of experience in Data Lake and Azure Data Bricks. You should have expertise in designing and deploying Databricks platforms on AWS, Azure, or Google Cloud, as well as building and deploying data engineering pipelines with automation best practices for CI/CD. Your role will involve guiding clients in implementing transformative big data projects, including end-to-end development and deployment of cutting-edge big data and AI applications. Experience in working with Scrum Agile methodology and streamlining the customer machine learning lifecycle are essential. Proficiency in Big Data Technologies is a must. Key Skills: - Azure Data Lake - Azure Data Bricks - Spark SQL Desirable Skills: - Apache Spark certifications - Databricks certifications In this role, you will play a crucial part in the successful execution of data engineering projects and contributing to the advancement of big data technologies.,
Posted 4 weeks ago
5.0 - 9.0 years
0 Lacs
hyderabad, telangana
On-site
You have a minimum of 5 years of experience in Bengaluru or Gurugram. You should have expertise in Python, Airflow (Orchestration), GCP (Cloud), Spark SQL, PySpark, CI/CD, Git, and GitHub. Your responsibilities will include designing and constructing data models, developing and managing data ingestion and processing systems, implementing data storage solutions, ensuring data consistency and accuracy, and collaborating with cross-functional teams to address data-related issues. Proficiency in Python programming and experience with GCP and Airflow are necessary. You should be familiar with security and governance aspects, such as role-based access control and data lineage tools. Knowledge of database management systems like MySQL will be advantageous. Strong problem-solving and analytical skills are essential. NucleusTeq values a positive and supportive culture that encourages associates to excel and offers well-being programs for a healthy and happy work environment.,
Posted 1 month ago
4.0 - 12.0 years
0 Lacs
karnataka
On-site
As a Big Data Lead with 7-12 years of experience, you will be responsible for software development using multiple computing languages. Your role will involve working on distributed data processing systems and applications, specifically in Business Intelligence/Data Warehouse (BIDW) programs. Additionally, you should have previous experience in development through testing, preferably on the J2EE stack. Your knowledge and understanding of best practices and concepts in Data Warehouse Applications will be crucial to your success in this role. You should possess a strong foundation in distributed systems and computing systems, with hands-on engineering skills. Hands-on experience with technologies such as Spark, Scala, Kafka, Hadoop, Hbase, Pig, and Hive is required. An understanding of NoSQL data stores, data modeling, and data management is essential for this position. Good interpersonal communication skills, along with excellent oral and written communication and analytical skills, are necessary for effective collaboration within the team. Experience with Data Lake implementation as an alternative to Data Warehouse is preferred. You should have hands-on experience with Data frames using Spark SQL and proficiency in SQL. A minimum of 2 end-to-end implementations in either Data Warehouse or Data Lake is required for this role as a Big Data Lead.,
Posted 1 month ago
2.0 - 6.0 years
0 Lacs
maharashtra
On-site
Job Description: We are looking for a skilled PySpark Developer having 4-5 or 2-3 years of experience to join our team. As a PySpark Developer, you will be responsible for developing and maintaining data processing pipelines using PySpark, Apache Spark's Python API. You will work closely with data engineers, data scientists, and other stakeholders to design and implement scalable and efficient data processing solutions. Bachelor's or Master's degree in Computer Science, Data Science, or a related field is required. The ideal candidate should have strong expertise in the Big Data ecosystem including Spark, Hive, Sqoop, HDFS, Map Reduce, Oozie, Yarn, HBase, Nifi. The candidate should be below 35 years of age and have experience in designing, developing, and maintaining PySpark data processing pipelines to process large volumes of structured and unstructured data. Additionally, the candidate should collaborate with data engineers and data scientists to understand data requirements and design efficient data models and transformations. Optimizing and tuning PySpark jobs for performance, scalability, and reliability is a key responsibility. Implementing data quality checks, error handling, and monitoring mechanisms to ensure data accuracy and pipeline robustness is crucial. The candidate should also develop and maintain documentation for PySpark code, data pipelines, and data workflows. Experience in developing production-ready Spark applications using Spark RDD APIs, Data frames, Datasets, Spark SQL, and Spark Streaming is required. Strong experience of HIVE Bucketing and Partitioning, as well as writing complex hive queries using analytical functions, is essential. Knowledge in writing custom UDFs in Hive to support custom business requirements is a plus. If you meet the above qualifications and are interested in this position, please email your resume, mentioning the position applied for in the subject column at: careers@cdslindia.com.,
Posted 1 month ago
8.0 - 12.0 years
0 Lacs
karnataka
On-site
Join the Consumer & Community Banking division at Chase, a leading U.S. financial services firm as a skilled data professional in our Data & Analytics team. As an Analytical Solutions Manager within the Consumer and Community Banking (CCB) Finance Data & Insights Team, you will be a part of an agile product team responsible for the development, production, and transformation of financial data and reporting across the Consumer and Community Banking division. Your ability and passion to think beyond raw and disparate data will enable you to create data visualizations and intelligence solutions that will be utilized by the organization's top leaders to achieve key strategic imperatives. You will assist in identifying and assessing opportunities to eliminate manual processes and utilize automation tools such as Alteryx or Thought Spot to implement automated solutions. You will be responsible for extracting, analyzing, and summarizing data for ad hoc stakeholder requests and play a significant role in transforming our data environment to a modernized cloud platform. Transform raw data into actionable insights, demonstrating a history of learning and implementing new technologies. Lead the Finance Data & Insights Team, an agile product team, taking responsibility for the development, production, and transformation of financial data and reporting across CCB. Improve the lives of our people and increase value to the firm by leveraging the power of data and the best tools to analyze data, generate insights, save time, improve processes & control, and lead the organization in developing skills of the future. Join an agile product team as an Analytical Solutions Manager on the CCB Finance Data & Insights Team, responsible for the development, production, and transformation of financial data and reporting across CCB. Lead conversations with business teams and create data visualizations and intelligence solutions utilized by the organization's top leaders to reach key strategic imperatives. Identify and assess opportunities to eliminate manual processes and utilize automation tools such as Alteryx or Thought Spot to bring automated solutions to life. Extract, analyze, and summarize data for ad hoc stakeholder requests, playing a role in transforming the data environment to a modernized cloud platform. Minimum 8 years of experience in SQL is a MUST. Minimum 8 years of experience developing data visualization and presentations. Experience with data wrangling tools such as Alteryx. Experience with relational databases utilizing SQL to pull and summarize large datasets, report creation and ad-hoc analyses. Knowledge of modern MPP databases and big-data (Hadoop) concepts. Experience in reporting development and testing, and ability to interpret unstructured data and draw objective inferences given known limitations of the data. Demonstrated ability to think beyond raw data and to understand the underlying business context and sense business opportunities hidden in data. Strong written and oral communication skills; ability to communicate effectively with all levels of management and partners from a variety of business functions. Data architecture experience is needed. Preferred qualifications, capabilities and skills include experience with LLM, Hive, Spark SQL, Impala, or other big-data query tool, Home Lending business understanding as a major advantage, and experience with AWS, Databricks, Snowflake, or other Cloud Data Warehouse, Thought Spot experience.,
Posted 1 month ago
5.0 - 15.0 years
0 Lacs
maharashtra
On-site
En Derevo empoderamos a las empresas y las personas, liberando el valor de los datos en las organizaciones. Con ms de 15 aos de experiencia, diseamos soluciones de datos e IA de punta a punta, desde la integracin en arquitecturas modernas hasta la implementacin de modelos inteligentes en procesos clave del negocio. Buscamos tu talento como Data Engineer (MS Fabric)!! Es importante que vivas en Mxico o Colombia. Como Data Engineer en Derevo, tu misin ser clave para crear e implementar arquitecturas modernas de datos con alta calidad, impulsando soluciones analticas basadas en tecnologas de Big Data. Disears, mantendrs y optimizars sistemas de multiprocesamiento paralelo, aplicando las mejores prcticas de almacenamiento y gestin en data warehouses, data lakes y lakehouses. Sers el apasionado que recolecta, procesa, limpia y orquesta grandes volmenes de datos, entendiendo modelos estructurados y semiestructurados, para integrar y transformar mltiples fuentes con eficacia. Definirs la estrategia ptima segn objetivos de negocio y requerimientos tcnicos, convirtiendo problemas complejos en soluciones alcanzables que ayuden a nuestros clientes a tomar decisiones basadas en datos. Te integrars al proyecto, sus sprints y ejecutars las actividades de desarrollo aplicando siempre las mejores prcticas de datos y las tecnologas que implementamos. Identificars requerimientos y definirs el alcance, participando en sprint planning y sesiones de ingeniera con una visin de consultor que aporte valor extra. Colaborars proactivamente en workshops y reuniones con el equipo interno y con el cliente. Clasificars y estimars actividades bajo metodologas giles (picas, features, historias tcnicas/usuario) y dars seguimiento diario para mantener el ritmo del sprint. Cumplirs las fechas de entrega comprometidas y gestionars riesgos comunicando desviaciones a tiempo. Para incorporarte como Data Engineer en Derevo, es necesario tener un manejo avanzado del idioma ingls (Conversaciones tcnicas y de negocios, B2+ o C1) y habilidades tcnicas en: - Lenguajes de Consulta y Programacin: T-SQL / Spark SQL, Python (PySpark), JSON / REST APIs, Microsoft Fabric. - Lenguajes de Consulta y Programacin: T-SQL / Spark SQL, Python (PySpark), JSON / REST APIs, Microsoft Fabric. Adems, es importante que te identifiques con habilidades blandas y de negocio como la comunicacin cercana, trabajo en Squads, proactividad y colaboracin, aprendizaje constante, responsabilidad y organizacin, consultora de datos, gestin de requerimientos, estrategia alineada al cliente y presentacin a clientes. Entre los beneficios que tendrs en Derevo se encuentran el impulso a tu bienestar integral, oportunidad de especializarte en diferentes reas y tecnologas, libertad para crear, participacin en proyectos tecnolgicos punteros y un esquema de trabajo remoto flexible y estructurado. Si cumples con la mayora de los requisitos y te interesa el perfil, no dudes en postularte para convertirte en un derevian y desarrollar tu superpoder. Nuestro equipo de Talent te contactar!,
Posted 1 month ago
7.0 - 11.0 years
0 Lacs
karnataka
On-site
As a skilled Senior Engineer at Impetus Technologies, you will utilize your expertise in Java and Big Data technologies to design, develop, and deploy scalable data processing applications. Your responsibilities will include collaborating with cross-functional teams, developing high-quality code, and optimizing data processing workflows. Additionally, you will mentor junior engineers and contribute to architectural decisions to enhance system performance and scalability. Key Responsibilities: - Design, develop, and maintain high-performance applications using Java and Big Data technologies. - Implement data ingestion and processing workflows with frameworks like Hadoop and Spark. - Collaborate with the data architecture team to define efficient data models. - Optimize existing applications for performance, scalability, and reliability. - Mentor junior engineers, provide technical leadership, and promote continuous improvement. - Participate in code reviews and ensure best practices for coding, testing, and documentation. - Stay up-to-date with technology trends in Java and Big Data, and evaluate new tools and methodologies. Skills and Tools Required: - Strong proficiency in Java programming for building complex applications. - Hands-on experience with Big Data technologies like Apache Hadoop, Apache Spark, and Apache Kafka. - Understanding of distributed computing concepts and technologies. - Experience with data processing frameworks and libraries such as MapReduce and Spark SQL. - Familiarity with database systems like HDFS, NoSQL databases (e.g., Cassandra, MongoDB), and SQL databases. - Strong problem-solving skills and the ability to troubleshoot complex issues. - Knowledge of version control systems like Git and familiarity with CI/CD pipelines. - Excellent communication and teamwork skills for effective collaboration. About the Role: You will be responsible for designing and developing scalable Java applications for Big Data processing, collaborating with cross-functional teams to implement innovative solutions, and ensuring code quality and performance through best practices and testing methodologies. About the Team: You will work with a diverse team of skilled engineers, data scientists, and product managers in a collaborative environment that encourages knowledge sharing and continuous learning. Technical workshops and brainstorming sessions will provide opportunities to enhance your skills and stay updated with industry trends. Responsibilities: - Developing and maintaining high-performance Java applications for efficient data processing. - Implementing data integration and processing frameworks using Big Data technologies. - Troubleshooting and optimizing systems to enhance performance and scalability. To succeed in this role, you should have: - Strong proficiency in Java and experience with Big Data technologies and frameworks. - Solid understanding of data structures, algorithms, and software design principles. - Excellent problem-solving skills and the ability to work independently and within a team. - Familiarity with cloud platforms and distributed computing concepts is a plus. Qualification: Bachelor's or Master's degree in Computer Science, Engineering, or related field. Experience: 7 to 10 years Job Reference Number: 13131,
Posted 1 month ago
6.0 - 11.0 years
5 - 15 Lacs
Chennai, Bengaluru, Mumbai (All Areas)
Hybrid
Mandatory Skill : Spark and Scala Data Engineering. Secondary Skill Python 5+ years of in depth hands on experience of developing, testing, deployment and debugging of Spark Jobs using Scala in Hadoop Platform In depth knowledge of Spark Core, working with RDDs, Spark SQL In depth knowledge on Spark Optimization Techniques and Best practices Good Knowledge of Scala Functional Programming: Try, Option, Future, Collections Good Knowledge of Scala OOPS: Classes, Traits and Objects (Singleton and Companion), Case Classes Good Understanding of Scala Language Features: Type System, Implicit/Givens Hands on experience of working in Hadoop Environment (HDFS/Hive), AWS S3, EMR Working experience on Workflow Orchestration tools like Airflow, Oozie Working with API calls in Scala Understanding and exposure to file formats such as Apache AVRO, Parquet, JSON Good to have knowledge of Protocol Buffers and Geospatial data analytics. Writing Test cases using frameworks such as scalatest. Good Knowledge of Build Tools such as: Gradle & SBT in depth Experience on using GIT, resolving conflicts, working with branches. Good to have Python programming skills Good to have worked on some workflow systems as Airflow Strong programming skills using data structures and algorithms. Excellent analytical skills Good communication skills
Posted 1 month ago
6.0 - 11.0 years
5 - 15 Lacs
Hyderabad, Chennai, Bengaluru
Hybrid
Mandatory Skill : Spark and Scala Data Engineering. Secondary Skill Python 5+ years of in depth hands on experience of developing, testing, deployment and debugging of Spark Jobs using Scala in Hadoop Platform In depth knowledge of Spark Core, working with RDDs, Spark SQL In depth knowledge on Spark Optimization Techniques and Best practices Good Knowledge of Scala Functional Programming: Try, Option, Future, Collections Good Knowledge of Scala OOPS: Classes, Traits and Objects (Singleton and Companion), Case Classes Good Understanding of Scala Language Features: Type System, Implicit/Givens Hands on experience of working in Hadoop Environment (HDFS/Hive), AWS S3, EMR Working experience on Workflow Orchestration tools like Airflow, Oozie Working with API calls in Scala Understanding and exposure to file formats such as Apache AVRO, Parquet, JSON Good to have knowledge of Protocol Buffers and Geospatial data analytics. Writing Test cases using frameworks such as scalatest. Good Knowledge of Build Tools such as: Gradle & SBT in depth Experience on using GIT, resolving conflicts, working with branches. Good to have Python programming skills Good to have worked on some workflow systems as Airflow Strong programming skills using data structures and algorithms. Excellent analytical skills Good communication skills
Posted 1 month ago
3.0 - 7.0 years
0 Lacs
hyderabad, telangana
On-site
As a Senior Data Scientist with a focus on Predictive Analytics and expertise in Databricks, your primary responsibilities will involve designing and implementing predictive models for various applications such as forecasting, churn analysis, and fraud detection. You will utilize tools like Python, SQL, Spark MLlib, and Databricks ML to deploy these models effectively. Your role will also include building end-to-end machine learning pipelines on the Databricks Lakehouse platform, encompassing data ingestion, feature engineering, model training, and deployment. It will be essential to optimize model performance through techniques like hyperparameter tuning, AutoML, and leveraging MLflow for tracking. Collaboration with engineering teams will be a key aspect of your job to ensure the operationalization of models, both in batch and real-time scenarios, using Databricks Jobs or REST APIs. You will be responsible for implementing Delta Lake to support scalable and ACID-compliant data workflows, as well as enabling CI/CD for machine learning pipelines using Databricks Repos and GitHub Actions. In addition to your technical duties, troubleshooting Spark Jobs and resolving issues within the Databricks Environment will be part of your routine tasks. To excel in this role, you should possess 3 to 5 years of experience in predictive analytics, with a strong background in regression, classification, and time-series modeling. Hands-on experience with Databricks Runtime for ML, Spark SQL, and PySpark is crucial for success in this position. Familiarity with tools like MLflow, Feature Store, and Unity Catalog for governance purposes will be advantageous. Industry experience in Life Insurance or Property & Casualty (P&C) is preferred, and holding a certification as a Databricks Certified ML Practitioner would be considered a plus. Your technical skill set should include proficiency in Python, PySpark, MLflow, and Databricks AutoML. Expertise in predictive modeling techniques such as classification, clustering, regression, time series analysis, and NLP is required. Familiarity with cloud platforms like Azure or AWS, Delta Lake, and Unity Catalog will also be beneficial for this role.,
Posted 1 month ago
7.0 - 11.0 years
0 Lacs
coimbatore, tamil nadu
On-site
As a Data Engineer specializing in supply chain applications at NovintiX in Coimbatore, India, you will play a crucial role in enhancing our Supply Chain Analytics team. Your primary focus will be on developing intelligent data solutions that drive real-world logistics, procurement, and demand planning. Your responsibilities will include: - Creating and optimizing scalable data pipelines for inventory, shipping, and procurement data - Integrating data from ERP, PLM, and external sources through the development of APIs - Designing, building, and maintaining enterprise-grade data warehouses and data lakes while ensuring data quality, integrity, and security - Collaborating with stakeholders to develop reporting dashboards using tools like Power BI, Tableau, or QlikSense - Supporting supply chain decision-making with data-driven insights - Constructing data models and algorithms for demand forecasting and logistics optimization, utilizing ML libraries and concepts - Coordinating with supply chain, logistics, and IT teams to translate technical solutions into understandable business language - Implementing robust data governance frameworks and ensuring compliance and audit readiness To qualify for this role, you should have: - 7+ years of experience in Data Engineering - A Bachelor's degree in Computer Science, IT, or a related field - Proficiency in Python, Java, SQL, Spark SQL, Hadoop, PySpark, NoSQL, Power BI, Tableau, QlikSense, Azure Data Factory, Azure Databricks, AWS - Strong collaboration and communication skills - Exposure to fast-paced, agile environments If you are passionate about leveraging data to drive supply chain efficiencies and meet business objectives, we encourage you to apply for this full-time position. Please send your resume to shanmathi.saravanan@novintix.com before the application deadline on 13/07/2025. Please note that the ability to commute or relocate to Coimbatore, Tamil Nadu, is preferred for this role, as it requires in-person work.,
Posted 1 month ago
13.0 - 20.0 years
30 - 45 Lacs
Pune
Hybrid
Hi, Wishes from GSN!!! Pleasure connecting with you!!! We been into Corporate Search Services for Identifying & Bringing in Stellar Talented Professionals for our reputed IT / Non-IT clients in India. We have been successfully providing results to various potential needs of our clients for the last 20 years. At present, GSN is hiring DATA ENGINEERING - Solution Architect for one of our leading MNC client. PFB the details for your better understanding : 1. WORK LOCATION : PUNE 2. Job Role: DATA ENGINEERING - Solution Architect 3. EXPERIENCE : 13+ yrs 4. CTC Range: Rs. 35 LPA to Rs. 50 LPA 5. Work Type : WFO Hybrid ****** Looking for SHORT JOINERS ****** Job Description : Who are we looking for : Architectural Vision & Strategy: Define and articulate the technical vision, strategy and roadmap for Big Data, data streaming, and NoSQL solutions , aligning with overall enterprise architecture and business goals. Required Skills : 13+ years of progressive EXP in software development, data engineering and solution architecture roles, with a strong focus on large-scale distributed systems. Expertise in Big Data Technologies: Apache Spark: Deep expertise in Spark architecture, Spark SQL, Spark Streaming, performance tuning, and optimization techniques. Experience with data processing paradigms (batch and real-time). Hadoop Ecosystem: Strong understanding of HDFS, YARN, Hive and other related Hadoop components . Real-time Data Streaming: Apache Kafka: Expert-level knowledge of Kafka architecture, topics, partitions, producers, consumers, Kafka Streams, KSQL, and best practices for high-throughput, low-latency data pipelines. NoSQL Databases: Couchbase: In-depth experience with Couchbase OR MongoDB OR Cassandra), including data modeling, indexing, querying (N1QL), replication, scaling, and operational best practices. API Design & Development: Extensive experience in designing and implementing robust, scalable and secure APIs (RESTful, GraphQL) for data access and integration. Programming & Code Review: Hands-on coding proficiency in at least one relevant language ( Python, Scala, Java ) with a preference for Python and/or Scala for data engineering tasks. Proven experience in leading and performing code reviews, ensuring code quality, performance, and adherence to architectural guidelines. Cloud Platforms: Extensive EXP in designing and implementing solutions on at least one major cloud platform ( AWS, Azure, GCP ), leveraging their Big Data, streaming, and compute services . Database Fundamentals: Solid understanding of relational database concepts, SQL, and data warehousing principles. System Design & Architecture Patterns: Deep knowledge of various architectural patterns (e.g., Microservices, Event-Driven Architecture, Lambda/Kappa Architecture, Data Mesh ) and their application in data solutions. DevOps & CI/CD: Familiarity with DevOps principles, CI/CD pipelines, infrastructure as code (IaC) and automated deployment strategies for data platforms . ****** Looking for SHORT JOINERS ****** Interested, don't hesitate to call NAK @ 9840035825 / 9244912300 for IMMEDIATE response. Best, ANANTH | GSN | Google review : https://g.co/kgs/UAsF9W
Posted 1 month ago
7.0 - 11.0 years
0 Lacs
coimbatore, tamil nadu
On-site
As a Data Engineer specializing in supply chain applications, you will play a crucial role in the Supply Chain Analytics team at NovintiX, based in Coimbatore, India. Your primary responsibility will be to design, develop, and optimize scalable data solutions that support various aspects of logistics, procurement, and demand planning. Your key responsibilities will include building and enhancing data pipelines for inventory, shipping, and procurement data, integrating data from ERP, PLM, and third-party sources, and creating APIs to facilitate seamless data exchange. Additionally, you will be tasked with designing and maintaining enterprise-grade data lakes and warehouses while ensuring high standards of data quality, integrity, and security. Collaborating with stakeholders, you will be involved in developing reporting dashboards using tools like Power BI, Tableau, or QlikSense to support supply chain decision-making through data-driven insights. You will also work on building data models and algorithms for demand forecasting and logistics optimization, leveraging ML libraries and concepts for predictive analysis. Your role will involve cross-functional collaboration with supply chain, logistics, and IT teams, translating complex technical solutions into business language to drive operational efficiency. Implementing robust data governance frameworks and ensuring data compliance and audit readiness will be essential aspects of your job. To qualify for this position, you should have at least 7 years of experience in Data Engineering, a Bachelor's degree in Computer Science/IT or a related field, and expertise in technologies such as Python, Java, SQL, Spark SQL, Hadoop, PySpark, NoSQL, Power BI, Tableau, QlikSense, Azure Data Factory, Azure Databricks, and AWS. Strong collaboration, communication skills, and experience in fast-paced, agile environments are also desired. This is a full-time position based in Coimbatore, Tamil Nadu, requiring in-person work. If you are passionate about leveraging data to drive supply chain efficiency and are ready to take on this exciting challenge, please send your resume to shanmathi.saravanan@novintix.com before the application deadline on 13/07/2025.,
Posted 1 month ago
8.0 - 12.0 years
12 - 18 Lacs
Noida
Work from Office
General Roles & Responsibilities: Technical Leadership: Demonstrate leadership, and ability to guide business and technology teams in adoption of best practices and standards Design & Development: Design, develop, and maintain robust, scalable, and high-performance data estate Architecture: Architect and design robust data solutions that meet business requirements & include scalability, performance, and security. Quality: Ensure the quality of deliverables through rigorous reviews, and adherence to standards. Agile Methodologies: Actively participate in agile processes, including planning, stand-ups, retrospectives, and backlog refinement. Collaboration: Work closely with system architects, data engineers, data scientists, data analysts, cloud engineers and other business stakeholders to determine optimal solution & architecture that is future-proof too. Innovation: Stay updated with the latest industry trends and technologies, and drive continuous improvement initiatives within the development team. Documentation: Create and maintain technical documentation, including design documents, and architectural user guides. Technical Responsibilities: Optimize data pipelines for performance and efficiency. Work with Databricks clusters and configuration management tools. Use appropriate tools in the cloud data lake development and deployment. Developing/implementing cloud infrastructure to support current and future business needs. Provide technical expertise and ownership in the diagnosis and resolution of issues. Ensure all cloud solutions exhibit a higher level of cost efficiency, performance, security, scalability, and reliability. Manage cloud data lake development and deployment on AWSDatabricks. Manage and create workspaces, configure cloud resources, view usage data, and manage account identities, settings, and subscriptions in Databricks Required Technical Skills: Experience & Proficiency with Databricks platform - Delta Lake storage, Spark (PySpark, Spark SQL). Must be well versed with Databricks Lakehouse, Unity Catalog concept and its implementation in enterprise environments. Familiarity of data design pattern - medallion architecture to organize data in a Lakehouse. Experience & Proficiency with AWS Data Services S3, Glue, Athena, Redshift etc.| Airflow scheduling Proficiency in SQL and experience with relational databases. Proficiency in at least one programming language (e.g., Python, Java) for data processing and scripting. Experience with DevOps practices - AWS DevOps for CI/CD, Terraform/CDK for infrastructure as code Good understanding of data principles, Cloud Data Lake design & development including data ingestion, data modeling and data distribution. Jira: Proficient in using Jira for managing projects and tracking progress. Other Skills: Strong communication and interpersonal skills. Collaborate with data stewards, data owners, and IT teams for effective implementation Understanding of business processes and terminology preferably Logistics Experienced with Scrum and Agile Methodologies Qualification Bachelors degree in information technology or a related field. Equivalent experience may be considered. Overall experience of 8-12 years in Data Engineering Mandatory Competencies Data Science and Machine Learning - Data Science and Machine Learning - Databricks Data on Cloud - Azure Data Lake (ADL) Agile - Agile Data Analysis - Data Analysis Big Data - Big Data - Pyspark Data on Cloud - AWS S3 Data on Cloud - Redshift ETL - ETL - AWS Glue Python - Python DevOps - CI/CD Beh - Communication and collaboration Cloud - Azure - Azure Data Factory (ADF), Azure Databricks, Azure Data Lake Storage, Event Hubs, HDInsight Database - Database Programming - SQL Agile - Agile - SCRUM QA/QE - QA Analytics - Data Analysis Cloud - AWS - AWS S3, S3 glacier, AWS EBS Cloud - AWS - Tensorflow on AWS, AWS Glue, AWS EMR, Amazon Data Pipeline, AWS Redshift Programming Language - Python - Python Shell Development Tools and Management - Development Tools and Management - CI/CD Cloud - AWS - AWS Lambda,AWS EventBridge, AWS Fargate
Posted 1 month ago
4.0 - 12.0 years
0 Lacs
karnataka
On-site
As a Big Data Lead with 7-12 years of experience, you will be responsible for leading the development of data processing systems and applications, specifically in the areas of Data Warehousing (DWH). Your role will involve utilizing your strong software development skills in multiple computing languages, with a focus on distributed data processing systems and BIDW programs. You should have a minimum of 4 years of software development experience and a proven track record in developing and testing applications, preferably on the J2EE stack. A sound understanding of best practices and concepts related to Data Warehouse Applications is crucial for this role. Additionally, you should possess a strong foundation in distributed systems and computing systems, with hands-on experience in Spark & Scala, Kafka, Hadoop, Hbase, Pig, and Hive. Experience with NoSQL data stores, data modeling, and data management will be beneficial for this role. Strong interpersonal communication skills are essential, along with excellent oral and written communication abilities. Knowledge of Data Lake implementation as an alternative to Data Warehousing is desirable. Hands-on experience with Spark SQL and SQL proficiency are mandatory requirements for this role. You should have a minimum of 2 end-to-end implementations in either Data Warehousing or Data Lake projects. Your role as a Big Data Lead will involve collaborating with cross-functional teams and driving data-related initiatives to meet business objectives effectively.,
Posted 1 month ago
8.0 - 12.0 years
0 Lacs
karnataka
On-site
You are a strategic thinker passionate about driving solutions in Data Analytics. You have found the right team. As an Analytics Solutions Vice President in our Finance team, you will define, refine, and deliver our firm's goals. If you're a skilled data professional passionate about transforming raw data into actionable insights and eager to learn and implement new technologies, you've found the right team. Join us in the Finance Data & Insights Team, an agile product team focused on developing, producing, and transforming financial data and reporting across CCB. Your role will involve creating data visualizations and intelligence solutions for top leaders to achieve strategic goals. You'll identify opportunities to eliminate manual processes and use automation tools like Alteryx, Tableau, and ThoughtSpot to develop automated solutions. Additionally, you'll extract, analyze, and summarize data for ad hoc requests and contribute to modernizing our data environment to a cloud platform. Job responsibilities: - Lead Data & Analytics requirements gathering sessions with varying levels of leadership and complete detailed project planning using JIRA to record planned project execution steps. - Understand databases, ETL processes, and translate logic into requirements for the Technology team. - Develop and enhance Alteryx workflows by collecting data from disparate sources and summarizing it as defined in requirements gathering with stakeholders, following best practices to source data from authoritative sources. - Develop data visualization solutions using Tableau and/or ThoughtSpot to provide intuitive insights to key stakeholders. - Conduct thorough control testing of each component of the intelligence solution, providing evidence that all data and visualizations offer accurate insights and evidence in the control process. - Seek to understand stakeholder use cases to anticipate their requirements, questions, and objections. - Become a subject matter expert in these responsibilities and support team members in becoming more proficient. Required qualifications, capabilities, and skills: - Bachelor's degree in MIS or Computer Science, Mathematics, Engineering, Statistics, or other quantitative or financial subject areas - People management experience of at least 3 years is required - Experience with business intelligence analytic and data wrangling tools such as Alteryx, SAS, or Python - Experience with relational databases optimizing SQL to pull and summarize large datasets, report creation and ad-hoc analyses, Databricks, Cloud solutions - Experience in reporting development and testing, and ability to interpret unstructured data and draw objective inferences given known limitations of the data - Demonstrated ability to think beyond raw data and to understand the underlying business context and sense business opportunities hidden in data - Strong written and oral communication skills; ability to communicate effectively with all levels of management and partners from a variety of business functions - Experience with ThoughtSpot or similar tools empowering stakeholders to better understand their data - Highly motivated, self-directed, curious to learn new technologies Preferred qualifications, capabilities, and skills: - Experience with ThoughtSpot / Python major advantage - Experience with AI/ML or LLM added advantage but not a must-have. Minimum 8 years experience developing advanced data visualization and presentations preferably with Tableau - Experience with Hive, Spark SQL, Impala, or other big-data query tools. AWS, Databricks, Snowflake, or other Cloud Data Warehouse experience - Minimum of 8 years experience working with data analytics projects, preferably related to financial services domain,
Posted 1 month ago
6.0 - 10.0 years
0 Lacs
hyderabad, telangana
On-site
You should have a minimum of 6 years of experience in the technical field and possess the following skills: Python, Spark SQL, PySpark, Apache Airflow, DBT, Snowflake, CI/CD, Git, GitHub, and AWS. Your role will involve understanding the existing code base in AWS services and SQL, and converting it to a tech stack primarily using Airflow, Iceberg, Python, and SQL. Your responsibilities will include designing and building data models to support business requirements, developing and maintaining data ingestion and processing systems, implementing data storage solutions, ensuring data consistency and accuracy through validation and cleansing techniques, and collaborating with cross-functional teams to address data-related issues. Proficiency in Python, experience with big data Spark, orchestration experience with Airflow, and AWS knowledge are essential for this role. You should also have experience in security and governance practices such as role-based access control (RBAC) and data lineage tools, as well as knowledge of database management systems like MySQL. Strong problem-solving and analytical skills, along with excellent communication and collaboration abilities, are key attributes for this position. At NucleusTeq, we foster a positive and supportive culture that encourages our associates to perform at their best every day. We value and celebrate individual uniqueness, offering flexibility for making daily choices that contribute to overall well-being. Our well-being programs and continuous efforts to enhance our culture aim to create an environment where our people can thrive, lead healthy lives, and excel in their roles.,
Posted 1 month ago
3.0 - 6.0 years
5 - 8 Lacs
Hyderabad, Bengaluru, Delhi / NCR
Work from Office
As a Senior Azure Data Engineer, your responsibilities will include: Building scalable data pipelines using Databricks and PySpark Transforming raw data into usable business insights Integrating Azure services like Blob Storage, Data Lake, and Synapse Analytics Deploying and maintaining machine learning models using MLlib or TensorFlow Executing large-scale Spark jobs with performance tuning on Spark Pools Leveraging Databricks Notebooks and managing workflows with MLflow Qualifications: Bachelors/Masters in Computer Science, Data Science, or equivalent 7+ years in Data Engineering, with 3+ years in Azure Databricks Strong hands-on in: PySpark, Spark SQL, RDDs, Pandas, NumPy, Delta Lake Azure ecosystem: Data Lake, Blob Storage, Synapse Analytics Location: Remote- Bengaluru,Hyderabad,Delhi / NCR,Chennai,Pune,Kolkata,Ahmedabad,Mumbai
Posted 1 month ago
10.0 - 20.0 years
20 - 35 Lacs
Pune
Hybrid
Hi, Wishes from GSN!!! Pleasure connecting with you!!! We been into Corporate Search Services for Identifying & Bringing in Stellar Talented Professionals for our reputed IT / Non-IT clients in India. We have been successfully providing results to various potential needs of our clients for the last 20 years. At present, GSN is one of our leading MNC client. PFB the details for your better understanding: WORK LOCATION: PUNE Job Role: Big Data Solution Architect EXPERIENCE: 10 Yrs - 20 Yrs CTC Range: 25 LPA -35 LPA Work Type: Hybrid Required Skills & Experience: 10+ years of progressive experience in software development, data engineering, and solution architecture roles, with a strong focus on large-scale distributed systems. Expertise in Big Data Technologies : Apache Spark : Deep expertise in Spark architecture, Spark SQL, Spark Streaming, performance tuning, and optimization techniques. Experience with data processing paradigms (batch and real-time). Hadoop Ecosystem : Strong understanding of HDFS, YARN, Hive, and other related Hadoop components. Real-time Data Streaming: Apache Kafka: Expert-level knowledge of Kafka architecture, topics, partitions, producers, consumers, Kafka Streams, KSQL, and best practices for high-throughput, low-latency data pipelines. NoSQL Databases: Couchbase: In-depth experience with Couchbase (or similar document/key-value NoSQL databases like MongoDB, Cassandra), including data modeling, indexing, querying (N1QL), replication, scaling, and operational best practices. API Design & Development: Extensive experience in designing and implementing robust, scalable, and secure APIs (RESTful, GraphQL) for data access and integration. Programming & Code Review: Hands-on coding proficiency in at least one relevant language (Python, Scala, Java) with a preference for Python and/or Scala for data engineering tasks. Proven experience in leading and performing code reviews, ensuring code quality, performance, and adherence to architectural guidelines. Cloud Platforms : Extensive experience designing and implementing solutions on at least one major cloud platform (AWS, Azure, GCP), leveraging their Big Data, streaming, and compute services. Database Fundamentals: Solid understanding of relational database concepts, SQL, and data warehousing principles. System Design & Architecture Patterns : Deep knowledge of various architectural patterns (e.g., Microservices, Event-Driven Architecture, Lambda/Kappa Architecture, Data Mesh) and their application in data solutions. DevOps & CI/CD: Familiarity with DevOps principles, CI/CD pipelines, infrastructure as code (IaC), and automated deployment strategies for data platforms. If interested, kindly APPLY for IMMEDIATE response Thanks & Rgds SHOBANA GSN | Mob : 8939666294 (Whatsapp) | Email :Shobana@gsnhr.net | Web : www.gsnhr.net Google Reviews : https://g.co/kgs/UAsF9W
Posted 1 month ago
10.0 - 20.0 years
25 - 40 Lacs
Chennai, pune,gurgaon, Hyderabad/Bangalore
Work from Office
Function: Software Engineering Big Data / DWH / ETL Azure Data Factory Azure Synapse ETL Spark SQL Scala Responsibilities: Designing and implementing scalable and efficient data architectures. Creating data models and optimizing data structures for performance and usability. Implementing and managing data lakehouses and real-time analytics solutions using Microsoft Fabric. Leveraging Fabric's OneLake, Dataflows, and Synapse Data Engineering for seamless data management. Enabling end-to-end analytics and AI-powered insights. Developing and orchestrating data pipelines in Azure Data Factory. Managing ETL/ELT processes for data integration across various sources. Optimizing data workflows for performance and cost efficiency. Designing interactive dashboards and reports in Power BI. Implementing data models, DAX calculations, and performance optimizations. Ensuring data quality, security, and governance in reporting solutions. Requirements: Data Architect with 10+ years of experience in Microsoft Fabric skills, designs, and implements data solutions using Fabric, focusing on data integration, analytics, and automation, while ensuring data quality, security, and compliance. Primary Skills (Must Have): Azure Data Pipeline, Apache Spark, ETL, Azure Factory, Azure Synapse, Azure Functions, Spark SQL, SQL. Secondary Skills (Good to Have): Other Azure Services, Python/Scala, DataStage (preferably), and Fabric
Posted 1 month ago
6.0 - 10.0 years
30 - 35 Lacs
Bengaluru
Work from Office
We are seeking an experienced PySpark Developer / Data Engineer to design, develop, and optimize big data processing pipelines using Apache Spark and Python (PySpark). The ideal candidate should have expertise in distributed computing, ETL workflows, data lake architectures, and cloud-based big data solutions. Key Responsibilities: Develop and optimize ETL/ELT data pipelines using PySpark on distributed computing platforms (Hadoop, Databricks, EMR, HDInsight). Work with structured and unstructured data to perform data transformation, cleansing, and aggregation. Implement data lake and data warehouse solutions on AWS (S3, Glue, Redshift), Azure (ADLS, Synapse), or GCP (BigQuery, Dataflow). Optimize PySpark jobs for performance tuning, partitioning, and caching strategies. Design and implement real-time and batch data processing solutions. Integrate data pipelines with Kafka, Delta Lake, Iceberg, or Hudi for streaming and incremental updates. Ensure data security, governance, and compliance with industry best practices. Work with data scientists and analysts to prepare and process large-scale datasets for machine learning models. Collaborate with DevOps teams to deploy, monitor, and scale PySpark jobs using CI/CD pipelines, Kubernetes, and containerization. Perform unit testing and validation to ensure data integrity and reliability. Required Skills & Qualifications: 6+ years of experience in big data processing, ETL, and data engineering. Strong hands-on experience with PySpark (Apache Spark with Python). Expertise in SQL, DataFrame API, and RDD transformations. Experience with big data platforms (Hadoop, Hive, HDFS, Spark SQL). Knowledge of cloud data processing services (AWS Glue, EMR, Databricks, Azure Synapse, GCP Dataflow). Proficiency in writing optimized queries, partitioning, and indexing for performance tuning. Experience with workflow orchestration tools like Airflow, Oozie, or Prefect. Familiarity with containerization and deployment using Docker, Kubernetes, and CI/CD pipelines. Strong understanding of data governance, security, and compliance (GDPR, HIPAA, CCPA, etc.). Excellent problem-solving, debugging, and performance optimization skills.
Posted 1 month ago
12.0 - 14.0 years
12 - 20 Lacs
Hyderabad, Bengaluru
Hybrid
Please Note - NP should be 0-15 days Looking for 10+ Y / highly experienced and deeply hands-on Data Architect to lead the design, build, and optimization of our data platforms on AWS and Databricks. This role requires a strong blend of architectural vision and direct implementation expertise, ensuring scalable, secure, and performant data solutions from concept to production. Strong hand on exp in data engineering/architecture, hands-on architectural and implementation experience on AWS and Databricks, Schema modeling . AWS: Deep hands-on expertise with key AWS data services and infrastructure. Databricks: Expert-level hands-on development with Databricks (Spark SQL, PySpark), Delta Lake, and Unity Catalog. Coding: Exceptional proficiency in Python , Pyspark , Spark , AWS Services and SQL. Architectural: Strong data modeling and architectural design skills with a focus on practical implementation. Preferred: AWS/Databricks certifications, experience with streaming technologies, and other data tools. Design & Build: Lead and personally execute the design, development, and deployment of complex data architectures and pipelines on AWS (S3, Glue, Lambda, Redshift, etc.) and Databricks (PySpark/Spark SQL, Delta Lake, Unity Catalog). Databricks Expertise: Own the hands-on development, optimization, and performance tuning of Databricks jobs, clusters, and notebooks.
Posted 2 months ago
7.0 - 10.0 years
10 - 14 Lacs
Bengaluru
Work from Office
The Data Scientist-3 in Bangalore (or Mumbai) will be part of the 811 Data Strategy Group that comprises Data Engineers, Data Scientists and Data Analytics professionals. He/she will be associated with one of the key functional areas such as Product Strategy, Cross Sell, Asset Risk, Fraud Risk, Customer Experience etc. and help build robust and scalable solutions that are deployed for real time or near real time consumption and integrated into our proprietary Customer Data Platform (CDP). This is an exciting opportunity to work on data driven analytical solutions and have a profound influence on the growth trajectory of a super fast evolving digital product. Key Requirements of The Role Advanced degree in an analytical field (e.g., Data Science, Computer Science, Engineering, Applied Mathematics, Statistics, Data Analysis) or substantial hands on work experience in the space 7 - 10 Years of relevant experience in the space Expertise in mining AI/ML opportunities from open ended business problems and drive solution design/development while closely collaborating with engineering, product and business teams Strong understanding of advanced data mining techniques, curating, processing and transforming data to produce sound datasets. Strong experience in NLP, time series forecasting and recommendation engines preferred Create great data stories with expertise in robust EDA and statistical inference. Should have at least a foundational understanding in Experimentation design ? Strong understanding of the Machine Learning lifecycle - feature engineering, training, validation, scaling, deployment, scoring, monitoring, and feedback loop. Exposure to Deep Learning applications and tools like TensorFlow, Theano, Torch, Caffe preferred Experience with analytical programming languages, tools and libraries (Python a must) as well as Shell scripting. Should be proficient in developing production ready code as per best practices. Experience in using Scala/Java/Go based libraries a big plus Very proficient is SQL and other relational databases along with PySpark or Spark SQL. Proficient is using NoSQL databases. Experience in using GraphDBs like Neo4j a plus. Candidate should be able to handle unstructured data with ease. Candidate should have experience in working with MLEs and be proficient (with experience) in using MLOps tools. Should be able to consume the capabilities of said tools with deep understanding of deployment lifecycle. Experience in CI/CD deployment is a big plus. Knowledge of key concepts in distributed systems like replication, serialization, concurrency control etc. a big plus Good understanding of programming best practices and building code artifacts for reuse. Should be comfortable with version controlling and collaborate comfortably in tools like git Ability to create frameworks that can perform model RCAs using analytical and interpretability tools. Should be able to peer review model documentations/code bases and find opportunities Experience in end-to-end delivery of AI driven Solutions (Deep learning , traditional data science projects) Strong communication, partnership and teamwork skills ? Should be able to guide and mentor teams while leading them by example. Should be an integral part of creating a team culture focused on driving collaboration, technical expertise and partnerships with other teams ? Ability to work in an extremely fast paced environment, meet deadlines, and perform at high standards with limited supervision A self-starter who is looking to build grounds up and contribute to the making of a potential big name in the space ? Experience in Banking and financial services is a plus. However, sound logical reasoning and first principles problem solving are even more critical job role: 1. As a key partner at the table, attend key meetings with the business team to bring in the data perspective to the discussions 2. Perform comprehensive data explorations around to generate inquisitive insights and scope out the problem 3. Develop simplistic to advanced solutions to address the problem at hand. We believe in making swift (albeit sometimes marginal) impact to business KPIs and hence adopt an MVP approach to solution development 4. Build re-usable code analytical frameworks to address commonly occurring business questions 5. Perform 360-degree customer profiling and opportunity analyses to guide new product strategy. This is a nascent business and hence opportunities to guide business strategy are plenty 6. Guide team members on data science and analytics best practices to help them overcome bottlenecks and challenges 7. The role will be an approximate 60% IC 40% leading and the ratios can vary basis need and fit 8. Develop Customer-360 Features that will be integrated into the Customer Data Platform (CDP) to enhance the single view of our customer
Posted 2 months ago
3.0 - 5.0 years
22 - 25 Lacs
Bengaluru
Work from Office
Job Description: We are looking for energetic, self-motivated and exceptional Data engineer to work on extraordinary enterprise products based on AI and Big Data engineering leveraging AWS/Databricks tech stack. He/she will work with star team of Architects, Data Scientists/AI Specialists, Data Engineers and Integration. Skills and Qualifications: 5+ years of experience in DWH/ETL Domain; Databricks/AWS tech stack 2+ years of experience in building data pipelines with Databricks/ PySpark /SQL Experience in writing and interpreting SQL queries, designing data models and data standards. Experience in SQL Server databases, Oracle and/or cloud databases. Experience in data warehousing and data mart, Star and Snowflake model. Experience in loading data into database from databases and files. Experience in analyzing and drawing design conclusions from data profiling results. Understanding business process and relationship of systems and applications. Must be comfortable conversing with the end-users. Must have ability to manage multiple projects/clients simultaneously. Excellent analytical, verbal and communication skills. Role and Responsibilities: Work with business stakeholders and build data solutions to address analytical & reporting requirements. Work with application developers and business analysts to implement and optimise Databricks/AWS-based implementations meeting data requirements. Design, develop, and optimize data pipelines using Databricks (Delta Lake, Spark SQL, PySpark), AWS Glue, and Apache Airflow Implement and manage ETL workflows using Databricks notebooks, PySpark and AWS Glue for efficient data transformation Develop/ optimize SQL scripts, queries, views, and stored procedures to enhance data models and improve query performance on managed databases. Conduct root cause analysis and resolve production problems and data issues. Create and maintain up to date documentation of the data model, data flow and field level mappings. Provide support for production problems and daily batch processing. Provide ongoing maintenance and optimization of database schemas, data lake structures (Delta Tables, Parquet), and views to ensure data integrity and performance
Posted 2 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
54024 Jobs | Dublin
Wipro
24262 Jobs | Bengaluru
Accenture in India
18733 Jobs | Dublin 2
EY
17079 Jobs | London
Uplers
12548 Jobs | Ahmedabad
IBM
11704 Jobs | Armonk
Amazon
11059 Jobs | Seattle,WA
Bajaj Finserv
10656 Jobs |
Accenture services Pvt Ltd
10587 Jobs |
Oracle
10506 Jobs | Redwood City