Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
3.0 - 8.0 years
5 - 10 Lacs
Mumbai
Work from Office
This position participates in the support of batch and real-time data pipelines utilizing various data analytics processing frameworks in support of data science practices for Marketing and Finance business units. This position supports the integration of data from various data sources, as well as performs extract, transform, load (ETL) data conversions, and facilitates data cleansing and enrichment. This position performs full systems life cycle management activities, such as analysis, technical requirements, design, coding, testing, implementation of systems and applications software. This position participates and contributes to synthesizing disparate data sources to support reusable and reproducible data assets. Responsibilities Supervises and supports data engineering projects and builds solutions by leveraging a strong foundational knowledge in software/application development. Develops and delivers data engineering documentation. Gathers requirements, defines the scope, and performs the integration of data for data engineering projects. Recommends analytic reporting products/tools and supports the adoption of emerging technology. Performs data engineering maintenance and support. Provides the implementation strategy and executes backup, recovery, and technology solutions to perform analysis. Performs ETL tool capabilities with the ability to pull data from various sources and perform a load of the transformed data into a database or business intelligence platform. Required Qualifications Codes using programming language used for statistical analysis and modeling such as Python/Java/Scala/C# Strong understanding of database systems and data warehousing solutions. Strong understanding of the data interconnections between organizations operational and business functions. Strong understanding of the data life cycle stages - data collection, transformation, analysis, storing the data securely, providing data accessibility Strong understanding of the data environment to ensure that it can scale for the following demands: Throughput of data, increasing data pipeline throughput, analyzing large amounts of data, Real-time predictions, insights and customer feedback, data security, data regulations, and compliance. Strong knowledge of data structures, as well as data filtering and data optimization. Strong understanding of analytic reporting technologies and environments (e.g., PBI, Looker, Qlik, etc.) Strong understanding of a cloud services platform (e.g., GCP, or AZURE, or AWS) and all the data life cycle stages. Azure Preferred. Understanding of distributed systems and the underlying business problem being addressed, as well as guides team members on how their work will assist by performing data analysis and presenting findings to the stakeholders. Bachelors degree in MIS, mathematics, statistics, or computer science, international equivalent, or equivalent job experience. Required Skills 3 years of experience with Databricks Other required experience includes: SSIS/SSAS, Apache Spark, Python, R and SQL, SQL Server Preferred Skills Scala, DeltaLake Unity Catalog, Azure Logic Apps, Cloud Services Platform (e.g., GCP, or AZURE, or AWS)
Posted 1 week ago
3.0 - 8.0 years
6 - 14 Lacs
Gurugram
Work from Office
The ideal candidate will have strong expertise in Python, Apache Spark, and Databricks, along with experience in machine learning Data Engineer
Posted 1 week ago
3.0 - 6.0 years
3 - 7 Lacs
Hyderabad, Bengaluru
Hybrid
Locations : Hyderabad & Bangalore Work Mode: Hybrid Interview Mode: Virtual (2 Rounds) Type: Contract-to-Hire (C2H) Key Skills & Responsibilities Hands-on experience with AWS services: S3, Lambda, Glue, API Gateway, and SQS. Strong data engineering expertise on AWS, with proficiency in Python, PySpark, and SQL. Experience in batch job scheduling and managing data dependencies across pipelines. Familiarity with data processing tools such as Apache Spark and Airflow. Ability to automate repetitive tasks and build reusable frameworks for improved efficiency. Provide RunOps / DevOps support, and manage the ongoing operation and monitoring of data services. Ensure high performance, scalability, and reliability of data workflows in cloud environments. Skills: aws,s3,glue,apache spark,lambda,airflow,sql,s3, lambda, glue, api gateway, and sqs,api gateway,pyspark,sqs,python,devops support
Posted 1 week ago
6.0 - 9.0 years
4 - 7 Lacs
Pune
Hybrid
Work Mode: Hybrid Interview Mode: Virtual (2 Rounds) Type: Contract-to-Hire (C2H) Job Summary We are looking for a skilled PySpark Developer with hands-on experience in building scalable data pipelines and processing large datasets. The ideal candidate will have deep expertise in Apache Spark , Python , and working with modern data engineering tools in cloud environments such as AWS . Key Skills & Responsibilities Strong expertise in PySpark and Apache Spark for batch and real-time data processing. Experience in designing and implementing ETL pipelines, including data ingestion, transformation, and validation. Proficiency in Python for scripting, automation, and building reusable components. Hands-on experience with scheduling tools like Airflow or Control-M to orchestrate workflows. Familiarity with AWS ecosystem, especially S3 and related file system operations. Strong understanding of Unix/Linux environments and Shell scripting. Experience with Hadoop, Hive, and platforms like Cloudera or Hortonworks. Ability to handle CDC (Change Data Capture) operations on large datasets. Experience in performance tuning, optimizing Spark jobs, and troubleshooting. Strong knowledge of data modeling, data validation, and writing unit test cases. Exposure to real-time and batch integration with downstream/upstream systems. Working knowledge of Jupyter Notebook, Zeppelin, or PyCharm for development and debugging. Understanding of Agile methodologies, with experience in CI/CD tools (e.g., Jenkins, Git). Preferred Skills Experience in building or integrating APIs for data provisioning. Exposure to ETL or reporting tools such as Informatica, Tableau, Jasper, or QlikView. Familiarity with AI/ML model development using PySpark in cloud environments Skills: ci/cd,zeppelin,pycharm,pyspark,etl tools,control-m,unit test cases,tableau,performance tuning,jenkins,qlikview,informatica,jupyter notebook,api integration,unix/linux,git,aws s3,hive,cloudera,jasper,airflow,cdc,pyspark, apache spark, python, aws s3, airflow/control-m, sql, unix/linux, hive, hadoop, data modeling, and performance tuning,agile methodologies,aws,s3,data modeling,data validation,ai/ml model development,batch integration,apache spark,python,etl pipelines,shell scripting,hortonworks,real-time integration,hadoop
Posted 1 week ago
6.0 - 10.0 years
4 - 8 Lacs
Bengaluru
Hybrid
Role: PySpark Developer Work Mode: Hybrid Interview Mode: Virtual (2 Rounds) Type: Contract-to-Hire (C2H) Job Summary We are looking for a skilled PySpark Developer with hands-on experience in building scalable data pipelines and processing large datasets. The ideal candidate will have deep expertise in Apache Spark , Python , and working with modern data engineering tools in cloud environments such as AWS . Key Skills & Responsibilities Strong expertise in PySpark and Apache Spark for batch and real-time data processing. Experience in designing and implementing ETL pipelines, including data ingestion, transformation, and validation. Proficiency in Python for scripting, automation, and building reusable components. Hands-on experience with scheduling tools like Airflow or Control-M to orchestrate workflows. Familiarity with AWS ecosystem, especially S3 and related file system operations. Strong understanding of Unix/Linux environments and Shell scripting. Experience with Hadoop, Hive, and platforms like Cloudera or Hortonworks. Ability to handle CDC (Change Data Capture) operations on large datasets. Experience in performance tuning, optimizing Spark jobs, and troubleshooting. Strong knowledge of data modeling, data validation, and writing unit test cases. Exposure to real-time and batch integration with downstream/upstream systems. Working knowledge of Jupyter Notebook, Zeppelin, or PyCharm for development and debugging. Understanding of Agile methodologies, with experience in CI/CD tools (e.g., Jenkins, Git). Preferred Skills Experience in building or integrating APIs for data provisioning. Exposure to ETL or reporting tools such as Informatica, Tableau, Jasper, or QlikView. Familiarity with AI/ML model development using PySpark in cloud environments Skills: ci/cd,zeppelin,pycharm,pyspark,etl tools,control-m,unit test cases,tableau,performance tuning,jenkins,qlikview,informatica,jupyter notebook,api integration,unix/linux,git,aws s3,hive,cloudera,jasper,airflow,cdc,pyspark, apache spark, python, aws s3, airflow/control-m, sql, unix/linux, hive, hadoop, data modeling, and performance tuning,agile methodologies,aws,s3,data modeling,data validation,ai/ml model development,batch integration,apache spark,python,etl pipelines,shell scripting,hortonworks,real-time integration,hadoop
Posted 1 week ago
8.0 - 12.0 years
12 - 22 Lacs
Hyderabad, Secunderabad
Work from Office
Proficiency in SQL, Python, and data pipeline frameworks such as Apache Spark, Databricks, or Airflow. Hands-on experience with cloud data platforms (e.g., Azure Synapse, AWS Redshift, Google BigQuery). Strong understanding of data modeling, ETL/ELT, and data lake/warehouse/ Datamart architectures. Knowledge on Data Factory or AWS Glue Experience in developing reports and dashboards using tools like Power BI, Tableau, or Looker.
Posted 1 week ago
7.0 - 12.0 years
0 - 0 Lacs
Kochi
Work from Office
Greetings from TCS Recruitment Team! Role: DATABRICKS LEAD/ DATABRICKS SOLUTION ARCHITECT/ DATABRICKS ML ENGINEER Years of experience: 7 to 18 Years Walk-In-Drive Location: Kochi Walk-in-Location Details: Tata Consultancy Services TCS Centre SEZ Unit, Infopark Kochi Phase 1, Infopark Kochi P.O, Kakkanad, Kochi - 682042, Kerala India Drive Time: 9 am to 1:00 PM Date: 21-Jun-25 Must have 5+ years of experience in data engineering or related fields At least 2-3 years of hands-on experience with Databricks (using Apache Spark, Delta Lake, etc.) Solid experience in working with big data technologies such as Hadoop, Spark, Kafka, or similar Experience with cloud platforms (AWS, Azure, or GCP) and cloud-native data tools Experience with machine learning frameworks and pipelines, particularly in Databricks. Experience with AI/ML model deployment, MLOps, and ML lifecycle management using Databricks and related tools.
Posted 1 week ago
5.0 - 10.0 years
8 - 16 Lacs
Bhubaneswar, Bengaluru, Delhi / NCR
Work from Office
Project Role : Application Developer Project Role Description : Design, build and configure applications to meet business process and application requirements. Must have skills : Apache Spark Good to have skills : Oracle Procedural Language Extensions to SQL (PLSQL) Minimum 5 year(s) of experience is required Educational Qualification : 15 years full time education Summary: As an Application Developer, you will design, build, and configure applications to meet business process and application requirements. You will be responsible for ensuring that the applications are developed and implemented efficiently and effectively, while meeting the needs of the organization. Your typical day will involve collaborating with the team, making team decisions, engaging with multiple teams, and providing solutions to problems for your immediate team and across multiple teams. You will also contribute to key decisions and provide expertise in application development. Roles & Responsibilities: - Expected to be an SME - Collaborate and manage the team to perform - Responsible for team decisions - Engage with multiple teams and contribute on key decisions - Provide solutions to problems for their immediate team and across multiple teams - Design, build, and configure applications to meet business process and application requirements - Ensure that applications are developed and implemented efficiently and effectively - Contribute expertise in application development Professional & Technical Skills: - Must To Have Skills: Proficiency in Apache Spark - Good To Have Skills: Experience with Oracle Procedural Language Extensions to SQL (PLSQL), Google BigQuery - Strong understanding of statistical analysis and machine learning algorithms - Experience with data visualization tools such as Tableau or Power BI - Hands-on implementing various machine learning algorithms such as linear regression, logistic regression, decision trees, and clustering algorithms - Solid grasp of data munging techniques, including data cleaning, transformation, and normalization to ensure data quality and integrity Additional Information: - The candidate should have a minimum of 5 years of experience in Apache Spark - This position is based at our Gurugram office - A 15 years full time education is required
Posted 1 week ago
12.0 - 22.0 years
30 - 45 Lacs
Chennai
Work from Office
Project Role Description : Design, build and configure applications to meet business process and application requirements. Must have skills : Apache Spark Good to have skills : Google BigQuery, PySpark Professional & Technical Skills: - Must To Have Skills: Proficiency in Apache Spark, PySpark, Google BigQuery. - Strong understanding of statistical analysis and machine learning algorithms. - Experience with data visualization tools such as Tableau or Power BI. - Hands-on implementing various machine learning algorithms such as linear regression, logistic regression, decision trees, and clustering algorithms. - Solid grasp of data munging techniques, including data cleaning, transformation, and normalization to ensure data quality and integrity. Additional Information: - The candidate should have a minimum of 12 years of experience in Apache Spark.
Posted 1 week ago
8.0 - 13.0 years
25 - 40 Lacs
Bengaluru
Work from Office
*Must-Have Skills:* * Azure Databricks / PySpark hands-on * SQL/PL-SQL advanced level * Snowflake – 2+ years * Spark/Data pipeline development – 2+ years * Azure Repos / GitHub, Azure DevOps * Unix Shell Scripting * Cloud technology experience *Key Responsibilities:* 1. *Design, build, and manage data pipelines using Azure Databricks, PySpark, and Snowflake. 2. *Analyze and resolve production issues (Tier 2 support with weekend/on-call rotation). 3. *Write and optimize complex SQL/PL-SQL queries. 4. *Collaborate on low-level and high-level design for data solutions. 5. *Document all project deliverables and support deployment. Good to Have: Knowledge of Oracle, Qlik Replicate, GoldenGate, Hadoop Job scheduler tools like Control-M or Airflow Behavioral: Strong problem-solving & communication skills
Posted 1 week ago
3.0 - 5.0 years
15 - 17 Lacs
Pune
Work from Office
Performance Testing Specialist Databricks Pipelines Key Responsibilities: - Design and execute performance testing strategies specifically for Databricks-based data pipelines. - Identify performance bottlenecks and provide optimization recommendations across Spark/Databricks workloads. - Collaborate with development and DevOps teams to integrate performance testing into CI/CD pipelines. - Analyze job execution metrics, cluster utilization, memory/storage usage, and latency across various stages of data pipeline processing. - Create and maintain performance test scripts, frameworks, and dashboards using tools like JMeter, Locust, or custom Python utilities. - Generate detailed performance reports and suggest tuning at the code, configuration, and platform levels. - Conduct root cause analysis for slow-running ETL/ELT jobs and recommend remediation steps. - Participate in production issue resolution related to performance and contribute to RCA documentation. Technical Skills: Mandatory - Strong understanding of Databricks, Apache Spark, and performance tuning techniques for distributed data processing systems. - Hands-on experience in Spark (PySpark/Scala) performance profiling, partitioning strategies, and job parallelization. - 2+ years of experience in performance testing and load simulation of data pipelines. - Solid skills in SQL, Snowflake, and analyzing performance via query plans and optimization hints. - Familiarity with Azure Databricks, Azure Monitor, Log Analytics, or similar observability tools. - Proficient in scripting (Python/Shell) for test automation and pipeline instrumentation. - Experience with DevOps tools such as Azure DevOps, GitHub Actions, or Jenkins for automated testing. - Comfortable working in Unix/Linux environments and writing shell scripts for monitoring and debugging. Good to Have - Experience with job schedulers like Control-M, Autosys, or Azure Data Factory trigger flows. - Exposure to CI/CD integration for automated performance validation. - Understanding of network/storage I/O tuning parameters in cloud-based environments.
Posted 1 week ago
4.0 - 9.0 years
25 - 32 Lacs
Ahmedabad
Remote
Key Responsibilities: Design and implement robust, scalable search architectures using Solr and Elasticsearch. Write, optimize, and maintain complex search queries (including full-text, faceted, fuzzy, geospatial, and nested queries) using Solr Query Parser and Elasticsearch DSL. Work with business stakeholders to understand search requirements and translate them into performant and accurate queries. Build and manage custom analyzers, tokenizers, filters, and index mappings/schemas tailored to domain-specific search needs. Develop and optimize indexing pipelines using Apache Spark for processing large-scale structured and unstructured datasets. Perform query tuning and search relevance optimization based on precision, recall, and user engagement metrics. Create and maintain query templates and search APIs for integration with enterprise applications. Monitor, troubleshoot, and improve search performance and infrastructure reliability. Conduct evaluations and benchmarking of search quality, query latency, and index refresh times. Required Skills and Qualifications: 4 to 5 years of hands-on experience with Apache Solr and/or Elasticsearch in production environments. Proven ability to write and optimize complex Solr queries (standard, dismax, edismax parsers) and Elasticsearch Query DSL, including: Full-text search with analyzers Faceted and filtered search Boolean and range queries Aggregations and suggesters Nested and parent/ child queries Strong understanding of indexing principles, Lucene internals, and relevance scoring mechanisms (BM25, TF-IDF). Proficiency with Apache Spark for custom indexing workflows and large-scale data processing. Experience with document parsing and extraction (JSON, XML, PDFs, etc.) for search indexing. Experience integrating search into web applications or enterprise software platforms.
Posted 2 weeks ago
9.0 - 12.0 years
7 - 12 Lacs
Hyderabad
Work from Office
Role Description: We are looking for highly motivated expert Senior Data Engineer who can own the design & development of complex data pipelines, solutions and frameworks. The ideal candidate will be responsible to design, develop, and optimize data pipelines, data integration frameworks, and metadata-driven architectures that enable seamless data access and analytics. This role prefers deep expertise in big data processing, distributed computing, data modeling, and governance frameworks to support self-service analytics, AI-driven insights, and enterprise-wide data management. Roles & Responsibilities: Design, develop, and maintain scalable ETL/ELT pipelines to support structured, semi-structured, and unstructured data processing across the Enterprise Data Fabric. Implement real-time and batch data processing solutions, integrating data from multiple sources into a unified, governed data fabric architecture. Optimize big data processing frameworks using Apache Spark, Hadoop, or similar distributed computing technologies to ensure high availability and cost efficiency. Work with metadata management and data lineage tracking tools to enable enterprise-wide data discovery and governance. Ensure data security, compliance, and role-based access control (RBAC) across data environments. Optimize query performance, indexing strategies, partitioning, and caching for large-scale data sets. Develop CI/CD pipelines for automated data pipeline deployments, version control, and monitoring. Implement data virtualization techniques to provide seamless access to data across multiple storage systems. Collaborate with cross-functional teams, including data architects, business analysts, and DevOps teams, to align data engineering strategies with enterprise goals. Stay up to date with emerging data technologies and best practices, ensuring continuous improvement of Enterprise Data Fabric architectures. Must-Have Skills: Hands-on experience in data engineering technologies such as Databricks, PySpark, SparkSQL Apache Spark, AWS, Python, SQL, and Scaled Agile methodologies. Proficiency in workflow orchestration, performance tuning on big data processing. Strong understanding of AWS services Experience with Data Fabric, Data Mesh, or similar enterprise-wide data architectures. Ability to quickly learn, adapt and apply new technologies Strong problem-solving and analytical skills Excellent communication and teamwork skills Experience with Scaled Agile Framework (SAFe), Agile delivery practices, and DevOps practices. Good-to-Have Skills: Good to have deep expertise in Biotech & Pharma industries Experience in writing APIs to make the data available to the consumers Experienced with SQL/NOSQL database, vector database for large language models Experienced with data modeling and performance tuning for both OLAP and OLTP databases Experienced with software engineering best-practices, including but not limited to version control (Git, Subversion, etc.), CI/CD (Jenkins, Maven etc.), automated unit testing, and Dev Ops Education and Professional Certifications 9 to 12 years of Computer Science, IT or related field experience AWS Certified Data Engineer preferred Databricks Certificate preferred Scaled Agile SAFe certification preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals. Ability to learn quickly, be organized and detail oriented. Strong presentation and public speaking skills.
Posted 2 weeks ago
0.0 - 3.0 years
3 - 5 Lacs
Hyderabad
Work from Office
What you will do In this vital role We are seeking a Associate Data Engineer to design, build, and maintain scalable data solutions that drive business insights. You will work with large datasets, cloud platforms (AWS preferred), and big data technologies to develop ETL pipelines, ensure data quality, and support data governance initiatives. Develop and maintain data pipelines, ETL/ELT processes, and data integration solutions . Design and implement data models, data dictionaries, and documentation for accuracy and consistency. Ensure data security, privacy, and governance standard processes. Use Databricks, Apache Spark (PySpark, SparkSQL), AWS, Redshift, for scalable data processing. Collaborate with cross-functional teams to understand data needs and deliver actionable insights. Optimize data pipeline performance and explore new tools for efficiency. Follow best practices in coding, testing, and infrastructure-as-code (CI/CD, version control, automated testing) . What we expect of you We are all different, yet we all use our unique contributions to serve patients. Strong problem-solving, critical thinking, and communication skills. Ability to collaborate effectively in a team setting. Proficiency in SQL, data analysis tools, and data visualization . Hands-on experience with big data technologies (Databricks, Apache Spark, AWS, Redshift ) . Experience with ETL tools, workflow orchestration, and performance tuning for big data . Basic Qualifications: Bachelors degree and 0 to 3 years of experience OR Diploma and 4 to 7 years of experience in Computer science, IT or related field. Preferred Qualifications: Knowledge of data modeling, warehousing, and graph databases Experience with Python, SageMaker, and cloud data platforms . AWS Certified Data Engineer or Databricks certification preferred.
Posted 2 weeks ago
3.0 - 8.0 years
5 - 10 Lacs
Hyderabad
Work from Office
Role Description: We are looking for highly motivated expert Senior Data Engineer who can own the design & development of complex data pipelines, solutions and frameworks. The ideal candidate will be responsible to design, develop, and optimize data pipelines, data integration frameworks, and metadata-driven architectures that enable seamless data access and analytics. This role prefers deep expertise in big data processing, distributed computing, data modeling, and governance frameworks to support self-service analytics, AI-driven insights, and enterprise-wide data management. Roles & Responsibilities: Design, develop, and maintain scalable ETL/ELT pipelines to support structured, semi-structured, and unstructured data processing across the Enterprise Data Fabric. Implement real-time and batch data processing solutions, integrating data from multiple sources into a unified, governed data fabric architecture. Optimize big data processing frameworks using Apache Spark, Hadoop, or similar distributed computing technologies to ensure high availability and cost efficiency. Work with metadata management and data lineage tracking tools to enable enterprise-wide data discovery and governance. Ensure data security, compliance, and role-based access control (RBAC) across data environments. Optimize query performance, indexing strategies, partitioning, and caching for large-scale data sets. Develop CI/CD pipelines for automated data pipeline deployments, version control, and monitoring. Implement data virtualization techniques to provide seamless access to data across multiple storage systems. Collaborate with cross-functional teams, including data architects, business analysts, and DevOps teams, to align data engineering strategies with enterprise goals. Stay up to date with emerging data technologies and best practices, ensuring continuous improvement of Enterprise Data Fabric architectures. Must-Have Skills: Hands-on experience in data engineering technologies such as Databricks, PySpark, SparkSQL Apache Spark, AWS, Python, SQL, and Scaled Agile methodologies. Proficiency in workflow orchestration, performance tuning on big data processing. Strong understanding of AWS services Experience with Data Fabric, Data Mesh, or similar enterprise-wide data architectures. Ability to quickly learn, adapt and apply new technologies Strong problem-solving and analytical skills Excellent communication and teamwork skills Experience with Scaled Agile Framework (SAFe), Agile delivery practices, and DevOps practices. Good-to-Have Skills: Good to have deep expertise in Biotech & Pharma industries Experience in writing APIs to make the data available to the consumers Experienced with SQL/NOSQL database, vector database for large language models Experienced with data modeling and performance tuning for both OLAP and OLTP databases Experienced with software engineering best-practices, including but not limited to version control (Git, Subversion, etc.), CI/CD (Jenkins, Maven etc.), automated unit testing, and DevOps Education and Professional Certifications Masters degree and 3 to 4 + years of Computer Science, IT or related field experience OR Bachelors degree and 5 to 8 + years of Computer Science, IT or related field experience AWS Certified Data Engineer preferred Databricks Certificate preferred Scaled Agile SAFe certification preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals. Ability to learn quickly, be organized and detail oriented. Strong presentation and public speaking skills.
Posted 2 weeks ago
0.0 - 2.0 years
2 - 4 Lacs
Hyderabad
Work from Office
Role Description: We are looking for an Associate Data Engineer with deep expertise in writing data pipelines to build scalable, high-performance data solutions. The ideal candidate will be responsible for developing, optimizing and maintaining complex data pipelines, integration frameworks, and metadata-driven architectures that enable seamless access and analytics. This role prefers deep understanding of the big data processing, distributed computing, data modeling, and governance frameworks to support self-service analytics, AI-driven insights, and enterprise-wide data management. Roles & Responsibilities: Data Engineer who owns development of complex ETL/ELT data pipelines to process large-scale datasets Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions Ensuring data integrity, accuracy, and consistency through rigorous quality checks and monitoring Exploring and implementing new tools and technologies to enhance ETL platform and performance of the pipelines Proactively identify and implement opportunities to automate tasks and develop reusable frameworks Eager to understand the biotech/pharma domains & build highly efficient data pipelines to migrate and deploy complex data across systems Work in an Agile and Scaled Agile (SAFe) environment, collaborating with cross-functional teams, product owners, and Scrum Masters to deliver incremental value Use JIRA, Confluence, and Agile DevOps tools to manage sprints, backlogs, and user stories. Support continuous improvement, test automation, and DevOps practices in the data engineering lifecycle Collaborate and communicate effectively with the product teams, with cross-functional teams to understand business requirements and translate them into technical solutions Must-Have Skills: Experience in Data Engineering with a focus on Databricks, AWS, Python, SQL, and Scaled Agile methodologies Proficiency & Strong understanding of data processing and transformation of big data frameworks (Databricks, Apache Spark, Delta Lake, and distributed computing concepts) Strong understanding of AWS services and can demonstrate the same Ability to quickly learn, adapt and apply new technologies Strong problem-solving and analytical skills Excellent communication and teamwork skills Experience with Scaled Agile Framework (SAFe), Agile delivery, and DevOps practices Good-to-Have Skills: Data Engineering experience in Biotechnology or pharma industry Exposure to APIs, full stack development Experienced with SQL/NOSQL database, vector database for large language models Experienced with data modeling and performance tuning for both OLAP and OLTP databases Experienced with software engineering best-practices, including but not limited to version control (Git, Subversion, etc.), CI/CD (Jenkins, Maven etc.), automated unit testing, and Dev Ops Education and Professional Certifications Bachelors degree and 2 to 5 + years of Computer Science, IT or related field experience OR Masters degree and 1 to 4 + years of Computer Science, IT or related field experience AWS Certified Data Engineer preferred Databricks Certificate preferred Scaled Agile SAFe certification preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals. Ability to learn quickly, be organized and detail oriented. Strong presentation and public speaking skills.
Posted 2 weeks ago
9.0 - 14.0 years
11 - 16 Lacs
Hyderabad
Work from Office
Role Description: We are seeking a seasoned Solution Architect to drive the architecture, development and implementation of data solutions to Amgen functional groups. The ideal candidate able to work in large scale Data Analytic initiatives, engage and work along with Business, Program Management, Data Engineering and Analytic Engineering teams. Be champions of enterprise data analytic strategy, data architecture blueprints and architectural guidelines. As a Solution Architect, you will play a crucial role in designing, building, and optimizing data solutions to Amgen functional groups such as R&D, Operations and GCO. Roles & Responsibilities: Implement and manage large scale data analytic solutions to Amgen functional groups that align with the Amgen Data strategy Collaborate with Business, Program Management, Data Engineering and Analytic Engineering teams to deliver data solutions Responsible for design, develop, optimize, delivery and support of Data solutions on AWS and Databricks architecture Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions. Provide expert guidance and mentorship to the team members, fostering a culture of innovation and best practices. Be passionate and hands-on to quickly experiment with new data related technologies Define guidelines, standards, strategies, security policies and change management policies to support the Enterprise Data platform. Collaborate and align with EARB, Cloud Infrastructure, Security and other technology leaders on Enterprise Data Architecture changes Work with different project and application groups to drive growth of the Enterprise Data Platform using effective written/verbal communication skills, and lead demos at different roadmap sessions Overall management of the Enterprise Data Platform on AWS environment to ensure that the service delivery is cost effective and business SLAs around uptime, performance and capacity are met Ensure scalability, reliability, and performance of data platforms by implementing best practices for architecture, cloud resource optimization, and system tuning. Collaboration with RunOps engineers to continuously increase our ability to push changes into production with as little manual overhead and as much speed as possible. Maintain knowledge of market trends and developments in data integration, data management and analytics software/tools Work as part of team in a SAFe Agile/Scrum model Basic Qualifications and Experience: Masters degree with 6 - 8 years of experience in Computer Science, IT or related field OR Bachelors degree with 9 - 12 years of experience in Computer Science, IT or related field OR Functional Skills: Must-Have Skills: 7+ years of hands-on experience in Data integrations, Data Management and BI technology stack. Strong experience with one or more Data Management tools such as AWS data lake, Snowflake or Azure Data Fabric Expert-level proficiency with Databricks and experience in optimizing data pipelines and workflows in Databricks environments. Strong experience with Python, PySpark, and SQL for building scalable data workflows and pipelines. Experience with Apache Spark, Delta Lake, and other relevant technologies for large-scale data processing. Familiarity with BI tools including Tableau and PowerBI Demonstrated ability to enhance cost-efficiency, scalability, and performance for data solutions Strong analytical and problem-solving skills to address complex data solutions Good-to-Have Skills: Preferred to have experience in life science or tech or consultative solution architecture roles Experience working with agile development methodologies such as Scaled Agile. Professional Certifications AWS Certified Data Engineer preferred Databricks Certificate preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals Strong presentation and public speaking skills.
Posted 2 weeks ago
5.0 - 8.0 years
7 - 11 Lacs
Bengaluru
Work from Office
Role Purpose The purpose of the role is to support process delivery by ensuring daily performance of the Production Specialists, resolve technical escalations and develop technical capability within the Production Specialists. Do Oversee and support process by reviewing daily transactions on performance parameters Review performance dashboard and the scores for the team Support the team in improving performance parameters by providing technical support and process guidance Record, track, and document all queries received, problem-solving steps taken and total successful and unsuccessful resolutions Ensure standard processes and procedures are followed to resolve all client queries Resolve client queries as per the SLAs defined in the contract Develop understanding of process/ product for the team members to facilitate better client interaction and troubleshooting Document and analyze call logs to spot most occurring trends to prevent future problems Identify red flags and escalate serious client issues to Team leader in cases of untimely resolution Ensure all product information and disclosures are given to clients before and after the call/email requests Avoids legal challenges by monitoring compliance with service agreements Handle technical escalations through effective diagnosis and troubleshooting of client queries Manage and resolve technical roadblocks/ escalations as per SLA and quality requirements If unable to resolve the issues, timely escalate the issues to TA & SES Provide product support and resolution to clients by performing a question diagnosis while guiding users through step-by-step solutions Troubleshoot all client queries in a user-friendly, courteous and professional manner Offer alternative solutions to clients (where appropriate) with the objective of retaining customers and clients business Organize ideas and effectively communicate oral messages appropriate to listeners and situations Follow up and make scheduled call backs to customers to record feedback and ensure compliance to contract SLAs Build people capability to ensure operational excellence and maintain superior customer service levels of the existing account/client Mentor and guide Production Specialists on improving technical knowledge Collate trainings to be conducted as triage to bridge the skill gaps identified through interviews with the Production Specialist Develop and conduct trainings (Triages) within products for production specialist as per target Inform client about the triages being conducted Undertake product trainings to stay current with product features, changes and updates Enroll in product specific and any other trainings per client requirements/recommendations Identify and document most common problems and recommend appropriate resolutions to the team Update job knowledge by participating in self learning opportunities and maintaining personal networks Deliver NoPerformance ParameterMeasure 1ProcessNo. of cases resolved per day, compliance to process and quality standards, meeting process level SLAs, Pulse score, Customer feedback, NSAT/ ESAT 2Team ManagementProductivity, efficiency, absenteeism 3Capability developmentTriages completed, Technical Test performance Mandatory Skills: Apache Spark.
Posted 2 weeks ago
10.0 - 15.0 years
12 - 18 Lacs
Maharashtra
Work from Office
Staff Software Engineers are the technology leaders of our highest impact projects. Your high energy is contagious, you actively collaborate with others across the engineering organization, and you seek to learn as much as you like to teach. You personify the notion of constant improvement as you work with your team and the larger engineering group to build software that delivers on our mission. You use your extraordinary technical competence to ensure a high bar for excellence while you mentor other engineers on their own path towards craftsmanship. You are most likely T-shaped, with broad knowledge across many technologies plus strong skills in a specific area. Staff Software Engineers embrace the opportunity to represent HMH in industry groups and open-source communities. Area of Responsibility: You will be working on the HMH Assessment Platform that is part of the HMH Educational Online/Digital Learning Platform. The Assessment team builds highly scalable and available platform. The platform is built using Microservices Architecture, Java microservices backend, REACT JavaScript UI Frontend, REST APIs, Postgres Database, AWS Cloud technologies, AWS Kafka, Kubernetes or Mesos orchestration, DataDog for logging/monitoring/alerting, Concourse CI or Jenkins, Maven etc. Responsibilities: Be the technical lead for feature development in a team of 5-10 engineers and influencing the technical direction of the overall engineering organization. Decompose business objectives into valuable, incrementally releasable user features accurately estimating the effort to complete each. Contribute code to feature development efforts demonstrating to others efficient design, delivery and testing patterns and techniques. Strive for high quality outcomes, continuously look for ways to improve team productivity and product reliability, performance, and security. Develop the talents and abilities of peers and colleagues. Create a memorable legacy as you progress toward your personal and professional objectives. Foster your personal and professional development continually seeking assignments that challenge you. Skills & Experience: Successful Candidates must demonstrate an appropriate combination of: 10+ years of experience as a software engineer. 3+ years of experience as a Staff or lead software engineer. Bachelor's degree in computer science or a STEM field. A portfolio of thought leadership and individual technical accomplishments. Full understanding of Agile software development methodologies and practices. Strong communication skills both verbal and written. Extensive experience working with technologies and concepts such: Behavior-driven or test-driven development JVM-based languages such as Java and Scala Development frameworks such as Spring Boot Asynchronous programming concepts, including Event processing Database technologies such as SQL, Postgres/MySQL, AWS Aurora DBs, Redshift, Liquibase or Flyway No-SQL technologies such as Redis, MongoDB and Cassandra Streaming technologies such as Apache Kafka, Apache Spark or Amazon Kinesis Unit-testing frameworks such as jUnit Performance testing frameworks such as Gatling Architectural concepts such as micro-services and separation of concerns Expert knowledge of class-based, object-oriented programming and design patterns Development tools such as GitHub, Jira, Jenkins, Concourse, and Maven Cloud technologies such as AWS and Azure Data Center Operating Technologies such as Kubernetes, Apache Mesos Apache Aurora, and TerraForm and container services such as Docker and Kubernetes Monitoring and operational data analysis practices and tools such as DataDog, Splunk and ELK.
Posted 2 weeks ago
6.0 - 11.0 years
22 - 30 Lacs
Hyderabad
Work from Office
Qualifications - External Required Qualifications: Bachelors Degree in Computer Information Systems or other technology-related fields 14+ years of software application development and documentation experience with a focus on quality, performance, scalability, and resilience using front end technologies and database 8+ years of experience in Apache Spark and Azure with .NET / Java framework and related technologies such as ASP.NET, C#, VB.NET, .NET Core, Angular/React, JAVA, J2Ee, Springboot, Microservices etc. - Preferably 8+ years of experience with solid understanding of object-oriented programming (OOP) principles and design patterns 8+ years of experience with software technologies commonly used in .NET/Java development, such as SQL Server or MySQL or Springboot and Microservices 6+ years of knowledge of web development technologies like HTML, CSS, JavaScript, and front-end frameworks 4+ years of experience in CI/CD like using TFS, Azure DevOps, or GIT tools for data solution management and delivery 2+ years of experience with familiarity with Cloud, preferably Azure services and solutions, including infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS) Experience on Software Products Engineering using .NET framework and related technologies such as ASP.NET, C#, VB.NET, .NET Core, etc. Solid understanding of agile software development methodology (Scrum) and industry best practices
Posted 2 weeks ago
6.0 - 10.0 years
8 - 12 Lacs
Pune, Gurugram, Bengaluru
Work from Office
Contractual Hiring manager :- My profile :- linkedin.com/in/yashsharma1608 Payroll of :- https://www.nyxtech.in/ 1. AZURE DATA ENGINEER WITH FABRIC The Role : Lead Data Engineer PAYROLL Client - Brillio About Role: Experience 6 to 8yrs Location- Bangalore , Hyderabad , Pune , Chennai , Gurgaon (Hyderabad is preferred) Notice- 15 days / 30 days. Budget -15 LPA AZURE FABRIC EXP MANDATE Skills : Azure Onelake, datapipeline , Apache Spark , ETL , Datafactory , Azure Fabric , SQL , Python/Scala. Key Responsibilities: Data Pipeline Development: Lead the design, development, and deployment of data pipelines using Azure OneLake, Azure Data Factory, and Apache Spark, ensuring efficient, scalable, and secure data movement across systems. ETL Architecture: Architect and implement ETL (Extract, Transform, Load) workflows, optimizing the process for data ingestion, transformation, and storage in the cloud. Data Integration: Build and manage data integration solutions that connect multiple data sources (structured and unstructured) into a cohesive data ecosystem. Use SQL, Python, Scala, and R to manipulate and process large datasets. Azure OneLake Expertise: Leverage Azure OneLake and Azure Synapse Analytics to design and implement scalable data storage and analytics solutions that support big data processing and analysis. Collaboration with Teams: Work closely with Data Scientists, Data Analysts, and BI Engineers to ensure that the data infrastructure supports analytical needs and is optimized for performance and accuracy. Performance Optimization: Monitor, troubleshoot, and optimize data pipeline performance to ensure high availability, fast processing, and minimal downtime. Data Governance & Security: Implement best practices for data governance, data security, and compliance within the Azure ecosystem, ensuring data privacy and protection. Leadership & Mentorship: Lead and mentor a team of data engineers, promoting a collaborative and high-performance team culture. Oversee code reviews, design decisions, and the implementation of new technologies. Automation & Monitoring: Automate data engineering workflows, job scheduling, and monitoring to ensure smooth operations. Use tools like Azure DevOps, Airflow, and other relevant platforms for automation and orchestration. Documentation & Best Practices: Document data pipeline architecture, data models, and ETL processes, and contribute to the establishment of engineering best practices, standards, and guidelines. C Innovation: Stay current with industry trends and emerging technologies in data engineering, cloud computing, and big data analytics, driving innovation within the team.C
Posted 2 weeks ago
8.0 - 10.0 years
10 - 12 Lacs
Hyderabad
Work from Office
ABOUT THE ROLE Role Description: We are seeking a highly skilled and experienced hands-on Test Automation Engineering Manager with a deep e xpertise in Data Quality (DQ) , Data Integration (DIF) , and Data Governance . In this role, you will design and implement automated frameworks that ensure data accuracy, metadata consistency , and compliance throughout the data pipeline , leveraging technologies like Data bricks , AWS , and cloud-native tools . You will have a major focus on Data Cataloguing and Governance , ensuring that data assets are well-documented, auditable, and secure across the enterprise. In this role, you will be responsible for the end-to-end design and development of a test automation framework, working collaboratively with the team. As the delivery owner for test automation, your primary focus will be on building and automating comprehensive validation frameworks for data cataloging , data classification, and metadata tracking, while ensuring alignment with internal governance standards. will also work closely with data engineers, product teams, and data governance leads to enforce data quality and governance policies . Your efforts will play a key role in driving data integrity, consistency, and trust across the organization. The role is highly technical and hands-on , with a strong focus on automation, metadata validation , and ensuring data governance practices are seamlessly integrated into development pipelines. Roles & Responsibilities: Data Quality & Integration Frameworks Design and implement Data Quality (DQ) frameworks that validate schema compliance, transformations, completeness, null checks, duplicates, threshold rules, and referential integrity. Build Data Integration Frameworks (DIF) that validate end-to-end data pipelines across ingestion, processing, storage, and consumption layers. Automate data validations in Databricks/Spark pipelines, integrated with AWS services like S3, Glue, Athena, and Lake Formation. Develop modular, reusable validation components using PySpark, SQL, Python, and orchestration via CI/CD pipelines. Data Cataloging & Governance Integrate automated validations with AWS Glue Data Catalog to ensure metadata consistency, schema versioning, and lineage tracking. Implement checks to verify that data assets are properly cataloged, discoverable, and compliant with internal governance standards. Validate and enforce data classification, tagging, and access controls, ensuring alignment with data governance frameworks (e.g., PII/PHI tagging, role-based access policies). Collaborate with governance teams to automate policy enforcement and compliance checks for audit and regulatory needs. Visualization & UI Testing Automate validation of data visualizations in tools like Tableau, Power BI, Looker , or custom React dashboards. Ensure charts, KPIs, filters, and dynamic views correctly reflect backend data using UI automation (Selenium with Python) and backend validation logic. Conduct API testing (via Postman or Python test suites) to ensure accurate data delivery to visualization layers. Technical Skills and Tools Hands-on experience with data automation tools like Databricks and AWS is essential, as the manager will be instrumental in building and managing data pipelines. Leverage automated testing frameworks and containerization tools to streamline processes and improve efficiency. Experience in UI and API functional validation using tools such as Selenium with Python and Postman, ensuring comprehensive testing coverage. Technical Leadership, Strategy & Team Collaboration Define and drive the overall QA and testing strategy for UI and search-related components with a focus on scalability, reliability, and performance, while establishing alerting and reporting mechanisms for test failures, data anomalies, and governance violations. Contribute to system architecture and design discussions , bringing a strong quality and testability lens early into the development lifecycle. Lead test automation initiatives by implementing best practices and scalable frameworks, embedding test suites into CI/CD pipelines to enable automated, continuous validation of data workflows, catalog changes, and visualization updates Mentor and guide QA engineers , fostering a collaborative, growth-oriented culture focused on continuous learning and technical excellence. Collaborate cross-functionally with product managers, developers, and DevOps to align quality efforts with business goals and release timelines. Conduct code reviews, test plan reviews, and pair-testing sessions to ensure team-level consistency and high-quality standards. Good-to-Have Skills: Experience with data governance tools such as Apache Atlas , Collibra , or Alation Understanding of DataOps methodologies and practices Familiarity with monitoring/observability tools such as Datadog , Prometheus , or CloudWatch Experience building or maintaining test data generators Contributions to internal quality dashboards or data observability systems Awareness of metadata-driven testing approaches and lineage-based validations Experience working with agile Testing methodologies such as Scaled Agile. Familiarity with automated testing frameworks like Selenium, JUnit, TestNG, or PyTest. Must-Have Skills: Strong hands-on experience with Data Quality (DQ) framework design and automation Expertise in PySpark, Python, and SQL for data validations Solid understanding of ETL/ELT pipeline testing in Databricks or Apache Spark environments Experience validating structured and semi-structured data formats (e.g., Parquet, JSON, Avro) Deep familiarity with AWS data services: S3, Glue, Athena, Lake Formation, Data Catalog Integration of test automation with AWS Glue Data Catalog or similar catalog tools UI automation using Selenium with Python for dashboard and web interface validation API testing using Postman, Python, or custom API test scripts Hands-on testing of BI tools such as Tableau, Power BI, Looker, or custom visualization layers CI/CD test integration with tools like Jenkins, GitHub Actions, or GitLab CI Familiarity with containerized environments (e.g., Docker, AWS ECS/EKS) Knowledge of data classification, access control validation, and PII/PHI tagging Understanding of data governance standards (e.g., GDPR, HIPAA, CCPA) Understanding Data Structures: Knowledge of various data structures and their applications. Ability to analyze data and identify inconsistencies. Proven hands-on experience in test automation and data automation using Databricks and AWS. Strong knowledge of Data Integrity Frameworks (DIF) and Data Quality (DQ) principles. Familiarity with automated testing frameworks like Selenium, JUnit, TestNG, or PyTest. Strong understanding of data transformation techniques and logic. Education and Professional Certifications Bachelors degree in computer science and engineering preferred, other Engineering field is considered; Masters degree and 6+ years experience Or Bachelors degree and 8+ years Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals Strong presentation and public speaking skills.
Posted 2 weeks ago
12.0 - 17.0 years
14 - 19 Lacs
Hyderabad
Work from Office
W e are seeking a highly skilled , hands -on and technically proficient Test Automation Engineering Manager with strong experience in data quality , data integration , and a specific focus on semantic layer validation . This role combines technical ownership of automated data testing solutions with team leadership responsibilities, ensuring that the data infrastructure across platforms remains accurate , reliable, and high performing . As a leader in the QA and Data Engineering space, you will be responsible for building robust automated testing frameworks, validating GraphQL -based data layers, and driving the teams technical growth. Your work will ensure that all data flows, transformations, and API interactions meet enterprise-grade quality standards across the data lifecycle. Y ou will be responsible for the end-to-end design and development of test automation frameworks, working collaboratively with your team. As the delivery owner for test automation, your primary responsibilities will include building and automating comprehensive validation frameworks for semantic layer testing, GraphQL API validation, and schema compliance , ensuring alignment with data quality, performance, and integration reliability standards. You will also work closely with data engineers, product teams, and platform architects to validate data contracts and integration logic, supporting the integrity and trustworthiness of enterprise data solutions. This is a highly technical and hands-on role, with strong emphasis on automation, data workflow validation , and the seamless integration of testing practices into CI/CD pipelines . Roles & Responsibilities: Design and implement robust data validation frameworks focused on the semantic layer, ensuring accurate data model, schema compliance, and contract adherence across services and platforms. Build and automate end-to-end data pipeline validations across ingestion, transformation, and consumption layers using Databricks, Apache Spark, and AWS services such as S3, Glue, Athena, and Lake Formation. Lead test automation initiatives by developing scalable, modular test frameworks and embedding them into CI/CD pipelines for continuous validation of semantic models, API integrations, and data workflows. Validate GraphQL APIs by testing query/mutation structures, schema compliance, and end-to-end integration accuracy using tools like Postman, Python, and custom test suites. Oversee UI and visualization testing for tools like Tableau, Power BI, and custom front-end dashboards, ensuring consistency with backend data through Selenium with Python and backend validations. Define and drive the overall QA strategy with emphasis on performance, reliability, and semantic data accuracy, while setting up alerting and reporting mechanisms for test failures, schema issues, and data contract violations. Collaborate closely with product managers, data engineers, developers, and DevOps teams to align quality assurance initiatives with business goals and agile release cycles. Actively contribute to architecture and design discussions, ensuring quality and testability are embedded from the earliest stages of development. Mentor and manage QA engineers, fostering a collaborative environment focused on technical excellence, knowledge sharing, and continuous professional growth. Must-Have Skills: Team Leadership Experience is also required. Strong 6+ years of experience in Requested Data Ops/Testing is required 7+ to 12 years of Overall experience is expected in Test Automation. Strong experience in designing and implementing test automation frameworks integrated with CI/CD pipelines. Expertise in validating data pipelines at the syntactic layer, including schema checks, null/duplicate handling, and transformation validation. Hands-on experience with Databricks, Apache Spark, and AWS services (S3, Glue, Athena, Lake Formation). Proficiency in Python, PySpark, and SQL for writing validation scripts and automation logic. Solid understanding of GraphQL APIs, including schema validation and query/mutation testing. Experience with API testing tools like Postman and Python-based test frameworks. Proficient in UI and visualization testing using Selenium with Python, especially for tools like Tableau, Power BI, or custom dashboards. Familiarity with CI/CD tools such as Jenkins, GitHub Actions, or GitLab CI for test orchestration. Ability to implement alerting and reporting for test failures, anomalies, and validation issues. Strong background in defining QA strategies and leading test automation initiatives in data-centric environments. Excellent collaboration and communication skills, with the ability to work closely with cross-functional teams in Agile settings. Mentor and manage QA engineers, fostering a collaborative environment focused on technical excellence, knowledge sharing, and continuous professional growth. Good-to-Have Skills: Experience with data governance tools such as Apache Atlas, Collibra, or Alation Understanding of DataOps methodologies and practices Contributions to internal quality dashboards or data observability systems Awareness of metadata-driven testing approaches and lineage-based validations Experience working with agile Testing methodologies such as Scaled Agile. Familiarity with automated testing frameworks like Selenium, JUnit, TestNG, or PyTest. Education and Professional Certifications Bachelors/Masters degree in computer science and engineering preferred. Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals Strong presentation and public speaking skills. EQUAL OPPORTUNITY STATEMENT We provide reasonable accommodations for individuals with disabilities during the application, interview process, job functions, and employment benefits. Contact us to request an accommodation .
Posted 2 weeks ago
4.0 - 6.0 years
6 - 8 Lacs
Hyderabad
Work from Office
ABOUT THE ROLE Role Description: We are seeking a highly experienced and hands-on Test Automation Engineering Manager with strong leadership skills and deep expertise in Data Integration, Data Quality , and automated data validation across real-time and batch pipelines . In this strategic role, you will lead the design, development, and implementation of scalable test automation frameworks that validate data ingestion, transformation, and delivery across diverse sources into AWS-based analytics platforms , leveraging technologies like Databricks , PySpark , and cloud-native services. As a lead , you will drive the overall testing strategy, lead a team of test engineers, and collaborate cross-functionally with data engineering, platform, and product teams. Your focus will be on delivering high-confidence, production-grade data pipelines with built-in validation layers that support enterprise analytics, ML models, and reporting platforms. The role is highly technical and hands-on , with a strong focus on automation, metadata validation , and ensuring data governance practices are seamlessly integrated into development pipelines. Roles & Responsibilities: Define and drive the test automation strategy for data pipelines, ensuring alignment with enterprise data platform goals. Lead and mentor a team of data QA/test engineers, providing technical direction, career development, and performance feedback. Own delivery of automated data validation frameworks across real-time and batch data pipelines using Databricks and AWS services. Collaborate with data engineering, platform, and product teams to embed data quality checks and testability into pipeline design. Design and implement scalable validation frameworks for data ingestion, transformation, and consumption layers. Automate validations for multiple data formats including JSON, CSV, Parquet, and other structured/semi-structured file types during ingestion and transformation. Automate data testing workflows for pipelines built on Databricks/Spark, integrated with AWS services like S3, Glue, Athena, and Redshift. Establish reusable test components for schema validation, null checks, deduplication, threshold rules, and transformation logic. Integrate validation processes with CI/CD pipelines, enabling automated and event-driven testing across the development lifecycle. Drive the selection and adoption of tools/frameworks that improve automation, scalability, and test efficiency. Oversee testing of data visualizations in Tableau, Power BI, or custom dashboards, ensuring backend accuracy via UI and data-layer validations. Ensure accuracy of API-driven data services, managing functional and regression testing via Postman, Python, or other automation tools. Track test coverage, quality metrics, and defect trends, providing regular reporting to leadership and ensuring continuous improvement. establishing alerting and reporting mechanisms for test failures, data anomalies, and governance violations. Contribute to system architecture and design discussions, bringing a strong quality and testability lens early into the development lifecycle. Lead test automation initiatives by implementing best practices and scalable frameworks, embedding test suites into CI/CD pipelines to enable automated, continuous validation of data workflows, catalog changes, and visualization updates Mentor and guide QA engineers, fostering a collaborative, growth-oriented culture focused on continuous learning and technical excellence. Collaborate cross-functionally with product managers, developers, and DevOps to align quality efforts with business goals and release timelines. Conduct code reviews, test plan reviews, and pair-testing sessions to ensure team-level consistency and high-quality standards. Must-Have Skills: Hands-on experience with Databricks and Apache Spark for building and validating scalable data pipelines Strong expertise in AWS services including S3, Glue, Athena, Redshift, and Lake Formation Proficient in Python, PySpark, and SQL for developing test automation and validation logic Experience validating data from various file formats such as JSON, CSV, Parquet, and Avro In-depth understanding of data integration workflows including batch and real-time (streaming) pipelines Strong ability to define and automate data quality checks : schema validation, null checks, duplicates, thresholds, and transformation validation Experience designing modular, reusable automation frameworks for large-scale data validation Skilled in integrating tests with CI/CD tools like GitHub Actions , Jenkins , or Azure DevOps Familiarity with orchestration tools such as Apache Airflow , Databricks Jobs , or AWS Step Functions Hands-on experience with API testing using Postman , pytest , or custom automation scripts Proven track record of leading and mentoring QA/test engineering teams Ability to define and own test automation strategy and roadmap for data platforms Strong collaboration skills to work with engineering, product, and data teams Excellent communication skills for presenting test results, quality metrics , and project health to leadership Contributions to internal quality dashboards or data observability systems Awareness of metadata-driven testing approaches and lineage-based validations Experience working with agile Testing methodologies such as Scaled Agile. Familiarity with automated testing frameworks like Selenium, JUnit, TestNG, or PyTest. Good-to-Have Skills: Experience with data governance tools such as Apache Atlas, Collibra, or Alation Understanding of DataOps methodologies and practices Familiarity with monitoring/observability tools such as Datadog, Prometheus, or CloudWatch Experience building or maintainingtest data generators Education and Professional Certifications Bachelors/Masters degree in computer science and engineering preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals Strong presentation and public speaking skills.
Posted 2 weeks ago
1.0 - 3.0 years
3 - 5 Lacs
Hyderabad
Work from Office
What you will do In this vital role you will be responsible for designing, building, maintaining, analyzing, and interpreting data to provide actionable insights that drive business decisions. This role involves working with large datasets, developing reports, supporting and performing data governance initiatives and, visualizing data to ensure data is accessible, reliable, and efficiently managed. The ideal candidate has deep technical skills, experience with big data technologies, and a deep understanding of data architecture and ETL processes. Roles & Responsibilities: Design, develop, and maintain data solutions for data generation, collection, and processing Be a crucial team member that assists in design and development of the data pipeline Build data pipelines and ensure data quality by implementing ETL processes to migrate and deploy data across systems Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions Take ownership of data pipeline projects from inception to deployment, manage scope, timelines, and risks Collaborate with cross-functional teams to understand data requirements and design solutions that meet business needs Develop and maintain data models, data dictionaries, and other documentation to ensure data accuracy and consistency Implement data security and privacy measures to protect sensitive data Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions Collaborate and communicate effectively with product teams Collaborate with Data Architects, Business SMEs, and Data Scientists to design and develop end-to-end data pipelines to meet fast-paced business needs across geographic regions Identify and resolve complex data-related challenges Adhere to best practices for coding, testing, and designing reusable code/component Explore new tools and technologies that will help to improve ETL platform performance Participate in sprint planning meetings and provide estimations on technical implementation Basic Qualifications: Masters degree and 1 to 3 years of Computer Science, IT or related field experience OR Bachelors degree and 3 to 5 years of Computer Science, IT or related field experience OR Diploma and 7 to 9 years of Computer Science, IT or related field experience Preferred Qualifications: Must-Have Skills: Hands-on experience with big data technologies and platforms, such as Databricks, Apache Spark (PySpark, SparkSQL), workflow orchestration, performance tuning on big data processing Proficiency in data analysis tools (eg. SQL) and experience with data visualization tools Excellent problem-solving skills and the ability to work with large, complex datasets Solid understanding of data governance frameworks, tools, and best practices. Knowledge of data protection regulations and compliance requirements Good-to-Have Skills: Experience with ETL tools such as Apache Spark, and various Python packages related to data processing, machine learning model development Good understanding of data modeling, data warehousing, and data integration concepts Knowledge of Python/R, Databricks, SageMaker, cloud data platforms Professional Certifications Certified Data Engineer / Data Analyst (preferred on Databricks or cloud environments) Soft Skills: Excellent critical-thinking and problem-solving skills Good communication and collaboration skills Demonstrated awareness of how to function in a team setting Demonstrated presentation skills
Posted 2 weeks ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
19947 Jobs | Dublin
Wipro
9475 Jobs | Bengaluru
EY
7894 Jobs | London
Accenture in India
6317 Jobs | Dublin 2
Amazon
6141 Jobs | Seattle,WA
Uplers
6077 Jobs | Ahmedabad
Oracle
5820 Jobs | Redwood City
IBM
5736 Jobs | Armonk
Tata Consultancy Services
3644 Jobs | Thane
Capgemini
3598 Jobs | Paris,France