Jobs
Interviews

8569 Pyspark Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

0 years

6 - 8 Lacs

Hyderābād

Remote

Official Title: Data Operations Analyst About YipitData: YipitData is the leading market research and analytics firm for the disruptive economy and recently raised up to $475M from The Carlyle Group at a valuation over $1B. We analyze billions of alternative data points every day to provide accurate, detailed insights on ridesharing, e-commerce marketplaces, payments, and more. Our on-demand insights team uses proprietary technology to identify, license, clean, and analyze the data many of the world's largest investment funds and corporations depend on. For three years and counting, we have been recognized as one of Inc's Best Workplaces . We are a fast-growing technology company backed by The Carlyle Group and Norwest Venture Partners. Our offices are located in NYC, Austin, Miami, Denver, Mountain View, Seattle , Hong Kong, Shanghai, Beijing, Guangzhou, and Singapore. We cultivate a people-centric culture focused on mastery, ownership, and transparency. Why You Should Apply NOW: We are seeking a highly skilled and detail-oriented Data Operations Analyst to play a crucial role in developing custom product attribution solutions based on unique customer needs. This position requires a deep understanding of consumer product data, a strong combination of technical abilities, and a proactive approach to delivering high-quality results. The Data Operations Analyst will work closely with our corporate retail and brand customers, who may have different approaches to organizing and structuring their categories. Your primary responsibility will be to map products to customer requirements using a blend of manual tagging and machine-learning tools. About The Role: The Data Operations Analyst plays a critical role in developing custom product attribution based on unique customer needs. Each of our corporate retail and brand customers thinks about the structure of their categories in slightly different ways, and the Data Operations Analyst will execute the mapping of products to their requirements using a combination of manual tagging and machine learning tools. Interpretation and execution design require strong judgment. The ideal candidate should bring a combination of technical skills, knowledge of consumer product data, strong attention to detail, and accountability to plan and deliver projects on time with a high degree of accuracy. This is a fully-remote opportunity based in India. The start date is June 30, 2025. During onboarding and training period, we expect several hours of overlap with US time zones. Afterward, hires should be available for meetings and check-ins with their US managers and colleagues; however, outside of these specific times, standard work hours can be flexible. As Our Data Operations Analyst, You Will: Work with corporate retail and brand customers to understand their category structures and product attribution requirements. Execute the mapping of products to customers' category needs, utilizing both manual tagging and machine learning tools. Apply strong judgment and interpretation skills to ensure that data mapping aligns with customer specifications and business goals. Collaborate with cross-functional teams to plan and deliver projects on time, ensuring high levels of accuracy and precision. Continuously monitor and improve product attribution processes to increase efficiency and quality. Leverage technical skills to enhance existing processes and tools that support data operations. Maintain strong attention to detail and ensure accuracy in every aspect of the data mapping process. Take accountability for successfully executing projects, including meeting deadlines and client expectations. You Are Likely To Succeed If You Have… 1-2yrs of experience developing a strong technical background, including applied Python, Pyspark, and/or SQL skills RegEx experience is not mandatory but would be considered a nice-to-have skill Proficiency in data analysis, tagging systems, and machine learning tools. Knowledge of consumer product data and experience working with retail or brand data. Exceptional attention to detail, with a focus on delivering high-quality, accurate results. Excellent problem-solving and critical-thinking skills. The ability to manage multiple tasks and projects while maintaining a high degree of accuracy. Strong communication and collaboration skills to work with internal teams and external clients. Proven ability to execute projects efficiently and meet deadlines. Preferred Skills for This Position Include: Experience with data management platforms, product attribution systems, or machine learning tools. Familiarity with data mapping, tagging, and categorization practices. If you are a proactive, technically skilled individual with a passion for data and product attribution, we invite you to join our dynamic team! Apply today to help shape the future of data operations for our retail and brand customers. What We Offer: Our compensation package includes comprehensive benefits, perks, and a competitive salary: We care about your personal life and we mean it. We offer vacation time, medical insurance, parental leave, learning reimbursement, and more! Your growth at YipitData is determined by the impact that you are making, not by tenure, unnecessary facetime, or office politics. Everyone at YipitData is empowered to learn, self-improve, and master their skills in an environment focused on ownership, respect, and trust. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, marital status, disability, gender, gender identity or expression, or veteran status. We are proud to be an equal-opportunity employer. Job Applicant Privacy Notice

Posted 14 hours ago

Apply

2.0 - 3.0 years

8 Lacs

Thiruvananthapuram

On-site

2 - 3 Years 2 Openings Trivandrum Role description Role Proficiency: This role requires proficiency in data pipeline development including coding testing and implementing data pipelines for ingesting wrangling transforming and joining data from various sources. Must be adept at using ETL tools such as Informatica Glue Databricks and DataProc along with coding skills in Python PySpark and SQL. Works independently according to work allocation. Outcomes: Operate with minimal guidance to develop error-free code test applications and document the development process. Understand application features and component designs to develop them in accordance with user stories and requirements. Code debug test document and communicate the stages of product component or feature development. Develop optimized code using appropriate approaches and algorithms while adhering to standards and security guidelines independently. Complete foundational level certifications in Azure AWS or GCP. Demonstrate proficiency in writing advanced SQL queries. Measures of Outcomes: Adherence to engineering processes and standards Adherence to schedule / timelines Adhere to SLAs where applicable # of defects post delivery # of non-compliance issues Reduction of reoccurrence of known defects Quickly turnaround production bugs Completion of applicable technical/domain certifications Completion of all mandatory training requirements Outputs Expected: Code Development: Develop data processing code independently ensuring it meets performance and scalability requirements. Documentation: Create comprehensive documentation for personal work and ensure it aligns with project standards. Configuration: Follow the configuration process diligently. Testing: Create and conduct unit tests for data pipelines and transformations to ensure data quality and correctness. Domain Relevance: Develop features and components with a solid understanding of the business problems being addressed for the client. Defect Management: Raise fix and retest defects in accordance with project standards. Estimation: Estimate time effort and resource dependencies for personal work. Knowledge Management: Consume and contribute to project-related documents SharePoint libraries and client universities. Release Management: Adhere to the release management process for seamless deployment. Design Understanding: Understand the design and low-level design (LLD) and link it to requirements and user stories. Certifications: Obtain relevant technology certifications to enhance skills and knowledge. Skill Examples: Proficiency in SQL Python or other programming languages utilized for data manipulation. Experience with ETL tools such as Apache Airflow Talend Informatica AWS Glue Dataproc and Azure ADF. Hands-on experience with cloud platforms like AWS Azure or Google Cloud particularly with data-related services (e.g. AWS Glue BigQuery). Conduct tests on data pipelines and evaluate results against data quality and performance specifications. Knowledge Examples: Knowledge Examples Knowledge of various ETL services used by cloud providers including Apache PySpark AWS Glue GCP DataProc/DataFlow and Azure ADF/ADLF. Proficiency in SQL for analytics including windowing functions. Understanding of data schemas and models Additional Comments: Job Description - Job Description Strong written and verbal communication skills in English. • Ability to work in 24x7 shift schedules, including night shifts for extended periods. • Analytical and problem-solving skills to diagnose and address data-related issues. • Proficiency in writing SQL queries for data extraction and analysis. • Hands-on experience with MS Excel for data analysis. • Ability to work independently under minimal supervision while following SOPs. • Strong attention to detail and ability to manage multiple monitoring tasks effectively. As an L1 Data Ops Analyst, you will be responsible for monitoring data pipelines, dashboards, and databases to ensure smooth operations. You will follow Standard Operating Procedures (SOPs) and runbooks to identify, escalate, and resolve issues with minimal supervision. Strong analytical skills, attention to detail, and the ability to work in a fast-paced, 24x7 environment are critical for this role. Key Responsibilities: • Monitor various dashboards, s, and databases continuously for a 9-hour shift. • Identify and escalate system or data anomalies based on predefined thresholds. • Follow SOPs and runbooks to troubleshoot and resolve basic data issues. • Work closely with L2 and L3 support teams for issue escalation and resolution. • Write and execute basic SQL queries for data validation and troubleshooting. • Analyze and interpret data using MS Excel to identify trends or anomalies. • Maintain detailed logs of incidents, resolutions, and escalations. • Communicate effectively with stakeholders, both verbally and in writing. Skills Sql,Data Analysis,Ms Excel,Dashboards About UST UST is a global digital transformation solutions provider. For more than 20 years, UST has worked side by side with the world’s best companies to make a real impact through transformation. Powered by technology, inspired by people and led by purpose, UST partners with their clients from design to operation. With deep domain expertise and a future-proof philosophy, UST embeds innovation and agility into their clients’ organizations. With over 30,000 employees in 30 countries, UST builds for boundless impact—touching billions of lives in the process.

Posted 14 hours ago

Apply

4.0 - 7.0 years

3 - 7 Lacs

Gurgaon

On-site

Gurugram, Haryana, India;Bangalore, Karnataka, India Qualification : Skills : Bigdata, Pyspark, Hive, Spark Optimization Good to have : GCP Skills Required : Bigdata, Spark, Pyspark, Optimization, Python, Java Role : Skills : Bigdata, Pyspark, Hive, Spark Optimization Good to have : GCP Experience : 4 to 7 years Job Reference Number : 13159

Posted 14 hours ago

Apply

0 years

5 - 9 Lacs

Haryāna

On-site

Role Description: We are seeking a skilled professional to maintain and support batch jobs in a legacy environment. The role involves managing and monitoring ETL processes, addressing issues, and enhancing existing PL/SQL scripts. The ideal candidate will have strong expertise in Informatica, SQL Server, and data warehousing concepts, along with experience in troubleshooting and improving batch job performance.Key Responsibilities: Design and implement robust ETL pipelines using AWS Glue, Lambda, Redshift and S3.· Monitor and optimize the performance of data workflows and batch processing jobs.· Troubleshoot and resolve issues related to data pipeline failures, inconsistencies, and performance bottlenecks.· Collaborate with cross-functional teams to define data requirements and ensure data quality and accuracy.· Develop and maintain automated solutions for data transformation, migration, and integration tasks.· Implement best practices for data security, data governance, and compliance within AWS environments.· Continuously improve and optimize AWS Glue jobs, Lambda functions, and S3 storage management.· Maintain comprehensive documentation for data pipeline architecture, job schedules, and issue resolution processes. Required Skills and Experience: Strong experience with Data Engineering practices.· Experience in AWS services, particularly AWS Glue, Lambda, S3, and other AWS data tools.· Proficiency in SQL, python , Pyspark, numpy etc and experience in working with large-scale data sets.· Experience in designing and implementing ETL pipelines in cloud environments.· Expertise in troubleshooting and optimizing data processing workflows.· Familiarity with data warehousing concepts and cloud-native data architecture.· Knowledge of automation and orchestration tools in a cloud-based environment.· Strong problem-solving skills and the ability to debug and improve the performance of data jobs.· Excellent communication skills and the ability to work collaboratively with cross-functional teams.· Good to have knowledge of DBT & Snowflake We are an Equal Opportunity Employer: We value diversity at Incedo. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Posted 15 hours ago

Apply

0 years

3 - 7 Lacs

Gurgaon

Remote

Official Title: Data Operations Analyst About YipitData: YipitData is the leading market research and analytics firm for the disruptive economy and recently raised up to $475M from The Carlyle Group at a valuation over $1B. We analyze billions of alternative data points every day to provide accurate, detailed insights on ridesharing, e-commerce marketplaces, payments, and more. Our on-demand insights team uses proprietary technology to identify, license, clean, and analyze the data many of the world's largest investment funds and corporations depend on. For three years and counting, we have been recognized as one of Inc's Best Workplaces . We are a fast-growing technology company backed by The Carlyle Group and Norwest Venture Partners. Our offices are located in NYC, Austin, Miami, Denver, Mountain View, Seattle , Hong Kong, Shanghai, Beijing, Guangzhou, and Singapore. We cultivate a people-centric culture focused on mastery, ownership, and transparency. Why You Should Apply NOW: We are seeking a highly skilled and detail-oriented Data Operations Analyst to play a crucial role in developing custom product attribution solutions based on unique customer needs. This position requires a deep understanding of consumer product data, a strong combination of technical abilities, and a proactive approach to delivering high-quality results. The Data Operations Analyst will work closely with our corporate retail and brand customers, who may have different approaches to organizing and structuring their categories. Your primary responsibility will be to map products to customer requirements using a blend of manual tagging and machine-learning tools. About The Role: The Data Operations Analyst plays a critical role in developing custom product attribution based on unique customer needs. Each of our corporate retail and brand customers thinks about the structure of their categories in slightly different ways, and the Data Operations Analyst will execute the mapping of products to their requirements using a combination of manual tagging and machine learning tools. Interpretation and execution design require strong judgment. The ideal candidate should bring a combination of technical skills, knowledge of consumer product data, strong attention to detail, and accountability to plan and deliver projects on time with a high degree of accuracy. This is a fully-remote opportunity based in India. The start date is June 30, 2025. During onboarding and training period, we expect several hours of overlap with US time zones. Afterward, hires should be available for meetings and check-ins with their US managers and colleagues; however, outside of these specific times, standard work hours can be flexible. As Our Data Operations Analyst, You Will: Work with corporate retail and brand customers to understand their category structures and product attribution requirements. Execute the mapping of products to customers' category needs, utilizing both manual tagging and machine learning tools. Apply strong judgment and interpretation skills to ensure that data mapping aligns with customer specifications and business goals. Collaborate with cross-functional teams to plan and deliver projects on time, ensuring high levels of accuracy and precision. Continuously monitor and improve product attribution processes to increase efficiency and quality. Leverage technical skills to enhance existing processes and tools that support data operations. Maintain strong attention to detail and ensure accuracy in every aspect of the data mapping process. Take accountability for successfully executing projects, including meeting deadlines and client expectations. You Are Likely To Succeed If You Have… 1-2yrs of experience developing a strong technical background, including applied Python, Pyspark, and/or SQL skills RegEx experience is not mandatory but would be considered a nice-to-have skill Proficiency in data analysis, tagging systems, and machine learning tools. Knowledge of consumer product data and experience working with retail or brand data. Exceptional attention to detail, with a focus on delivering high-quality, accurate results. Excellent problem-solving and critical-thinking skills. The ability to manage multiple tasks and projects while maintaining a high degree of accuracy. Strong communication and collaboration skills to work with internal teams and external clients. Proven ability to execute projects efficiently and meet deadlines. Preferred Skills for This Position Include: Experience with data management platforms, product attribution systems, or machine learning tools. Familiarity with data mapping, tagging, and categorization practices. If you are a proactive, technically skilled individual with a passion for data and product attribution, we invite you to join our dynamic team! Apply today to help shape the future of data operations for our retail and brand customers. What We Offer: Our compensation package includes comprehensive benefits, perks, and a competitive salary: We care about your personal life and we mean it. We offer vacation time, medical insurance, parental leave, learning reimbursement, and more! Your growth at YipitData is determined by the impact that you are making, not by tenure, unnecessary facetime, or office politics. Everyone at YipitData is empowered to learn, self-improve, and master their skills in an environment focused on ownership, respect, and trust. The annual salary for this position is anticipated to be ₹16,60,000 - ₹20,75,000 (INR). The final offer may be determined by a number of factors, including, but not limited to, the applicant's experience, knowledge, skills, and abilities, as well as internal team benchmarks. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, marital status, disability, gender, gender identity or expression, or veteran status. We are proud to be an equal-opportunity employer. Job Applicant Privacy Notice

Posted 15 hours ago

Apply

175.0 years

2 - 7 Lacs

Gurgaon

On-site

At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleagues. As part of Team Amex, you'll experience this powerful backing with comprehensive support for your holistic well-being and many opportunities to learn new skills, develop as a leader, and grow your career. Here, your voice and ideas matter, your work makes an impact, and together, you will help us define the future of American Express. Description: The Analytics, Investment and Marketing Enablement (AIM) team – a part of GCS Marketing Organization – is the analytical engine that enables the Global Commercial Card business. The team drives Profitable Growth in Acquisitions through Data, Analytics, AI powered Targeting & Personalization Capabilities. This B30 role would be a part of AIM India team, based out of Gurgaon, and would be responsible for proactive retention and save a card analytics across the SME segment across marketing and sales distribution channels. This critical role represents a unique opportunity to make charge volume impact of 2+ Billion. A very important focus for the role shall be quantitatively determining the value, deriving insights, and then assuring the insights are leveraged to create positive impact that cause a meaningful difference to the business. Key Responsibilities include: Develop/enhance precursors in AI models partnering with Decision science and collaborate across Marketing, Risk, and Sales to help design customized treatments depending upon the precursors. Be a key analytical partner to the Marketing and Measurement teams to report on Digital, Field and Phone Programs that promote growth and retention. Support and enable the GCS partners with actionable, insightful analytical solutions (such as triggers, Prioritization Tiers) to help the Field and Phone Sales team prioritize efforts effectively. Partner with functional leaders, Strategic Business Partners, and Senior leaders to assess and identify opportunities for better customer engagement and revenue growth. Excellent communication skills with the ability to engage, influence, and inspire partners and stakeholders to drive collaboration and alignment. Exceptional execution skills – be able to resolve issues, identify opportunities, and define success metrics and make things happen. Drive Automation and ongoing refinement of analytical frameworks. Willingness to challenge the status quo; breakthrough thinking to generate insights, alternatives, and opportunities for business success. High degree of organization, individual initiative, and personal accountability Minimum Qualifications: Strong programming skills & experience with building models & analytical data products are required. Experience with technologies such as Java, Big Data, PySpark, Hive, Scala, Python Proficiency & experience in applying cutting edge statistical and machine learning techniques to business problems and leverage external thinking (from academia and/or other industries) to develop best in class data science solutions. Excellent communication and interpersonal skills, and ability to build and retain strong working relationships. Ability to interact effectively and deliver compelling messages to business leaders across various band levels. Preferred Qualifications: Good knowledge of statistical techniques like hypothesis testing, regression, knn, t-test, chi-square test Demonstrated ability to work independently and across a matrix organization partnering with capabilities, decision sciences, technology teams and external vendors to deliver solutions at top speed. Experience with commercial data and ability to create insights and drive results. We back you with benefits that support your holistic well-being so you can be and deliver your best. This means caring for you and your loved ones' physical, financial, and mental health, as well as providing the flexibility you need to thrive personally and professionally: Competitive base salaries Bonus incentives Support for financial-well-being and retirement Comprehensive medical, dental, vision, life insurance, and disability benefits (depending on location) Flexible working model with hybrid, onsite or virtual arrangements depending on role and business need Generous paid parental leave policies (depending on your location) Free access to global on-site wellness centers staffed with nurses and doctors (depending on location) Free and confidential counseling support through our Healthy Minds program Career development and training opportunities American Express is an equal opportunity employer and makes employment decisions without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran status, disability status, age, or any other status protected by law. Offer of employment with American Express is conditioned upon the successful completion of a background verification check, subject to applicable laws and regulations.

Posted 15 hours ago

Apply

0 years

0 Lacs

Delhi

Remote

Official Title: Data Operations Analyst About YipitData: YipitData is the leading market research and analytics firm for the disruptive economy and recently raised up to $475M from The Carlyle Group at a valuation over $1B. We analyze billions of alternative data points every day to provide accurate, detailed insights on ridesharing, e-commerce marketplaces, payments, and more. Our on-demand insights team uses proprietary technology to identify, license, clean, and analyze the data many of the world's largest investment funds and corporations depend on. For three years and counting, we have been recognized as one of Inc's Best Workplaces . We are a fast-growing technology company backed by The Carlyle Group and Norwest Venture Partners. Our offices are located in NYC, Austin, Miami, Denver, Mountain View, Seattle , Hong Kong, Shanghai, Beijing, Guangzhou, and Singapore. We cultivate a people-centric culture focused on mastery, ownership, and transparency. Why You Should Apply NOW: We are seeking a highly skilled and detail-oriented Data Operations Analyst to play a crucial role in developing custom product attribution solutions based on unique customer needs. This position requires a deep understanding of consumer product data, a strong combination of technical abilities, and a proactive approach to delivering high-quality results. The Data Operations Analyst will work closely with our corporate retail and brand customers, who may have different approaches to organizing and structuring their categories. Your primary responsibility will be to map products to customer requirements using a blend of manual tagging and machine-learning tools. About The Role: The Data Operations Analyst plays a critical role in developing custom product attribution based on unique customer needs. Each of our corporate retail and brand customers thinks about the structure of their categories in slightly different ways, and the Data Operations Analyst will execute the mapping of products to their requirements using a combination of manual tagging and machine learning tools. Interpretation and execution design require strong judgment. The ideal candidate should bring a combination of technical skills, knowledge of consumer product data, strong attention to detail, and accountability to plan and deliver projects on time with a high degree of accuracy. This is a fully-remote opportunity based in India. The start date is June 30, 2025. During onboarding and training period, we expect several hours of overlap with US time zones. Afterward, hires should be available for meetings and check-ins with their US managers and colleagues; however, outside of these specific times, standard work hours can be flexible. As Our Data Operations Analyst, You Will: Work with corporate retail and brand customers to understand their category structures and product attribution requirements. Execute the mapping of products to customers' category needs, utilizing both manual tagging and machine learning tools. Apply strong judgment and interpretation skills to ensure that data mapping aligns with customer specifications and business goals. Collaborate with cross-functional teams to plan and deliver projects on time, ensuring high levels of accuracy and precision. Continuously monitor and improve product attribution processes to increase efficiency and quality. Leverage technical skills to enhance existing processes and tools that support data operations. Maintain strong attention to detail and ensure accuracy in every aspect of the data mapping process. Take accountability for successfully executing projects, including meeting deadlines and client expectations. You Are Likely To Succeed If You Have… 1-2yrs of experience developing a strong technical background, including applied Python, Pyspark, and/or SQL skills RegEx experience is not mandatory but would be considered a nice-to-have skill Proficiency in data analysis, tagging systems, and machine learning tools. Knowledge of consumer product data and experience working with retail or brand data. Exceptional attention to detail, with a focus on delivering high-quality, accurate results. Excellent problem-solving and critical-thinking skills. The ability to manage multiple tasks and projects while maintaining a high degree of accuracy. Strong communication and collaboration skills to work with internal teams and external clients. Proven ability to execute projects efficiently and meet deadlines. Preferred Skills for This Position Include: Experience with data management platforms, product attribution systems, or machine learning tools. Familiarity with data mapping, tagging, and categorization practices. If you are a proactive, technically skilled individual with a passion for data and product attribution, we invite you to join our dynamic team! Apply today to help shape the future of data operations for our retail and brand customers. What We Offer: Our compensation package includes comprehensive benefits, perks, and a competitive salary: We care about your personal life and we mean it. We offer vacation time, medical insurance, parental leave, learning reimbursement, and more! Your growth at YipitData is determined by the impact that you are making, not by tenure, unnecessary facetime, or office politics. Everyone at YipitData is empowered to learn, self-improve, and master their skills in an environment focused on ownership, respect, and trust. The annual salary for this position is anticipated to be ₹16,60,000 - ₹20,75,000 (INR). The final offer may be determined by a number of factors, including, but not limited to, the applicant's experience, knowledge, skills, and abilities, as well as internal team benchmarks. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, marital status, disability, gender, gender identity or expression, or veteran status. We are proud to be an equal-opportunity employer. Job Applicant Privacy Notice

Posted 15 hours ago

Apply

0 years

5 - 10 Lacs

Noida

On-site

Posted On: 7 Aug 2025 Location: Noida, UP, India Company: Iris Software Why Join Us? Are you inspired to grow your career at one of India’s Top 25 Best Workplaces in IT industry? Do you want to do the best work of your life at one of the fastest growing IT services companies ? Do you aspire to thrive in an award-winning work culture that values your talent and career aspirations ? It’s happening right here at Iris Software. About Iris Software At Iris Software, our vision is to be our client’s most trusted technology partner, and the first choice for the industry’s top professionals to realize their full potential. With over 4,300 associates across India, U.S.A, and Canada, we help our enterprise clients thrive with technology-enabled transformation across financial services, healthcare, transportation & logistics, and professional services. Our work covers complex, mission-critical applications with the latest technologies, such as high-value complex Application & Product Engineering, Data & Analytics, Cloud, DevOps, Data & MLOps, Quality Engineering, and Business Automation. Working at Iris Be valued, be inspired, be your best. At Iris Software, we invest in and create a culture where colleagues feel valued, can explore their potential, and have opportunities to grow. Our employee value proposition (EVP) is about “Being Your Best” – as a professional and person. It is about being challenged by work that inspires us, being empowered to excel and grow in your career, and being part of a culture where talent is valued. We’re a place where everyone can discover and be their best version. Job Description Data Services Architect to work and lead the design and implementation of data architecture solutions for a logistics enterprise. This role requires a good understanding of canonical architecture patterns, medallion (bronze-silver-gold) data architecture, and end-to-end data processing pipelines and enabling analytics-ready data from raw files stored in AWS S3, using Databricks as the core processing platform. The ideal candidate will collaborate closely with business stakeholders to understand domain knowledge. Key Responsibilities: Canonical Architecture Design Medallion Architecture Implementation (Databricks) Raw Data Mapping and Transformation : Build ingestion pipelines to process raw files from AWS S3 into Bronze layer Map, and transform raw data into structured canonical formats aligned with logistics business rules Implement scalable DevOps pipelines using PySpark, Delta Lake , and Databricks Workflows . Business Engagement: Work closely with business SMEs, operations teams, and product managers to understand logistics processes, business entities, and KPIs. Understand the business requirements into data models and semantic layers. Collaboration & Leadership: Guide data engineers in the development and maintenance of data pipelines. Provide architectural oversight and best practices for data processing and integration. Mandatory Competencies Cloud - Azure - Azure Data Factory (ADF), Azure Databricks, Azure Data Lake Storage, Event Hubs, HDInsight Data Science and Machine Learning - Data Science and Machine Learning - Databricks Big Data - Big Data - Pyspark DevOps/Configuration Mgmt - Cloud Platforms - AWS Beh - Communication and collaboration Perks and Benefits for Irisians At Iris Software, we offer world-class benefits designed to support the financial, health and well-being needs of our associates to help achieve harmony between their professional and personal growth. From comprehensive health insurance and competitive salaries to flexible work arrangements and ongoing learning opportunities, we're committed to providing a supportive and rewarding work environment. Join us and experience the difference of working at a company that values its employees' success and happiness.

Posted 15 hours ago

Apply

0 years

7 - 8 Lacs

Noida

On-site

Posted On: 6 Aug 2025 Location: Noida, UP, India Company: Iris Software Why Join Us? Are you inspired to grow your career at one of India’s Top 25 Best Workplaces in IT industry? Do you want to do the best work of your life at one of the fastest growing IT services companies ? Do you aspire to thrive in an award-winning work culture that values your talent and career aspirations ? It’s happening right here at Iris Software. About Iris Software At Iris Software, our vision is to be our client’s most trusted technology partner, and the first choice for the industry’s top professionals to realize their full potential. With over 4,300 associates across India, U.S.A, and Canada, we help our enterprise clients thrive with technology-enabled transformation across financial services, healthcare, transportation & logistics, and professional services. Our work covers complex, mission-critical applications with the latest technologies, such as high-value complex Application & Product Engineering, Data & Analytics, Cloud, DevOps, Data & MLOps, Quality Engineering, and Business Automation. Working at Iris Be valued, be inspired, be your best. At Iris Software, we invest in and create a culture where colleagues feel valued, can explore their potential, and have opportunities to grow. Our employee value proposition (EVP) is about “Being Your Best” – as a professional and person. It is about being challenged by work that inspires us, being empowered to excel and grow in your career, and being part of a culture where talent is valued. We’re a place where everyone can discover and be their best version. Job Description Must have 3+ hands-on experience in test automation development using Python. Must have Basic knowledge of Big Data and AI Ecosystem. Must have API testing experience using any framework available in the market using Python. Continuous testing experience and expertise required. Proven success in position of similar responsibilities in a QA environment. Must be strong in writing efficient code in Python using data frames. Must have hands on experience on Python, PySpark, Linux, Big Data(data validation), Jenkins, Github. Good to havr AWS-Hadoop Commands, QTest, Java, Rest Assured, Selenium, Pytest, Playwright, Cypress, Cucumber, Behave, Jmeter, LoadRunner. Mandatory Competencies QA/QE - QA Automation - Python Beh - Communication QA/QE - QA Manual - API Testing Big Data - Big Data - Pyspark Operating System - Operating System - Linux Perks and Benefits for Irisians At Iris Software, we offer world-class benefits designed to support the financial, health and well-being needs of our associates to help achieve harmony between their professional and personal growth. From comprehensive health insurance and competitive salaries to flexible work arrangements and ongoing learning opportunities, we're committed to providing a supportive and rewarding work environment. Join us and experience the difference of working at a company that values its employees' success and happiness.

Posted 15 hours ago

Apply

0 years

0 Lacs

Trivandrum, Kerala, India

On-site

Role Description Role Proficiency: This role requires proficiency in data pipeline development including coding testing and implementing data pipelines for ingesting wrangling transforming and joining data from various sources. Must be adept at using ETL tools such as Informatica Glue Databricks and DataProc along with coding skills in Python PySpark and SQL. Works independently according to work allocation. Outcomes Operate with minimal guidance to develop error-free code test applications and document the development process. Understand application features and component designs to develop them in accordance with user stories and requirements. Code debug test document and communicate the stages of product component or feature development. Develop optimized code using appropriate approaches and algorithms while adhering to standards and security guidelines independently. Complete foundational level certifications in Azure AWS or GCP. Demonstrate proficiency in writing advanced SQL queries. Measures Of Outcomes Adherence to engineering processes and standards Adherence to schedule / timelines Adhere to SLAs where applicable # of defects post delivery # of non-compliance issues Reduction of reoccurrence of known defects Quickly turnaround production bugs Completion of applicable technical/domain certifications Completion of all mandatory training requirements Outputs Expected Code Development: Develop data processing code independently ensuring it meets performance and scalability requirements. Documentation Create comprehensive documentation for personal work and ensure it aligns with project standards. Configuration Follow the configuration process diligently. Testing Create and conduct unit tests for data pipelines and transformations to ensure data quality and correctness. Domain Relevance Develop features and components with a solid understanding of the business problems being addressed for the client. Defect Management Raise fix and retest defects in accordance with project standards. Estimation Estimate time effort and resource dependencies for personal work. Knowledge Management Consume and contribute to project-related documents SharePoint libraries and client universities. Release Management Adhere to the release management process for seamless deployment. Design Understanding Understand the design and low-level design (LLD) and link it to requirements and user stories. Certifications Obtain relevant technology certifications to enhance skills and knowledge. Skill Examples Proficiency in SQL Python or other programming languages utilized for data manipulation. Experience with ETL tools such as Apache Airflow Talend Informatica AWS Glue Dataproc and Azure ADF. Hands-on experience with cloud platforms like AWS Azure or Google Cloud particularly with data-related services (e.g. AWS Glue BigQuery). Conduct tests on data pipelines and evaluate results against data quality and performance specifications. Knowledge Examples Knowledge Examples Knowledge of various ETL services used by cloud providers including Apache PySpark AWS Glue GCP DataProc/DataFlow and Azure ADF/ADLF. Proficiency in SQL for analytics including windowing functions. Understanding of data schemas and models Additional Comments Job Description - Job Description Strong written and verbal communication skills in English. Ability to work in 24x7 shift schedules, including night shifts for extended periods. Analytical and problem-solving skills to diagnose and address data-related issues. Proficiency in writing SQL queries for data extraction and analysis. Hands-on experience with MS Excel for data analysis. Ability to work independently under minimal supervision while following SOPs. Strong attention to detail and ability to manage multiple monitoring tasks effectively. As an L1 Data Ops Analyst, you will be responsible for monitoring data pipelines, dashboards, and databases to ensure smooth operations. You will follow Standard Operating Procedures (SOPs) and runbooks to identify, escalate, and resolve issues with minimal supervision. Strong analytical skills, attention to detail, and the ability to work in a fast-paced, 24x7 environment are critical for this role. Key Responsibilities: Monitor various dashboards, s, and databases continuously for a 9-hour shift. Identify and escalate system or data anomalies based on predefined thresholds. Follow SOPs and runbooks to troubleshoot and resolve basic data issues. Work closely with L2 and L3 support teams for issue escalation and resolution. Write and execute basic SQL queries for data validation and troubleshooting. Analyze and interpret data using MS Excel to identify trends or anomalies. Maintain detailed logs of incidents, resolutions, and escalations. Communicate effectively with stakeholders, both verbally and in writing. Skills Sql,Data Analysis,Ms Excel,Dashboards

Posted 15 hours ago

Apply

0 years

0 Lacs

Gurgaon, Haryana, India

On-site

We are seeking a dynamic professional with strong experience in Databricks and Machine Learning to design and implement scalable data pipelines and ML solutions. The ideal candidate will work closely with data scientists, analysts, and business teams to deliver high-performance data products and predictive models. Key Responsibilities Design, develop, and optimize data pipelines using Databricks, PySpark, and Delta Lake Build and deploy Machine Learning models at scale Perform data wrangling, feature engineering, and model tuning Collaborate with cross-functional teams for ML model integration and monitoring Implement MLflow for model versioning and tracking Ensure best practices in MLOps, code management, and automation Must-Have Skills Hands-on experience with Databricks, Spark, and SQL Strong knowledge of ML algorithms, Python (Pandas, Scikit-learn), and model deployment Familiarity with cloud platforms (Azure / AWS / GCP) Experience with CI/CD pipelines and ML lifecycle management tools Good To Have Exposure to data governance, monitoring tools, and performance optimization Knowledge of Docker/Kubernetes and REST API integration

Posted 16 hours ago

Apply

4.0 years

0 Lacs

Greater Kolkata Area

On-site

Line of Service Advisory Industry/Sector Not Applicable Specialism Data, Analytics & AI Management Level Senior Associate Job Description & Summary At PwC, our people in data and analytics engineering focus on leveraging advanced technologies and techniques to design and develop robust data solutions for clients. They play a crucial role in transforming raw data into actionable insights, enabling informed decision-making and driving business growth. In data engineering at PwC, you will focus on designing and building data infrastructure and systems to enable efficient data processing and analysis. You will be responsible for developing and implementing data pipelines, data integration, and data transformation solutions. Why PWC At PwC, you will be part of a vibrant community of solvers that leads with trust and creates distinctive outcomes for our clients and communities. This purpose-led and values-driven work, powered by technology in an environment that drives innovation, will enable you to make a tangible impact in the real world. We reward your contributions, support your wellbeing, and offer inclusive benefits, flexibility programmes and mentorship that will help you thrive in work and life. Together, we grow, learn, care, collaborate, and create a future of infinite experiences for each other. Learn more about us. At PwC, we believe in providing equal employment opportunities, without any discrimination on the grounds of gender, ethnic background, age, disability, marital status, sexual orientation, pregnancy, gender identity or expression, religion or other beliefs, perceived differences and status protected by law. We strive to create an environment where each one of our people can bring their true selves and contribute to their personal growth and the firm’s growth. To enable this, we have zero tolerance for any discrimination and harassment based on the above considerations. Job Description & Summary: A career within Data and Analytics services will provide you with the opportunity to help organisations uncover enterprise insights and drive business results using smarter data analytics. We focus on a collection of organisational technology capabilities, including business intelligence, data management, and data assurance that help our clients drive innovation, growth, and change within their organisations in order to keep up with the changing nature of customers and technology. We make impactful decisions by mixing mind and machine to leverage data, understand and navigate risk, and help our clients gain a competitive edge. Responsibilities Design and develop data pipelines using Databricks and PySpark to ingest, process, and transform large volumes of data. Implement ETL/ELT workflows to move data from source systems to Data Warehouses, Data Lakes, and Lake Houses using cloud-native tools. Work with structured and unstructured data stored in AWS or Azure Data Lakes. Apply strong SQL and Python skills to manipulate and analyze data efficiently. Collaborate with cross-functional teams to deliver cloud-based serverless data solutions. Design innovative data solutions that address complex business requirements and support data-driven decision-making. Maintain documentation and enforce best practices for data architecture, governance, and performance optimization. Mandatory Skill Sets Databricks, PySpark, and SQL on any cloud platform (AWS or Azure). Preferred Skill Sets ETL,Pyspark Years of experience required: 4 to 10 Years Education Qualification Bachelor's degree in computer science, data science or any other Engineering discipline. Master’s degree is a plus. Education (if blank, degree and/or field of study not specified) Degrees/Field of Study required: Master Degree, Bachelor Degree Degrees/Field Of Study Preferred Certifications (if blank, certifications not specified) Required Skills PySpark Optional Skills Accepting Feedback, Accepting Feedback, Active Listening, Agile Scalability, Amazon Web Services (AWS), Analytical Thinking, Apache Airflow, Apache Hadoop, Azure Data Factory, Communication, Creativity, Data Anonymization, Data Architecture, Database Administration, Database Management System (DBMS), Database Optimization, Database Security Best Practices, Databricks Unified Data Analytics Platform, Data Engineering, Data Engineering Platforms, Data Infrastructure, Data Integration, Data Lake, Data Modeling, Data Pipeline {+ 27 more} Desired Languages (If blank, desired languages not specified) Travel Requirements Not Specified Available for Work Visa Sponsorship? No Government Clearance Required? No Job Posting End Date

Posted 16 hours ago

Apply

0 years

0 Lacs

Ahmedabad, Gujarat, India

On-site

Mandatory : Proficiency in Python with experience in Databricks (PySpark) Good to Have : Hands-on experience with Apache Airflow. Working knowledge of PostgreSQL, MongoDB. Basic experience on cloud technologies like Azure, AWS and Google.

Posted 18 hours ago

Apply

2.0 years

0 Lacs

Gurugram, Haryana, India

On-site

At EXL, our collaboration is built on ongoing listening and learning to adapt our methodologies. We’re your business evolution partner—tailoring solutions that make the most of data to make better business decisions and drive more intelligence into your increasingly digital operations. ("Note: We are only considering immediate joiners currently based in Delhi/NCR.") Job Description: 2+ years of experience in Analytics. Develop, optimize, and maintain data pipelines using PySpark and Python for large-scale data processing Proficient in SQL and Power BI Experienced in Data extraction and manipulation In-depth data analysis like identifying major trends, univariate and bivariate analysis Good communication and presentation skills with experience in client management Required Technical Skills: Python: Strong programming skills with experience in data manipulation and analysis. PySpark: Hands-on experience with Spark DataFrames and distributed computing concepts. Advanced SQL: Deep understanding of joins, window functions, CTEs, and query optimization. Power BI: Experience in building dashboards and data modeling. R: Working knowledge of statistical analysis, data visualization, or reporting in R. Qualifications: Bachelor’s or Master’s degree in Computer Science, Data Science, Information Technology, Statistics, or a related field. Minimum 2 years of relevant experience in analytics roles.

Posted 18 hours ago

Apply

2.0 years

13 - 15 Lacs

Mumbai Metropolitan Region

Remote

About The Opportunity A high-velocity IT consulting firm in the Cloud Infrastructure Support sector seeks a hands-on engineer to maintain and optimize enterprise Azure estates. You’ll work in a hybrid model from India, partnering with global teams to ensure uptime, drive automation, and deliver top-tier support for mission-critical applications. Role & Responsibilities Serve as primary point of contact for Azure platform incidents, managing ticket triage, escalation, and resolution. Perform root-cause analysis on infrastructure failures and implement preventative measures to reduce recurrence. Develop and maintain Python scripts and ARM templates for automation of routine tasks, deployments, and configuration management. Collaborate with DevOps teams to design, test, and optimize Azure DevOps pipelines for CI/CD workflows. Monitor system health using Azure Monitor, Log Analytics, and custom dashboards; respond to alerts and capacity thresholds. Document processes, runbooks, and knowledge-base articles to streamline support operations and enable self-service. Skills & Qualifications Must-Have 2+ years’ experience supporting Azure Cloud Services (VMs, App Services, Functions, Storage). Strong Python scripting skills for automation, data parsing, and API integration. Hands-on expertise with ARM templates, Azure CLI, and REST API for resource provisioning. Proven track record in troubleshooting infrastructure incidents and performing RCA in complex environments. Familiarity with Azure DevOps, CI/CD pipeline creation, and version control (Git). Preferred Experience with monitoring and alerting tools such as Azure Monitor, Log Analytics, and Application Insights. Knowledge of Linux/Windows server administration and SQL/NoSQL database support. Benefits & Culture Highlights Hybrid work model with flexible hours and remote days to promote work–life balance. Competitive compensation, training allowances, and certification support for Azure and Python. Collaborative culture with regular hackathons, knowledge-sharing sessions, and global team events. Skills: itil,storage architecture,networking concepts,app services,adf,vector db,shell scripting,azure active directory,command-line interfaces,sql,key vault,cli,python,azure cli,rag,pyspark,sql databases,virtual networks,azure python sdk,databricks,azure monitor,bash,data engineering,cosmos db,adb,storage accounts,powerbi,aks clusters,bash scripting,llm,powershell,aks,logic app,azure,python middleware

Posted 18 hours ago

Apply

2.0 years

13 - 15 Lacs

Pune, Maharashtra, India

Remote

About The Opportunity A high-velocity IT consulting firm in the Cloud Infrastructure Support sector seeks a hands-on engineer to maintain and optimize enterprise Azure estates. You’ll work in a hybrid model from India, partnering with global teams to ensure uptime, drive automation, and deliver top-tier support for mission-critical applications. Role & Responsibilities Serve as primary point of contact for Azure platform incidents, managing ticket triage, escalation, and resolution. Perform root-cause analysis on infrastructure failures and implement preventative measures to reduce recurrence. Develop and maintain Python scripts and ARM templates for automation of routine tasks, deployments, and configuration management. Collaborate with DevOps teams to design, test, and optimize Azure DevOps pipelines for CI/CD workflows. Monitor system health using Azure Monitor, Log Analytics, and custom dashboards; respond to alerts and capacity thresholds. Document processes, runbooks, and knowledge-base articles to streamline support operations and enable self-service. Skills & Qualifications Must-Have 2+ years’ experience supporting Azure Cloud Services (VMs, App Services, Functions, Storage). Strong Python scripting skills for automation, data parsing, and API integration. Hands-on expertise with ARM templates, Azure CLI, and REST API for resource provisioning. Proven track record in troubleshooting infrastructure incidents and performing RCA in complex environments. Familiarity with Azure DevOps, CI/CD pipeline creation, and version control (Git). Preferred Experience with monitoring and alerting tools such as Azure Monitor, Log Analytics, and Application Insights. Knowledge of Linux/Windows server administration and SQL/NoSQL database support. Benefits & Culture Highlights Hybrid work model with flexible hours and remote days to promote work–life balance. Competitive compensation, training allowances, and certification support for Azure and Python. Collaborative culture with regular hackathons, knowledge-sharing sessions, and global team events. Skills: itil,storage architecture,networking concepts,app services,adf,vector db,shell scripting,azure active directory,command-line interfaces,sql,key vault,cli,python,azure cli,rag,pyspark,sql databases,virtual networks,azure python sdk,databricks,azure monitor,bash,data engineering,cosmos db,adb,storage accounts,powerbi,aks clusters,bash scripting,llm,powershell,aks,logic app,azure,python middleware

Posted 18 hours ago

Apply

4.0 years

0 Lacs

Gurugram, Haryana, India

On-site

Senior Data Engineer: Incedo is a US-based consulting, data science and technology services firm with over 2,000 people helping clients from our six offices across US and India . We help our clients achieve competitive advantage through end-to-end digital transformation. Our uniqueness lies in bringing together strong engineering, data science, and design capabilities coupled with deep domain understanding. We combine services and products to maximize business impact for our clients in telecom, financial services, product engineering and life science & healthcare industries . Working at Incedo will provide you an opportunity to work with industry leading client organizations, deep technology and domain experts, and global teams . Incedo University , our learning platform, provides ample learning opportunities starting with a structured onboarding program and carrying throughout various stages of your career. A variety of fun activities are also an integral part of our friendly work environment . Our flexible career paths allow you to grow into a program manager, a technical architect or a domain expert based on your skills and interests. Location: Gurugram & Pune Experience: 4 Years to 7 Years Notice Period: Immediate OR Serving Notice OR 30 Days Official Notice Only. Role Description: We are seeking a skilled professional to maintain and support batch jobs in a legacy environment. The role involves managing and monitoring ETL processes, addressing issues, and enhancing existing PL/SQL scripts. The ideal candidate will have strong expertise in Informatica, SQL Server, and data warehousing concepts, along with experience in troubleshooting and improving batch job performance. Key Responsibilities: Design and implement robust ETL pipelines using AWS Glue, Redshift, Lambda, and S3. Monitor and optimize the performance of data workflows and batch processing jobs. Troubleshoot and resolve issues related to data pipeline failures, inconsistencies, and performance bottlenecks. Collaborate with cross-functional teams to define data requirements and ensure data quality and accuracy. Develop and maintain automated solutions for data transformation, migration, and integration tasks. Implement best practices for data security, data governance, and compliance within AWS environments. Continuously improve and optimize AWS Glue jobs, Lambda functions, and S3 storage management. Maintain comprehensive documentation for data pipeline architecture, job schedules, and issue resolution processes. Required Skills and Experience: Strong experience with Data Engineering practices. Experience in AWS services, particularly AWS Redshift, Glue, Lambda, S3, and other AWS data tools. Proficiency in SQL, python , Pyspark, numpy etc and experience in working with large-scale data sets. Experience in designing and implementing ETL pipelines in cloud environments. Expertise in troubleshooting and optimizing data processing workflows. Familiarity with data warehousing concepts and cloud-native data architecture. Knowledge of automation and orchestration tools in a cloud-based environment. Strong problem-solving skills and the ability to debug and improve the performance of data jobs. Excellent communication skills and the ability to work collaboratively with cross-functional teams. Good to have knowledge of DBT & Snowflake Preferred Qualifications: Bachelor’s degree in Computer Science, Information Technology, Data Engineering, or a related field. Experience with other AWS data services like Redshift, Athena, or Kinesis. Familiarity with Python or other scripting languages for data engineering tasks. Experience with containerization and orchestration tools like Docker or Kubernetes. We are an Equal Opportunity Employer: We value diversity at Incedo. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Posted 19 hours ago

Apply

0 years

0 Lacs

Haryana, India

On-site

Role Description: We are seeking a skilled professional to maintain and support batch jobs in a legacy environment. The role involves managing and monitoring ETL processes, addressing issues, and enhancing existing PL/SQL scripts. The ideal candidate will have strong expertise in Informatica, SQL Server, and data warehousing concepts, along with experience in troubleshooting and improving batch job performance. Key Responsibilities: Design and implement robust ETL pipelines using AWS Glue, Lambda, Redshift and S3 Monitor and optimize the performance of data workflows and batch processing jobs Troubleshoot and resolve issues related to data pipeline failures, inconsistencies, and performance bottlenecks Collaborate with cross-functional teams to define data requirements and ensure data quality and accuracy Develop and maintain automated solutions for data transformation, migration, and integration tasks Implement best practices for data security, data governance, and compliance within AWS environments Continuously improve and optimize AWS Glue jobs, Lambda functions, and S3 storage management Maintain comprehensive documentation for data pipeline architecture, job schedules, and issue resolution processes. Required Skills and Experience: Strong experience with Data Engineering practices Experience in AWS services, particularly AWS Glue, Lambda, S3, and other AWS data tools Proficiency in SQL, python , Pyspark, numpy etc and experience in working with large-scale data sets Experience in designing and implementing ETL pipelines in cloud environments Expertise in troubleshooting and optimizing data processing workflows Familiarity with data warehousing concepts and cloud-native data architecture Knowledge of automation and orchestration tools in a cloud-based environment Strong problem-solving skills and the ability to debug and improve the performance of data jobs Excellent communication skills and the ability to work collaboratively with cross-functional teams Good to have knowledge of DBT & Snowflake We are an Equal Opportunity Employer: We value diversity at Incedo. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Posted 19 hours ago

Apply

3.0 years

13 - 15 Lacs

Pune, Maharashtra, India

On-site

Support Analyst{AI and Data Application). Job involves working in Shifts + working during extended hours on weekend or India Holidays on a Need basis. Exp : 3-5+ years in Supporting Python Middleware applications with Azure or Data Engineering application, knowledge on ITIL is must. Technical Expertise Understanding of cloud computing concepts, specifically Microsoft Azure. Experience with Azure services like AKS clusters, App Services, Storage Accounts, Virtual Networks, Azure Active Directory, and Azure Monitor. Ability to support API based applications hosted on Azure. Familiarity with command-line interfaces (CLI) and scripting (PowerShell, Bash). Experience with monitoring tools and performance analysis. Proven experience with the Azure Python SDK and integrating Azure services - Storage, Key Vault, Logic App, Cosmos DB into software applications. Good Experience on supporting Python Based Web applications. Knowledge of LLM, Vector DB and RAG architecture will be a plus Basic understanding of networking concepts (TCP/IP, DNS, firewalls). Ability to troubleshoot software applications and their dependencies. Knowledge on supporting Data applications - ADB, ADF , Databricks, PySpark will be a huge plus. Knowledge on SQL is must. Understanding of PowerBI reports will be a plus. Communication And Collaboration Effectively communicate complex technical concepts to both technical and non-technical audiences. Excellent problem-solving and analytical skills. Strong customer service orientation and empathy. Ability to work independently and as part of a team. Skills: ITIL,Python Middleware,Azure,Data Engineering,AKS clusters,App Services,Storage Accounts,Virtual Networks,Azure Active Directory,Azure Monitor,command-line interfaces,CLI,PowerShell,Bash,Bash Scripting,Shell Scripting,Azure Python SDK,SQL,PowerBI,LLM,RAG,Vector DB,Storage Architecture,Key Vault,Logic App,Cosmos DB,ADB,ADF,Databricks,PySpark

Posted 20 hours ago

Apply

3.0 - 4.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

ML Model implementation, experimentation & related software engineering focused Experience as a data scientist with preference in forecasting algorithms Python, PySpark development experience a must with 3-4 years of experience Essential Functions Machine Learning model implementations Qualifications Ability to learn fast and handle multiple priorities We offer Opportunity to work on bleeding-edge projects Work with a highly motivated and dedicated team Competitive salary Flexible schedule Benefits package - medical insurance, sports Corporate social events Professional development opportunities Well-equipped office About Us Grid Dynamics (NASDAQ: GDYN) is a leading provider of technology consulting, platform and product engineering, AI, and advanced analytics services. Fusing technical vision with business acumen, we solve the most pressing technical challenges and enable positive business outcomes for enterprise companies undergoing business transformation. A key differentiator for Grid Dynamics is our 8 years of experience and leadership in enterprise AI, supported by profound expertise and ongoing investment in data, analytics, cloud & DevOps, application modernization, and customer experience. Founded in 2006, Grid Dynamics is headquartered in Silicon Valley with offices across the Americas, Europe, and India.

Posted 20 hours ago

Apply

0.0 - 7.0 years

0 Lacs

Gurugram, Haryana

On-site

Gurugram, Haryana, India;Bangalore, Karnataka, India Qualification : Skills : Bigdata, Pyspark, Hive, Spark Optimization Good to have : GCP Skills Required : Bigdata, Spark, Pyspark, Optimization, Python, Java Role : Skills : Bigdata, Pyspark, Hive, Spark Optimization Good to have : GCP Experience : 4 to 7 years Job Reference Number : 13159

Posted 22 hours ago

Apply

2.0 years

0 Lacs

Delhi, India

Remote

Job Title: Data Engineer Company: Enablemining Location: Remote Employment Type: Full-time Seniority Level: Mid-Level Experience: Minimum 2 years Education: BE/BTech or MCA About Us Enablemining is a global mining consultancy headquartered in Australia. We specialize in strategy, mine planning, and technical evaluations for coal and metalliferous mines. Our work is grounded in structured problem-solving and innovation — helping clients maximize the value of their mining assets. About the Role We are looking for a skilled Data Engineer to join our data and analytics team. You’ll be responsible for building and optimizing data pipelines, transforming raw datasets into usable formats, and enabling insight through interactive reporting solutions. You will work across modern tools such as PySpark, Python, SQL, Power BI, and DAX , and collaborate closely with business teams to create scalable, impactful data systems. Key Responsibilities Design and maintain data pipelines using PySpark and SQL Develop efficient ETL workflows and automate data ingestion Support data transformation and analytics with Python Create and manage interactive dashboards using Power BI and DAX Integrate and manage data from Databricks and other platforms Ensure accuracy, performance, and scalability of data solutions Work with stakeholders to understand and deliver on reporting needs Required Skills & Experience Minimum 2 years of experience as a Data Engineer or in a related role Proficiency in PySpark, SQL, and Python Experience in Power BI , with strong skills in DAX Familiarity with Databricks or other data lakehouse platforms Strong analytical, problem-solving, and communication skills

Posted 1 day ago

Apply

3.0 - 5.0 years

0 Lacs

Greater Kolkata Area

Remote

Key Responsibilities Build up data pipelines for consumption by the data science team. Clear understanding and experience with Python and PySpark. Experience in writing Python programs and SQL queries. Experience in SQL Query tuning. Build and maintain data pipelines in Pyspark with SQL and Python. Knowledge of Cloud (Azure/AWS) technologies is additional. Suggest and implement best practices in data integration. Split the planned deliverables into tasks and assign them to the team. Good oral, written and presentation skills. Skills & Experience Required Degree in Computer Science, IT, or a similar field; a master's is a plus. Hands-on experience with Python and Pyspark 3- 5 years of hands-on experience in Python, PySpark, and SQL. Great numerical and analytical skills. Proven expertise in building and optimizing data pipelines using Spark and Hadoop ecosystem tools. Experience working on cloud platforms like Azure or AWS. Should be able to collaborate and coordinate in a remote environment. Be a problem solver and be proactive to solve the challenges that come his way. (ref:hirist.tech)

Posted 1 day ago

Apply

8.0 - 15.0 years

0 Lacs

chennai, tamil nadu

On-site

As the Control Monitoring & Testing Automation- Services, Markets & Banking Lead (C14) at Citi, your role is critical in leading a team of Automation Analysts to transition from manual to automated testing of controls across the Services, Markets, and Banking businesses. Your responsibilities include providing thought leadership in the technical design and development of control testing automation solutions, overseeing the automation requests lifecycle, and identifying control and testing gaps to mitigate risks like Regulatory/AML/KYC/Sanctions/Anti Bribery/Sales Practice/Reputational/Fraud and Theft. You will be responsible for end-to-end delivery of control monitoring & testing automation tools, acting as a technical subject matter expert in tools and capabilities such as SAS/SQL/Python development, low-code automation tools, and overseeing the automation pipeline. Your role also involves developing data analytics strategies to support business objectives, staying updated on industry trends, liaising with stakeholders to identify automation opportunities, and leading a team of Managers and SMEs in designing, developing, and testing automated solutions. To qualify for this role, you should have 15+ years of relevant experience in data analytics and/or automation, with 8+ years of direct management experience. You should be proficient in software development and automation tools such as SAS, SQL, Python, PySpark, Hive, Alteryx, and possess in-depth knowledge of Services, Market, Banking & Client Domain. Effective communication, relationship management, analytical, and continuous improvement skills are essential, along with the ability to work under pressure, manage deadlines, and exhibit problem-solving and decision-making skills. A Master's degree in information systems/technology/Statistics/Mathematics/Computer Sciences/Engineering or a related quantitative field from a premier institute is required for this position. Citi is an equal opportunity employer, and as a Control Monitoring & Testing Automation- Services, Markets & Banking Lead (C14), you will play a crucial role in driving the advancement of control testing automation solutions through data-based techniques and emerging technologies.,

Posted 1 day ago

Apply

6.0 years

0 Lacs

Kolkata, West Bengal, India

On-site

At EY, you’ll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture and technology to become the best version of you. And we’re counting on your unique voice and perspective to help EY become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all. Position: Senior - Data Engineer Role Description: Leads the delivery of solution or infrastructure development services for a large or more complex project, using advanced technical capabilities Takes accountability for the design, development, delivery and maintenance of solutions or infrastructure, driving compliance with and contributing to the development of relevant standards Fully understands business and user requirements and ensures design specifications meet the requirements from a business and technical perspective Responsibilities: Design and development of Data Ingestion, Transformation components to support business requirements. Should be able to process heterogenous source data formats (excels, csv, txt, pdf, database, web) and be able to perform EDA (Exploratory Data Analysis), handle outliers, data cleansing and data transformation. Should be highly data driven and able to write complex data transformation programs using PySpark, Databricks, Python. Experience in data integration and data processing using Spark and Python. Has hands-on experience in providing performing tuning techniques to resolve the performance issues in the data pipelines. Provides advanced technical expertise to maximize efficiency, reliability and value from current solutions, infrastructure and emerging technologies, showing technical leadership and identifying and implementing continuous improvement plans Works closely with Development lead and build components in agile methodology Develops strong working relationships with peers across Development & Engineering and Architecture teams, collaborating to develop and engineer leading solutions Skills requirement: 4 – 6 years of hands on experience in Big Data Technologies and distributed systems computing Strong experience in Azure Databricks/Python/Spark/PySpark. Strong experience with SQL, RESTful API, JSON Experience with Azure Cloud resources is preferable Experience with Angular would be nice to have Exposure to any noSQL Databases (MongoDB, Cosmos DB and etc) is a plus EY | Building a better working world EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets. Enabled by data and technology, diverse EY teams in over 150 countries provide trust through assurance and help clients grow, transform and operate. Working across assurance, consulting, law, strategy, tax and transactions, EY teams ask better questions to find new answers for the complex issues facing our world today.

Posted 1 day ago

Apply

Exploring PySpark Jobs in India

PySpark, a powerful data processing framework built on top of Apache Spark and Python, is in high demand in the job market in India. With the increasing need for big data processing and analysis, companies are actively seeking professionals with PySpark skills to join their teams. If you are a job seeker looking to excel in the field of big data and analytics, exploring PySpark jobs in India could be a great career move.

Top Hiring Locations in India

Here are 5 major cities in India where companies are actively hiring for PySpark roles: 1. Bangalore 2. Pune 3. Hyderabad 4. Mumbai 5. Delhi

Average Salary Range

The estimated salary range for PySpark professionals in India varies based on experience levels. Entry-level positions can expect to earn around INR 6-8 lakhs per annum, while experienced professionals can earn upwards of INR 15 lakhs per annum.

Career Path

In the field of PySpark, a typical career progression may look like this: 1. Junior Developer 2. Data Engineer 3. Senior Developer 4. Tech Lead 5. Data Architect

Related Skills

In addition to PySpark, professionals in this field are often expected to have or develop skills in: - Python programming - Apache Spark - Big data technologies (Hadoop, Hive, etc.) - SQL - Data visualization tools (Tableau, Power BI)

Interview Questions

Here are 25 interview questions you may encounter when applying for PySpark roles:

  • Explain what PySpark is and its main features (basic)
  • What are the advantages of using PySpark over other big data processing frameworks? (medium)
  • How do you handle missing or null values in PySpark? (medium)
  • What is RDD in PySpark? (basic)
  • What is a DataFrame in PySpark and how is it different from an RDD? (medium)
  • How can you optimize performance in PySpark jobs? (advanced)
  • Explain the difference between map and flatMap transformations in PySpark (basic)
  • What is the role of a SparkContext in PySpark? (basic)
  • How do you handle schema inference in PySpark? (medium)
  • What is a SparkSession in PySpark? (basic)
  • How do you join DataFrames in PySpark? (medium)
  • Explain the concept of partitioning in PySpark (medium)
  • What is a UDF in PySpark? (medium)
  • How do you cache DataFrames in PySpark for optimization? (medium)
  • Explain the concept of lazy evaluation in PySpark (medium)
  • How do you handle skewed data in PySpark? (advanced)
  • What is checkpointing in PySpark and how does it help in fault tolerance? (advanced)
  • How do you tune the performance of a PySpark application? (advanced)
  • Explain the use of Accumulators in PySpark (advanced)
  • How do you handle broadcast variables in PySpark? (advanced)
  • What are the different data sources supported by PySpark? (medium)
  • How can you run PySpark on a cluster? (medium)
  • What is the purpose of the PySpark MLlib library? (medium)
  • How do you handle serialization and deserialization in PySpark? (advanced)
  • What are the best practices for deploying PySpark applications in production? (advanced)

Closing Remark

As you explore PySpark jobs in India, remember to prepare thoroughly for interviews and showcase your expertise confidently. With the right skills and knowledge, you can excel in this field and advance your career in the world of big data and analytics. Good luck!

cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies