Posted:3 weeks ago|
Platform:
On-site
Full Time
Main Purpose: Main Purpose ▪Collaborate with data scientists and business stakeholders to design, develop, and maintain efficient data pipelines feeding into the organization's data lake. ▪ Maintain the integrity and quality of the data lake, enabling accurate and actionable insights for data scientists and informed decision-making for business stakeholders. ▪Utilize extensive knowledge of data engineering and cloud technologies to enhance the organization’s data infrastructure, promoting a culture of data-driven decision-making. ▪ Apply data engineering expertise to define and optimize data pipelines using advanced concepts to improve the efficiency and accessibility of data storage. ▪Own the development of an extensive data catalog, ensuring robust data governance and facilitating effective data access and utilization across the organization. Knowledge Skills and Abilities, Key Responsibilities: Key Responsibilities Contribute to the development of scalable and performant data pipelines on Databricks, leveraging Delta Lake, Delta Live Tables (DLT), and other core Databricks components. Develop data lakes/warehouses designed for optimized storage, querying, and real-time updates using Delta Lake. Implement effective data ingestion strategies from various sources (streaming, batch, API-based), ensuring seamless integration with Databricks. Ensure the integrity, security, quality, and governance of data across our Databricks-centric platforms. Collaborate with stakeholders (data scientists, analysts, product teams) to translate business requirements into Databricks-native data solutions. Build and maintain ETL/ELT processes, heavily utilizing Databricks, Spark (Scala or Python), SQL, and Delta Lake for transformations. Experience with CI/CD and DevOps practices specifically tailored for the Databricks environment. Monitor and optimize the cost-efficiency of data operations on Databricks, ensuring optimal resource utilization. Utilize a range of Databricks tools, including the Databricks CLI and REST API, alongside Apache Spark™, to develop, manage, and optimize data engineering solutions. Work Experience: 5 years of overall experience & at least 3 years of relevant experience 3 years of experience working with Azure or any cloud platform & Databricks Skills: Proficiency in Spark, Delta Lake, Structured Streaming, and other Azure Databricks functionalities for sophisticated data pipeline construction. Strong capability in diagnosing and optimizing Spark applications and Databricks workloads, including strategic cluster sizing and configuration. Expertise in sharing data solutions that leverage Azure Databricks ecosystem technologies for enhanced data management and processing efficiency. Profound knowledge of data governance, data security, coupled with an understanding of large-scale distributed systems and cloud architecture design. Experience with a variety of data sources and BI tools Key Relationships and Department Overview: Internal – Data Engineering Manager Developers across various departments, Managers of Departments in other regional hubs of Puma Energy External – Platform providers Show more Show less
Puma Energy
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python NowMumbai Metropolitan Region
Salary: Not disclosed
Mumbai Metropolitan Region
Experience: Not specified
Salary: Not disclosed
Mumbai Metropolitan Region
Salary: Not disclosed
Mumbai Metropolitan Region
Experience: Not specified
Salary: Not disclosed