Role: Senior Data Engineer with Databricks. Experience: 5+ Years Job Type: Contract Contract Duration: 6 Months Budget: 1.0 lakh per month Location : Remote JOB DESCRIPTION: We are looking for a dynamic and experienced Senior Data Engineer – Databricks to design, build, and optimize robust data pipelines using the Databricks Lakehouse platform. The ideal candidate should have strong hands-on skills in Apache Spark, PySpark, cloud data services, and a good grasp of Python and Java. This role involves close collaboration with architects, analysts, and developers to deliver scalable and high-performing data solutions across AWS, Azure, and GCP. ESSENTIAL JOB FUNCTIONS 1. Data Pipeline Development • Build scalable and efficient ETL/ELT workflows using Databricks and Spark for both batch and streaming data. • Leverage Delta Lake and Unity Catalog for structured data management and governance. • Optimize Spark jobs by tuning configurations, caching, partitioning, and serialization techniques. 2. Cloud-Based Implementation • Develop and deploy data workflows onAWS (S3, EMR,Glue), Azure (ADLS, ADF, Synapse), and/orGCP (GCS, Dataflow, BigQuery). • Manage and optimize data storage, access control, and pipeline orchestration using native cloud tools. • Use tools like Databricks Auto Loader and SQL Warehousing for efficient data ingestion and querying. 3. Programming & Automation • Write clean, reusable, and production-grade code in Python and Java. • Automate workflows using orchestration tools(e.g., Airflow, ADF, or Cloud Composer). • Implement robust testing, logging, and monitoring mechanisms for data pipelines. 4. Collaboration & Support • Collaborate with data analysts, data scientists, and business users to meet evolving data needs. • Support production workflows, troubleshoot failures, and resolve performance bottlenecks. • Document solutions, maintain version control, and follow Agile/Scrum processes Required Skills Technical Skills: • Databricks: Hands-on experience with notebooks, cluster management, Delta Lake, Unity Catalog, and job orchestration. • Spark: Expertise in Spark transformations, joins, window functions, and performance tuning. • Programming: Strong in PySpark and Java, with experience in data validation and error handling. • Cloud Services: Good understanding of AWS, Azure, or GCP data services and security models. • DevOps/Tools: Familiarity with Git, CI/CD, Docker (preferred), and data monitoring tools. Experience: • 5–8 years of data engineering or backend development experience. • Minimum 1–2 years of hands-on work in Databricks with Spark. • Exposure to large-scale data migration, processing, or analytics projects. Certifications (nice to have): Databricks Certified Data Engineer Associate Working Conditions Hours of work - Full-time hours; Flexibility for remote work with ensuring availability during US Timings. Overtime expectations - Overtime may not be required as long as the commitment is accomplished Work environment - Primarily remote; occasional on-site work may be needed only during client visit. Travel requirements - No travel required. On-call responsibilities - On-call duties during deployment phases. Special conditions or requirements - Not Applicable. Workplace Policies and Agreements Confidentiality Agreement: Required to safeguard client sensitive data. Non-Compete Agreement: Must be signed to ensure proprietary model security. Non-Disclosure Agreement: Must be signed to ensure client confidentiality and security. Show more Show less

Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.