Senior Data Engineer (Databricks, PySpark, SQL, Cloud Data Platforms, Data Pipelines)
Job Summary
Synechron is seeking a highly skilled and experienced Data Engineer to join our innovative analytics team in Bangalore. The primary purpose of this role is to design, develop, and maintain scalable data pipelines and architectures that empower data-driven decision making and advanced analytics initiatives. As a critical contributor within our data ecosystem, you will enable the organization to harness large, complex datasets efficiently, supporting strategic business objectives and ensuring high standards of data quality, security, and performance. Your expertise will directly contribute to building robust, efficient, and secure data solutions that drive business value across multiple domains.Software
Required Software & Tools:
Databricks Platform (Hands-on experience with Databricks notebooks, clusters, and workflows)PySpark (Proficient in developing and optimizing Spark jobs)SQL (Advance proficiency in writing complex SQL queries and optimizing queries)Data Orchestration Tools such as Apache Airflow or similar (Experience in scheduling and managing data workflows)Cloud Data Platforms (Experience with cloud environments such as AWS, Azure, or Google Cloud)Data Warehousing Solutions (Snowflake highly preferred)Preferred Software & Tools:Kafka or other streaming frameworks (e.g., Confluent, MQTT)CI/CD tools for data pipelines (e.g., Jenkins, GitLab CI)DevOps practices for data workflowsProgramming LanguagesPython (Expert level), and familiarity with other languages like Java or Scala is advantageousOverall ResponsibilitiesArchitect, develop, and maintain scalable, resilient data pipelines and architectures supporting business analytics, reporting, and data science use cases.Collaborate closely with data scientists, analysts, and cross-functional teams to gather requirements and deliver optimized data solutions aligned with organizational goals.Ensure data quality, consistency, and security across all data workflows, adhering to best practices and compliance standards.Optimize data processes for enhanced performance, reliability, and cost efficiency.Integrate data from multiple sources, including cloud data services and streaming platforms, ensuring seamless data flow and transformation.Lead efforts in performance tuning and troubleshooting data pipelines to resolve bottlenecks and improve throughput.Stay up-to-date with emerging data engineering technologies and contribute to continuous improvement initiatives within the team.Technical Skills (By Category)Programming Languages:EssentialPython, SQLPreferredScala, JavaDatabases/Data Management:EssentialData modeling, ETL/ELT processes, data warehousing (Snowflake experience highly preferred)Preferred NoSQL databases, Hadoop ecosystemCloud Technologies:EssentialExperience with cloud data services (AWS, Azure, GCP) and deployment of data pipelines in cloud environmentsPreferredCloud native data tools and architecture designFrameworks and Libraries:EssentialPySpark, Spark SQL, Kafka, AirflowPreferredStreaming frameworks, TensorFlow (for data prep)Development Tools and Methodologies:EssentialVersion control (Git), CI/CD pipelines, Agile methodologiesPreferredDevOps practices in data engineering, containerization (Docker, Kubernetes)Security Protocols:Familiarity with data security, encryption standards, and compliance best practicesExperience Minimum of 8 years of professional experience in Data Engineering or related rolesProven track record of designing and deploying large-scale data pipelines using Databricks, PySpark, and SQLPractical experience in data modeling, data warehousing, and ETL/ELT workflowsExperience working with cloud data platforms and streaming data frameworks such as Kafka or equivalentDemonstrated ability to work with cross-functional teams, translating business needs into technical solutionsExperience with data orchestration and automation tools is highly valuedPrior experience in implementing CI/CD pipelines or DevOps practices for data workflows (preferred)Day-to-Day ActivitiesDesign, develop, and troubleshoot data pipelines for ingestion, transformation, and storage of large datasetsCollaborate with data scientists and analysts to understand data requirements and optimize existing pipelinesAutomate data workflows and improve pipeline efficiency through performance tuning and best practicesConduct data quality audits and ensure data security protocols are followedManage and monitor data workflows, troubleshoot failures, and implement fixes proactivelyContribute to documentation, code reviews, and knowledge sharing within the teamStay informed of evolving data engineering tools, techniques, and industry best practices, incorporating them into daily work processesQualificationsBachelor's or Master's degree in Computer Science, Information Technology, or related fieldRelevant certifications such as Databricks Certified Data Engineer, AWS Certified Data Analytics, or equivalent (preferred)Continuous learning through courses, workshops, or industry conferences on data engineering and cloud technologiesProfessional CompetenciesStrong analytical and problem-solving skills with a focus on scalable solutionsExcellent communication skills to effectively collaborate with technical and non-technical stakeholdersAbility to prioritize tasks, manage time effectively, and deliver within tight deadlinesDemonstrated leadership in guiding team members and driving project successAdaptability to evolving technological landscapes and innovative thinkingCommitment to data privacy, security, and ethical handling of information