We are looking for an Associate Data Engineer with deep expertise in writing data pipelines to build scalable, high-performance data solutions. The ideal candidate will be responsible for developing, optimizing and maintaining complex data pipelines, integration frameworks, and metadata-driven architectures that enable seamless access and analytics. This role prefers deep understanding of the big data processing, distributed computing, data modeling, and governance frameworks to support self-service analytics, AI-driven insights, and enterprise-wide data management.
Roles & Responsibilities:
- Data Engineer who owns development of complex ETL/ELT data pipelines to process large-scale datasets
- Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions
- Ensuring data integrity, accuracy, and consistency through rigorous quality checks and monitoring
- Exploring and implementing new tools and technologies to enhance ETL platform and performance of the pipelines
- Proactively identify and implement opportunities to automate tasks and develop reusable frameworks
- Eager to understand the biotech/pharma domains & build highly efficient data pipelines to migrate and deploy complex data across systems
- Work in an Agile and Scaled Agile (SAFe) environment, collaborating with cross-functional teams, product owners, and Scrum Masters to deliver incremental value
- Use JIRA, Confluence, and Agile DevOps tools to manage sprints, backlogs, and user stories.
- Support continuous improvement, test automation, and DevOps practices in the data engineering lifecycle
- Collaborate and communicate effectively with the product teams, with cross-functional teams to understand business requirements and translate them into technical solutions
What we expect of you
Must-Have Skills:
- Experience in Data Engineering with a focus on Databricks, AWS, Python, SQL, and Scaled Agile methodologies
- Proficiency & Strong understanding of data processing and transformation of big data frameworks (Databricks, Apache Spark, Delta Lake, and distributed computing concepts)
- Strong understanding of AWS services and can demonstrate the same
- Ability to quickly learn, adapt and apply new technologies
- Strong problem-solving and analytical skills
- Excellent communication and teamwork skills
- Experience with Scaled Agile Framework (SAFe), Agile delivery, and DevOps practices
Good-to-Have Skills:
- Data Engineering experience in Biotechnology or pharma industry
- Exposure to APIs, full stack development
- Experienced with SQL/NOSQL database, vector database for large language models
- Experienced with data modeling and performance tuning for both OLAP and OLTP databases
- Experienced with software engineering best-practices, including but not limited to version control (Git, Subversion, etc.), CI/CD (Jenkins, Maven etc.), automated unit testing, and Dev Ops
Education and Professional Certifications
- Bachelor s degree and 2 to 5 + years of Computer Science, IT or related field experience
- OR
- Master s degree and 1 to 4 + years of Computer Science, IT or related field experience
- AWS Certified Data Engineer preferred
- Databricks Certificate preferred
- Scaled Agile SAFe certification preferred
Soft Skills:
- Excellent analytical and troubleshooting skills.
- Strong verbal and written communication skills
- Ability to work effectively with global, virtual teams
- High degree of initiative and self-motivation.
- Ability to manage multiple priorities successfully.
- Team-oriented, with a focus on achieving team goals.
- Ability to learn quickly, be organized and detail oriented.
- Strong presentation and public speaking skills.