Job SummarySynechron is seeking a motivated and experienced Big Data Engineer to design, develop, and implement scalable big data solutions. The ideal candidate will possess strong hands-on experience with Hadoop, Spark, and NoSQL databases, enabling the organization to ingest, process, and analyze vast data sets efficiently. This role contributes directly to the organizations data-driven initiatives by creating reliable data pipelines and collaborating with cross-functional teams to deliver insights that support strategic decision-making and operational excellence.
Purpose:
To build and maintain optimized big data architectures that support real-time and batch data processing, enabling analytics, reporting, and machine learning efforts.
Value:
By ensuring high-performance and scalable data platforms, this role accelerates data insights, enhances business agility, and ensures data integrity and security.
Software RequirementsRequired Skills:
- Deep expertise in Hadoop ecosystem components including Hadoop Distributed File System (HDFS), Spark (batch and streaming), and related tools.
- Practical experience with NoSQL databases such as Cassandra, MongoDB, and HBase.
- Experience with data ingestion tools like Spark Streaming and Apache Flume.
- Strong programming skills in Java, Scala, or Python.
- Familiarity with DevOps tools such as Git, Jenkins, Docker, and container orchestration with OpenShift or Kubernetes.
- Working knowledge of cloud platforms like AWS and Azure for deploying and managing data solutions.
Preferred Skills:
- Knowledge of additional data ingestion and processing tools.
- Experience with data cataloging or governance frameworks.
Overall Responsibilities- Design, develop, and optimize large-scale data pipelines and data lakes using Spark, Hadoop, and related tools.
- Implement data ingestion, transformation, and storage solutions to meet business and analytic needs.
- Collaborate with data scientists, analysts, and cross-functional teams to translate requirements into technical architectures.
- Monitor daily data operations, troubleshoot issues, and improve system performance and scalability.
- Automate deployment and maintenance workflows utilizing DevOps practices and tools.
- Ensure data security, privacy, and compliance standards are upheld across all systems.
- Stay updated with emerging big data technologies to incorporate innovative solutions.
Strategic objectives:
- Enable scalable, reliable, and efficient data processing platforms to support analytics and AI initiatives.
- Improve data quality, accessibility, and timeliness for organizational decision-making.
- Drive automation and continuous improvement in data infrastructure.
Performance outcomes:
- High reliability and performance of data pipelines with minimal downtime.
- Increased data ingestion and processing efficiency.
- Strong collaboration across teams leading to successful project outcomes.
Technical Skills (By Category)Programming Languages:
- Essential: Java, Scala, or Python for developing data pipelines and processing scripts.
- Preferred: Knowledge of additional languages such as R or SQL scripting for data manipulation.
Databases & Data Management:
- Experience with Hadoop HDFS, HBase, Cassandra, MongoDB, and similar NoSQL data stores.
- Familiarity with data modeling, ETL workflows, and data warehousing strategies.
Cloud Technologies:
- Practical experience deploying and managing big data solutions on AWS (e.g., EMR, S3) and Azure.
- Knowledge of cloud security practices and resource management.
Frameworks & Libraries:
- Extensive use of Hadoop, Spark (structured and streaming), and related libraries.
- Familiarity with serialization formats like Parquet, Avro, or ORC.
Development Tools & Methodologies:
- Proficiency with GIT, Jenkins, Docker, and OpenShift/Kubernetes for versioning, CI/CD, and containerization.
- Experience working within Agile/Scrum environments.
Security & Data Governance:
- Comprehension of data security protocols, access controls, and compliance regulations.
Experience Requirements- 4 to 7 years of hands-on experience in Big Data engineering or related roles.
- Demonstrable experience designing and maintaining large-scale data pipelines, data lakes, and data warehouses.
- Proven aptitude for using Spark, Hadoop, and NoSQL databases effectively in production environments.
- Prior experience in financial services, healthcare, retail, or telecommunications sectors is a plus.
- Ability to lead technical initiatives and collaborate with multidisciplinary teams.
Day-to-Day Activities- Develop and optimize data ingestion, processing, and storage workflows.
- Collaborate with data scientists and analysts to architect solutions aligned with business needs.
- Build, test, and deploy scalable data pipelines ensuring high performance and reliability.
- Monitor system health, diagnose issues, and implement improvements for data systems.
- Conduct code reviews and knowledge sharing sessions within the team.
- Participate in sprint planning, daily stand-ups, and project reviews to ensure timely delivery.
- Stay current with evolving big data tools and best practices.
Qualifications- Bachelors or Masters degree in Computer Science, Information Technology, or related field.
- Relevant certifications in big data technologies or cloud platforms are a plus.
- Demonstrable experience leading end-to-end data pipeline solutions.
Professional Competencies- Strong analytical, troubleshooting, and problem-solving skills.
- Effective communicator with the ability to explain complex concepts to diverse audiences.
- Ability to work collaboratively in a team-oriented environment.
- Adaptability to emerging technologies and shifting priorities.
- High level of organization and attention to detail.
- Drive for continuous learning and process improvement.