Position Overview:
We are seeking a skilled and passionate Data Engineer with deep expertise in Google Cloud Platform (GCP) and Google BigQuery. In this role, you will architect, build, and maintain the scalable data pipelines that are the foundation of our analytics and data science initiatives.
You will be responsible for the entire data lifecycle, from ingestion and processing to warehousing and serving. You will work on creating a reliable, efficient, and high-quality data ecosystem that empowers our data analysts, data scientists, and business leaders to make critical data-informed decisions.
ShyftLabs is a growing data product company that was founded in early 2020 and works primarily with Fortune 500 companies. We deliver digital solutions built to help accelerate the growth of businesses in various industries by focusing on creating value through innovation.
-
Data Architecture & Pipeline Development: Design, build, and maintain scalable and reliable batch and real-time ETL/ELT data pipelines using GCP services like Dataflow, Cloud Functions, Pub/Sub, and Cloud Composer.
-
Data Warehousing: Develop and manage our central data warehouse in Google BigQuery. Implement data models, schemas, and table structures optimized for performance and scalability.
-
Data Processing & Transformation: Write clean, efficient, and robust code (primarily in SQL and Python) to transform raw data into curated, analysis-ready datasets.
-
Infrastructure Optimization & Scalability: Monitor, troubleshoot, and optimize our data infrastructure for performance, reliability, and cost-effectiveness. Implement BigQuery best practices, including partitioning, clustering, and materialized views.
-
Enable Data Accessibility & BI: Build and maintain curated data models that serve as the "source of truth" for business intelligence and reporting, ensuring data is ready for consumption by BI tools like Looker.
-
Data Governance & Quality: Implement automated data quality checks, validation rules, and monitoring to ensure the accuracy and integrity of our data pipelines and warehouse.
-
Collaboration: Work closely with software engineers, data analysts, and data scientists to understand their data requirements and provide the necessary infrastructure and data products.
-
Experience: 3-5+ years of hands-on experience in a Data Engineering, Software Engineering, or a similar role.
-
Programming Skills: Strong proficiency in a programming language such as Python or Java for data processing and automation.
-
Expert SQL Proficiency: Mastery of SQL for complex data manipulation, DDL/DML operations, and query optimization.
-
Google BigQuery: Proven expertise in using BigQuery as a data warehouse, including data modeling, performance tuning, and cost management.
-
GCP Data Services: Hands-on experience building data pipelines using the GCP ecosystem (e.g., Dataflow, Pub/Sub, Cloud Storage, Cloud Composer/Airflow).
-
Data Pipeline Concepts: Deep understanding of ETL/ELT principles and data warehousing architecture (e.g., Star Schema, Data Lakes).
-
Engineering Mindset: Strong problem-solving and troubleshooting skills with a focus on building scalable, maintainable, and automated systems.
-
BI Tool Integration: Experience building data models that power BI tools like Looker (knowledge of LookML is a strong plus), Tableau, or Power BI.
-
Modern Data Stack Tools: Experience with tools like dbt, Dataform, or Fivetran for data transformation and integration.
-
Infrastructure as Code (IaC): Familiarity with tools like Terraform or Deployment Manager for managing cloud infrastructure.
-
Containerization: Knowledge of Docker and Kubernetes is a plus.
-
Certifications: Google Cloud Professional Data Engineer certification is highly desirable.
-
Version Control: Proficiency with Git for code management and CI/CD pipelines.
We are proud to offer a competitive salary alongside a strong insurance package. We pride ourselves on the growth of our employees, offering extensive learning and development resources.