As entrepreneurs ourselves, we understand the unique challenges startups face managing their rollercoaster growth. We’ve lived it. We know that even well-funded teams can lack the bandwidth to recruit, train, and integrate the operations staff needed to meet growing demand. And that even when the right employees are in place, many companies lack the crucial mid-management layer needed to drive employee performance and process improvements. Hugo was created with the high-growth startup in mind. We custom build or augment existing operations teams for companies in scaling mode, leaving founders and senior management to focus on what matters most: growth.
Key Responsibilities:
- Data Pipeline Execution: Build and maintain reliable data pipelines that automate ingestion from both structured and unstructured sources. Leverage Python and SQL to ensure data flows are secure and traceable.
- Transformation Layer: Develop and manage transformation workflows using dbt, ensuring data models are modular, tested, and version-controlled via Git.
- Orchestration & Scheduling: Schedule data workflows using tools such as Airflow, Prefect, or GCP Workflows, ensuring automated and timely data delivery across our systems.
- Cloud Warehouse Support: Maintain the data warehouse environment (BigQuery), focusing on query performance, cost monitoring, and schema organization.
- Observability & Quality: Implement data validation tests and lineage tracking using frameworks like Elementary to ensure high levels of data integrity and trust.
- Infrastructure as Code (IaC): Assist in managing and deploying cloud resources (BigQuery datasets, IAM roles, GCS buckets) using Terraform to ensure a reproducible and documented environment.
- Version Control & CI/CD: Maintain the integrity of our codebase using GitHub. Ensure that every dbt change or Python script follows our CI/CD patterns (GitHub Actions) for automated testing and deployment.
- Analytics Support: Collaborate with BI analysts and product teams to provide clean, optimized data sets for reporting and internal tools.
- Operational Documentation: Maintain clear documentation (dbt docs/SOPs) for pipelines and models to support team-wide data discovery and "self-service" BI.
Competencies
WHAT QUALIFICATIONS YOU’LL NEED
- Proficiency in SQL and Python for data manipulation and automation.
- Practical experience with dbt for building and maintaining modular data models.
- Familiarity with Git/GitHub workflows and CI/CD principles for managing code.
- Hands-on experience with BigQuery (or similar cloud DWH like Snowflake).
- Practical experience with GCP (preferred) or AWS/Azure. Understanding of IAM permissions and cloud storage.
- Familiarity with Terraform or a strong desire to learn how to manage infrastructure through code rather than manual console clicks.
- Solid understanding of Data Modeling, knowledge of star schemas and how to structure data for efficient reporting.
- Ability to troubleshoot broken pipelines and optimize slow-running queries.
- Strong communication skills and a desire to work collaboratively under the guidance of the Lead Data Engineer.
- Proactive about catching data issues and suggesting improvements to existing workflows.
Experiences
- 2 to 3 years of progressive experience in Data Engineering or Analytics Engineering.
- Prior experience in a scale-up, product-led, or data-centric organization.
- Familiarity with BI tools is a plus (Tableau, PowerBI, Looker), though not a core requirement.
- Proven track record of building and managing dbt models in a production environment.
- Experience with API-based ingestion (using Airbyte, dlt, and/or custom scripts) is a plus.
- Ability to work effectively with people at all levels in an organization.
- Excellent written and oral communication skills, with the ability to present to various audiences and distill key messages in order to effectively inform.
- Skills to communicate complex ideas effectively.
Method of Application
Signup to view application details.
Signup Now