Data Engineer (Middle-Senior)
About this position
The Data Engineer (Middle-Senior) is responsible for designing and implementing data pipelines, self-service tools, and processes to ensure data quality and observability within the data platform.
Responsibilities
• Design and implement data pipeline on data lake/data lakehouse concept
• Design and implement self-service tools in data platform (orchestration tool and data quality tool)
• Design and implement data validation process to improve data quality in data pipeline
• Monitor data platform cluster (Kubernetes)
• Design and implement data observability process to monitor data pipeline
• Design and implement data lineage process to tracking data movement
• Design and implement data models to support analysts
• Coaching other team members
Requirements
• 3+ years of experience in design and implementation data pipeline on data lake/data lakehouse concept
• Experience in Apache spark
• Experience in Apache airflow
• Manage multiple projects
• Understand data quality/data validation concept
• Strong programming skills (SQL, Python, Scala)
• Strong in using Docker, Kubernetes
• Experience in coding standard and testing
• Have a growth mindset, and willing to learn new things or share knowledge with others
• Understand data lineage concept is a plus
• Understand data observability concept is a plus
• Understand data mesh concept is a plus
• Data analytics engineer experience is a plus