Data Engineer (Data Infra & Ops)
About this position
Responsibilities
• Designing and evaluating optimal data pipeline architecture
• Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, maintaining data pipeline monitoring, etc.
• Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS/GCP Big Data technologies.
• Collaborate with data scientists, engineers, and stakeholders to ensure effective deployment and integration of machine learning models in the AWS cloud environment using related AWS services
• Support data scientists to troubleshoot and debug machine learning applications, providing technical support to resolve issues.
• Manage the auto-scaling and performance monitoring of the Data Infrastructure, including for machine learning applications
• Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
• Work with data and analytics experts to strive for greater functionality in our data systems.
Requirements
• Bachelor Degree, in Statistics, Mathematics or Computer Science
• At least 3 years experience as Data Engineer
• Strong skills in SQL, with proficiency in writing efficient and optimized code for data integration, storage, processing, and manipulation, and good knowledge of ETL and/or ELT tools
• Proficiency in one or more programming languages. Python is required.
• Have experience in cloud-based data-warehousing solutions such as BigQuery, Redshift, etc.
• Have experience related to AWS services such as SageMaker, EMR, S3, DynamoDB and EC2
• Experience with Devops tools (such as Github Action) and infrastructure-as-code is a plus
• Knowledge of data security measures, including role-based access control (RBAC) and data encryption is a plus
• Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent is a plus
• Able to communicate in English