Data Engineer
เกี่ยวกับตำแหน่งนี้
The Data Engineer role involves performing data exploration, cleaning, and feature engineering, as well as building and maintaining data pipelines for machine learning models.
หน้าที่รับผิดชอบ
• Perform data exploration, data cleaning, data imputation, and feature engineering on unstructured and structured data.
• Build the infrastructure for optimal extraction, transformation, and loading (ETL) of data from a wide variety of data sources.
• Develop and maintain optimal data pipeline architecture for training statistical and machine learning models such as regression and classification.
• Develop and maintain evaluations to measure the effectiveness of training data. This includes measuring the capabilities of models on a variety of tasks and domains.
• Collaborate with data scientists and machine learning engineers to develop a comprehensive data science/machine learning solution pipeline.
คุณสมบัติ
• Bachelor's degree from computer science or related fields, or equivalent software engineering experience.
• Proficiency in Python programming language.
• Experience in dataset processing and feature engineering using tools such as Numpy, Pandas, and Scikit-Learn.
• Visualization skills using tools such as Matplotlib, Seaborn, and Bokeh.
• Understanding of deep learning frameworks such as Pytorch and TensorFlow.
• Understanding of SQL and NoSQL.
• Understands Hadoop / Spark / Kafka / Hive / Presto.
• Proficiency in source control i.e. Git.
• Deep understanding of Object-Oriented Programming (OOP) concepts such as inheritance, delegation, and abstract class.
• Understanding of cloud-native technologies such as AWS, GCP, and Azure.
• Experience in using Docker.
• Experience in using AWS services such as S3, EC2, Glue, Sagemaker.
• Experience in AWS Step Function and/or AWS Lambda is even better.
• Proficiency in Scala and Java programming languages.
• Enjoy iterating quickly with research prototypes and learning new technologies.