Senior Data Engineer
About this position
Responsibilities
• Developing and maintaining scalable and reliable ETL pipelines and processes to ingest data from a large number and variety of data sources
• Developing a deep understanding of real-time data productions availability to inform on the real time metric definitions
• Develop data quality checks and establish best practices for data governance, quality assurance, data cleansing, and ETL-related activities
• Develop familiarity with the existing inbuilt data platform tools and utilize them efficiently to set up the data pipelines.
• Maintaining and optimizing the performance of our data analytics infrastructure to ensure accurate, reliable and timely delivery of key insights for decision making
• Design and deliver the next-gen data lifecycle management suite of tools/frameworks, including ingestion and consumption on the top of the data lake to support real-time, API-based and serverless use-cases, along with batch as relevant.
• Build solutions leveraging AWS services such as Glue, Redshift, Athena, Lambda, S3, Step Functions, EMR, and Kinesis to enable efficient data processing and analytics.
• Develop a deep understanding of real-time data production availability to inform real-time metric definitions using tools like Amazon MSK or Kinesis Data Streams.
• Implement and monitor data quality checks and establish best practices for data governance, quality assurance, data cleansing, and ETL-related activities using AWS Glue DataBrew or similar tools.
Requirements
• At least 5+ years of relevant experience in developing scalable, secured, distributed, fault tolerant, resilient & mission-critical data solutions