Offer summary

Qualifications:

4-6 years of experience in ETL development and data engineering., Proficiency in PySpark, SQL, and AWS services like Redshift and Glue., Experience with data ingestion pipelines and optimizing data loading processes., Familiarity with data modeling and reporting technologies such as Tableau or PowerBI..

Key responsabilities:

Analyze customer use cases and ingest data from various enterprise sources into Adobe Experience Platform.

Design and build data ingestion pipelines using PySpark while ensuring performance and compliance with best practices.

Develop and test complex SQL queries for data extraction and reporting.

Debug issues related to data ingestion and support Data Architects in implementing data models.

Job description

Skills:
SQL, Apache Kafka, Amazon Redshift, Apache Spark, Talend, Data Warehousing, ETL Testing, Python,

Pls rate the candidate (from 1 to 5, 1 lowest, 5 highest ) in these areas

Big Data
PySpark
AWS
Redshift

Position Summary

Experienced ETL Developers and Data Engineers to ingest and analyze data from multiple enterprise sources into Adobe Experience Platform

Requirements

About 4-6 years of professional technology experience mostly focused on the following:
4+ year of experience on developing data ingestion pipelines using Pyspark(batch and streaming).
4+ years experience on multiple Data engineering related services on AWS, e.g. Glue, Athena, DynamoDb, Kinesis, Kafka, Lambda, Redshift etc.
1+ years of experience of working with Redshift esp the following.
Experience and knowledge of loading data from various sources, e.g. s3 bucket and on-prem data sources into Redshift.
Experience of optimizing data ingestion into Redshift.
Experience of designing, developing and optimizing queries on Redshift using SQL or PySparkSQL
Experience of designing tables in Redshift(distribution key, compression etc., vacuuming,etc. )

Experience of developing applications that consume the services exposed as ReST APIs. Experience and ability to write and analyze complex and performant SQLs

Special Consideration given for

2 years of Developing and supporting ETL pipelines using enterprise-grade ETL tools like Pentaho, Informatica, Talend
Good knowledge on Data Modelling(design patterns and best practices)
Experience with Reporting Technologies (i.e. Tableau, PowerBI)

What Youll Do

Analyze and understand customers use case and data sources and extract, transform and load data from multitude of customers enterprise sources and ingest into Adobe Experience Platform

Design and build data ingestion pipelines into the platform using PySpark

Ensure ingestion is designed and implemented in a performant manner to support the throughout and latency needed.

Develop and test complex SQLs to extract\analyze and report the data ingested into the Adobe Experience platform.

Ensure the SQLs are implemented in compliance with the best practice to they are performant.

Migrate platform configurations, including the data ingestion pipelines and SQL, across various sandboxes.

Debug any issues reported on data ingestion, SQL or any other functionalities of the platform and resolve the issues.

Support Data Architects in implementing data model in the platform.

Contribute to the innovation charter and develop intellectual property for the organization.

Present on advanced features and complex use case implementations at multiple forums.

Attend regular scrum events or equivalent and provide update on the deliverables.

Work independently across multiple engagements with none or minimum supervision.

Required profile