Skills:
SQL, Apache Kafka, Amazon Redshift, Apache Spark, Talend, Data Warehousing, ETL Testing, Python,
Pls rate the candidate (from 1 to 5, 1 lowest, 5 highest ) in these areas
- Big Data
- PySpark
- AWS
- Redshift
Position Summary
Experienced ETL Developers and Data Engineers to ingest and analyze data from multiple enterprise sources into Adobe Experience Platform
Requirements
- About 4-6 years of professional technology experience mostly focused on the following:
- 4+ year of experience on developing data ingestion pipelines using Pyspark(batch and streaming).
- 4+ years experience on multiple Data engineering related services on AWS, e.g. Glue, Athena, DynamoDb, Kinesis, Kafka, Lambda, Redshift etc.
- 1+ years of experience of working with Redshift esp the following.
- Experience and knowledge of loading data from various sources, e.g. s3 bucket and on-prem data sources into Redshift.
- Experience of optimizing data ingestion into Redshift.
- Experience of designing, developing and optimizing queries on Redshift using SQL or PySparkSQL
- Experience of designing tables in Redshift(distribution key, compression etc., vacuuming,etc. )
Experience of developing applications that consume the services exposed as
ReST APIs. Experience and ability to write and analyze complex and performant SQLs
Special Consideration given for
- 2 years of Developing and supporting ETL pipelines using enterprise-grade ETL tools like Pentaho, Informatica, Talend
- Good knowledge on Data Modelling(design patterns and best practices)
- Experience with Reporting Technologies (i.e. Tableau, PowerBI)
What Youll Do
Analyze and understand customers use case and data sources and extract, transform and load data from multitude of customers enterprise sources and ingest into Adobe Experience Platform
Design and build data ingestion pipelines into the platform using PySpark
Ensure ingestion is designed and implemented in a performant manner to support the throughout and latency needed.
Develop and test complex SQLs to extract\analyze and report the data ingested into the Adobe Experience platform.
Ensure the SQLs are implemented in compliance with the best practice to they are performant.
Migrate platform configurations, including the data ingestion pipelines and SQL, across various sandboxes.
Debug any issues reported on data ingestion, SQL or any other functionalities of the platform and resolve the issues.
Support Data Architects in implementing data model in the platform.
Contribute to the innovation charter and develop intellectual property for the organization.
Present on advanced features and complex use case implementations at multiple forums.
Attend regular scrum events or equivalent and provide update on the deliverables.
Work independently across multiple engagements with none or minimum supervision.