How can I do Continuous data ingestion from on prem data sources to redshift

0 votes

I have a required to ingest data from multiple on-prem data sources into my Redshift. This ingestion will be a scheduled activity running every 6 hours in a day. The process should be able to identify the delta records and load only new/changed records in Redshift. In all these processes, restart option should also be made available.I am trying to do this using either entirely AWS services or with a combination of python programs and aws services.

My idea is to setup a data flow from external sources to s3, then temporarily launch a ec2 instance for any data processing/wrangling requirement, then write the curated data back to s3, terminate the ec2 instance and load data into redshift using datapipeline.

Can you suggest some pointers to start with. If you have experience with a similar project , do share your experiences. Also if possible, please share a design and associated code for reference.

Aug 8, 2018 in AWS by bug_seeker
• 15,520 points

1 answer to this question.

0 votes

I can Recommend looking into AWS Schema Conversion Tool (AWS SCT) and AWS Database Migration Service (AWS DMS).

DMS can help you establish ongoing movement of data from on prem sources to Redshift, including staging the data to S3. Supported sources are list in the docs.

Start with the walk through in this blog post: "How to Migrate Your Oracle Data Warehouse to Amazon Redshift Using AWS SCT and AWS DMS"

If you still don’t get an answer do comment i would then look into this and surely help you.

answered Aug 8, 2018 by Priyaj
• 58,090 points

Related Questions In AWS

0 votes
0 answers

How can I migrate Elastic beanstalk env from one region to another on AWS?

I need help! How can i migrate ...READ MORE

Sep 18, 2020 in AWS by anonymous
• 19,610 points
+15 votes
2 answers

Git management technique when there are multiple customers and need multiple customization?

Consider this - In 'extended' Git-Flow, (Git-Multi-Flow, ...READ MORE

answered Mar 27, 2018 in DevOps & Agile by DragonLord999
• 8,450 points
+1 vote
2 answers

AWS CloudWatch Logs in Docker

The awslogs works without using ECS. you need to configure ...READ MORE

answered Sep 7, 2018 in AWS by bug_seeker
• 15,520 points
0 votes
1 answer

How can i copy tables from one database to other on AWS?

You can use AWS Data pipeline to ...READ MORE

answered Jul 5, 2018 in AWS by Priyaj
• 58,090 points
0 votes
1 answer
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP