Moving Redshift data between databases

by | Apr 23, 2014

I recently had need to transfer some data between Redshift Databases. Coming from an Oracle background I started my search for the Redshift equivalent to import/export. I found was that there wasn’t one. However using the unload and copy commands you can achieve the desired results. There are quite a few flags options with the commands but here are the ones that worked for me on large set of tables.
To extract the data from a table called PRODUCT in the XYZ schema run the following against the source database. You need to use S3 to store the export files.

unload ('select * from xyz.product') to 's3://my-bucket/product' CREDENTIALS
'aws_access_key_id=<your key here>;aws_secret_access_key=<your key here>
' MANIFEST  DELIMITER '|' NULL AS 'null_string' ESCAPE ALLOWOVERWRITE;

In your S3 bucket this will create a manifest file called productmanifest which contains a list of all the export files created and the export data files (eg product0001_part_00 & product0001_part_00).
On the target database issue the following command to load the data into the database.

copy xyz.product from 's3://my-bucket/productmanifest' CREDENTIALS
'aws_access_key_id=<your key here>;aws_secret_access_key=<your key here>'
MANIFEST DELIMITER '|' NULL AS 'null_string' ESCAPE;

If there are any errors you can check for errors in the STL_LOAD_ERRORS table.
All the code, all the fun – Ben
 
Ben writes blogs about the technical side of BI the code, all the code and not much other than the code. 
You can read Oracle Apex 5 on AWS Linux or all of Ben’s blogs here.
We run regular business intelligence courses in both Wellington and Auckland. Find out more here.

11 Comments
  1. Eric

    Thank you so much. I spent hours trying to figure this out. Your blog post was the only thing that worked to allow migration of data between Redshift clusters.

    Reply
  2. Steven Ensslen

    If you want to do this in bulk you can generate all of the commands with the following SELECT from the data dictionary:
    SELECT ‘unload (”select * from ‘ + table_schema + ‘.’ + table_name + ‘)
    to ”s3://my-bucket/’ + table_schema + ‘.’ + table_name + ”’
    CREDENTIALS ”aws_access_key_id=;aws_secret_access_key=”
    MANIFEST DELIMITER ”|” NULL AS ”null_string” ESCAPE ALLOWOVERWRITE;’
    FROM information_schema.tables
    WHERE table_type = ‘BASE TABLE’
    Of course, you’ll need to modify the S3 details before they run.
    Steven

    Reply
  3. tushar belorkar

    my use case is: move all data from one redshift cluster to new cluster. I can use above commands for data. But how about system tables like database, schema, user details along with privileges?

    Reply
  4. tushar belorkar

    Thanks Lee. But, our new cluster is encrypted whereas existing cluster is unencrypted. So, does restoring new cluster from existing cluster snapshot would work?

    Reply
  5. tushar belorkar

    Yeah, restoring won’t work. Thanks for admin views…looks great and very helpful indeed.
    Thanks Lee!

    Reply
  6. Radha krishna

    Is there a way, I can restore the redshift database to local database (mysql, etc.)

    Reply
  7. Ben Lee

    Hi Radha,
    You can use a similar approach as described here and use the UNLOAD command to extract the data to CSV files and upload them to your desired database. For example Mysql has a LOAD command to upload the data from CSV.
    Regards,
    Ben

    Reply
  8. Sanket Nalavade

    How can i move all the tables in the schema at once from one cluster to another?

    Reply
  9. Ben Lee

    Hi Sanket,
    It’s been a while looked into this issue but I don’t believe there is a way to what you want.
    I is one table at a time or a snapshot of the whole DB
    Regards,
    Ben

    Reply
Submit a Comment

Your email address will not be published. Required fields are marked *