OptimalBI | We do cool sh!t with data

I have been researching different ways that we can get data into AWS Redshift and found importing a CSV data into Redshift from AWS S3 is a very simple process.

Get the CSV file into S3 -> Define the Target Table -> Import the file

Get the CSV file into S3
Upload the CSV file into a S3 bucket using the AWS S3 interface (or your favourite tool). Make sure you have the right permissions on the bucket; The Access key you’ll use later needs the ability to read the file (by default only the User that created the bucket has access).
Define the Target Table

If not already created, you’ll need to create the table you’ll be populating. Make sure you have defined the columns correctly otherwise you’ll have some debugging to do later.

Something like (I say something like as I am not SQL guru 😉 )

CREATE TABLE <table_name> ( <variable_name> attributes, …);

Another thing to remember is to make sure you have granted the right permissions for the user that will be querying this table

GRANT SELECT on <table_name> to <user or group>;

Import the file

The simple bit, loading the CSV file into Redshift from S3 is one command.

COPY <table_name> FROM ‘s3://<bucket_name>/<csv_file>‘ CREDENTIALS ‘aws_access_key_id=< aws_access_key_id >;aws_secret_access_key=< aws_secret_access_key >‘ CSV <other_options> ;

And that is basically it.

Like always there are plenty of options on the COPY command that you’ll need to look at and there are good AWS resources to read

Tutorial: Loading Data from Amazon S3 (http://docs.aws.amazon.com/redshift/latest/dg/tutorial-loading-data.html)

Redshift Database Developer Guide (http://docs.aws.amazon.com/redshift/latest/dg/welcome.html).

Barry, Preventer Of Chaos.

Barry Stevens is part of our Data Ops team who specialises in SAS Administration, helping you prevent chaos.

Connect with Barry on LinkedIn or read some of his other blogs here.

‍

Importing CSV file from AWS S3 into Redshift