Home > 108 Data Analysis > 108.10.20.10.20.10 Data Analysis - Moving data from disk to S3

Moving data from disk to S3

The first step to getting data into a Redshift database is to put the raw files on S3. From here it can be COPY‘d onto Redshift. Getting a file onto S3 can be accomplished in a couple of different ways:

Use the S3 web interface (https://console.aws.amazon.com) for small files
Use the AWS CLI for large files. This is preferred and easier once you get it.
- E.g. aws s3 cp raw_data.csv s3://bucket-name/raw_data.csv

S3 can also act like a long term data repository. The files on local and virtual machines are subject to being deleted occasionally and S3 is a much safer place to store data.

Graph:

108.10.20.10.20.10 Data Analysis - Moving data from disk to S3 to 108.10.20 Data Analysis - Step 2 ingest clean validate
108.10.20.10.20 Data Analysis - Standardized process to ingest raw data to 108.10.20.10.20.10 Data Analysis - Moving data from disk to S3

Travis Giggy

Moving data from disk to S3

Graph: