GIGAMIND

Folder:
108 Data Analysis
File:
108.10.20.10.20.10 Data Analysis - Moving data from disk to S3

Moving data from disk to S3

The first step to getting data into a Redshift database is to put the raw files on S3. From here it can be COPY'd onto Redshift. Getting a file onto S3 can be accomplished in a couple of different ways:

  • Use the S3 web interface (https://console.aws.amazon.com) for small files
  • Use the AWS CLI for large files. This is preferred and easier once you get it.
    • E.g. aws s3 cp raw_data.csv s3://bucket-name/raw_data.csv

S3 can also act like a long term data repository. The files on local and virtual machines are subject to being deleted occasionally and S3 is a much safer place to store data.


Source:
  • Me