GIGAMIND

Folder:
108 Data Analysis
File:
108.10.20 Data Analysis - Step 2 ingest clean validate

Step 2: Ingest/Clean/Validate

This is the most time consuming, frustrating, unstructured part of the analysis. It's also where the most can go wrong. Therefore, it is the most important and must be done right.

1. Ingest raw data into database

108.10.20.10 Data Analysis - Ingest raw data into database

2. Understand

108.40.20.20 Data Analysis - Raw data understanding and summary statistics

3. Clean / map

108.10.20.20 Data Analysis - Clean and map data

4. Confirm

Never move to analysis before signing off on data quality.
108.10.20.30 Data Analysis - Summary statistics and confirmation


Source:
  • Me