GIGAMIND
Folder:
108 Data Analysis
File:
108.10.20 Data Analysis - Step 2 ingest clean validate
Step 2: Ingest/Clean/Validate
This is the most time consuming, frustrating, unstructured part of the analysis. It's also where the most can go wrong. Therefore, it is the most important and must be done right.
1. Ingest raw data into database
108.10.20.10 Data Analysis - Ingest raw data into database
2. Understand
108.40.20.20 Data Analysis - Raw data understanding and summary statistics
3. Clean / map
108.10.20.20 Data Analysis - Clean and map data
4. Confirm
Never move to analysis before signing off on data quality.
108.10.20.30 Data Analysis - Summary statistics and confirmation
Source:
- Me