Folder:
108 Data Analysis
File:
108.40.20.50 Data Analysis - Data quality tool

Data quality tool

I tried to create a "Data Quality (DQ)" tool at Lucky8. The tool was incredibly important because consistent data quality is key to a successful engagement.

But the tool was never successful, never got used, very disappointing. And the reason it never got used is because it was an extra step in the process - it wasn't the process. People did their work in SQL and had to copy/paste outputs into the tool to save them, in addition to copy/pasting it into Excel.

If this is going to work in practice, in the future (and it would be a very good idea to have it as part of the flow), then it needs to be integrated directly into the process - as part of getting data outputs from sql/jupyter into a display/plot/output.
- The SQL and data manipulation should be done in a single location. Jupyter Notebook could be a good fit for this.
- If the SQL is run from a client and the data manipulation is done in Python and Tableau and Excel, the tool will fail because a data scientist will copy data from SQL into Excel, do further analysis, format it, and copy into the presentation.


Source:
  • Me