Data quality tool
I tried to create a "Data Quality (DQ)" tool at Lucky8. The tool was incredibly important because consistent data quality is key to a successful engagement.
But the tool was never successful, never got used, very disappointing. And the reason it never got used is because it was an extra step in the process - it wasn't the process. People did their work in SQL and had to copy/paste outputs into the tool to save them, in addition to copy/pasting it into Excel.
If this is going to work in practice, in the future (and it would be a very good idea to have it as part of the flow), then it needs to be integrated directly into the process - as part of getting data outputs from sql/jupyter into a display/plot/output.
- The SQL and data manipulation should be done in a single location. Jupyter Notebook could be a good fit for this.
- If the SQL is run from a client and the data manipulation is done in Python and Tableau and Excel, the tool will fail because a data scientist will copy data from SQL into Excel, do further analysis, format it, and copy into the presentation.