6 Signs of Dirty Data [Infographic]

Less risqué than Dirty Dancing, less provocative than Dirty Diana—dirty data is something you want to keep away from your company. You know from the name that’s it’s probably not good, but what is dirty data?

Simply put, it’s data that has errors, mistakes, and is incomplete in some way. And it’s costing you and your company money. Chances are you, or at least someone at your company knows that’s a problem, but more than 90 percent of companies still aren’t keeping their data clean.

For marketers and salespeople, that’s 30 percent of data that becomes totally unusable. That comes through in missing emails, duplicate records, and inaccurate (or fake) audience and fan records. That’s a tough pill to swallow for live events like sports and festivals, which already have a tough time knowing fans who aren’t in their database.

Not keeping your data clean comes at a cost, but if you’re not a data scientist, how do you even diagnose that you have a problem? Luckily, Umbel’s Data Science team has seen (and cleaned) it all. Our data scientists recently came together to look at how you can diagnose dirty data with some tips on clean up across six areas:

  • Completeness

  • Uniqueness

  • Consistency

  • Timeliness

  • Validity

  • Accuracy

In their discussion, they covered why some things are possible to fix after the fact, but also how you can save yourself headaches down the line with a solid data collection strategy from the very beginning. Beyond the don’ts, they also touched on:

  • Basic principles of ‘Tidy Data’

  • Common data ‘gotchas’

  • Best practices for getting the most out of data sets

You can watch their on-demand webinar below or scroll down for a recap infographic of the six signs of dirty data.


Infographic depicting the 6 signs of dirty data.

Leave a comment