Diving into Data Discovery: the #1 QA process your firm should prioritize each year

Estimated read time: 6 minutes

The data discovery process doesn’t get a lot of press when it comes to data conversion projects. It’s Step 1 and therefore lacks the excitement and glamor that comes with the final go-live event. And yet, data discovery is the foundation that will make or break any conversion, extraction, or upgrade. How long will the project take? How complicated will it be? How much personnel will be needed? Data discovery establishes these parameters.

What many firm executives don’t realize is that data discovery also presents unique business transformation opportunities. It sets the stage for leaders to reevaluate the firm’s data collection protocols and aligns them to current needs and future goals. When was the last time you thought about these processes? What repercussions are your current processes having on your business? These seemingly minor activities can have a significant impact on your bottom line over time.

Done correctly, data discovery allows you to see your data holistically. It gives you the space to make management decisions that will lead to more pertinent data for your current way of operations.

Done correctly, data discovery allows you to see your data holistically. It gives you the space to make management decisions that will lead to more pertinent data for your current way of operations. To help you get the most out of this process, we’re doing a deep dive on data discovery.

Data discovery: A new perspective

In short, data discovery is determining what your data currently looks like. Not what you think it looks like or what it’s supposed to look like, but what’s actually there.

Technology providers go about data discovery in a number of ways. It can be a laborious process, but it produces a database schema showing the differences between what the data actually are and what the data should be. At Helm360, we leverage our Digital Eye tool to automate this process and deliver an interactive report that zeroes in on these differences and indicates how to fix them.

Regardless of how it’s accomplished, data discovery is a baseline; it establishes a basis for comparison between how the firm’s data are intended to be used and how that data actually are used.

Why is this benchmarking process important for executives? Beyond helping them get a sense for project parameters (timelines, schedules, staffing, etc.), it’s usually the first time executives get a bird’s eye view of their firm’s data and data quality. As mentioned above, it shows how much deviation has occurred since a new system was implemented and makes space for high-level questions like:

Does current data usage align with expectations?
Is data usage supporting our business goals?
Are there unexpected or hidden roadblocks hindering daily functionality?

This “map” of your current data environment leads to a trove of valuable information executives can use to create a relevant and future-focused technology infrastructure. It’s a value-add to your data conversion/extraction/upgrade project you don’t want to hurry past.

Data profiling: What’s really happening with your data?

For most firms, especially those on legacy systems like Enterprise or older Elite 3E releases, maintaining data quality is an ongoing struggle. Data conversions, extractions, upgrades, etc. are opportunities to evaluate, recalibrate, and improve data quality. This improvement happens via data profiling.

Data profiling is the process of examining, analyzing, and summarizing data from an existing data source. It is an exceedingly valuable part of the data discovery process.

As we know, data quality inevitably erodes over time. Customizations are added, workarounds are tucked in, redundancies propagate, record integrity diminishes, and so on. Each small anomaly contributes to a larger web of inconsistencies that reduces data quality, lengthens processing times, and generally mucks up your systems.

Each small anomaly contributes to a larger web of inconsistencies that reduces data quality, lengthens processing times, and generally mucks up your systems.

All this can happen without any serious disruption to workflow. And although your firm may be functioning, in the meantime your attorneys and staff are working with clunky applications and potentially faulty data. How does this impact your business? Take a look:

Mail is sent to incorrect addresses.
Processing reports, billing, WIP, etc. takes longer.
Management decisions are based on incorrect, incomplete, or faulty data.
Time is spent confirming and/or fixing data.

Individually, these are small inconveniences. Taken as a whole, they’re a significant number of lost productivity hours. They also erode ROI on your system as a whole.

With data profiling, you can see data quality issues quickly and easily. Data profiling reports show:

How data are distributed in core tables.
Where bad dates, incomplete records, or orphan records are hiding.
Where and how data entry protocols differ.
How setup tables are syncing to actual data.

With this report in hand, decision-makers can determine next steps: Which data can be purged? Are all the data fields still pertinent? Do the setup tables (e.g., ledger codes) need adjusting?

Not only do these decisions improve data quality, they also boost data relevance, making everything from time entry to billing concise, expedient, and accurate.

Address cleaning: A data discovery use case

Address book maintenance is a major pain point for law firms. Key to effective client communication, address data are typically fraught with inconsistencies: duplications, incomplete records, no separators, etc. These inconsistencies make the data difficult to use and report on accurately.

Data discovery makes cleaning address data a manageable, efficient process.

Data discovery makes cleaning address data a manageable, efficient process. First, it uncovers common inconsistencies and reports them out in an actionable format. For instance, the data profiling report shows:

How many records are missing postal codes.
Discrepancies in state indicators (spelled out versus abbreviated).
How many records have names in one field versus two fields.
How many times any surname is listed (do any of these listings match? Which one is correct?).

With the data profiling report in hand, tech teams can see what cleanup work needs to be done and where they need to do it.

Their next step is to parse the data, which puts it into a format that allows for manipulation; in other words, it makes the data consistent and therefore easy to work with. Once parsed, unneeded data can be purged, field structure changes can be implemented (yes, states will be abbreviated), and data entry protocols can be established (such as always splitting names into two fields). And instead of wading through untold lines of data looking for errors, records with missing, duplicate, or incorrect data can be queried into a results location and batch updated.

Finally, the data are normalized; the information is synced to the new, amended, or established structure. The result? Clean address data.

Conclusion

Cleaning address data is one of many use cases for undergoing a robust data discovery process in any conversion, extraction, or upgrade project. Yes, it can be arduous and time-consuming. However, it results in insightful clarity and opportunities to boost the quality of all your data and the efficiency with which they can be leveraged across the firm. Taking the time to do this data reality check thus advances your business operations and ability to meet its goals in the long run.

Regularly running a data discovery process is one of the best ways to increase your firm’s ROI on their technology investments and position it to maintain or improve its competitive edge in the marketplace.

Want more information about how data discovery can improve your data quality and enhance your business? Want to see Helm360’s Digital Eye in action? Contact us! We’re happy to answer questions and give you a free demo.

Tags: data discovery, digital eye, qa testing