Data management is crucial for creating an environment where data can be useful across the entire organization. Effective data management minimizes the problems that stem from bad data, such as added friction, poor predictions, and even simple inaccessibility, ideally before they occur.
Managing data, though, is a labor-intensive activity: It involves cleaning, extracting, integrating, cataloging, labeling, and organizing data, and defining and performing the many data-related tasks that often lead to frustration among both data scientists and employees without “data” in their titles.
Get Updates on Leading With AI and Data
Get monthly insights on how artificial intelligence impacts your organization and what it means for your company and customers.
Please enter a valid email address
Thank you for signing up
Artificial intelligence has been applied successfully in thousands of ways, but one of the less visible and less dramatic ones is in improving data management. There are five common data management areas where we see AI playing important roles:
- Classification: Broadly encompasses obtaining, extracting, and structuring data from documents, photos, handwriting, and other media.
- Cataloging: Helping to locate data.
- Quality: Reducing errors in the data.
- Security: Keeping data safe from bad actors and making sure it’s used in accordance with relevant laws, policies, and customs.
- Data integration: Helping to build “master lists” of data, including by merging lists.
Below, we discuss each of these areas in turn. We also describe the vendor landscape and the ways that humans are essential to data management.
AI to the (Partial) Rescue
Technology alone cannot replace good data management processes such as attacking data quality proactively, making sure everyone understands their roles and responsibilities, building organizational structures such as data supply chains, and establishing common definitions of key terms. But AI is a valuable resource that can dramatically improve both productivity and the value companies obtain from their data. Here are the five areas where AI can have the most impact on effective data management in an organization.
Area 1: Classification
Data classification and extraction is a broad area, and it has grown larger still as more media has been digitized and as social media has increasingly centered around images and video. In today’s online settings, moderating content to identify inappropriate postings would not be possible at scale without AI (although many humans are still employed in the field as well). We include in this area classification (Is this hate speech?), identity/entity resolution (Is this a human or a bot, and, if human, which one?), matching (Is the Jane Doe in database A the same human as J.E.