Building a Winning Data Strategy
Brought to you byAWS
What to Read Next
In one of our recent research projects, a respondent told our team that “data is the food of AI. It’s what AI grows on.” This simple but powerful metaphor illustrates that generating value with data is not about having lots of data on hand; it is about using the right data (at the right time, one might argue), which explains why many organizations are still struggling to become data-driven.
At the same time, data breaches that we too frequently hear about underline the reality that having data is not without risk for organizations — as illustrated by a quick glance at GDPR rulings.1 To paraphrase the Spider-Man comics, with great data comes great responsibility. So how do organizations generate value by leveraging data while avoiding the issues that stem from generating, collecting, and processing data? To shed light on this pressing question, it’s important to discuss the relevance of data governance in data-driven organizations.
Email Updates on AI, Data, & Machine Learning
Get monthly email updates on how artificial intelligence and big data are affecting the development and execution of strategy in organizations.
Please enter a valid email address
Thank you for signing up
Governance: Bridging Strategy and Operations
Broadly speaking, data governance builds on the concepts of governance found in other disciplines, such as management, accounting, and IT. Think of it as a set of practices and guidelines that define the loci of accountability and responsibility related to data within the organization. These guidelines support the organization’s business model through generating and consuming data.
A recurring question I hear from executives is whether data governance takes place at the strategic level. Although it’s possible to approach it this way, it is not an ideal pattern, because it fails to translate strategy into concrete practices and guidelines. Where data governance really takes place is between strategy and the daily management of operations. Data governance should be a bridge that translates a strategic vision acknowledging the importance of data for the organization and codifying it into practices and guidelines that support operations, ensuring that products and services are delivered to customers.
By functioning as a bridge within the organizational design, data governance supports the execution of the strategy and enables innovation while providing the necessary safeguards to guarantee the security and confidentiality of information owned and/or processed by the organization. Unfortunately, in many organizations, data governance is implemented later and efforts to place it as a central piece of the organizational puzzle are not sufficient, leading to situations where it does not support the execution of the strategy and is perceived as an unnecessary nuisance by operations.
Data governance initiatives often begin with a flurry of activity concerning a data governance “plan.” Usually, this takes the form of a document outlining rules and procedures that put a coercive spin on data generation, collection, and use. Often, procedures focus heavily on security policies and can get quite technical. While necessary, these documents can be problematic for four reasons. First, as with many other corporate documents, employees may not read them even though they know they exist. Second, they are highly impersonal. The plurality of data means that it is very difficult to outline all possible scenarios in one document. Third, documents must be maintained and evolve over time with the organization. Unfortunately, these changes are not always done in a timely manner. And finally, documentation can give a false impression that things are taken care of and create a sense of complacency. Things are done (or not) because “it’s in the data governance plan” rather than because they continue to create value for the organization.
To address these issues, we can think about governing data, rather than data governance. The difference is subtle but ties back to placing governance between strategy and operations — because these activities bridge and evolve in step with both. Literature highlights the reality that governance is not just about rules and procedures. Rather, there are three primary categories of mechanisms that complement one another and that leaders can leverage to govern data:
- Structural mechanisms are the most formal and include elements such as the creation of special roles, official policies, and rules.
- Procedural mechanisms are used by the organization to ensure compliance with structural mechanisms, such as data audits and reviews. (This is where IT plays an important role.)
- Relational mechanisms are the least formal and include key activities such as communication and informal employee mentoring. In a large financial institution where I conducted research, for example, leaders relied heavily on the informal mentoring of junior AI developers to teach them about the ethics of using sensitive data for things such as credit ratings and loan applications.
While organizations must rely on a combination of all three mechanisms to successfully govern data, relational mechanisms are particularly important for creating a data-driven culture that serves the organization’s strategic objectives.
Governing Data Does Not Have To Be Monolithic
Organizations sometimes adopt a one-size-fits-all approach to data governance. Although this is easier to create and maintain, it’s often not ideal. For example, if your organization has different business units that use data with various levels of sensitivity, a monolithic approach based on the requirements of the most sensitive data may not meet the company’s needs; other units may require additional flexibility to support digital innovation. The framework offered by Vijay Khatri and Carol Brown is a useful tool to diagnose or design an agile approach to data governance, recognizing the different needs of organizations.2 It is built on five key dimensions that represent domains for management decision-making where a combination of structural, operational, and relational mechanisms can be implemented:
- Principles are the foundation of the framework and ask questions related to the role of data as an asset for the organization.
- Quality defines the requirements for data to be usable and the mechanisms in place to assess that those requirements are met.
- Metadata defines the semantics that are crucial for interpreting and using data — for example, those found in a data catalog that data scientists use to work with large data sets hosted on a data lake.
- Accessibility establishes the requirements related to gaining access to data, including security requirements and risk mitigation procedures.
- Life cycle supports the production, retention, and disposal of data on the basis of organizational and/or legal requirements.
Although we can certainly apply this framework to the entire organization, the authors suggest creating a matrix where each dimension is evaluated on the basis of the location of the central point of authority and accountability. For example, an organization may need to comply with regulations related to data life cycle and keep decisions related to this domain in the hands of C-level executives relying more heavily on structural mechanisms (for example, organizational policies). Data quality decisions, on the other hand, may be deferred to business units on the basis of their own requirements.
Think About Services
When we think about data, we often think about bits and bytes at rest, stored in dedicated data structures such as databases or, more commonly nowadays, text files (in CSV or JSON format, for instance). Although data storage and management are important, data is often in flux, and interactions are increasingly taking place using services — programmatic interfaces that allow users to access and/or manipulate data over a network.
Good governance requires balance and adjustment, and when done well, it can fuel digital innovation without compromising security.
In data governance, services are useful because they describe the semantics of data and their methods of access, regardless of the structure and the location of the underlying data, and policies such as usage quotas can be enforced programmatically (and customized on a tier basis, for instance) directly within those services. An upside is that this makes it easier to design, scale, and customize services based on the needs of a given business. On the flip side, it’s crucial to harness services in a way that helps mitigate security risks to avoid undesired outcomes, such as a data leak to data consumers (for example, third parties).
Overall, services force us to think about governing data as a piece of software, which also means that we need to carefully consider how we evaluate the viability of those services — by ensuring that they provide only the data elements they were designed to provide. The good news is that services can easily be tested using continuous automation to ensure continued compliance with data governance practices and guidelines.
Four Action Items to Govern Your Data
Good governance requires balance and adjustment, and when done well, it can fuel digital innovation without compromising security. Here are four simple action items to help govern data in an organization.
- Start at the top. To govern data, leaders need to acknowledge its strategic relevance. Taking advantage of the formulation of AI strategies or the revamping of existing business strategies in the current crisis context allows leaders to incorporate data as part of their strategy.
- Think beyond coercion to support data-based innovation. Governing data is often perceived as a way to rein in data within the organization. While that is important, making sure that data governance also supports innovation is equally important.
- Design and assert data governance frequently using frameworks. Delving into the details of daily governance mechanisms is a daunting task. Designing and asserting how data is governed within an organization using simple frameworks such as the one presented here is less tedious, more flexible, and more amenable to the way executives think about their organizations.
- Think beyond data at rest. Data in flux is an important area for data governance in 21st-century organizations. Although services are software, they must be designed and tested to comply with the data governance practices and guidelines of the organization.
Governing data is not easy, but it is well worth the effort. Not only does it help an organization keep up with the changing legal and ethical landscape of data production and use; it also helps safeguard a precious strategic asset while supporting digital innovation.