DatA steWarDsHiP

Why we need to have trust in data

Our digital identity and integrity are as important as our physical identity and integrity. Data and digital flows are the foundations of our social functioning and primary source of value in the global digital economy. However, a few Mega-Tech companies have been able to extract, monopolize and monetize our data over the last two decades.

More than 90% of Western data is hosted in the US. Geopolitical actors use digital technology and data increasingly as tools of power. Legal initiatives such as the US Cloud Act or the Chinese Surveillance Legislation give their governments unlimited access to data abroad. A cyber arms race is ongoing using a wide range of techniques, from blunt nation-state cyber-attacks to sophisticated social manipulation.

In short, digitisation has made us as individuals, as businesses and as countries very dependent and vulnerable and hindered in capitalizing on our own assets. As a result, individuals, organisations, and governments are re-evaluating their external exposure and launching digital sovereignty-related initiatives to maintain or acquire physical and digital control over their strategic assets, including data, algorithms, and critical software.

Around the world, governments are grappling with questions regarding the use of, and who benefits from, public and private data.  Data governance and Data Stewardship is an ever-rising topic on the global policy agenda.  Such a challenge is not new, policy makers and societies have been building governance systems that support public and private benefit from shared resources for a long time.   What is different however, is the globalisation of data markets which has, to some degree, outpaced the evolution of consumer protections which are typically designed and enforced by Governments.  As with most self-regulation the market is unlikely to enforce good practice on its own when it comes to data.  

Just at the point where data is becoming more and more connected, the trust of the regulators and data subjects in the big tech companies use of that data is moving towards its lowest ebb.

This is the context in which one response of existing and emerging big tech is to find ways to demonstrate trust in data holding and usage, above and beyond that with is enshrined in GDPR, to give confidence to data subjects, regulators and investors. This is happening in parallel with the greater move to ESG credentials being auditable and demonstrable.

The right kind of access to data is vital in tackling big challenges we face in society – from the early detection and treatment of disease to reducing pollution.  Data has an important role to play in driving economic growth and supporting the creation of new technologies, products and services. Getting this wrong threatens the supply and availability of data, with similar ramifications as cutting off oil and gas supplies. Perhaps just as significant is the possibility of cutting off investment money, just at a time when the general investment environment is going through a point of introspection, especially in terms of tech and the previously perceived infallibility of “youth culture”.

‎‎ ‎

Our building blocks - Data Foundations

In the traditional fiduciary world, “Trusts” are structures that hold assets, such as a boat or property (but could be the entire financial estate of an individual) overseen by an independent group (Trustees) who have legal responsibility for the custody and management of that asset against a set of predefined rules.  The Trustees never own the asset rather they control it and make decisions with respect of a defined benefactor or benefactors. 

When used for governance of data, there is a modern Trust/Commercial hybrid called a “Foundation” which can steward, maintain and manage how data is used and shared — from who is allowed access to it, and under what terms, to who gets to define the terms, and how. In addition, the “Trustees” in a Foundation can then license out the data under commercial arrangements, but maintaining the trust relationships they are obliged to uphold.

Existing laws already provide a variety of data rights, but exercising them can demand considerable knowledge, time, and energy. These legislative frameworks also struggle to cope with a data environment where data can be used and re-used in different ways. These changing patterns of data use put pressure on our traditional forms of data governance. 

We are in the process of making subtle changes to our Foundation structures to better recognise Data as an Asset or Property, and make it much easier to create commercial arrangements, such as licensing fees for data sharing. In conjunction with this we are using work done by organisations such as the Open Data Institute to define a Code of Practice that Data Foundations are required to adhere to, which then underpins the use and sharing of data whilst addressing privacy risk.

Reaping the benefits of data and digital technologies will require robust new institutions or frameworks that can allow data sharing - helping develop new data-enabled products and services - while protecting individual rights and freedoms. Data Foundations offer a mechanism to achieve this goal. 

‎‎‎ ‎

Data Stewardship

To realise the true benefit from the development of Data Foundations we also need trustworthy stewardship.  The challenge is finding and maintaining a knowledgeable and independent set of Trustees as the issues under consideration are complex.  Data stewardship is a responsible, rights preserving and participatory concept which aims to unlock the economic and social value of data, whilst upholding the rights of individuals and communities. 

Data stewardship is also one of the narratives that has been developed by the ODI. For the Ada Lovelace Institute, data stewardship is a ‘responsible, rights-preserving and participatory concept [which] aims to unlock the economic and societal value of data, while upholding the rights of individuals and communities to participate in decisions relating to its collection, management and use’. The Royal Society uses data stewardship to describe a body mandated to ensure responsible use of data, and the Mozilla Foundation uses it as a term to describe the act of empowering agents in relation to their own data and guidance toward a societal goal. The Aapti Institute describes data stewardship as a ‘paradigm which explores how the societal value of data can be unlocked while considering what it takes to empower individuals/communities to better negotiate on their data rights’.

The concept of stewardship inherently involves a dynamic relationship between at least two parties; stewarding data relates to the role of ‘looking after it on behalf of others’. The idea of stewardship has been around for a long time, and prior to its application to data, has tended to focus on the control or organisation of companies, land and money. Another widely recognised way of thinking about how data should be used, is data ethics. The ODI, define ‘data ethics’ as:

A branch of ethics that evaluates data practices with the potential to adversely impact on people and society – in data collection, sharing and use.[1]

Assurance of data and data practices is really important for organisations to build trust, decrease risks and maximise opportunities. At the Open Data Institute (ODI) we define data assurance as:

The process, or set of processes, that increase confidence that data will meet a specific need, and that organisations collecting, accessing, using and sharing data are doing so in trustworthy ways

We believe the adoption of assurance products and services should both reassure organisations who want to share or reuse data, and support better data governance practices, fostering trust and sustainable behaviour change within those organisations or communities. However, the role of assurance products and services to improve data governance and trust across a data ecosystem are not mature or widely understood so we want to develop this to a mature state.

It is important to look at data assurance from the perspectives of both the data sharer and the data collector or reuser. Organisations may not be willing to share their data if they are concerned about data quality, how it will be used, its security or what will happen to it after use. Where an organisation wants to limit data use and reuse to a defined purpose, data assurance practices embedded in the Data Foundations Code of Practice can help allay fears of the data being misused, misunderstood or mishandled.

Example: A research institute requests patient data from the local health authority in order to conduct research that may lead to an improvement in public health and a reduction in treatment costs. The health authority needs to be confident that the shared data will be used ethically, will not be further distributed and will be kept secure. Providing assurance that the data practices, skills and tools to be used are suitable for this purpose will raise confidence and help encourage data sharing.

If configured correctly Data Foundations can increase access to data while retaining trust and, as the AI revolution accelerates and quantum computing develops global demand for data stewardship services will grow.

This only takes the principle so far. AI in particular is stretching the boundaries of data protection regulation, but onward access to the lifeblood data remains based on trust. Hence, in a normal mode a data foundation can only operate within existing regulation, there is great mileage in a well-constructed local data foundation being given access to an ICO run Data Sandbox, where it would be able to test new regulatory boundaries within the local space. This solidifies the Data Foundation principle as well as providing access to ground-breaking ideas that could help formulate progressive regulation, policy and legislation.

‎‎‎ ‎‎‎‎‎


There is a pressing need to explore new data governance models that provide individuals with some control over data and the technologies that use them and advance the public good. It also risks dragging technologies that rely upon that data – including AI – down with it.

We are designing structures and practises that support forward-looking models of data governance that encourage innovation through an iterative process.  Clear definitions of new data governance models enable governments, industry and society to pilot their applications and learn through experimentation.  

  • Data Foundations offer a flexible and inclusive model that enables government and industry to coevolve regulation and technology, allowing time for concepts of digital rights to mature while immediately strengthening the rights of citizen consumers.
  • Data Foundations can protect the public’s intellectual property rights in data against monopolization by private interests, enabling the sharing of public value.
  • Data Foundations leverage existing legal governance structures, such as trustees’ fiduciary duty, to provide the public with stronger protection against privacy violations and the unethical collection and use of their personal data.

An ecosystem of Data Foundations enables the public to choose a data governance regime that reflects their privacy preferences and supports their values.


[1] The Open Data Institute (2021). The Data Ethics Canvas