Map of Europe with country flags

The Open Data Directive is here! – Approach to Open Data publication

By Deirdre Lee, Founder and Director of Derilinx

There’s no more putting it off – the Directive on Open Data and the Re-Use of Public Sector Information will be transposed into National Law across EU Member States in July 2021. The Open Data Directive focuses on making sure that public-sector and publicly-funded data is reusable; driving social, economic, political and environmental change.

As a public body, you are undoubtedly collecting or using data as part of your day-to-day operations. Transport operators need to know the location of all the public-transport stops, and the real-time information about imminent arrivals and departures. Environmental agencies need up-to-date information about the bathing-water quality of all swimming spots, and to provide this to the public in a timely fashion.

In addition to using data for the delivery of public services, historical information is also used in analysis and reporting, as well as for future planning. Tourism agencies use statistics about airport passenger volumes, hotel bookings, and pedestrian footfall to understand tourism numbers over a period of time. Local Authorities use housing zoning and development plans as part of green-area and public-amenity planning.

But how can we transition from collecting and using data within our organisations, to sharing data publicly for general reuse? Let’s approach Open Data publication in these bite-sized chunks, using the Open Data Directive as guidance:

What data do we manage?

Which data should we publish as Open Data?

How can we make data available as Open Data?

What data do we manage?

A common challenge for public bodies is having a complete view of what data they have, as datasets can be created and managed in many units across an organisation. While technical teams may process the data, it is often business teams who use and understand the data. Carrying out a Data Audit is a great way of getting a better understanding of your organisation’s data holdings. You can start small with compiling a simple list of what datasets a particular unit manages. Depending on your needs, this can be extended to a detailed description of all datasets in every unit. A data audit represents a great first opportunity to identify organisational data silos, improve data sharing across the organisation and identify improvement to data management practices.

An example of a public-data inventory is Ireland’s Public Service Data Catalogue, which catalogues the key data in public bodies, including personal data, business data, and data critical to business decisions or service delivery.

Which data should we publish as Open Data?

Note, Open Data does not include data where there are concerns in relation to privacy, protection of personal data, confidentiality, national security, etc. In an ideal world, all suitable datasets managed by public bodies would be published as Open Data straightaway. In reality, public bodies have limited resources and therefore need to put in place a practical approach to publishing Open Data. A Publication Plan can help organisations to prioritise what datasets to focus on first. In our experience, selecting high-value, sustainable datasets leads to the greatest impact.

The Open Data Directive defines High-Value Datasets as those whose reuse is associated with important benefits for society and the economy, and which have high commercial potential. The Directive further highlights the following list of thematic categories of high-value datasets: Geospatial, Earth observation and environment, Meteorological, Statistics, Companies and company ownership, and Mobility.

Within your organisation, High-Value Datasets are those datasets that you know to have a high demand. What data are you repeatedly being asked to share – by your colleagues, by other public bodies, by researchers, by the general public, etc. For example, some of the most in-demand data since the pandemic began is the latest COVID-19 statistics on cases, hospitalisations, deaths and vaccinations.

Ireland's COVID-19 Data Hub

Sustainable Datasets are those that will be easy-to-maintain in the longer term. Making data available as Open Data isn’t just a once-off action, but an ongoing process to ensure the information is kept up-to-date. If an individual is responsible for manually publishing data, this can quickly become a bottleneck if that person isn’t available or changes positions. Ideally, the publication process is repeatable, or even better, fully automated using a data harvester or API.

How can we make data available as Open Data?

So now that we have identified which data to publish as Open Data, the question remains how can we make the information accessible to potential users? The simple answer is to share the data online. A dedicated Data Catalogue enables people to discover the data they are looking for, access the raw information, and use it for their own applications or analysis. For example, the World Bank Water Global Practice, and the Energy and Extractives Global Practice have made thousands of datasets available on their Open Data Portals. This data powers numerous applications that support transparency, decision-making, and planning, all of which are also available via the portals.

World Bank Water Data website
World Bank - Energy Data Hub

A comprehensive data catalogue should list datasets of open and machine-readable formats. Additionally, if you want to strengthen the reusability of your data, then making it available with an Application Programming Interface (API) is a great option. APIs allow for programmatic access to data, which, as the Open Data Directive outlines, is particularly important for dynamic data (including environmental, traffic, satellite, meteorological and sensor generated data), the economic value of which depends on the immediate availability of the information and of regular updates.

Next Steps

Hopefully this article has given you some tips on how to get started on, or to build upon, your Open Data journey. Start small, pick a couple of sustainable, high-value datasets, and grow from there.

For more information on this topic, you can watch the recording of our webinar How to Publish Open Data Effectively, as part of the It’s Time to Open webinar series.