Skip to main content

A core dataset for libraries

Posted by: , Posted on: - Categories: Data

Libraries Deliver: Ambition for Public Libraries in England 2016-2021 emphasised the importance of data to:

  • identify, understand and meet user needs better
  • support strategic planning
  • develop information they can use for advocacy purposes to secure future investment and encourage increased usage
  • identify areas for improvement
  • manage day to day operations in a more effective and timely way

Therefore, Action 2 in the Ambition Action Plan stated that the Libraries Taskforce would define and publish a core dataset, creating a transparent and automated (where possible) process to gather and share it.

In an earlier blog, we asked for validation on a draft of the core dataset for libraries. This had been developed based on views we had received from the library sector on what a core dataset should contain; through our data workshops, the consultation on Libraries Deliver: Ambition and the workshops in the sector forums we ran in January 2017.

Taking on board these comments, we’ve now published the finalised list of contents for the core dataset on GOV.UK.

What’s in the core dataset?

The core dataset has been split into ten sections containing information on:

  1. Individual libraries
  2. Users
  3. Events
  4. Visits
  5. Staff
  6. Volunteers
  7. Public Lending Right (physical and e-books)
  8. Stock
  9. Finance
  10. Impact

Apart from section one, we expect all the information would be collected at a library service level.

You’ll notice the stock and financial data sections are lacking some detail. This is because the survey showed that, while people were keen for the core dataset to include financial and stock data, there is a need for further discussion to confirm exactly which data would be most valuable - there’s always the danger of ‘collecting war and peace’ on these topics!

While measuring the impact of library services isn’t a straightforward piece of data, we want to emphasise the need to collect information on this as it’s so important - this came up a lot during the consultation. Hence, it has been given its own section in the core dataset though we know this will be one of the harder items to gather (see below for more information).

We’ve removed data on non-users and lapsed users from the core dataset because establishing why they aren’t using libraries sits better under our research strand of work. There are already a number of national research projects that touch on this area - such as the DCMS Taking Part survey - and, to support research at a local level, we’ll be running Masterclasses on how to conduct user research and will publish associated guidance on this.

You’ll notice information required for the administration of the Public Lending Right (PLR) scheme is also included. With the extension of PLR to include remote lending of e-books earlier this year (in the Digital Economy Act 2017), it makes sense to include the data required by the British Library (as administers of PLR) for both physical and e-books in the core dataset. At present, due to their specific reporting requirements, we have separated this out under its own heading, but it might be incorporated into the more general stock heading over time.

Next steps

We want the core dataset to be something which all library services will be encouraged to collect, use and publish. A consistent dataset can be used to help inform and improve local library service delivery, as well as being used for advocacy purposes at a local and national level (when aggregated). There may, of course, also be other data which authorities choose to collect in addition to this for their own local purposes.

laptop showing data dashboard
Showing what can be done with library data: Photo credit: Libraries Taskforce

We know that more work is needed (at local and national level) for library services to be able to collect and share some of the items in the core dataset. We’ll be taking this forward in a number of ways:

1. Supporting library services to publish their own data

We are already seeing some library services starting to publish elements of the core dataset, such as Newcastle Libraries are doing on their github page. The Taskforce will work with library services to try and break down the barriers they have to publishing their own data in line with the core dataset.

2. Pilots

The Taskforce will run a small number of pilots to test out data collection for items in the core dataset which are harder to collect or don’t currently have a straightforward means of collecting them. Let us know if you’d like to be involved in them or if you’d like to share how you’re already collecting this data.

3. Impact

The impact section of the core dataset will be one of the more difficult elements to get right. We want to show the impact of libraries around the 7 Outcomes described in Libraries Deliver: Ambition: cultural and creative enrichment, increased reading and literacy, improved digital access and literacy, helping everyone to achieve their full potential, healthier and happier lives, greater prosperity and stronger, more resilient communities. One way to do this is via outcomes frameworks, for example, the Reading Agency’s Reading Outcomes Framework Toolkit or through local frameworks that have been developed, eg. Norfolk and Norwich Health Outcomes Framework). The research priorities of the Taskforce includes looking to fill in the gaps with additional outcomes frameworks. If you use any frameworks (either local or national ones) which could be adopted by others please let us know.

4. Development of existing systems

We need to see how existing (or new) systems can be used to streamline the process of data collection and publication. We’ll be speaking to suppliers to see how they can help with automation of collection, aggregation and publication of library data. We’ll also be talking to CIPFA to see how their annual data report can support publication of elements of the core dataset.

5. Using data innovatively

The potential for what can be done with published data is vast. We want to work with library services and other organisations to show the potential of what can be achieved, for example, through data visualisation techniques.

6. General Data Protection Regulation (GDPR)

The European GDPR will be enforced from 25 May 2018 following a 2 year transition period. It contains the right for greater transparency in data processing, a right for people not to be subject to automated decision making and profiling, and a right to transfer data between suppliers. Library services will need to ensure they are involved in the discussions within their local authority on responding to the GDPR.

If you’d like to be involved in any particular work streams, please let us know by emailing

Sharing and comments

Share this page


  1. Comment by librariesmatter posted on

    Some basic questions:

    Doesn’t publication of library service performance need to be mandatory given that public libraries are a statutory service? Otherwise it is very likely that poorly performing library services won’t publish anything. Timely reporting and provision of comparative data is also essential (cf CIPFA’s current abysmal performance which councils and the DDCMS seem happy to accept). What is the Taskforce’s position on these points?

    Why is it efficient to set up another data collection stream separate from the existing CIPFA public library data collection? Why not one improved data collection and reporting process?

    Lending is still central to what public libraries do. What is the reason for not collecting lending data? The decline in lending isn’t a good reason for non-collection!

    The Taskforce not knowing by now what data is required about finance and stock is disappointing. What is the timescale for the data collection project?

    When will the public see tangible improvements to public libraries through better reporting of public library performance?

    • Replies to librariesmatter>

      Comment by Former public librarian posted on

      Very well said indeed. Excellent points.

  2. Comment by Matthew Kerr posted on

    To make best use of this data, it needs to be formatted in a way that the different data sets can easily be linked. Presenting the linked data will then allow the real learning about how people are benefitting from library services

  3. Comment by Trevor Craig posted on

    Will you release the other data you collected? Not just basic details of all the libraries.