National Pupil Data releases March 2012 – May 2017: summary review

National Pupil Data releases Jan 2017 – May 2017

We campaign to make your school data safe, to stop it being sent it to hundreds of companies, researchers and press without your consent, and start better transparent processes to the children whose school life the data come from. Our review of the latest releases of confidential data by the Department for Education, shows 86% were of individual level, identifiable and sensitive or highly sensitive data since 2012. Only one release of identifying data from the National Pupil Database (NPD) between January 2017 and May 2017 was aggregated, and may also have been identifying.

Of the 121 requests in these 5 months, Jan – May 2017 published in the Third Party Register, :

  • 92 were approved and received data.
  • 21 requests were not pursued to completion (seven times as many in these 5 months as the previous total published for not pursued in 4.5 years March 2012 – December 2016).
  • 8 were rejected.
  • 161 more were in-progress “open cases,” pending completion.

Tier 1 are the most identifying and highly sensitive and individual level. Tier 4 are also individual level and identifiable but less sensitive. See the NPD User Guide. (pp 19-21)

The data items released range from name (see case studies), to date-of-birth and address, through special educational needs health related data, lifetime attainment, reasons for absence and exclusions such as theft, and violence, and can be seen in the summary (pp 26-35). Every one of these releases is of identifying data. [For more detail see our previous review.] The DfE relies on the organisation not publishing the pupil-level data, but hands out pupil data without small numbers suppression.

These figures exclude releases of open data, anonymous data, or statistics.

This also excludes Home Office, police, internal DfE and any other government access which is still not published. We keep being told it will be. Hopefully by the end of the year. That would help restore public, professional and parliament’s trust that the data are being used for what we are told it is.


Total releases March 2012 – May 2017

The total of 92 new approved releases in the first five months of 2017 brings the releases of identifiable pupil data to third parties to around 1,000 since March 2012.

Since March 2012 – May 2017, only 30 have been for aggregated data. A further 116 requests were open in May 2017, pending approval completion by the Data Management Advisory Panel (DMAP).

Of the known classifications that means 86% are individual level, identifiable and sensitive or highly sensitive, and only between 3% with an outside maximum of 10% of the data releases are aggregated, and they can also be identifying. Bear in mind, this is no indication of the volume of data released to each user, but we have been told this can be for the population-wide entire dataset, of an ever growing 23 million individuals.


Receiving organisation types Jan 2017 – May 2017

In Jan – May 2017, around 42% were university requests and 18% commercial, plus 13% each for charities and Think Tank access. Commercial use appears to be down from a cumulative total of 28% March 2012-Dec 2016, to around 18% in these 5 months.

Many uses since 2012 are for direct interventions with children and families. [see end notes].

In answer to this recent parliamentary question 106644, Minister for School Standards, Nick Gibb, suggests that some of these identifying data users are onwardly sharing that data. We have no oversight to whom, because it is not published and there is no regular audit process of the national pupil data users.

In 2013 a DfE email mentioned the newspaper’s “cast iron assurances” that no child would be identified, meaning the paper would not publish identifying data. In effect, the Department outsourced the management of children’s privacy to ten Telegraph journalists. They were given identifiable SEN status, Free School Meals indicator, Ethnicity plus other personal data; the entire lifetime attainment, and school locator and other variables for millions of children.

In addition to The Telegraph among over 1,000 releases of identifying individual pupil-level data, others include BBC Newsnight, The Times, Private company Tutor Hunt, and Data consultancies.

 


Total Rejections

The register records 23 rejections. The reply to the Parliamentary question, states that more have been refused.

Before December 2016 only 15 requests for identifying data had ever been published in rejected figures. In this 5 month round, there were 8 rejections recorded alone.

These included a UCL project to inform operational predictive policing, and mailshot surveys by the Department for Transport.

A statistical study to look at how the month of birth is relevant to outcomes in testing through education in the UK by Edinburgh Uni was also rejected, as was a project examining the impact of Philosophy for Children on critical thinking, creativity and attainment.

The decision-making process is opaque and there is no clarity why these academic projects have been rejected compared with past surveys on health in 15 year olds for example. Or why commercial projects are viewed more favourably. Or why they failed compared with vague purposes of the latest identifiable data release to TV journalists;

“To cover the English school system for BBC Newsnight, a BBC2 TV programme, and its associated online outlets. To look at progress made by the current – and previous – administration in reducing geographical, social and economic disparities in education performance. “

The BBC Newsnight journalist previously received pupil level, Tier 1 data in 2014.


Releases missing from this third-party transparency register

The third-party register (albeit now broken into pre and post 2017) includes the disclosures of the last 4 years  March 2012 through to May 2017 2016 but it has left out the releases to police, and to the Home Office Removals Casework Team, confirmed via FOI for the purposes of immigration enforcement.

We know that up to 1,500 individual pupils’ home address and school address may be made available to the Home Office on a monthly basis in an agreement in place since July 2015.

We look forward to the inclusion of those numbers in regular statistical publications, as promised by the Department in November 2016.


Footnotes:

While we have confidence in our figures using the available information, there are discrepancies between what is published in the third-party register and total published in parliamentary questions.  The judgement of what category a data recipient falls, is ours alone, and based on institution name and where not immediately clear, we used online information from Companies House (beta search), the Charities Commission, and ICO registry of data controllers and processors.

i. Our previous review March 2012- Dec 2016 [link]

See page 5 of our briefing from earlier this year, for details on identifying data and types.

ii. Raw data source

These are only a breakdown of the external releases published in the Third Party Register . A parliamentary question in January 2017 stated over 1700 unique releases were approved and that “these include both the Department’s and external requests.”

‘National pupil database third-party requests’ has 3 tabs:

  • January 2017 to May 2017: third-party requests for data from the national pupil database completed during this period
  • open cases: open third-party requests which have received no data up to 31 May 2017
  • live cases April 12 to December 16: third-party requests within terms of agreement up to 31 March 2017

Previous versions of this register have been sent to the UK Government Web Archive.

iii. The Department may have placed the collection of census data on a statutory footing, but the Department cannot ignore its obligations under Section 7 of the DPA. They currently rely on S33 exemptions, but the fact that the school census data are now used for operational purposes, by the Home Office, for use in the Troubled Families and the NCS programmes, as well as direct and covert behavioural research interventions, using named and individual pupil level data for over 86% of releases, including provision to companies that use the data in product development, surely leave the “research” reliance open to s33 challenge.

Access to copies of these data by the pupils and parents that they are about, are refused by the  Department, even though Subject Access is good standard practice, recommended by the ICO.


Case Studies

Further detail of named data use in practice:

The examples that follow showing named-data case studies are selected not to show harm, simply uses of the names.

MPs in the House of Commons were assured on the changes to the “Central Pupil Database” in 2002 by then Minister of State for Education and Skills, that, “The Department has no interest in the identity of individual pupils as such, and will be using the database solely for statistical purposes, with only technical staff directly engaged in the data collation process having access to pupil names.”

The National Pupil Database is used as a population wide source of named individuals who may be contacted for surveys, have their behaviour reviewed as results of intervention or used as control groups. All without their knowledge that this is where their personal data came from.

While the published work from users of the data are intended not to reveal the pupil name, the Department for Education appears to overlook that it is their own release that breaches the pupil expectation of common law duty of confidentiality when they release identifiable data to the third party.

Whether the NPD gives the third party the name, or enough information to match it to the name the third party already holds, gives the same result for the individual – a third party to whom they did not give consent, has a whole lot of additional data stored against their name:

“This will involve matching the KS2 data to information on examination performance held within AQA’ databases, using a candidates’ name and date of birth. Again, the data will be reported at the overall cohort level so no individuals will be identifiable.”

Named data are given to charities: for example the NPD was used to give additional information on individuals to the Prince’s Trust as well as similar groups (the ‘control’ group). Names, date of birth and postcode were used. The Trust “hoped to send DfE a list of young people, and for DfE to return a list of their data (FSM, SEN, absences, exclusions, behaviours, attainment (at KS3 and KS4), alongside an equivalent control group.”

NPD data are used in surveys. What About Youth was created by extracting named pupil data and matching it with health data in 2014 for an intensely detailed questionnaire social survey mailshot to the homes of 15 year old pupils. Almost 300,000 of them according to the published report.

Similarly the IoE used it to get all Year 7 pupils’ data and send them named maths tests for completion in 2014.

Consider the behavioural research targeting of children using data from the NPD. These are very explicit in their goal of individual interventions. Parents in a random controlled trial were selected to receive additional text messages about their child’s attainment and children measured for behavioural attainment impact

The Behavioural Insights research [p16] used the children’s attainment data from the NPD in ongoing surveillance with extensive other data sources, in a longer term project, in which interventions will be evaluated for their effect on social action. We wonder how they measure any negative impact?