Never Mind The Data, Feel The Noise (or, seek the signal, young Jedi)

Everyone in education loves data.

image
This is the only time it is correct to use the word "Data" in the singular...

Or at least claims to. One sometimes wonders what would happen to the UK education system if a computer virus disabled every Excel spreadsheet overnight — h’mmm, perhaps someone should get in touch with those nice hacker people at Anonymous . . .
image

However, I digress. I wanted to share a recent epiphany that I’d had about data, particularly educational data. Perhaps it’s not much of an epiphany, but I’ve started so I’ll finish.

It came when I was listening to an interview on the evergreen The Jodcast (a podcast produced by the Jodrell Bank Radio Observatory). Dr Alan Duffy was talking about some of the new technologies that need to be invented in order to run the new massive Square Kilometre Array radio telescope (due to begin observing in 2018):

And then we have to deal with some of the data rates . . . essentially we recreate all of the information that exists on the internet today, and we do that every year without fail, it just keeps pouring off the instrument. And what you’re looking for is the proverbial needle in the haystack . . . how do you pick out the signal that you’re interested in from that amount of data?
The Jodcast, October 2014, 18:00 – 21:00 min approximately [emphasis added]

The realisation that hit me was: it isn’t the data that should be centre stage — it’s the signal that’s contained within that data. And that signal can be as hard to find as the proverbial needle in a haystack, even without data volumes that are multiples of the 2014 Internet.

A simple example from the history of science: Edwin Hubble’s famous graph from 1929 that was one of the first pieces of evidence that we exist in an expanding universe. The data are the difficult and painstaking measurements made by Hubble and his colleague Vesto Slipher that are plotted as small circles on the graph.

image

The signal is the line of best fit that makes sense of the data by suggesting a possible relationship between the variables. Now, as you can see, not all the points lie on, or even close, to the line of best fit. This is because of noise — random fluctuations that affect any measurement process. Because Hubble and Slipher were pushing the envelope of available technology at the time, their measurements were unavoidably ‘noisy’, but they were still able to extract a signal, and that signal has been both confirmed and honed over the years.

In my experience, when the dread phrase “let’s look at the data” is uttered in education, the “search for a signal” barely extends beyond simplistic numerical comparisons: increase=doubleplus good, decrease=doubledoubleplus ungood.

The way we use currently use data in schools reminds me of SF author William Gibson’s coining of the term cyberspace (way back in the pre-internet 1980s) as the
 

consensual hallucination experienced daily by billions of legimate operators . . . a graphic representation of data abstracted from the banks of every computer in the human system
— William Gibson, Neuromancer (1984)

In my opinion, almost the whole statistical shebang associated with UK education, from the precipitous data-mountains of the likes of RAISEOnline (TM) to the humblest tracking spreadsheet for a department of one, is actually nothing more than a ‘consensual hallucination’.

The numbers, levels and grades mean something because we say they mean something. And sometimes, it is true, they can tell a story.

Let’s say a student has variable test scores in one subject over a few months: does this tell us something about the child’s actual learning, or about possible inconsistencies in the department’s assessment regime, or about the child’s teachers?

My point is that WE DON’T KNOW without cross referencing other sources of information and using — wait for it — professional judgement.

I believe that the search for a signal should be central to any examination of data, and that this is best done with a human brain through the lens of professional experience. And, given the inevitability of noise and uncertainty in any measurement process, with a generous number of grains of statistical salt.

The Joy of Quotation Marks

A colleague of experimental psychologist Steven Pinker once joked that verbs were ‘his little friends’ as Pinker believed that the way they are used can give genuine insight into the hidden machinery of cognitive processes.

You know who my ‘little friends’ are? Punctuation marks. I think that they can often give the game away. Take this doozy:

The best secondary schools trusted the incoming ‘levels’ achieved by pupils in primary school as a starting point . . .
–OFSTED, Maintaining Curiosity in Science, November 2013, p.42

The writer asks schools to trust things called “‘levels'”, which the writer has deliberately placed in quotation marks. H’mmm, interesting. Now why would they choose to do that?

By my count, there are five reasons to use quotation marks:

1. Reported speech — this instance doesn’t seem to fit that usage.

2. When coining a new word or phrase — again, this usage is unlikely in this instance.

3. When referring to a word as a word — again, it doesn’t seem to be the intention here.

4. To indicate the title of a book or article — this is definitely not the case here.

By a process of elimination, this seems to leave only one plausible reason for the writer to choose to use quotation marks:

       5.   To imply that the quoted word or phrase is dubious.

So let’s be clear here: the writer is asking schools to trust things called “‘levels'” that he or she apparently considers dubious enough to wrap in ironic quotation marks.

In this paragraph, Ofsted are urging schools to trust what Ofsted themselves (going by their use of punctuation, at least) consider untrustworthy. What are they going to ask us to do next? Square the circle? Cut down the largest tree in the forest with a herring?

Now, where else have I seen ‘levels’ in quotation marks recently? Oh yes . . .

As part of our reforms to the national curriculum , the current system of ‘levels’ used to report children’s attainment and progress will be removed.  It will not be replaced.
–DfE, June 2013

Let me summarise: in June 2013, the DfE tells us that ‘levels’ are gone, but then in November 2013, Ofsted admonishes us for not taking ‘levels’ seriously enough.

Sigh. Education: does thy right hand know what thy left hand doeth? Ever?

As a teacher, my way forward is crystal clear: it’s time to get busy cutting down the largest tree in the forest. Now, where did I put that herring . . .