It is loopy how a lot TfL can find out about us from our cellular knowledge (JamesO’Malley)

Huge Brother meets Huge Information
It’s a well-known maxim within the tech business that for those who’re not paying for the product, then you are the product. We get to make use of unbelievable providers like Gmail, Fb and Twitter1 at no cost – and in return, the large tech corporations promote entry to our eyeballs to advertisers2.
However this isn’t all the time the case. Generally, even once we pay for a service, we’re additionally the product being bought.
For instance, one thing that EE, O2 and Vodafone all do, however don’t actually love to shout about is promote anonymised, aggregated knowledge on our bodily actions to native authorities, transit businesses and every other firms with a chequebook massive sufficient.
And that’s why in the present day I’m going to inform you about a number of the actually mad issues that Transport for London (TfL) can work out about us by utilizing our location knowledge, supplied by the O2 cellular community.
Utilizing the Freedom of Data Act, I’ve managed to acquire the Information Safety Influence Evaluation, and the Assertion of Work for TfL’s Undertaking EDMOND – which stands for “Estimating Demand from Cell Community Information”3.
That’s proper, this week’s publication is dangerously near being precise reporting as a substitute of simply my traditional bloviating. And having now fallen down the rabbit-hole digging into it, I’m amazed by the standard of data it offers transport planners and coverage makers. And actually, I’m somewhat freaked out.
So let’s dive in and discover it collectively.
Cautious now
The way in which EDMOND works may be very intelligent. TfL isn’t truly monitoring all of our telephones the entire time, presumably as a result of it is aware of that to take action can be vastly controversial.
So as a substitute, it contracts with O24 to license knowledge over shorter durations of time. For instance, in 2023, it took knowledge from ‘as much as’ 40 regular weekdays between the beginning of April and finish of June, when nothing bizarre was occurring like faculty holidays or financial institution holidays5.
This is a gigantic dataset, with probably as much as 25 million telephones included in it6, however it nonetheless doesn’t embody everybody in London as a result of some individuals use different networks like EE, Vodafone, and so forth.
So it’s essential to grasp that EDMOND isn’t only a pile of information – it’s a mannequin, the place TfL has taken the info from O2, and has achieved some intelligent maths to scale it as much as estimate the the actions of everybody in London over the age of 12.
There’s additionally the elephant within the room. Although it may be shocking to study that O2 is promoting knowledge insights on its customers, it’s not promoting private knowledge7. What’s being bought by O2 and licensed by TfL is aggregated, anonymised knowledge.
This implies TfL can’t see the actions of particular person individuals, and naturally every thing is totally GDPR-compliant and above board – as you’d count on for a significant company and a transport company.
In truth, in accordance with the 2018 Travel in London report, any time the info suggests there have been fewer than ten telephones in a given statistical space, the info was robotically excluded so to keep away from inadvertently unmasking individuals primarily based on their metadata.
So to be completely clear, there’s no large scandal right here8. In truth, utilizing this form of knowledge is more and more routine for native authorities and others9. To the extent that O2 even has a model title for this line of its enterprise – “O2 Movement”.
However that doesn’t imply what’s occurring isn’t attention-grabbing. In truth, I’m keen to wager that most individuals exterior of the cellular business are utterly unaware their motion knowledge is getting used on this manner.
What TfL is aware of
Now let’s get to the great things. What does all of this knowledge do for TfL, and what knowledge have they got to play with?
Due to the aforementioned privateness restrictions, they don’t merely get dots on the map present them the place everybody was. As a substitute, the info is damaged down into lots of of “Medium Tremendous Output Areas (MSOAs)” – it is a statistical customary that divides up the nation into teams of between 2000 and 6000 houses.
Right here’s a map displaying London’s MSOAs:
Taking a look at this, you possibly can see why knowledge on this degree may be helpful.
Utilizing the aggregated knowledge from O2, TfL can see which areas of London individuals are travelling from and the place they’re travelling to – which is precisely the form of info you may want for those who have been, for instance, planning the place to run buses or impose an Ultra-Low Emissions Zone that disincentivises automobile use.