London Underground Wifi Tracking: Here's Everything We Learned From TfL's Official Report

By James O Malley on at

Earlier this year, Gizmodo UK scored ourselves a scoop, as we exclusively revealed some of the findings from last year’s wifi tracking trial, in which Transport for London analysed wifi data picked up from our phones as we travel on the London Underground - and was able to track our movements across the tube network.

Many months on and TfL has decided to share more details on what it learned during the trial in an official capacity - and today marks the publication of a report that has been led by TfL’s Chief Data Officer, Lauren Sager Weinstein.

Why analyse wifi data in this way? “There are a lot of questions we've been mulling over for some time about how we can run a transport network as efficiently as we can, how can plan for the future and how can we give customers more information about travel”, she says.

She told me that the idea to analyse wifi data to gain customer insights has its origins in the original Oyster card system. Starting in 2005, she first started to scrutinise the Oyster data using data science - and this was at a time when TfL was solely reliant on paper surveys and stopping passengers at stations and asking them about their journeys.

“As the world of data has exploded and as computer power has [increased] we've built up a practice of looking at customer patterns and movements through the ticketing data, and [we] said there's a gap here, right?”

The gap she noticed was that Oyster data didn’t paint a complete picture. “You touch in and you touch out... but that meant there was a big question mark about travel within central London in particular when you have multiple different ways of travelling around the network”, she explains. “So we thought: Is there some potential here to use this as a data source when we have the wifi on the tube to take patterns and look at the patterns from this dataset as well?”

Imagine trying to make sense of 500m lines like this.

As it turns out, using this wifi data TfL has been able to learn an awful lot. In the month in which the trial took place last year, it logged more than 500m (anonymised) wifi connection requests from around 5.6m devices. That’s a lot of data! So what did they learn? Read on to find out more.

Tracking Journeys Around The Network

(Insert ultra-niche Unown reference here.)

Just as we revealed earlier in the year, one of the most eye-catching aspects was that it enables TfL to essentially fill in the gaps - and figure out how we got between A and B. To show just how useful this is, Lauren explains that they took one of the most complex routes: Kings Cross to Waterloo. There’s a myriad of different ways to get between the two - and using the wifi data they can now tell that 32% of passengers travel via Oxford Circus, and 26.7% go via Green Park. Perhaps most bewilderingly, apparently 1.2% of passengers during the trial chose to go to Baker Street, take the Bakerloo to Oxford Circus, then the Victoria to Green Park… before finally taking the Jubilee to Waterloo. If that’s you then please do write in as I’d like to know what is going on inside your head.

And yes, just in case you’re wondering, as an expert tube traveller Lauren says that she would take the route via Oxford Circus - especially because she knows that there is a relatively short interchange there.

Seeing How Delays Affect Demand

Knowing how people behave is one thing: But what about when something goes wrong? When the Waterloo and City Line went down on the morning 9th December, thousands of commuters were suddenly shaken out of the sort of robotic trance you go into when you repeat a journey so many times. Instead, they had to get to work by alternative means - but which route did they choose?

The above diagram shows how adjacent routes were impacted - and how the increased demand rippled out across the tube network as a whole. In terms of actual numbers, this meant that 4000 people decided to take the Jubilee, 3000 took the Bakerloo to Embankment. By the time everyone got to Embankment, the tube network had to cope with an extra 6000 people taking the Circle and District line eastbound to Monument. The TfL report reckons that translates to approximately 150 extra people on board each train arriving. At what was already rush hour. Yikes.

It’s easy to imagine how, if the wifi tracking system was live, it could be used to monitor demand in real time, and could instruct frontline staff in stations on the best places to redirect passengers to.

“You're really able to see with the data itself what was otherwise reliant on your eyes and operational teams feeding back”, Lauren explains. Taking this data driven approach, in other words, is going to be much more effective.

Moving Around Stations

We also revealed earlier this year that wifi data could be used to make heatmaps of which parts of stations are particularly busy. TfL’s official report makes the same point - and also points to how the data is so fine-grained that it is possible to see the routes people take between different platforms and so on.

In the above example, showing Euston (just as our FOI’d images did), it shows two possible routes between the northbound Northern Line platform and southbound Victoria. Apparently in this case, 68% of passengers take the shortest possible route - up a set of stairs - which takes between 1 and 3 minutes, whereas others are lazy and go all the way up the escalator to the next level, and then take the other escalator back down again. (Have a look at this excellent 3D map from StationMaster if you can’t quite picture it.)

Read More: Major museums are using wifi to track you too.

TfL was also able to see how disruptions impacted stations too: Apparently when mega-congested, the walk times increased from 3 minutes to more than ten minutes. Which creates a whole array of second-order problems for the poor staff on the ground trying to squeeze everyone in.

The wifi data also enables TfL to generate more accurate data on crowding in stations. The above graph compares the number of Oyster touch-ins with wireless device detections over the course of the day.

Previously, how busy a station was could only be measured using Oyster touch in data but there’s a fairly big flaw in using this: There’s a fairly hard limit on how many people can use a set of ticket barriers at any one time. So measuring it by touch-ins doesn’t account for hundreds or thousands of grumpy commuters in the queue.

The wifi data, by contrast, accounts for these people as it still picks up their phones - and comparing it with Oyster touch-ins gives a really immediate and stark example of when it is particularly crowded, such as at Oxford Circus during morning and evening peak.

Read More: Remember the Holborn 'standing only' escalator trial? Here's the results of that experiment too.

Crowding On Trains

Crowding data doesn’t just work for stations - it could work on trains too. Earlier this year we revealed that TfL had figured out how to use the data to work backwards and track a specific user on a specific train as they travelled through many stations. Using this data, TfL could better inform passengers about how busy their trains are likely to be - and using it, help shape passenger behaviour to, say, encourage them to travel at different times or not all crowd on to the first train that arrives, if the one just behind is significantly less busy.

The graphic above shows just how detailed the data can get - taking one set of peak trains from the southbound Victoria line at Euston station on the 9th December 2016. Even during rush hour. It is clear that there is some fairly significant variation in just how rammed each train is. Below is an example of how the data could then be presented on screens to passengers.

Better Journey Planners

And finally, what about giving commuters more control in the journey planner? Rather than simply present the options of the different routes available, by using wifi data the planner could also offer details on which route will mean the most pleasant journey. Sure, you could take the Victoria Line to Victoria, as shown in the mock-up above - but maybe it’d be less sweaty if you took the District Line instead?

Seeing this did make me wonder, though: What if we’re all using these apps? If everyone using CityMapper and Google Maps took the quiet routes, wouldn’t the alternatives… not be so quiet? Isn’t there a risk of creating a feedback loop?

“We need to think about how we effectively communicate going forward. [...] We want people to have information at their fingertips”, Lauren says, but she admits that it isn’t easy as there can always be feedback loops. “People will go one way and that will have an effect, and a secondary effect... so it’s complicated. Just to do the analysis requires some thinking about how you analyse what is going on on the network and how you take all of these movement patterns and create an overall pattern.”

“I'm sure that we'll have to think about all that, but it’s a great challenge to have”, she explains.

The other big challenge ultimately is to figure how best to use the data and the models that the data creates. And this gives Lauren tonnes of questions for her team to answer: “What does a front end visuals that use this data look like? How would it be used to provide customer information, and what format would customers want to consume it? How can we work with our operations team so they can have it as a tool in their arsenal for when they're running our network? How do we feed this into the models and plans for thinking about the transport network? Can we model far into the future?”

So what’s next? Will TfL be switching on the wifi trackers full time? Following the publication of the report, TfL is now recommending that TfL switches on wifi tracking for “continued use”. But don’t expect them to switch it on for Monday morning - as now the project needs to go from proof of concept to something more robust, and the agency is clearly trying to be responsible and keep the public informed along the way about what it wants to do and the potential benefits, given the obvious potential privacy concerns.

“I don't have a specific date but we're keen to move forward on this because there is real value in it”, says Lauren.

So who knows? Perhaps one day in the future you might be travelling from Kings Cross to Waterloo - but before you board your train a notification might pop up to warn you about crowding. So you might decide there is only one sensible option: Going via Baker Street and Green Park.

James O'Malley is Interim Editor of Gizmodo UK and tweets as @Psythor.

Read More: Our original Tube wifi tracking post.