Felipe Massa, Williams, Sepang, 2017

Big data: How one F1 team creates more information than it can handle

2018 F1 season

Posted on

| Written by

Formula One cars are packed with sensors which allow teams to scrutinise every aspect of their performance in search of lap time. But the vast quantity of data generated creates its own problems.

As Williams’ chief technical officer Graeme Hackland admits, his team is already producing more data than it can afford to store.

Lance Stroll, Yas Marina, Williams, 2017
Williams generates 200Gb of data per race weekend
“Back at base we’ve got terabytes of data,” he explains. “In fact we have to flush a fair amount of data that they generate, say in the wind tunnel, because we simply can’t keep it all, we can’t afford the storage to keep it.”

“It’s not the petabytes that some of the financial [companies] are running but it’s significant enough, and we’re moving it around the world so we’ve got to protect it at rest and in motion as well.”

Speaking at the launch of a new technical partnership with data specialists Acronis, Hackland described how the team is planning to upgrade its storage facilities.

“We’re about to remove tape, that’s one of the most important things we’re going to achieve together,” he said. Some of Williams’ tape archives are “three, four, five years or, dare I say, even older than that,” he admits. “We’ve got tapes on this site, no one even remembers how old those tapes are.”

Being able to quickly access information from previous seasons is essential for developing new technologies and solving problems. Finding the correct tape wastes valuable time and if it has degraded the information on it may no longer be accessible. Williams has identified these as threats to its potential performance.

Another challenge the team faces is the growing volume of data it generates on-site at races, and how this can be quickly and securely shared with the factory back home.

“The pure car telemetry data is not that big,” Hackland explains. “Over the course of a race weekend we’re between 80 and 100 [gigabytes] of data.”

Go ad-free for just £1 per month

>> Find out more and sign up

“For us that’s a significant amount of telemetry data that we then combine with weather, tyre and video especially. Video analytics of our own car and our competitors’ cars is becoming more and more important. You’ve got all the voice data that we get as well. So combining all of that you end up at 160 to 200 [gigabytes] of data per race weekend.”

The vulnerability of this data is another worry. “We’re concerned that, all of the equipment that’s sitting at the track, there’s a period of time where the only place [the data] exists is at the track. If something goes wrong we could lose data. We never want to lose data.”

The team’s partnership with Acronis is aimed at tackling these problems using new data backup, storage and synchronisation technologies. They will face their first real-world test soon: The maiden run of the FW41 is less than two weeks away, and the team is waiting to generate another huge load of vital information.

2018 F1 season

Browse all 2018 F1 season articles

Author information

Keith Collantine
Lifelong motor sport fan Keith set up RaceFans in 2005 - when it was originally called F1 Fanatic. Having previously worked as a motoring...

Got a potential story, tip or enquiry? Find out more about RaceFans and contact us here.

21 comments on “Big data: How one F1 team creates more information than it can handle”

  1. This is an interesting subject and one I have never considered.

    Video analytics of our own car and our competitors’ cars is becoming more and more important.

    Huh! I’d like to know more.

    1. This is an interesting subject and one I have never considered.

      Same here.

    2. @damon They should be able to work out the acceleration of the cars from the audio and video feeds. Combined with the GPS telemetry they can pretty accurately model the competitors cars.

      A decade ago they were doing this already. It must have gotten a lot more advanced by now.

      You can even hear the drivers talking about it like it’s the most normal thing that they know the downforce levels, engine performance and such about their competitors’ cars.

    3. Same here. Although it’s a bit of a when work meets leisure for me 😁

  2. Interesting information regarding this particular topic.

  3. This article is extremely surprising to me. 200-ish GB of data is almost nothing and really shouldn’t be that hard to duplicate several times over at the track and transfer back to base.

    Further, terabytes of archival data isn’t that much either. Sure petabytes are more difficult to deal with, but still not super expensive for a team with as big of a budget as Williams.

    What I don’t know, is how fast the connections are at each track. I suspect some are better than others, which makes off-site duplication not as straight forward. Still, at the numbers they’re talking, it sounds more like they 1) don’t have the right people dealing with, and/or 2) insufficient budget assigned to IT.

    1. More like their IT infrastructure is really old (by IT standard) and the board isn’t savvy enough to approve the upgrade. I’m more surprised they still use tapes, while it still have higher density than hard drives, 200GB is easily handled by modern hard drives and if they don’t deal with petabytes I don’t think tape is necessary. Assuming 20 races each year, it amounts to 4TB of data which can easily fitted into a hard drive and another for backups. A 6TB drive should be enough for race data plus testings and I think it will be easier to index by year. If they afraid of hard drive reliability when moving around, 256GB flash disk is already in the market. Just invest few of those for redundancy and keep the drives in the server room.

      1. Actually I’m being silly. Just use SSD for races. No moving parts, almost zero risk of losing data.

    2. They don’t have the right people dealing with

      Biznitch, pleasneetch.

    3. There is no shame with using (old) reliable proved technology. Williams certainly got a tight budget and no need to replace the storage system until now. Problem with tape / DAT is that data access is local and linear. But lots of companies still use DAT storage.

      Big data is a concept of machine learning algorithms using very large amount of data and computation power. All this became possible even for end users with cloud storage and virtual machines that can deploy on-demand.

      In this respect Williams had to put its data into the cloud. Big data can provide real-time analysis too so it probably needs to change its whole data storage chain. It’s not a matter of data quantity but high availability and real-time computation.

  4. When it comes to data analysis, correct interpretation is much more important than acquiring huge amount of data. Hence, fundamental flaws in the design are very rarely discovered by tons of data. Meaning: can’t see the forest for the trees…

  5. I’m surprised at how small the data numbers are.
    My home network has 7 machines on it. My main server has 2 3TB disks, my backup machine has 2 6TB disks and each of the workstations has a 2TB work disk.
    A 6TB spinning disk can be bought, off the shelf, for under £200, a 4TB SSD can be had for about £1300.
    This is peanuts for a team running on a budget in excess of £100 million per year.
    Somebody, somewhere in the team doesn’t really know what they are doing if these small amounts of data are causing them problems.
    As for data at the track, simply back it up to the cloud on an hourly basis.

    1. @ceevee 100GB of text data generated in a few days is already a lot of data actually. I’m more surprised with the small 100GB of video/audio data.
      As for the cloud it will even be realtime backup and bigdata analysis to assist decisions in the races.

  6. Wow 200GB is that all? I’m really surprised that are struggling with that much data! I’m a data-wrangler for video productions. Even on really low budget shoots I have a pretty basic setup in the boot of my car for when my clients are shooting remotely (away from any power source), and I can wrangle over 2TBs worth of footage in one day. Which usually has to be backed up to a 2 extra drives sometimes 3. The Master copy on local raid, 1st hard drive to travel in other car away from master, 2nd drive to couriered to production base. For bigger productions I even transcode and upload to the cloud the same day for review by directors and producers. Mad to think they are struggling with just 200GB? That can’t be right… Saying that I’ve been to production house’s that are still running Windows 95…

    1. Agree. God and Pay Drivers help them if they need to wrangle multiple 4k video streams for video analytics.

    2. I’d say it’s probably tb rather than gb. I don’t think they’d need more. I believe speed of acquisition and analysis is much more important than size / detail data acquired

  7. I agree with the previous comments. Perhaps there is a mistake in order and they are handling 200TB per race, then I could see why they worry. With 200GB or less per race, the whole season would fit on a 4TB hard disk that even ordinary people buy for aceptable money – I have 8 such disks at home for keeping my favourite movies and music (4 are copies of the other four and it already paid off :-) ) and I’m a classical middle-class family.

    1. 100gb of text data is a nice amount of data already.
      100gb of video/audio is not that much for all cameras/all cars/ all voices though

      1. The contents of the data isn’t surprising. It’s that they’re struggling with gigabytes of data a weekend, and only terabytes of archival data overall. This total amount isn’t at all troublesome for even individuals.

        As mentioned before, fully redundant network storage devices are available for just a few thousand USD. Even with a modest F1 budget, they should be able to buy 10 of these or even more expensive enterprise solutions without much of a hassle.

      2. @spoutnik Yes. This isn’t about data storage but data processing.

        100gb of raw blue-ray Game of Throne only cover 8 episodes. Anyone can binge-watched it on one night.
        100gb of A Song of Ice and Fire eBook is like digesting 20,000 books of incest and betrayal.

  8. a) they are talking about a sponsor/partner so clearly want to emphasize their value
    b) if you have crappy internet and limited cargo space (as well as different levels of power supply systems around the world) this isn’t as simple as in a stationary environment
    c)most of that data is likely still coming from sensors that are just dumping information at a constant rate – coordinating and synchronizing data from what is likely dozens of sensors on the car with one another data wise is not a trivial task. Not all the data is coming via radio either as far as I know – only the more important sensors actually transmit their readings to the box. All more of a logistics and moving pieces challenge than a technical one.

Comments are closed.