Blog home

Tips for Normalizing Foot Traffic Data

February 2, 2021
by
Eugene Chong

Foot traffic data is incredibly useful for understanding consumer behavior. With fields for number of visitors, number of visits, dwell times, origin, and more, SafeGraph Patterns data empowers organizations to derive actionable insights related to how people interact with points of interest. 

One of the most common ways data scientists use SafeGraph Patterns data is to measure foot traffic over time. This can reveal interesting trends related to seasonality, brand affinity, and proximity to other businesses. An accurate and detailed analysis of foot traffic over time can be used to inform site selection, investment, and advertising decisions, among many other use cases spanning different industries. But to do this effectively, some technical considerations need to be factored in.

Reasons to normalize

Like most frequently-updated datasets, raw foot traffic data can show rapid fluctuations that may overwhelm an analysis. This can be remedied by applying moving averages to smooth out the data while still preserving the important trends for analysis.

Granularity should also be considered so that you find the right balance between privacy regulations and specificity for your analysis. Deciding on the appropriate level of granularity, for example, CBG-level or POI-level foot traffic, can help you determine the natural variance of mobile location pings while also ensuring you maintain compliance with privacy agreements for your specific use case.

Panel bias can arise from collecting data from sub-groups of the population disproportionately. If a Walmart POI's typical visitor profile is 10% Hispanic or Latino, but the sample panel accounts for only 5%, the. results from analyzing the raw data will not be as accurate as they could be if corrected for bias. Any other type of bias that does not fit into panel bias is considered an outlier and should also be filtered out.

Adjusting for sample bias to represent the true demographic profile of a POI improves the accuracy of results.

SafeGraph Patterns is aggregated from a panel of millions of mobile devices in the US and Canada. As such, there are going to be biases and outliers in the data. However, when normalized correctly, Patterns can provide the foot traffic insights needed to truly transform a business strategy.

Simple ways to normalize patterns

Working with Patterns time series data requires some nuance. This is because, while fluctuations in foot traffic measurements are primarily driven by changes in real-world movement behavior (i.e., true signal), they can also be influenced by differences in the size and composition of SafeGraph's mobile device panel (i.e., noise and/or sampling bias).

In our full technical guide, we provide recommendations for filtering the signal from the noise when analyzing SafeGraph Patterns data over time. This notebook demonstrates methods for:

  1. Normalizing raw visit numbers to provide comparable measurements over time
  2. Indexing the normalized numbers to a baseline value to measure change
  3. Identifying and filtering outlier POI when comparing to benchmark data

We do so by applying the above methods on Patterns data, and then comparing results to external "ground truth" datasets. We have selected datasets that represent proxies for foot traffic at a specific POI or a group of POIs, similar to prototypical Patterns use cases (although most Patterns users are comparing to proprietary first party data rather than the publicly available data here).

  • Monthly airport security checkpoint throughput at ~50 US airports as reported by the TSA from January 2019 to February 2021.
  • Quarterly reported attendance/revenue from 2019-2020 at SeaWorld (total attendance across parks), Six Flags (total attendance across parks), Lowe's (company-wide revenues), and Planet Fitness (company-wide revenues).
  • Daily Major League Baseball game attendance figures from 2019 and early 2021 (because games were played without fans in attendance during 2020). Data provided by Bill Petti.

We selected datasets reported at different levels of aggregation (i.e., Daily, Monthly, and Quarterly) to demonstrate how to aggregate Patterns data to different time periods.

You may find that a particular method of normalization (or not normalizing at all) performs better in some use cases than it does in others.

Putting it all together

Once you’ve normalized your data, you can start to run analyses and deliver reliable results to inform your business strategy. Understanding the effect foot traffic has on your other data, whether credit card transactions, offer redemptions, or product sales, can help you uncover valuable connections otherwise undiscoverable. By factoring in seasonality, holidays, and other known sources of variance, you can be confident in your ability to make strategic decisions that will boost your overall business.

Browse the latest

Subscribe to our Monthly Newsletter

Sign up to stay in the loop on all of the latest blogs, podcasts, product information, and more.