Questions about normalized_visits_ by_state_scaling

Hi, I would like to compare the overall POI visit counts among US counties. The Patterns dataset includes normalized_visits_by_state_scaling which says the data is scaled using the mobile device sampling rate for the state in which the POI is located. I wonder if this is calculated by raw visits divided by the states’ sampling rates? For comparing county-level data, I think normalization should consider the county population while sampling rates can also be considered here. So I feel the complete normalization could be raw_visits/(population*sampling rate). Is this a sound logic based on your data? Also, is there any way I can access, or compute normalized visits by county scaling by myself?

Yes, that’s how that column is computed. See this blog post or Approach 1 here.

You are free to try to use the county sampling rates as well. Follow the colab notebook above but replace the state-level aggregation with the county-level aggregation (both are contained within the poi_cbg column). If you do this, I’d be interested in seeing a comparison of the upscaled data after using state- vs county-level data!

Hi Jeff thanks for your reply. Do you mean the second “Apporach 1 here” is the colab notebook? I did not find the concrete data for county/state sampling rates.

The heading “Approach 1” inside the colab notebook:

It shows you how to calculate the device sampling rates per geography, with state as the geography used in the example.

Hi Jeff, yes, can you please also direct me to the files about the county/state sampling rates used by the SafeGraph Pattern dataset?

We don’t provide explicit files that have those sampling rates - you must calculate those manually using the notebook provided.

Okay thank you.