Blog home

Demystifying the SafeGraph Facts

May 3, 2022
by
Auren Hoffman

SafeGraph sells facts about places. We strive to be a super transparent company and have always provided our full data schema available online.  Our mission is to empower data scientists working on humanity’s hardest problems.    

The SafeGraph Places dataset includes where businesses are located, when they’re open, and what neighboring businesses surround them.  Any and all of the places in our datasets are easily searchable online.  It’s all out there. We even list our bugs and errors every month.

SafeGraph just sells facts.

Most of the organizations that use our facts want to know about things like the store hours of the local cafe.  The store hours actually change a lot (and during peak COVID they were changing weekly) and a lot of people want to know that information.

We also have data about the geometry of a physical place.  Like understanding the shape of your local gym or understanding all the parking lots in your metro area.  

SafeGraph also has a Patterns dataset that shows how groups of people interact with a place (fully aggregated and anonymized).  SafeGraph has always committed to the highest level of privacy practices ensuring individual privacy is NEVER compromised.  We use differential privacy to ensure anonymity. 

SafeGraph only focuses on the truth.  So here are a few truths:

We only sell data about physical places (not individuals).  Our data is available to anyone to buy.  Our schema is public.  There is nothing hidden.  Even VICE (an online news organization) bought our data and we would never prevent them from getting the data.  We also give the SafeGraph data away for free to 15,000+ researchers and academics who use the data in amazing ways.  We also have over 100 reporters from some of the nation’s best press institutions use the data.  We encourage you to get the data and see it yourself.  

We service tons of different use cases.  Here are the main use cases of the SafeGraph data:

  • Research – 15,000+ researchers and academics use the SafeGraph data resulting in hundreds of major research papers.  
  • Logistics – we have a lot of data about warehouses, train depots, ports, and more.  So the data is very helpful for logistics.  This is especially important today when there are so many supply bottlenecks.
  • Local search – one of the big use cases for SafeGraph data is helping put places on a map.  Our data can be very helpful in a search for “Italian restaurants near me.”
  • Real estate planning – many of the largest retailers use the SafeGraph data to figure out where they should put their store.  Also some of the big real estate buyers (like large PE funds) use SafeGraph data to figure out what to buy. 
  • Adtech - most of our advertising customers are in the Out Of Home category where they are trying to figure out where they should deploy new ad assets and they also use our data for compliance (like to make sure there are not alcohol ads near schools).  
  • Government –  our government customers include the CDC (for health policy), Federal Reserve (to help understand the economy), and many local and state governments (mainly for things like urban development, understanding food deserts, and transportation planning).  We have great data on parking lots.  Contrary to some belief, we don’t have any law enforcement customers.  
  • Healthcare - for real estate planning and logistics about how to deliver better care.

We build facts about physical places and that’s all we do.  We have competitors that also sell this type of data … but we think we’ve been successful because we have focused on the veracity of the data.  

Part of democratizing access to data means making it available in a self-serve way.  But of course, making data convenient and accessible also has drawbacks.  It means we aren’t able to fully control who buys the data.  But we’ve never tried to censor or hide anything. 

To our knowledge, nobody has ever used SafeGraph Patterns data for malicious purposes.  In fact, the Patterns dataset has mostly been used by 15,000+ researchers and academics and local governments to combat COVID-19.  Some of the brightest data scientists in the world have extensively used Patterns for social good and have never encountered issues with data privacy since we released the data 3 years ago. 

But there are always extreme hypothetical corner cases, and in some cases these are worth actively preventing.  

In light of potential federal changes in family planning access, we're removing Patterns data for locations classified as NAICS code 621410 (‘Family Planning Centers’) from our self-serve “shop” and API to curtail any potential misuse of its data.  

We don’t have any indication that this data has ever been used for bad purposes.  We have had many academics that have used this type of data for really good purposes.  Taking away this data will impact many academics that want to study this topic (like understanding the impact of legislation on family planning visits).  We acknowledge that our decision to take down Patterns for family planning centers could negatively impact this valuable research, but we think this is the right decision given the current climate.  

We will still have Places and Geometry data about Family Planning Centers (like their locations and operating hours).  Family Planning centers like Planned Parenthood make their location data public because they want to serve their constituents. 

These decisions are never easy and there will certainly be more conflicting situations in the future.  SafeGraph is committed to remaining dynamic and advancing our mission of democratizing data in a privacy-safe way.

Browse the latest

Questions? Get in touch with our team of data experts.