Once you understand what kind of geospatial data your organization might need, you next need to know where to find it. There are many places where you can get geospatial data, but not all of them will provide the type(s) of data that you’re looking for.
It’s important to know where to go to get the right kinds of data for building your organization’s geospatial data ecosystem. We’ll help you do that here by listing and explaining some common reliable geospatial data sources for the following categories:
Let’s begin our data safari.
Not every person or organization is going to need the same kinds of geospatial data. So for the sake of speed and efficiency, it helps to know ahead of time where to look.
Our list here will break down sources of geospatial data by the types of data you’ll likely be after. Note that some sources may provide more than one kind of data, so you don’t necessarily need to shop all over the place to get what you need.
Sources of points of interest (POI) data have information on pretty much all kinds of non-residential buildings and properties. Basically, anywhere where people congregate and hang out, other than a private dwelling, can be considered a point of interest. Usually that’s a commercial building like a store or restaurant, but not always. POI data also often contains attributes of the places it describes, though these can be slightly different between providers.
SafeGraph has extensive points of interest data known as Places. This includes information on millions of global places, with optional data for businesses that have permanently closed. You can find out a place’s name, address, geographic coordinates, business type, associated brand, and more.
CAP Locations data focuses specifically on commercial points of interest. It contains records on over 40,000 retail stores, restaurants, and malls across the US and Canada (with plans to expand this dataset to the UK). Their information spans from addresses and business categories to attributes like parking and tenant capacity, or the year a building opened or was remodeled.
Sources of property data have sets of polygons and other geometric shapes to represent the physical boundaries of buildings and other properties on Earth. Mostly, these datasets are on 2-dimensional planes of length and width, but some include height to become 3-dimensional. Many of these datasets also include additional attributes of the properties they define, much like POI data.
SafeGraph’s property dataset is called Geometry. It provides building footprint information for millions of POIs in the US, UK, and Canada. It also contains spatial hierarchy metadata, which shows spaces or rooms within other buildings.
Born from a merger between Landgrid and Loveland Technologies, Regrid has data on over 150 million parcels of property across the US. Download this geospatial data to also get standardized building footprints of over 155 million structures around the country.
BA45 has a database on over 125 million properties across the US, including information regarding who owns them. Each property has been given over 40 different attributes, from how much money it last sold for to how many stories it has to whether it has a garage, pool, and/or central HVAC system.
Sources of mobility data use anonymized GPS signals sent out by people’s cell phones to provide a general picture of where people are throughout the day, and when. They do not track individual users or their activity, but rather measure activity around points of interest and neighborhoods (i.e. census block groups, or CBGs) to see how frequently people visit there, and in what volumes.
SafeGraph’s main mobility dataset is called Patterns and contains aggregated data regarding visitors to POIs across the US and Canada. Other information provided includes how often people visit a POI, how long they stay, and where they go before or afterwards. SafeGraph also has a Weekly Patterns dataset for POIs that is updated more frequently, as well as a Neighborhood Patterns dataset that focuses on activity between census block groups.
Veraset has a basic mobility dataset called Movement. This offers anonymized information on movement patterns around POIs located in over 150 countries around the world. This includes timestamps and geographic coordinates. Veraset also has a Visits dataset, which combines mobility and property data to determine whether or not someone actually visited a specific location, and when.
Locomizer runs a geospatial data platform that measures mobility data around specific points of interest. They are mainly called upon to help their clients put together targeted advertising campaigns.
Sources of demographic data contain information about the people who live in a certain geographic area. This includes bulk population counts, but these are also often segmented by attributes such as sex/gender, age, median income, and average housing costs.
In many cases, demographics data is open geospatial data because it’s made publicly available by government agencies. For example, SafeGraph has aggregated and cleaned data from the US Census Bureau and their American Community Survey report to cover demographic approximations across the US from 2016 to 2019.
Esri has ready-to-use demographic data in various forms for over 130 countries around the world. It’s segmented by over 15,000 variables, including ones related to income, families/households, health, education, employment, age, gender, ethnicity, and more.
Spatial.ai does demographic data differently. Their data is sourced from social media and segmented into over 70 categories that go beyond basic attributes to model and predict how people actually behave. These range from people’s hobbies, lifestyles, relationships to what they eat and drink, what they do for fun, and what they believe in.
Address sources provide data on the locations of properties to power geocoding and routing. Information is usually in the form of postage details or geographic coordinates, but may include other attributes as well.
Among Infutor’s geospatial datasets is its National Spatial Reference File, which compiles over 360 million addresses across the US. These include precise latitude and longitude coordinates, as well as all addresses that are on file for US postal organizations (and even some that aren’t).
The USDOT has partnered with state and local governments across the US to create a National Address Database (NAD). Its goal is to have accurate, up-to-date, and free geospatial data on addresses for use in transportation safety, emergency response, and many other government services. As of May 2021, it contains over 60.5 million records from over half the states in the US.
Sources of boundary data provide information about the political divisions in geography. These include borders between countries, but can also include the boundaries of smaller administrative jurisdictions such as states, provinces, territories, counties, and regions. They can also include small civic boundaries such as school districts.
CARTO has data on several different kinds of boundary segmentation. These include general regional borders, but also include things like census block groups (CBGs), school districts, postal code catchment areas, and more.
Esri provides an amalgamation of several different categories of geospatial data. One is administrative boundaries, such as postal code catchment areas. This geospatial data download also includes information on POIs and properties.
Mapbox Boundaries has data on over 5 million boundary sets from countries around the world. These include administrative, legislative, local, postal, and statistical boundaries such as states/provinces/counties, electoral districts, major metropolitan areas, and census block groups (CBGs).
Sources for environmental data gather information on what’s going on in the natural geographic world. Some may be government agencies, while others may be private corporations or non-profit conservation groups. They track things like temperature, weather patterns, wildlife migration, and seismic activity.
ClimateCheck is an environmental data service that synthesizes over 25 internationally-recognized models on climate change. Built for the insurance and real estate industries, it assesses the risk of climate-related damage (fires, floods, heat, storms, etc.) to over 140 million individual properties in the US over the next 30 years.
Tomorrow.io offers complete, accurate, and customizable historical weather information from around the world. Their platform is built to help minimize the impact of inclement weather on businesses. It does so through services such as monitoring and forecasting conditions, making actionable recommendations, providing team-wide alerts, and streamlining communication channels to speed up response to weather-related incidents.
CustomWeather aggregates weather data from over 80,000 locations worldwide to provide the most comprehensive global weather coverage. By providing daily, monthly, and year-over-year weather comparisons for specific places on the planet, CustomWeather provides vital climate intelligence for workers in broadcast media, agriculture, insurance, renewable energy, and more. Notable weather attributes they track include min/max/average temperature, precipitation, humidity, and atmospheric pressure.
Sources of street data map out the myriad of road transportation networks around the world. They may also provide metadata on these networks, such as where and when roads get the most traffic and what potential obstructions might slow commuters down. Their goal is to safely and efficiently get people where they want to go.
Mapbox’s Traffic Data contains information about over 30 billion road segments (each taking an average of 5 minutes or less to traverse) around the world, consistently updated by over 600 million monthly active users worldwide. It works with other major geospatial data solutions such as OpenStreetMap, HERE, and TomTom to provide the data necessary for route planning and traffic analysis.
Google’s Roads API allows for inputting up to 100 sets of GPS coordinates, whereupon it will then map those points to the geometry of known roads to determine the most likely route a vehicle took. It includes features for interpolating coordinates to better fit the actual shapes of roads, and even provides other metadata about those road segments (such as speed limits).
There are many sources of images out there. But the best ones for geospatial purposes are those that show what the physical world actually looks like, as a visual frame of reference for other geospatial data. It is also helpful if they include other geospatial metadata, such as what points of interest can be seen (and perhaps information about them), what time of day it is, and what other signs and signifiers might be nearby for navigation and safety’s sake.
Many search engines, such as Microsoft Bing, now offer mapping services that incorporate aerial photography as a map layer option. Some, like Bing, also offer “street view” services that allow users to view particular road segments as if they were actually traveling on them.
Nexar’s imagery data focuses on roadways. They use special dashboard cameras on cars to not only capture what roads and their surroundings look like, but also to detect road signs and other factors that may affect traffic. This allows navigation companies, insurance firms, governments, and others to assess the safety and driveability of roads more accurately.
These are a few of the many places out there where you can find reliable geospatial data. Now that you know where to get the data you need to build your geospatial data ecosystem, the next chapter will give you some ideas on what you can do with it all.
If you're ready to learn more, check out the next chapter "Top 10 Uses of Geospatial Data + Where to Get It". If you want to learn more about geospatial data types, check out “Geospatial Data Types and How You Can Use Them”.