Address - SafeGraph

Challenges with Address Data Coverage and Quality

Global address data is fragmented. Many regions lack structured, geocoded addresses, limiting the accuracy of geospatial analysis and logistics operations.

Sparse or incomplete address data

Inconsistent address formatting

Sparse or incomplete address data

SafeGraph addresses these gaps with a structured geocoded address database designed for hard-to-source markets.

Built for Advanced Analysis

Every record is parsed into discrete address fields and paired with precise geographic coordinates, ready for ingestion into your data warehouse.

Parsed address components (street, city, region, postal code)
Accurate address data with latitude and longitude
Structured and verified address records designed for reliable geocoding
Multi-script support for international addresses
Flat-file delivery for warehouse ingestion
Consistent schema across countries

Core Capabilities of the SafeGraph Address Dataset

Improve Routing and
Location Accuracy

Power routing and spatial analysis with precise address coordinates.

Integrate with
Analytics Pipelines

Load address data directly into warehouses for large-scale geospatial analysis.

Validate Address
Coverage at Scale

Identify missing addresses and perform bulk address validation across markets.

Support Global
Address Workflows

Handle international address formats with multi-script support.

Address Data Across Hard-to-Source Markets

SafeGraph’s geocoded address dataset covers 35+ countries, with particular depth in markets where reliable address data is traditionally sparse or fragmented.

Coverage includes:

How Organizations Use SafeGraph Address Data

Teams across logistics, mapping, and geospatial analytics use SafeGraph geocoded address data to improve operational accuracy and strengthen geocoding systems.

Perform bulk address validation
Identify missing addresses in service zones
Improve routing precision with verified coordinates
Validate address coverage before expanding service areas
Reduce failed deliveries caused by incomplete address data

Expanding candidate datasets for geocoding APIs
Supporting batch geocoding workflows
Enriching internal geocoded address databases
Providing ground-truth coordinates for validation
Improving search, autocomplete, and routing accuracy in mapping systems

Address Schema Overview

The dataset follows a consistent, flat schema designed for direct ingestion into analytical warehouses. Each field is typed and documented.

Column Name	Description	Type	Example
primary_number	A JSON string with alphabet as key. Value: A primary numeric identifier for the building.	JSON	{ "latin": "4700" }
sub_building	A JSON string with alphabet as key. Value: Combined secondary designators (Ste, Unit, Bldg, Block).	JSON	{ "latin": "Unit 1105" }
building_name	A JSON string with alphabet as key. Value: The name of the building.	JSON	{ "latin": "Eaton" }
street	A JSON string with alphabet as key. Value: All street components combined.	JSON	{ "latin": "Main Street" }
intermediate_locality	Additional details associated with locality like subdivision, neighborhood, or village.	JSON	{ "latin": "Hohenlimburg" }
locality	Subdivision or district within a city.	JSON	{ "latin": "Repto Robles" }
city	The city of the point of interest.	JSON	{ "latin": "Clearwater" }
sub_region	Second largest administrative division in a country.	JSON	{ "latin": "Tom" }
region	State, province, or county of the location.	JSON	{ "latin": "Pinellas" }
postal_code	The postal code of the location.	JSON	{ "latin": "12235" }
full_address	The full unparsed address from the source.	JSON	{ "latin": "1680 Campbell Ln Bowling Green KY 42104-1062" }
iso_country_code	2-letter ISO country code.	String	US
latitude	Latitude coordinate of the address.	Float	36.714767
longitude	Longitude coordinate of the address.	Float	121.662912

Address Data Designed for Modern Data Infrastructure

SafeGraph address data integrates easily with modern analytics and geospatial infrastructure.

FAQ’s

What is a geocoded address dataset?

A geocoded address dataset contains structured address records paired with geographic coordinates such as latitude and longitude.

How is this different from a geocoding API?

Geocoding APIs resolve addresses one at a time through a live request. That works for individual lookups but can become expensive at scale. SafeGraph delivers a flat file you load directly into your warehouse, so you can run bulk address validation and analysis without rate limits or live dependencies.

Can the dataset support batch geocoding?

SafeGraph’s flat file format makes it well-suited for batch workflows. You can validate large address lists, fill in missing coordinates, and run coverage analysis across entire service areas directly from your warehouse, without managing API calls or hitting rate limits.

Does the dataset include latitude and longitude?

Opened and closed dates are determined using metadata at the source level. If a new point of interest from an existing source repeatedly appears in our build pipeline, it is flagged as “opened_on” during the month in which it first appears. Similarly, if a POI from an existing source repeatedly disappears in our pipeline, it is flagged as “closed_on” during the month in which it first disappears. These flags are applied to the Places product following final QA checks and overall data hygiene processes.

Temporary closures are not captured in open and close tracking. During the onset of COVID-19, it became difficult to distinguish permanent closures from temporary ones, which resulted in a lower number of POIs marked as closed between March and June 2020. If a POI has not been sourced consistently enough to determine a closure date, the tracking_closed_since field will remain null. In general, the SafeGraph Places dataset tracks opened and closed dates from July 2019 onward.

How is the address dataset delivered?

SafeGraph Places uses the North American Industry Classification System (NAICS) developed by the US Census Bureau. This system assigns a numeric code of up to six digits to classify points of interest by industry.

Although NAICS was developed in the United States, it has proven effective for categorizing points of interest data in other countries as well. The classification is hierarchical, with the first two digits representing a broad category and additional digits providing increasingly specific classifications.

Featured Content

Featured content

Featured content

Geocoded Address Data Powering Global Location Intelligence

Challenges with Address Data Coverage and Quality

Sparse or incomplete address data

Inconsistent address formatting

Sparse or incomplete address data

Built for Advanced Analysis

Core Capabilities of the SafeGraph Address Dataset

Improve Routing and Location Accuracy

Integrate with Analytics Pipelines

Validate AddressCoverage at Scale

Support Global Address Workflows

Address Data Across Hard-to-Source Markets

How Organizations Use SafeGraph Address Data

Logistics and Delivery

Pricing Based on Data Scope

Address Schema Overview

Address Data Designed for Modern Data Infrastructure

FAQ’s

Have Questions About Our Address Dataset?

Learn More About Accurate and Precise Global POI Data

Improve Routing and
Location Accuracy

Integrate with
Analytics Pipelines

Validate Address
Coverage at Scale

Support Global
Address Workflows

Learn More About Accurate and
Precise Global POI Data