Working with Locations Inside Other Locations

Table of Contents

Categories

Share Article

Introduction

What do you do when a single piece of land can be rightfully claimed by two different locations?

It’s not that unusual for one place to be two places. For example, the Claire’s in your nearby mall. Is that in the Claire’s? Is that in the mall? Really, it’s in both.

SafeGraph addresses this problem using the concept of parent and child locations. Some overarching location that contains many other locations within it is a parent. Common types of parents include malls, airports, and shopping centers. You can see the full list of location types that can be parents, and some more information about parent/child relationships, here. The location that’s inside of some sort of parent is a child.

Dealing with parent and child locations can be an important part of working with SafeGraph data, even if you’re not interested in distinguishing them, since if you’re not careful you could end up double-counting foot traffic: once for the parent, and once for the child.

So some important questions:

  1. How can we look in our data and figure out where we have parent and child locations?
  2. How can we work with data that has parent and child locations?
  3. How can we think about the spatial orientation of parent and child locations?

Let’s start by opening up our data and loading it in.

 customer_placekeycustomer_parent_placekeycustomer_location_nameplacekey
1zzw-222@64h-vr7-ysq222-226@64h-vr7-ysqSharps Barbershop222-226@64h-vr7-ysq
2zzy-222@64h-vr7-ysq222-226@64h-vr7-ysqCajun Seafood & Wings222-226@64h-vr7-ysq
3222-226@64h-vr7-ysqNaNPleasant Valley Marketplace222-226@64h-vr7-ysq
4zzw-223@64h-vr7-ysq222-226@64h-vr7-ysqMamma Mia Pizzeria222-226@64h-vr7-ysq
522p-222@64h-vr7-y35222-226@64h-vr7-ysqKingdom World Outreach Center222-226@64h-vr7-ysq
622f-222@64h-vr7-yvz222-226@64h-vr7-ysqSaigon 1222-226@64h-vr7-ysq
722g-222@64h-vr7-yvz222-226@64h-vr7-ysqDHL222-226@64h-vr7-ysq
8zzw-224@64h-vr7-y35222-226@64h-vr7-ysqSally’s Bakery & Grocery222-226@64h-vr7-ysq
922r-222@64h-vr7-9j9222-226@64h-vr7-ysqSmoke Shack222-226@64h-vr7-ysq

This sample contains a set of customer_ columns of the child data used to match, and then another set of columns containing the match data, i.e. the parent.

Store data that was acquired in a different way, for example getting all POIs in a certain city, might be structured slightly differently, so the next steps might not be necessary for you.

So, question 1 as above:

1. How can we look in our data and figure out where we have parent and child locations?

We’ve got the data. We separated out all the match data, but for a given set of POIs that contains both parents and children, we can figure out where we have parent and child locations by looking at the parent_placekey column. This column is missing for any location without a parent. For children locations, it will tell you which location is the parent.

We can figure out children by any row that has a nonmissing parent_placekey column.

And we can figure out parents by *row for which its placekey is found as a parent_placekey of some other column.

Let’s find this in our data here:

 customer_placekeycustomer_parent_placekeycustomer_location_nameplacekey
1zzw-222@64h-vr7-ysq222-226@64h-vr7-ysqSharps Barbershop222-226@64h-vr7-ysq
2zzy-222@64h-vr7-ysq222-226@64h-vr7-ysqCajun Seafood & Wings222-226@64h-vr7-ysq
3222-226@64h-vr7-ysqNaNPleasant Valley Marketplace222-226@64h-vr7-ysq
4zzw-223@64h-vr7-ysq222-226@64h-vr7-ysqMamma Mia Pizzeria222-226@64h-vr7-ysq
522p-222@64h-vr7-y35222-226@64h-vr7-ysqKingdom World Outreach Center222-226@64h-vr7-ysq
622f-222@64h-vr7-yvz222-226@64h-vr7-ysqSaigon 1222-226@64h-vr7-ysq
722g-222@64h-vr7-yvz222-226@64h-vr7-ysqDHL222-226@64h-vr7-ysq
8zzw-224@64h-vr7-y35222-226@64h-vr7-ysqSally’s Bakery & Grocery222-226@64h-vr7-ysq
922r-222@64h-vr7-9j9222-226@64h-vr7-ysqSmoke Shack222-226@64h-vr7-ysq

Now it just so happens that the data we’ve taken for this demonstration includes exactly one parent: the Pleasant Valley Marketplace in Virginia Beach, and all of its children.

With our data loaded, and the parents and children identified, we can move on to our next question:

2. How can we work with data that has parent and child locations?

This requires us to think about what kind of parent/child relationship we have. The most important distinction is whether or not the child is enclosed within its parent.

The enclosed column tells you whether a child is enclosed within its parent. This is something like that Claire’s in the mall. It’s really inside that mall, as opposed to a burger joint in an outdoor strip mall, which is in that strip mall but maybe also sort of its own space. When a location like Claire’s is enclosed, it can be difficult to tell the difference between a device being in Claire’s and being near Claire’s inside the mall.

The way we deal with parent and child locations differs considerably based on whether the children are enclosed or not.

In the case of Pleasant Valley Marketplace, the children are enclosed (enclosed == True).

Children that are enclosed (enclosed == True) are basically not distinguished from their parents. They do not have their own separate foot traffic data. Sometimes they have their own polygons, but sometimes they’re just a part of the parent polygon, although they may have their own latitude/longitude data.

Children that are not enclosed (enclosed == False) have parents but also act as independent locations. We track visitor data like visits_per_day separately for those locations, and they have their own polygon data in the polygon_wkt column.

The Pleasant Valley Marketplace is full of enclosed children, so we’ll talk about how to handle that first.

Working with Enclosed Children

How can we handle data from enclosed children? Well, we can ignore the children’s foot traffic data, since it doesn’t really have any. We can take the parent foot traffic data we see, and that covers the entire region.

 customer_placekeycustomer_parent_placekeycustomer_location_nameplacekey
1zzw-222@64h-vr7-ysq222-226@64h-vr7-ysqSharps Barbershop222-226@64h-vr7-ysq
2zzy-222@64h-vr7-ysq222-226@64h-vr7-ysqCajun Seafood & Wings222-226@64h-vr7-ysq
3222-226@64h-vr7-ysqNaNPleasant Valley Marketplace222-226@64h-vr7-ysq
4zzw-223@64h-vr7-ysq222-226@64h-vr7-ysqMamma Mia Pizzeria222-226@64h-vr7-ysq
522p-222@64h-vr7-y35222-226@64h-vr7-ysqKingdom World Outreach Center222-226@64h-vr7-ysq
622f-222@64h-vr7-yvz222-226@64h-vr7-ysqSaigon 1222-226@64h-vr7-ysq
722g-222@64h-vr7-yvz222-226@64h-vr7-ysqDHL222-226@64h-vr7-ysq
8zzw-224@64h-vr7-y35222-226@64h-vr7-ysqSally’s Bakery & Grocery222-226@64h-vr7-ysq
922r-222@64h-vr7-9j9222-226@64h-vr7-ysqSmoke Shack222-226@64h-vr7-ysq

However, while we don’t have traffic data for the children, we do have plenty of other information about them from the core information columns.

 placekeyparent_placekeylocation_name
0zzw-223@64h-vr7-y35222-226@64h-vr7-ysqWestern Union
1zzw-222@64h-vr7-ysq222-226@64h-vr7-ysqSharps Barbershop
2zzy-222@64h-vr7-ysq222-226@64h-vr7-ysqCajun Seafood & Wings
4zzw-223@64h-vr7-ysq222-226@64h-vr7-ysqMamma Mia Pizzeria
522p-222@64h-vr7-y35222-226@64h-vr7-ysqKingdom World Outreach Center
622f-222@64h-vr7-yvz222-226@64h-vr7-ysqSaigon 1
722g-222@64h-vr7-yvz222-226@64h-vr7-ysqDHL
8zzw-224@64h-vr7-y35222-226@64h-vr7-ysqSally’s Bakery & Grocery
922r-222@64h-vr7-9j9222-226@64h-vr7-ysqSmoke Shack
10zzw-222@64h-vr7-y35222-226@64h-vr7-ysqDolphin Laundromat
11zzy-222@64h-vr7-yvz222-226@64h-vr7-ysqState Farm
12222-222@64h-vr7-ysq222-226@64h-vr7-ysqKrossroads Cafe and Tavern
13228-222@64h-vr7-yvz222-226@64h-vr7-ysqAllstate Insurance
1422b-222@64h-vr7-ysq222-226@64h-vr7-ysqIglesia Cristiana Rios de Agua Viva de Virginia Beach
15zzw-222@64h-vr7-yvz222-226@64h-vr7-ysqFamily Dollar Stores
16222-224@64h-vr7-ysq222-226@64h-vr7-ysqTung Hoi Chinese Restaurant
17zzw-226@64h-vr7-y35222-226@64h-vr7-ysqFj Beauty Studios
1822s-222@64h-vr7-y35222-226@64h-vr7-ysqAdamo’s New York Pizzeria
19222-223@64h-vr7-ysq222-226@64h-vr7-ysqFood Lion
2022f-222@64h-vr7-ysq222-226@64h-vr7-ysqTokyo Express

For example, maybe we’re interested in the kinds of businesses and locations that are inside the shopping center.

  Number
top_categorysub_category 
Activities Related to Credit IntermediationOther Activities Related to Credit Intermediation1
Agencies, Brokerages, and Other Insurance Related ActivitiesInsurance Agencies and Brokerages2
Bakeries and Tortilla ManufacturingRetail Bakeries1
Couriers and Express Delivery ServicesCouriers and Express Delivery Services1
Drycleaning and Laundry ServicesDrycleaning and Laundry Services (except Coin-Operated)1
General Merchandise Stores, including Warehouse Clubs and SupercentersAll Other General Merchandise Stores1
Grocery StoresSupermarkets and Other Grocery (except Convenience) Stores1
Other Miscellaneous Store RetailersTobacco Stores1
Personal Care ServicesBarber Shops1
Beauty Salons1
Religious OrganizationsReligious Organizations2
Restaurants and Other Eating PlacesFull-Service Restaurants7

3. How can we think about the spatial orientation of parent and child locations?

In the case of enclosed-children data where the children have their own polygons, you can handle spatial orientation as normal. Simply look at the polygons!

But what if they don’t, as in this data? At this point, you’re stuck with just the parent polygon. But you can still do a little something with the children, because they will have latitude and longitude data that you can work with.

So we’ll start by mapping out the parent polygon. The polygon_wkt column is information about the POI’s polygon in WKT format.

We can add the child locations on as points (see this guide).

This looks like the kind of place where there’s a long row of stores down the center, surrounded by parking. Let’s make sure that makes sense. First of all… does the polygon include parking? We can check in the includes_parking_lot column.

Yep! That’s a parking lot. Knowing that we have a parking lot is important for interpreting foot traffic data-for example, foot traffic to a McDonald’s means something very different depending on whether or not we pick up the drive-thru.

What is notable is that Pleasant Valley turns out to be an outdoor shopping center, which goes to show that sometimes these kinds of locations can also be “enclosed.”

Working with Non-Enclosed Children

Let’s look at another part of the data, picking out a parent location that has non-enclosed children. Unlike the first data set, this one was created by pulling all locations in a certain zip code. This means we can see, and understand how to import, this alternate structure of data.

This time we’ll be working with the Shoppes at Lac de Ville, which is in Rochester, New York. This is the Placekey ID 222-224@665-8rv-vs5, and so to get both the parent and child data, we can look for that code in either the parent_placekey column or the placekey column.

 placekeyparent_placekeylocation_namebucketed_dwell_times
43222-224@665-8rv-vs5NaNShoppes At Lac De Ville{“240”:177}
56zzw-22b@665-8rv-sdv222-224@665-8rv-vs5Allstate InsuranceNaN
59224-222@665-8rv-vmk222-224@665-8rv-vs5Parker Robt E III DDS{“240”:0}
97zzw-223@665-8rv-sdv222-224@665-8rv-vs5Silk{“240”:0}
153222-225@665-8rv-vs5222-224@665-8rv-vs5Citizens Bank{“240”:0}
187222-222@665-8rv-s89222-224@665-8rv-vs5Ritz Stacey M Od{“240”:5}
224224-222@665-8rv-vs5222-224@665-8rv-vs5Mobile Notary ServiceNaN
227229-222@665-8rv-vmk222-224@665-8rv-vs5Joseph I Mann MD Greater Rochester Neurology{“240”:0}
229zzw-228@665-8rv-sdv222-224@665-8rv-vs5Project Leannation{“240”:0}
283223-222@665-8rv-vj9222-224@665-8rv-vs5Visionary Eye Associates{“240”:0}
286222-228@665-8rv-vs5222-224@665-8rv-vs5Julian’s Dry CleanersNaN
296222-224@665-8rv-s89222-224@665-8rv-vs5Rochester Eye AssociatesNaN
301zzw-227@665-8rv-sdv222-224@665-8rv-vs5Mesquite Grill{“240”:11}
395zzw-222@665-8rv-sdv222-224@665-8rv-vs5Dollar General{“240”:28}
396222-229@665-8rv-vs5222-224@665-8rv-vs5Bolsa Nails{“240”:0}
426zzw-229@665-8rv-sdv222-224@665-8rv-vs5Thimble Tailoring & ClothierNaN
439223-222@665-8rv-x5z222-224@665-8rv-vs5M&T Bank{“240”:1}
463222-227@665-8rv-vs5222-224@665-8rv-vs5Liberty Wine & Liquor{“240”:0}
471222-223@665-8rv-vs5222-224@665-8rv-vs5Feet First Shoes and Pedorthics{“240”:0}
514222-226@665-8rv-vs5222-224@665-8rv-vs5CVS{“240”:0}
558222-22c@665-8rv-vs5222-224@665-8rv-vs5Boomtown Cafe{“240”:4}
580zzw-222@665-8rv-vj9222-224@665-8rv-vs5Evangelisti Reconstructive & Plastic Surgery{“240”:0}
618222-222@665-8rv-vs5222-224@665-8rv-vs5Oreck{“240”:0}
625222-223@665-8rv-s89222-224@665-8rv-vs5Dupont David OD{“240”:0}
779zzy-223@665-8rv-sdv222-224@665-8rv-vs5Amaya Indian Cuisine{“240”:18}
785zzw-223@665-8rv-vj9222-224@665-8rv-vs5Stephen Evangelisti{“240”:1}
835222-222@665-8rv-x89222-224@665-8rv-vs5MacGregor’s Grill & Tap{“240”:1}
854zzy-222@665-8rv-sdv222-224@665-8rv-vs5Paislee Boutique{“240”:0}
857zzw-222@665-8rv-vpv222-224@665-8rv-vs5Tops Friendly Markets{“240”:45}
868226-222@665-8rv-vmk222-224@665-8rv-vs5Brighton Towne Dental{“240”:26}
889zzw-225@665-8rv-sdv222-224@665-8rv-vs5United States Postal Service (USPS)NaN
900zzw-224@665-8rv-sdv222-224@665-8rv-vs5Rita’s Italian Ice{“240”:35}
902zzy-222@665-8rv-vj9222-224@665-8rv-vs5CaminoByTheWay{“240”:2}

And are these actually non-enclosed child locations? Let’s make sure.

All false! That’s what we were expecting.

The first thing to be aware of when dealing with non-enclosed data is that unless we’re careful, we’ll double-count foot traffic. Foot traffic that shows up for a child will also show up for its parent. So we will want to drop one or the other if we’re going to be aggregating things up and don’t want to double-count.

How can we tell if we have double-counting going on? Well, it should be going on any time you have a non-enclosed child location. But it’s especially easy to see if the parent location doesn’t have any visitors outside of its children, as is the case here. We can add up the daily visits for the parent, and for all the children, and should get the exact same values.

 parent_visitschild_visits
02525.0
12828.0
22828.0
35454.0
47777.0
55555.0
65454.0
75858.0
82929.0
93030.0
106767.0
117373.0
126161.0
135555.0
145858.0
153333.0
162222.0
174343.0
188383.0
195151.0
205050.0
216666.0
225050.0
232525.0
245656.0
255858.0
266262.0
275353.0
286565.0
293636.0
302121.0

They’re exactly the same! Clearly if we want to work with foot traffic data, we’ll need to only use one or the other.

Next we can ask how to deal with the spatial arrangement of our data. This time, we have data where each place of interest has its own polygon in the polygon_wkt column.

 polygon_wkt
56POLYGON ((-77.59417363840089 43.119868567405916, -77.59414413410173 43.119943943129186, -77.59391480523095 43.119885208807354, -77.5939416273211 43.11981472754674, -77.59417363840089 43.119868567405916))
59POLYGON ((-77.59143161215052 43.120892209242385, -77.5912596878754 43.120816922141515, -77.591356 43.120596, -77.591542 43.120639, -77.59143161215052 43.120892209242385))
97POLYGON ((-77.59414341246315 43.11994394312916, -77.59407635723778 43.120070221730195, -77.59401466643044 43.120098609906954, -77.593844346158 43.12004868586318, -77.59391274248787 43.119885208807375, -77.59414341246315 43.11994394312916))
153POLYGON ((-77.59200493412783 43.120714745000875, -77.59228924828341 43.12079207746439, -77.59237105565836 43.120632273247, -77.59284446554949 43.120730896725995, -77.59273985939791 43.12101428539153, -77.59253094468227 43.12150846710204, -77.59250412259212 43.12151042486173, -77.5924638894569 43.12157307313886, -77.59240219864955 43.121608312766554, -77.59197036299815 43.12150455158246, -77.5919971850883 43.12144777652045, -77.59196231637111 43.12144777652045, -77.59193549428096 43.12150259382258, -77.59167800221553 43.12143211442511, -77.59185502801051 43.121030771864206, -77.59177456174007 43.121015109662125, -77.59183088812938 43.120928967478925, -77.59188453230968 43.12094854525849, -77.59200493412783 43.120714745000875))
187POLYGON ((-77.5918583166116 43.120005479690604, -77.59162764663633 43.119952618856665, -77.59174566383298 43.11967656709296, -77.59197901601728 43.11972747034871, -77.5918583166116 43.120005479690604))
224POLYGON ((-77.59200493412783 43.120714745000875, -77.59228924828341 43.12079207746439, -77.59237105565836 43.120632273247, -77.59284446554949 43.120730896725995, -77.59273985939791 43.12101428539153, -77.59253094468227 43.12150846710204, -77.59250412259212 43.12151042486173, -77.5924638894569 43.12157307313886, -77.59240219864955 43.121608312766554, -77.59197036299815 43.12150455158246, -77.5919971850883 43.12144777652045, -77.59196231637111 43.12144777652045, -77.59193549428096 43.12150259382258, -77.59167800221553 43.12143211442511, -77.59185502801051 43.121030771864206, -77.59177456174007 43.121015109662125, -77.59183088812938 43.120928967478925, -77.59188453230968 43.12094854525849, -77.59200493412783 43.120714745000875))
227POLYGON ((-77.59164308521562 43.12040763170624, -77.59147215835709 43.120331919138046, -77.59159 43.120058, -77.591775 43.120101, -77.59164308521562 43.12040763170624))
229POLYGON ((-77.59423801141725 43.11974620402106, -77.59417363840089 43.11987052521808, -77.59394430953012 43.11981276973279, -77.59399124818788 43.1196972585986, -77.59423801141725 43.11974620402106))
283POLYGON ((-77.591946 43.121896, -77.591944 43.121862, -77.591803 43.121865, -77.591801 43.121798, -77.591748 43.121784, -77.591736 43.121808, -77.591592 43.12177, -77.591574 43.121808, -77.591446 43.121774, -77.591542 43.121581, -77.591813 43.121654, -77.591804 43.121671, -77.591936 43.121668, -77.591937 43.121699, -77.592063 43.121696, -77.592065 43.121762, -77.592282 43.121757, -77.592288 43.121889, -77.591946 43.121896))

Each POI having its own geometry is going to be the case whenever we have non-enclosed children, but keep in mind it will also sometimes be the case with enclosed children. Just be sure to look if it’s there!

When it comes to polygons in close proximity like this, sometimes we can be certain of how well we have the shape down, and other times we can’t. For this we’d want to look at the polygon_class column. Ideally we want this to be an OWNED_POLYGON indicating that we can map the location to a specific polygon. Otherwise, there might be a little uncertainty. What do we have here?

We have a single parent location that is an OWNED_POLYGON, as well as 13 children OWNED_POLYGONs. In addition, we have 19 children SHARED_POLYGONs, which means that multiple POIs have ended up sharing the same polygon – these POIs can’t be distinguished, or they may literally share the same space. More detail here.

As you might expect, the child polygons sit inside of the parent polygon. Using the same methods as before, this time we can actually get the internal structure of the location.

We can see the exact structure taken up by the children. It doesn’t fill the whole parent space! And yet, every single visit to the parent was accounted for by a child. What gives? Well, all that blank space is parking lot.

And while the parent location includes the parking lot…

The children don’t…

And so what’s happening? SafeGraph is willing to count visits to parent POIs that aren’t to any children, and there are areas here that are part of the parent but not the children. But in this case we can see that we aren’t counting any visits from the parking lot to the parent POI (or perhaps there weren’t any, but that seems unlikely). Good to know!

Wrapping Up

So there we have it! Some reminders:

  • Parent locations like malls and airports have children location inside of them
  • Locations can include or exclude parking lots
  • Children can be enclosed or non-enclosed
  • Enclosed children don’t get their own foot traffic data
  • Non-enclosed children do, and if you’re aggregating up you want to drop either the parents or the non-enclosed children or else you’ll double-count
  • Some enclosed children don’t have their own polygons
  • Some enclosed children, and all non-enclosed children, should have their own polygons
  • The parent polygon can be bigger than the full list of its children, but sometimes this additional area doesn’t record visits (sometimes it does, though)

Ready to get started? Schedule a demo with our experts.