# SafeGraph > Contact: press@safegraph.com ### Posts #### 10 Non-Obvious Rules to Raising a Series B SafeGraph recently raised a $45M Series B and I have gotten a ton of inbound from entrepreneurs in similar stages for best practices. This piece is for other entrepreneurs but also for venture capitalists who want to better understand what crazy founders are thinking.‍If there is one thing common across humanity, we all just want to be loved and felt loved.If you are looking for a spouse, my best advice is your number one criteria for a partner should be someone that truly loves you. And if you are picking a venture capitalist to be your financial partner, find the firm that most loves your business.So, without further ado, here are the ten non-obvious rules to raising a Series B -- both for founders and VCs...1. Get Ready to Work REALLY Hard … Because It Will Be Hard‍SafeGraph raised $45 million with a $370 million valuation. We started pitching firms on Jan 5 and we selected a term sheet on Feb 11.We had a great outcome but we need to remember that this is probably the best time in history to raise money. Anyone that tells you that raising money is easy is either completely high or lying. It was a crazy amount of work and super intense. Prior to the fundraise, we spent HUNDREDS of hours getting ready. It took two months of intense work to get ready to start fundraising. And during the fundraise (an intense six weeks) I was often doing 7 meetings a day and it was completely consuming.2. Don’t Let the Fundraise Slow Down the Business -- That’s Your Priority as CEO‍Unfortunately, I’ve seen a lot of CEOs use fundraising as another opportunity to delegate work to other employees. If there is ever a time NOT to do that, it is during a fundraising. One of your main priorities during a fundraise should be to make sure your business doesn’t slow down during this time, and for that to be true you’re going to need to do a lot of the work yourself.You should pick a small number of people internally (three max!) to help you jump headfirst into the fundraising effort. Ideally these folks are not from teams responsible for important growth initiatives, like product, sales, or marketing. In my case, I was able to rely heavily on our VP of Operations, VP of Finance, and my executive assistant. Although this created significantly more work for the three of us, it allowed the rest of the organization to stay 100% focused on growing the business.As you’ll soon find out, the fundraising process can be a long and drawn out process. This is precious time that shouldn’t pull a bunch of key employees away from their main priorities. One advantage here is that if you can accelerate the business during this time, you can use it to get better terms and have more leverage to push for a fast close (more on that later).I did send out a weekly detailed update (with a full funnel analysis) to the SafeGraph Executive Team and to our board of directors. That kept them in the loop and also helped them keep me accountable for my goals.Maybe surprisingly, a way to measure how well a fundraising went is by how few people internally were involved (or even knew about it!).3. Your Pitch Gets Better Over Time -- So Pitch Some of the Best Firms Last‍None of the firms we talked with in the first 3 weeks of pitching gave us term sheets … and most passed after just one or two meetings.In contrast, we got four term sheets from firms we pitched in the following three weeks and over a dozen of those firms went into deep due diligence (and many ultimately dropped out because they could not move fast enough due to a compressed timeline).Our pitch in week 4 was markedly better than in week 1. Luckily, many of the VCs we most wanted to talk to were later in the process (including Sapphire, our eventual partner, which was one of the top three firms we were hoping for) … but I flubbed meetings with many storied VCs early until I had my pitch down.Your first full pitch shouldn’t even be with a VC. I did over 8 practice runs before pitching VCs but, in retrospect, I should have done more.Special thank you to Tod Sacerdoti, Alex Rosen, Brendan Baker, Travis May, Jeff Lu, Ravi Patel, Nicole Berger, Ross Epstein, and many others for their incredible help. 4. ALWAYS Do What You Say You Are Going to Do‍If you tell a VC you will give them access to the data room by the end of the day, then give them access to the data room by the end of the day. If the data room is not ready, tell the VC it isn’t ready and let them know when they will get it.We were really good about setting expectations to the VCs. In a few cases we promised a document (like a deeper dive) and forgot to send it and that eroded trust.And trust is two-ways. For VCs: if you tell a CEO you will get back to her by Monday night, then you should get back to her by Monday night. Don’t wait until Tuesday to get back to her. You lose credibility everytime you don’t do what you say you will do.During our process, a few VCs promised “we will review the materials and get you our thoughts on next steps by Thursday” and then Thursday came and went without hearing from them. I would hear from them a few days later (“sorry about that, something came up. We are super interested -- can we set up a meeting with my partners?”). Great VCs prioritize managing their time better … they just send an email before the deadline letting the founders know a few more days are needed.Great companies have no desire to have someone on their board that cannot keep their promise (or manage themselves). Board members are expected to do a lot of work and companies want a VC to prove they are going to be a good member during the process.Luckily, most VCs were very good at keeping their promises and being on top of things. I found a VC’s responsiveness was extremely correlated with the reputation of the firm. VCs with very high reputations (like Sapphire) were extremely responsive (even with a quick “no”). VCs that are considered third-tier were much less likely to be on top of things. I came into the process thinking it would be the opposite -- that third-tier VCs would work harder because they need to build their reputation … but it turns out that high-reputation VCs have high reputations for a good reason.5. Never Ever Be Oversubscribed‍Too many of my entrepreneur friends have over-subscribed rounds. They have eight different options at the same price and they are taking the time to pick between those options. This makes no sense.They even brag that they are oversubscribed.Bragging that you are oversubscribed is like bragging about the “pop” on your IPO -- it means you have under-optimized the round and left money on the table (and hurt your existing shareholders).If you are truly oversubscribed, that means you had way more demand than supply and you could adjust your price. One company I am an angel investor in recently raised $40M in new money and said they were oversubscribed by $300M. That certainly means they could have gotten a much better deal.The demand in your company should be reflected in the price. If you find yourself having to cut back lots of people or cutting back on allocation, then you should have the power to change the terms and make them more favorable to current shareholders.Lots of VCs (famously, Bill Gurley at Benchmark) complain that companies under-optimize their IPOs and that complaint rings true. But companies seem to massively under-optimize every financing round … not just the public financing.6. Minimize the Time Between Term Sheet and Close‍Usually when a company signs a term sheet, they also sign an exclusive for 30 days. It is in both sides’ best interest to close much faster than 30 days, but most VCs do not realize this. Most VCs, even ones that run a great deal process, completely delegate the close process to their lawyers.That is a mistake.The time between term sheet and close is precious.In addition, the closer the time gets to 30 days, the LESS leverage the VC has. VCs should try to close in 10 days and push the company to move faster. Instead, in almost every deal, the company is pushing the VC to move faster.As you get closer and closer to the 30 day mark, the exclusivity is closer and closer to expiring. Once the exclusivity expires, the company can pick up its head and re-engage all those other investors that put in term sheets … and can engage them with docs that are likely 99.9% done. So the company has all the leverage.In addition, now-a-days companies are growing so fast that they usually grow significantly in 30 days -- so they are ACTUALLY worth more if you apply the same multiple logic.I was once involved in a merger transaction that was worth hundreds of millions of dollars. The acquiring company decided to renegotiate the deal at the very end of the process to save them $5 million (which was less than 2% of the deal at the time). The problem was, the acquiring company decided to negotiate in the 11th hour, exactly when their leverage was disappearing. They seemed to think they had MORE leverage at the end of the exclusivity period for some reason. Because of this, the 30-day window lapsed and the target company had the ability to shop for more offers. In the end, the deal closed a week later at a 30% HIGHER price … the acquiring company tried to save 2% but ended up losing 30%. Lesson is to always understand when you have leverage and when you don’t.The closer the deal comes to the 30-day mark, the more risk the VC takes that a company might shop the VC deal. I know that does not happen much, but in today’s hot market, it is not a risk a VC should take. Smart VCs should try to get the entire deal closed in 10 days (and it is very possible to do that), not 30.7. VCs Should Not Optimize for Small Terms‍There are just three main drivers that affect VC returns: (1) the company they invest in, (2) the price, and (3) the amount they invest. I’d wager that those three factors alone account for over 98% of the returns of a fund.So everything else that is worked on between term sheet and close (the docs can number many hundreds of pages and take a hundreds of collective hours to draft and negotiate) make up less than 2% of all returns a VC might get (see great Twitter thread by Villi Iltchev and another by Jamie Goldstein).Everything between term sheet and close is about protecting a VC’s downside … but the power-law returns of most VC firms show that downside protection is a very small driver of overall return.And while 2% is not insignificant, the time it takes for the VC prevents them from spending more time helping their portfolio companies, and also makes them spend less time looking for new deals. In addition, the closing process takes a LOT of time of the CEO and other key employees of the company -- time which could be better spent making the company better.There is usually an extra $50k in legal fees (sometimes a LOT more) that is needlessly spent just to negotiate downside protections on extremely low-probability events.VCs would make almost an identical return just investing in common shares (which is what public markets investors do). The big reason not to invest in common is that it hurts a company’s 409a valuation and thus means option pricing would be higher.A smart VC would advertise super vanilla Series B docs. They would not have any of the stuff that takes all the time of close. This would:Allow the Smart VC to win more dealsEnable deals to close much faster (most deals today take at least 30 days from term sheet to close)Allow the smart VC to spend more time helping their portfolio companies and searching for new dealsImmediately put the smart VC and the founders on the same team (and reduce the adversarial time between term sheet and close).Of course, this only makes sense for VCs that make power-law returns. It likely does not make sense for traditional private equity firms.8. VCs Should Not Use GLG for Customer Calls‍A LOT of the VCs we engaged with spun up customer calls through GLG. In theory, that’s fine. In practice, it makes less sense.For GLG, it is REALLY hard to find a user of a product at a large company. Let’s say your customers are Pepsi, Starbucks, American Express, Walmart, Amazon, and a few random start-ups. You can be sure that everyone GLG introduces you to will be at those few random start-ups and none of them will be at the big companies. At a start-up of 30 people, it is really easy to find the person that uses a company’s product. At a huge company with 10,000 employees, it is virtually impossible to find the right person.So most VCs spend their time doing customer references of the company’s least valuable customers.Moreover, MANY of the people the VCs talk with are actually just posing as customers (they are people tangentially related to the company) because these “experts” just want the quick money from the call. So then the VCs just get random data.In SafeGraph’s case, one well-known VC shared the four calls they did with SafeGraph “customers” through GLG. Two happened to be our two smallest customers. And two were random guys who were tangentially related to our market but definitely not customers. This was a waste of time for the VC and gave them a false impression of SafeGraph.9. Bad News by Email, Good News Also by Email‍I MUCH prefer bad news via email. If a VC passes on a company, they should send a quick email “we really like the company but we are passing. Want to set up a call tomorrow so we can walk through our reasons?”Instead, many VCs send a cryptic email to the CEO -- “you free for a quick call tomorrow?” The founder then moves around three other meetings to make time only to get a pass.Not only should VCs give bad news by email, they should also give good news by email. For example, email “we really want to discuss a term sheet -- you free to talk tomorrow?” A few times I got cryptic emails from VCs and I assumed it was bad news and so I put off scheduling with them … but it was actually good news. My advice to VCs: get good news into the hands of founders ASAP.One of the things Sapphire (and some of the other top VC firms) did well is they would email me with an update every 1-2 days (“here is where we are at, here is what we did, here is what we are planning on doing tomorrow”) … those updates put the founder at ease and make running the process much easier.10. Send Break-up Emails‍When a VC doesn’t meet your expectations, send them a break-up email.During the SafeGraph Series B process, I sent over a dozen break-up emails to VCs letting them know that we were not going to take the step with them. The break-up may have been because (1) they were not the right stage for us; (2) they were not going to add enough value; (3) we thought there was a conflict with one of their other investments; or (4) the VCs missed a deadline and were not moving fast enough.Many times the break-up email invoked a conversation from the VC (“really sorry we were not moving fast enough. If I can promise to get everything done in two-days, can we get back into the deal?”).Remember, you are choosing the VC as much as they are choosing you. You might be “married” to this partner for the next ten years -- so you need to choose wisely. No need to waste your time with VCs that are not going to be good fits. This is especially true in today’s more frothy market where you don’t need to collect term sheets just to optimize valuation.One of the things I am proud of the process we ran at SafeGraph, is that we would have been happy with ALL of the term sheets we got. All of the terms were great and all from fantastic partners. We ended up picking Sapphire but it was a really hard choice (all the firms that gave us term sheets were super high-quality and the individual partners all would have been value-adds).Assuming you have a quality company, you don’t need to settle for a firm or partner that you don’t think will meet your needs.‍If you are a founder in a Series B process, feel free to reach out to me if I can be helpful in any way -- https://twitter.com/auren #### 3 Methods of Calculating Catchment Areas & Where to Get the Data Trade area analysis is a complicated process that requires significant effort and consideration to get right. To ensure that you are relying on accurate data visualizations, you need to make sure that you are calculating your catchment areas correctly for your needs.Catchment areas can be calculated by simple buffer zones, walking or driving time to the location, and even mobility pattern data, painting a vivid picture of where your customers visit your business from.To help you calculate catchment area, we cover the following: 3 methods of calculating catchment area for best results Where to get catchment area data for your analysis Let’s dive into the three methods for calculating catchment areas and help you decide which to use depending on the analysis you are trying to perform. You can also check out our ultimate guide to trade area analysis to learn more about trade areas and how to analyze them.3 methods of calculating catchment areas for best resultsThere isn’t a single way of calculating a catchment area, as there are a variety of values you can use to measure trade areas such as distance, travel time, and mobility patterns. Calculating the trade area is not about following a set formula, but deciding on the method you will use to define your catchment area.Below, we cover the top three methods for calculating catchment area:Method 1. Buffer trade areasBuffer trade areas create a buffer area around the locations that are of interest to you. These are most commonly based on a distance from the POI, but they can be adjusted in a number of ways.Creating a buffer trade area map showing your existing locations and all competitor locations within those regions can be a great tool. To do this, you’ll need to:Create or obtain a file of your existing store locations for analysisCreate or obtain a file of existing competitor locations using SafeGraph PlacesCreate a buffer around the locations (whether yours, your competitors’, or both)Join the two files or create a visualization to show your trade areas (keeping an eye out for potential cannibalization) and competitors with trade areas that intersect with yours.This will give you a catchment map of your store locations, their buffer zones, all competitors in those buffer zones, and their trade areas. This makes it easier to visually compare your locations to competitors, gaining a picture of each individually, locations in a set region, or all of them as a whole.Method 2. Walk and drive time trade areasWalk and drive time trade areas map the distance from the POI based on the time it takes to walk or drive there. Essentially, this involves pinpointing the location you want to analyze and creating an isochrone map displaying areas surrounding the POI based on walking or driving time. The map can be set up to display specific intervals of time, depending on what is important.For a Starbucks, walk and drive times of 3, 5, and 10 minutes may all be useful. However, for larger, less common stores like Ikea where visitors will require a vehicle to transport the items they purchase, drive times of 10, 30, and 60 minutes may be more useful.Create or obtain a file of your existing store locations and any competitors you want to include for analysis Upload the data into a BI or GIS program that includes drive and walk time toolsDefine your time parameters and run the toolJoin the two files or create a visualization to show how the trade areas relate to each other and may impact your businessYou can use these to map trade areas for multiple locations, helping you see where there are gaps and overlaps in your trade area. As you can see from the example of the catchment areas above, mapping multiple locations shows you where the closest location is to your visitors, helping you identify where to expand or close down locations, how to avoid cannibalization, and what the competitive landscape looks like.Method 3. Mobility trade areasMobility trade areas determine their catchment according to how consumers move. Rather than relying on a simple store radius based on distance, the catchment area will be defined based on mobility data showing store visits with origin census block groups (CBGs) and other brands visited that day/week. This not only creates a more dynamic area, but enables demographic analysis and brand affinity insights.This allows you to create a variety of data visualizations of trade areas based on different variables related to customer mobility, such as demographics, other stores visited, dwell times, and more. Creating multiple (or interactive) catchment maps based on different mobility factors allows you to get a clear picture of your catchment area.Create or obtain a file of your existing store locations and any competitors you want to include for analysisLayer in mobility dataJoin mobility data to your store location data to see how and where people move throughout the dayAdd in any other enriching data, like demographics or consumer transaction dataThis will give you a catchment area based on actual human movement instead of inferred location geographic relationships. There are multiple catchment maps that can be created this way, factoring in demographics such as age, income, and more.Where to get catchment area data for your analysisIt’s important to get store location data for yourself and any competitors or complementary businesses you want to analyze, as well as mobility data for those points and any enrichment information you can, such as demographics data.SafeGraph’s point of interest (POI) data provides business listings with geographic coordinates and brand details that can form a solid base for catchment area analysis. Also provided by SafeGraph is a free, cleaned census demographic dataset to further enrich trade area analysis. #### 4 Benefits of Trade Area Analysis in Scouting a New Store Location Key Takeaways Trade area analysis helps businesses identify locations that align with customer demand and purchasing power. Understanding competitor presence reduces the risk of entering oversaturated markets. Demographic and behavioral insights enable more targeted marketing and merchandising decisions. Trade area analysis supports smarter inventory and supply chain planning based on local demand patterns. To be successful in business, you need to know more than just what your customers want. You also need to know where to reach them. This has led companies to analyze geographic areas for market potential by considering things like available transportation, competitor locations, population demographics and lifestyles, and even when and where people in an area move based on location data from mobile devices.All of this falls under the umbrella of trade area analysis. But what is that, and how exactly does it help your business get a leg up? We’ll answer those questions here by covering:What is trade area analysis?Top 4 benefits of trade area analysisWe’ll start with defining what trade area analysis is, and then get into the benefits of trade area analysis.What Is Trade Area Analysis?Trade area analysis is studying and understanding trade activity within a given geographical area. This includes things like what types of businesses are there (and how many), along with how many potential customers are in the area, where they are coming from (or going), and what they are buying.For more information, see our Ultimate Guide to Trade Area Analysis.Top 4 Benefits of Trade Area Analysis: Choosing the Right LocationSo why perform trade area analysis? Well, as we alluded to, there are many different factors that determine whether a business succeeds or fails in a given area. These include considerations like how much demand there is for their products or services, how easy they are to access, and how many competitors are nearby (and how well-established they are).In short, trade area analysis provides vital information for making smart decisions on where businesses need to go and how they need to adapt. It doesn’t matter whether you’re part of an individual company’s management group or on a city planning committee. You need trade area analysis in order to give you these four key edges:1. Informed Site SelectionChoosing a location for a store is a critical decision for your business. It can determine not only how many customers you get, but also how much purchasing power your customers will have based on where they come from. It also affects who you’ll be in competition with. Without trade area analysis, you might as well be throwing a dart at a map while blindfolded. Let us explain.Trade area analysis is one of the best ways to determine whether a particular site would be a good fit for a certain business. For example, you might want to set up a shop in an area with high demand for your industry’s goods or services, but is currently being underserved. Of course, you’re also going to want to make sure that the purchasing power of your potential customers outstrips your operating costs.Another important factor is how accessible your business will be for both local residents and out-of-towners. That includes via public transportation as well as private vehicles, so be sure to have ample parking space. Consider how vital this is for businesses essential to people’s health and wellness, such as grocery stores.2. Competitive IntelligenceWhile you may see business opportunities in an area, remember that other companies in the same industry as you – i.e. your competitors – may also be seeing those opportunities. In fact, they may have already beaten you to the punch.Trade area analysis can give you an overview of where your competitors are and who they’re likely serving. That way, you can avoid spots where competitors are already meeting people’s needs and focus on sites where the population is underserved. You can also weigh the opportunities and risks of setting up shop somewhere where you may be competing with another company over the same customers.3. Personalized Marketing and AdvertisingWhen selecting a trade area to do business in, it’s not enough to know just who might be buying from you. You also have to keep in mind what specifically they’ll likely want to buy. After all, a prime location doesn’t do you any good if you aren’t selling what the nearby demographics are looking for. Even worse is if you have what they want, but they have no idea that you carry it.This is where trade area analysis comes in. By looking closely at people’s purchasing behaviors, you can get an idea of the kinds of lifestyles the people in an area have. Based on that information, you can adjust things like your store layout, inventory, and marketing accordingly.For instance, say that your research reveals a significant portion of the population near your store location is into home improvement. You can make the section for that type of merchandise in your store bigger, or put it closer to the entrance so it’s easy to access. Or, when you send out promotional materials, you can feature construction tools in a larger section near the front or top so that they’re the first things your customers see.4. Supply Chain and Inventory PlanningWith trade area analysis, you can delve even deeper into ways to manage your business. For instance, you may notice the days or even times of day when customers in the area are most likely to be out shopping. You can also factor in the types of products that they usually shop for on those days and/or at those times. This can help you narrow down who your prime customers will likely be, rather than assuming it will be everyone within a certain distance of your store.Using this information can also help you plan your supply and inventory systems, even down to each individual product you carry. You can plan to have shipments come in before days and times where your stores are busy and shoppers will likely be buying specific products. You can also estimate how much of each product you should order so that you’ll have most things in stock, even when they’re in high demand.This even applies to special dates and other holidays. During these times, demand for certain types of products may be higher than normal, or high when it’s otherwise nonexistent the rest of the year. With trade area analysis, you’ll know what to stock, how much of it to stock, and when to stock it by.In summary, trade area analysis is important for four main reasons. First, it helps you locate your stores at sites that meet the needs and demands of nearby customers. Second, it steers you away from locations where competition from similar stores will too heavily impact your business. Third, it informs your marketing strategy, inventory, and store layout based on what your likely customers are interested in buying. And finally, it helps you plan your supply chain around what products will likely be in demand, and when. FAQs 1. What is trade area analysis? Trade area analysis is the study of customer activity, demographics, and competition within a defined geographic area to assess market potential. 2. Who should use trade area analysis? Retailers, restaurant chains, real estate teams, marketers, and urban planners commonly use trade area analysis when making location-based decisions. 3. What data is used in trade area analysis? It often combines demographic data, points of interest, mobility data, consumer behavior, and competitive location data. 4. Is trade area analysis only useful for new store locations? No. It is also useful for optimizing existing locations, planning marketing strategies, and managing inventory and supply chains. Trade area analysis is the study of customer activity, demographics, and competition within a defined geographic area to assess market potential.Retailers, restaurant chains, real estate teams, marketers, and urban planners commonly use trade area analysis when making location-based decisions.It often combines demographic data, points of interest, mobility data, consumer behavior, and competitive location data.No. It is also useful for optimizing existing locations, planning marketing strategies, and managing inventory and supply chains. #### 5 Technical Differences Between US and UK POIs Key Takeaways US vs UK POI differences have direct implications for location intelligence and geospatial analysis. Postal code granularity in the UK affects how models interpret language, density, and spatial boundaries. UK POI data is denser and more granular, which impacts clustering, catchment analysis, and global modelling. Understanding geospatial data differences helps teams adapt workflows across regions. Before diving into the specifics of the UK Places data launch, it’s important to understand why US vs UK POI differences matter in practice. Differences in language, postal systems, density, and spatial geometry directly affect how points of interest data are interpreted in location intelligence, global modelling, and cross-market analysis. Treating POI data as structurally uniform across geographies can lead to misclassification, skewed density metrics, and integration challenges when scaling analytics beyond a single country. This month we launched our long-awaited, highly anticipated UK Places data. While we’ve been curating points of interest (POI) data in the US for the past few years, our customers have been asking for the same in the UK. Our UK data offering provides our Places and Geometry datasets for England, Scotland, and Wales, covering over 1.3M places and more than 500 brands. We have big aspirations at SafeGraph. We aim to one day provide places data for the entire world. Our expansion into Great Britain taught us a lot about how places, and as a result places data, differ by geography. We encountered five main technical differences between US and UK POIs.1. American English vs British EnglishAmerican jargon does not always translate to British jargon very well, and our machine learning models required some lessons in the Queen’s English to make use of the metadata attached to UK data sources. For example, humans may know that pubs and inns in Great Britain are the same as bars in the US, or high streets are the same as shopping strips, but computers need to learn these things. This difference becomes especially important when performing POI data comparison between the US and UK, as category-level inconsistencies can propagate errors through classification models, filters, and downstream analytics. Best practice: Normalize category labels and naming conventions across regions before training models or running comparative analyses.2. Postal Code FormatsPostal codes across the pond are much more granular than their counterparts in the US, so much so that some high-rises comprise several postal codes. Our US POIs represent 37000 distinct postal codes, while our UK POIs represent a whopping 609 thousand. These points of interest data differences have a direct impact on spatial joins, aggregation logic, and geographic rollups, particularly for users accustomed to ZIP code–level analysis in the US. Best practice: Treat UK postal codes as high-resolution spatial identifiers and adjust aggregation logic accordingly rather than mapping them one-to-one with US ZIP codes. 3. POI DensityGreat Britain is crowded. POIs are more densely co-located in the UK than in the US. Great Britain has 14.9 POIs per square mile, while the US only has 1.7 POIs per square mile. Of course, the US has a much larger area than Great Britain, but the resulting increase in POI density was a new challenge for us as we built out the UK Places data. This geospatial data difference affects proximity analysis, clustering, and trade area modelling, especially when applying US-calibrated assumptions to UK environments. Best practice: Recalibrate distance thresholds and clustering parameters to account for higher POI density UK vs US conditions. 4. Polygon SizeWith these crowded, co-located POIs come smaller spaces. The average polygon size for branded, “OWNED_POLYGON” POIs in Great Britain is 934 sq. meters. It’s roughly double that in the US, at 1,917 sq. meters. Smaller polygon sizes can influence visit attribution, overlap calculations, and spatial weighting, particularly when comparing activity across markets. Best practice: Normalize polygon-based metrics when comparing performance or behaviour across US and UK datasets. 5. Building Footprint Delineation Great Britain’s architecture is much older than America’s (some buildings date back to 3000 BCE), making distinct buildings much harder to delineate. We still strive for world-class polygons that define even the most obscure demarcations between adjacent buildings, but it’s a much taller order, and there will inevitably be more “SHARED_POLYGONS” in the UK data as a result. Users can expect our “% OWNED” polygon metric to increase over time as we continue to learn more about this. Building places data in the UK has been a fascinating lesson in geography, history, and culture, and we are excited to see what other regional differences we encounter as we continue to expand. These structural constraints introduce additional complexity for footprint-level attribution and ownership-based analysis. Best practice: Use ownership metrics and supporting attributes alongside geometry when analysing shared-building environments. Conclusion US vs UK POI differences extend beyond simple formatting or terminology changes. They shape how geospatial data differences influence real analytical use cases, from location intelligence and market analysis to global modelling and data integration. Accounting for these points of interest data differences allows teams to build workflows that scale across regions while maintaining analytical accuracy and consistency. Request a Demo of SafeGraph Places Data Request a Demo FAQ’s 1. Why are US vs UK POI differences important for analytics? They affect how data is categorized, aggregated, and interpreted across regions, which can impact modelling accuracy. 2. How does POI density UK vs US affect analysis? Higher density in the UK requires different clustering and proximity assumptions to avoid distorted insights. 3. Are UK postal codes comparable to US ZIP codes? No. UK postal codes are far more granular and should be handled differently in spatial workflows.  4. Do polygon sizes affect location intelligence outcomes? Yes. Smaller polygons can influence visit attribution, overlap analysis, and spatial metrics. 5. What causes more shared polygons in the UK? Older architecture and tightly packed buildings make footprint delineation more complex. 6. How should teams adapt global POI workflows? By normalizing language, recalibrating spatial thresholds, and accounting for regional geometry differences.  7. Is POI data comparison between US and UK suitable for benchmarking? Yes, but only when workflows are adjusted to reflect structural differences in the data. They affect how data is categorized, aggregated, and interpreted across regions, which can impact modelling accuracy.Higher density in the UK requires different clustering and proximity assumptions to avoid distorted insights.No. UK postal codes are far more granular and should be handled differently in spatial workflows. Yes. Smaller polygons can influence visit attribution, overlap analysis, and spatial metrics.Older architecture and tightly packed buildings make footprint delineation more complex.By normalizing language, recalibrating spatial thresholds, and accounting for regional geometry differences. Yes, but only when workflows are adjusted to reflect structural differences in the data. #### 7 Big Data Use Cases in Financial Services and Benefits of Data Science Data science has improved financial services by speeding up processes that would have usually taken a long time. For example, SafeGraph helped one of their financial services clients by providing them with data to assess whether or not customers would walk into a bank during the COVID-19 pandemic. This helped the client make an accurate assessment of how the pandemic would affect that particular bank, and aided the bank in making the right business decisions moving forward.In this article, we’ll explore other ways big data can be used in financial services. We’ll also suggest some ways you might be able to make use of it, depending on your niche or your needs. We’ll cover:4 benefits of using data science in the financial industry7 top big data use cases in financial servicesIt’s important to look at the 7 top big data use cases in financial services to understand the specific impacts of these technological changes on the banking world. And aside from these big data use cases, the financial sector has reaped other rewards from advancements in data science that may not be immediately obvious.4 benefits of using data science in the financial industryIt’s important to note the most significant benefits that data science has brought to the financial industry as a whole. These small changes have made huge differences to people’s livelihoods, including the ways they conduct work.1. Forecasting financial trends: Predicting financial trends before they occur can have a massive impact on how management handles future actions. Being proactive can curb the damage that a negative financial trend can do, as opposed to being caught off guard and simply reacting to it. This can also give you a leg up on the competition. Forecasting supply, demand, and other key financial indicators equips firms with the necessary information to make decisions about their products, services, and investments. It also helps them advise their clients on how to make smart financial decisions based on predictive models.‍2. Analyzing risk: Machine learning algorithms can very accurately predict whether or not a person or company will be a risky investment (using their credit score and financial transactions). This will dictate whether or not this person or company can be trusted with a loan, or if they should be denied because of their bad credit history. 3. Automating tasks: Automating tasks increases productivity and makes the work of financial services analysts, managers, and associates easier. For example, online applications and algorithms make it a lot faster to determine whether or not a particular customer is a financial liability. This makes it easier for bank workers to figure out whether or not to provide a service to that particular customer. Additionally, customers have benefited because they do not physically need to walk into a bank to apply for products and services. They can also autofill most of their applications online at home if they have their browser set up to save commonly-entered information like address, cell phone number, name, etc. Automation increases customer satisfaction by making it easier for consumers to engage with brands, while also increasing productivity of financial services workers themselves.4. Fostering inclusivity: When algorithms are used, financial institutions are guaranteed to treat each individual equally regardless of race, sexual orientation, or gender. This is because the decision process is completely based on the financial activity of the customer. This creates more transparency for customers in terms of their ability to qualify for products. It also removes the risk of discrimination, which can unfortunately happen with more subjective application processes.7 top big data use cases in financial servicesThe financial services industry is constantly evolving, so data science use cases for financial firms are, too. For example, risk assessment and management is something that is incredibly important in the financial industry. Banks must be very careful about whom they lend to or invest in. As such, the ability to assess and manage risk faster and more efficiently with data makes the lives of bankers much easier.1. Consumer analytics and insights for insurance companiesInsurance risk, whether related to a property or a person, is largely dependent on how people interact with a particular space. Data science models can shed light on how consumers move throughout a community, including which businesses they go to and when they go. This can inform general liability risk, as a location that gets more visitors has a higher risk of someone getting hurt there. SafeGraph explains general liability using geospatial data, exhibiting that insurance risk can be more accurately calculated using alternative data.Insurance risk is vital information for insurance companies to be able to determine whom they will and will not accept as clients. 2. Real-time analytics and marketingData science has made it possible to analyze data in real time, as opposed to waiting for data to be processed and made available. This means financial services firms can respond to trends quickly and make decisions that push them ahead of the competition. Geofencing, geotargeting, and beaconing are all examples of real-time analytics.Real-time analytics are especially useful for understanding and responding to consumer behavior. In an ultra-competitive market, financial firms and banks need to know what their consumers want and need, and when they want and need it. Real-time analytics makes it possible to uncover this crucial information. This lets businesses develop targeted marketing campaigns to meet consumer demand and win market share. 3. Risk assessment and managementBusinesses are beginning to move their decision-making processes away from individuals and towards machine learning algorithms. These reduce human error while also increasing efficiency. Leveraging models that predict risk associated with particular investments or ventures enables financial services firms to make decisions quickly. At the same time, they don’t need to sacrifice the quality of their due diligence and investment research.At a more consumer-facing level, financial planners assess whether or not a person is in a position to buy a mortgage based on their lending and credit history. Before machine learning, this was a very manual and individual research process. With sophisticated pre-built data models, approvals are much quicker and more reliable.4. Fraud detection and preventionFinancial institutions are using machine learning tools to identify unusual consumer spending patterns or behaviors in real-time. This helps banks act quickly and effectively to reduce losses for both businesses and consumers. For instance, in response to the rise of cybercrime, many banks have algorithms in place to prevent further spending if they detect a credit card displaying unusual activity.5. Customer segmentation and targeted marketingIn order to effectively understand and reach customers, it is important to segment them into categories based on their likes, dislikes, needs, socio-economic status, etc. Financial services firms can then develop products and services designed especially for each segment. For a parallel in a retail environment, a business might split their clientele into higher and lower gross income segments. These would depend on the customers’ demographics and how much disposable income they are expected to have. The more disposable income they have, the more they are expected to spend in the store. 6. Predictive analytics and future planningData science allows for the instant analysis of many different data sets from the past and present. This makes it easier to predict the direction(s) in which the market will go, and which investments will be more or less feasible based on those trends. This simplifies decision-making for financial institutions.For example, an investment company is likely to use statistics to decide which stocks to invest in over a long period of time. They can then use their expected investment profit to offer products to their clients and set their rates.7. Financial market and investment analysisFinancial market analysis results are used by financial firms to choose whether or not to invest in a stock, company, or commodity. Data science can automate and expedite this process, while also producing more reliable scientific results. For example, algorithmic trading can be used to choose which stocks to invest in. This is where advanced mathematical formulas guide bankers in choosing the best stocks to invest in, as well as the best long-term strategy for managing these investments.The financial services industry has made significant strides in providing innovative solutions for predictive analytics, risk modeling, and customer engagement – all thanks to data science. #### A Guide for Real Estate Site Selection using Trade Area Analysis Location, location, location. You’ve likely heard this refrain many times when it comes to whether a property – be it for shopping, living, or working – succeeds or doesn’t. But picking a plot of land to develop or move into isn’t as simple as it may first seem. There are a number of variables to take into account, both geographic and demographic.To elaborate, we’ll walk you through a site selection model driven by trade area analysis via the following sections:What is real estate site selection & what does it involve?How is retail site development related to trade area analysis?4 key stages to completing proper retail site selectionCommercial real estate site selection checklist: quick referenceWe’ll start with a quick overview of what choosing a site entails for a business, and end with a commercial real estate site selection checklist that will give you some quick pointers to inform your strategy.What is real estate site selection & what does it involve?Real estate site selection is the process of picking a plot of land or existing structure for developing a new facility. It involves identifying the best real estate opportunities for new locations, ensuring that you choose a good location for your target audience and main demographics.Any business or organization looking to expand undergoes some sort of site selection process, but it is particularly popular in retail and real estate. Commercial real estate is also a major site selection industry; oftentimes, developers have retailers as clients, or are developing commercial properties for investments (such as malls and shopping centers).A related process is site deselection. This involves choosing store locations to move out of and/or shut down operations at. This may be necessary if changing geographic and demographic conditions around a store make it no longer attractive or profitable, or if scaling back your business or real estate investment as a whole is the only way to balance out an economic shortfall.How is retail site development related to trade area analysis?Trade area analysis provides a lot of context for your real estate site selection criteria. This helps you make an informed decision on whether a particular site will be economically viable for your business. Here are some things it can tell you:Demographics – If you’re going to sell products or services in an area, you need to know if the people nearby actually want them and can afford them. Gathering data on their lifestyles and incomes (seeing where else they shop is one way to do this) will let you know.‍Accessibility – Your store’s more likely to get customers if they can easily get there by walking, biking, driving, or taking public transit along nearby transportation routes. This is especially important for convenience stores and other types of businesses where the customer wants to be in and out quickly, so they will likely pick the most efficient option.‍‍Competition – You also need to see if there are similar stores in the area that could take your business. You should stick to areas where there aren’t enough competitors to meet the demand for your type of business, and you can offer some sort of decisive advantage (bigger store, better inventory, lower prices) that your competitors don’t.Also, remember that your own locations can be thought of as competitors in this sense. So make sure your stores are not located in such a way as to compete with each other over the same customers.For even more upsides, see our article on the benefits of trade area analysis.4 key stages to completing proper retail site selectionNow you have an idea of the work you need to do – and the data you need to do it – when selecting a site for your business. So it’s time to develop a plan. Every business is different, so the details may vary, but here are four general steps you’ll need to take when choosing a site for a retail business or real estate opportunity.Stage 1: Decision to expand/close downRetail real estate site selection usually starts with one of two goals. One is to build stores in new areas to tap into additional market share. The other is to close down underperforming stores that may not be as profitable as they once were. Sometimes, you may even want to just move a store to a vacant location and accomplish both of these goals at once. In real estate, this means identifying areas with the best investment opportunities, and avoiding areas with the greatest risk.Whether you are selecting or deselecting sites (or both), you’ll need to consider where the investment and business opportunities in your area are and aren’t. We’ll discuss how to do that in the next step.Stage 2: Determine methodologyThere are a number of factors and methods involved in trade area analysis. Which ones you prioritize will depend on the type of business you’re running. Are you setting up shop in an area where the people living there actually want or need your business? Can the people you’ll likely be selling to afford your products or services, to the point where you can turn a profit? How accessible will your store be via local transportation routes? What other stores are in the area, and how likely will your store be competing with them over the same customers? These are some of the questions you should be considering during the real estate site selection process.Another helpful tool is to find a successful store (usually your own, but sometimes someone else’s) and then note and analyze its geographic and demographic characteristics. Then you can look for locations with similar parameters and feel reasonably confident that a new store will succeed there.If the property is an investment, you’ll want to consider the above as well. What markets will the location serve, and which businesses will be best suited for that area. This will be extremely important to help you find tenants for your real estate investments and the potential profitability of each. You can also mitigate risk by selecting locations that could serve a number of different industries and demographics. For example, office and retail real estate properties have very different designs and locations, which will dictate the opportunities you have for your investment.Stage 3: Perform analysisOnce you choose the metrics you want to measure, it’s time to get to work. Get the sales numbers. Do customer surveys. Use GIS software to calculate walk and drive times to your possible future store locations, and to your current store locations for comparison. Or buy pre-prepared location data packages from SafeGraph if you want to skip a lot of the heavy lifting.After you’ve gathered the data, you need to decide on how you’re going to model it and put it into action. There are several different theories and techniques you can use, and many of them can be mapped out using real estate site selection software. But again, your choice will come down to the order in which you prioritize your metrics.To use a previous example, walk/drive time will likely be more important to a convenience store, fast food restaurant, or other type of business where the customer will likely pick the easiest one to get to. As a real estate investment, walk/drive time, demographic, and mobility models are extremely useful to get an idea of the potential industries and populations your investment location can serve.Stage 4: Execute on strategyWhen you’ve made your final decision on a location, all that should be left is going through the motions. Purchase/lease/rent the land or building, get a building permit, do any necessary construction, and then start moving in your operations. From there, keep an eye on your new location to see if your strategy pays off.If you’re closing down instead, make the necessary arrangements for paperwork, moving out your inventory, and deconstruction. With any luck, you’ll see some savings in your company’s next financial report.Commercial real estate site selection checklist: quick referenceCommercial real estate site selection is a complex process with many different factors that you need to consider. To help make it a bit simpler, we’ve broken it down into seven essential steps:Map out your current locations – Plot out your existing stores or properties on a map. See where there are places available to expand to, and also try to find areas where your stores may be too close together and competing for each other's business.Consider competing and complementary businesses – Identify other businesses located near your stores. Think about how their market niches and performances may be impacting the success of your own stores. Some may be fighting over customers with you, while others may actually be helping to funnel more customers to you.Know who your likely customers are – It’s important for location’s sake to know where people who shop at your stores live or come from in the surrounding area. But you should also pay attention to their lifestyles: where they shop and what other activities they like to do. Ask yourself: are your stores accessible to your likely customers, and are you selling what they actually want?‍Add more context – Brainstorm some other factors that may affect your stores’ performances. For example, look at how close they are to nearby transportation routes, and how this might grow or shrink your customer base. Other aspects of surrounding geography may help or hinder your business, too. For example, placing a gardening store near a rural or suburban area, where people tend to have larger properties, will likely get you more business vs. locating in an urban area with limited open space.‍Compare store performance in context – Look for patterns between the information you’ve collected and how your current stores are doing. Determine which factors are helping your businesses and which are hurting them.‍Aim to replicate what works – Once you’ve identified which surrounding geographic and demographic characteristics are contributing to your stores’ success, look at your possible new locations for similar parameters. Even if some aspects are different, you can use your new location to test how well a store will do in a setting with those aspects.‍Plan the size and shape of your store – How your stores are set up can be as important as where they’re located. Go back and look at the size and layout of your current stores, and think about these metrics in relation to your clientele. Stores should be big enough to accommodate your inventory and customers, but small enough that you’re not wasting space and possibly disorienting shoppers. You should also arrange your sections to make what visitors typically buy easily accessible.Now you have a general idea of what’s involved in selecting a site for a real estate investment, an office space, retail store, or other commercial property. You’ve also seen how trade area analysis can play a critical role in the process. And your trade area analysis is only as good as the datasets you use. If you want the most accurate and comprehensive information on points of interest around the globe, check out SafeGraph’s datasets. #### A Non-Technical Introduction to Machine Learning   Key Takeaways Machine learning combines computer science and statistics to create statistical models, which are then used to make predictions about the world or to infer patterns in your data. Statistical models are really just mathematical functions (e.g. Y = m × X + b). They are determined by their parameters (e.g. m and b), which are learned via the training process. Models also have hyperparameters (e.g. the K in KNN), which are tuned by trying out many possibilities. ML models learn from training data, which captures our knowledge about the past. There are, in general, two types of supervised models: those used for regression (like simple linear regression) and those used for classification (like KNN). Regardless of whether or not you’re building these models yourself, there are a handful of important ethical questions that require deep thought before you take action on the results of an ML model. Why should I read this, and what will I learn? Machine learning is a field that threatens to both augment and undermine exactly what it means to be human, and it’s becoming increasingly important that you—yes, you—actually understand it. I don’t think you should need to have a technical background to know what machine learning is or how it’s done. Too much of the discussion about this field is either too technical or too uninformed, and, through this blog, I hope to level the playing field. This is for smart, ambitious people who want to know more about machine learning but who don’t care about the esoteric statistical and computational details underlying the field. You don’t need to know any math, statistics, or computer science to read and understand it. By the end of this post, you’ll: Understand the basic logical framework of machine learning (ML). Be able to define important relevant terms and concepts that anyone interested in this field should know. These terms are highlighted in boldface. Know which high-level decisions go into building statistical models, and understand some of the implications of these decisions. Be able to better analyze the question of when we should use the results of ML to make big decisions, such as determining public policy. This overview is in no way comprehensive. Huge portions of the field are left out, either because they are too rare to merit study by non-technical decision makers, because they’re difficult to explain, or both. What Is Machine Learning? The field itself: ML is a field of study which harnesses principles of computer science and statistics to create statistical models. These models are generally used to do two things: Prediction: make predictions about the future based on data about the past Inference: discover patterns in data Difference between ML and AI: There is no universally agreed upon distinction between ML and artificial intelligence (AI). AI usually concentrates on programming computers to make decisions (based on ML models and sets of logical rules), whereas ML focuses more on making predictions about the future. They are highly interconnected fields, and, for most non-technical purposes, they are the same. What’s a statistical model? Models: Teaching a computer to make predictions involves feeding data into machine learning models, which are representations of how the world supposedly works. If I tell a statistical model that the world works a certain way (say, for example, that taller people make more money than shorter people), then this model can then tell me who it thinks will make more money, between Cathy, who is 5’2”, and Jill, who is 5’9”. What does a model actually look like? Surely the concept of a model makes sense in the abstract, but knowing this is just half the battle. You should also know how it’s represented inside of a computer, or what it would look like if you wrote it down on paper. A model is just a mathematical function, which is merely a relationship between a set of inputs and a set of outputs. Here’s an example: f(x) = x² This is a function that takes as input a number and returns that number squared. So, f(1) = 1, f(2) = 4, f(3) = 9. Let’s briefly return to the example of the model that predicts income from height. I may believe, based on what I’ve seen in the corporate world, that a given human’s annual income is, on average, equal to her height (in inches) times 1,000. So, if you’re 60 inches tall (5 feet), then I’ll guess that you probably make $60,000 a year. If you’re a foot taller, I think you’ll make $72,000 a year. This model can be represented mathematically as follows: Income = Height × $1,000 In other words, income is a function of height. Here’s the main point: Machine learning refers to a set of techniques for estimating functions (like the one involving income) based on datasets (pairs of heights and their associated incomes). These functions, which are called models, can then be used for predictions of future data. Algorithms: These functions are estimated using algorithms. In this context, an algorithm is a predefined set of steps that takes as input a bunch of data and then transforms it through mathematical operations. You can think of an algorithm like a recipe — first do this, then do that, then do this. Done. Machine learning of all types uses models and algorithms as its building blocks to make predictions and inferences about the world. Now I’ll show you how models actually work by breaking them apart, component by component. This next part is important. A Framework for Understanding Machine Learning: Inputs: Statistical models learn from the past, formatted as structured tables of data (called training data). These datasets — such as those you might find in Excel sheets — tend to be formatted in a very structured, easy-to-understand way: each row in the dataset represents an individual observation, also called a datum or measurement, and each column represents a different feature, also called a predictor, of an observation. For example, you might imagine a dataset about people, in which each row represents a different person, and each column represents a different feature about that person: height, weight, age, income, etc. Most traditional models accept data formatted in the way I’ve just described. We call this structured data. Because one common goal of ML is to make predictions (for example, about someone’s income), training data also includes a column containing the data you want to predict. This feature is called the response variable (or output variable, or dependent variable) and looks just like any other feature in the table. Most common statistical models are constructed using a technique called supervised learning, which uses data that includes a response variable to make predictions or do inference. There is also a branch of ML called unsupervised learning, which doesn’t require a response variable and which is generally used just to find interesting patterns between variables (this pattern-finding process is known as inference). It is just as important as supervised learning, but it is usually much harder to understand and also less common in practice. This document won’t talk much about the latter subfield. The takeaway from this paragraph is simply that there are two “types” of learning, and that supervised learning is more common. Continuing the people dataset example, we might try to predict someone’s income based on her name, age, and height, so our training dataset would look like this: Example of structured data. Each row is a person (a unique observation we’ve made), and each column is a different measurement (called a feature or predictor) of that person. Is this problem suitable for prediction? Now that we have a dataset, we can begin building a statistical model. Why should we do this? Well, we assume that there’s some relationship between our predictors and our response — that income is somehow based on your age and your height, or a combination of the two. It’s reasonable to assume that you make more money as you get older, for instance, and that your height subtly influences your job prospects. We’ll leave the task of figuring out the exact details of the relationship (for instance, the precise role that age plays in determining your income) to the machine learning algorithm. Model selection: We have our data, and we’ve decided that there’s probably a relationship between our predictors and our response. We’re ready to make predictions. As an aside, we don’t actually need to know if there’s a relationship between these variables. We could, in fact, just throw all of our data into an algorithm and see if the resulting model is able to make valid predictions. Now we need to pick which model to use. Naturally, there are many different types of models which explain how the data actually works, and we’d like to choose the one that most accurately describes the relationship between the predictors and the response variable. Models generally fall into one of two categories: Regression models, which are used when the response variable (i.e. the variable that you’re predicting) is continuous. For example, height, age, and income are all continuous. That is, they can be placed and ordered on a number line. Classification models, which are used for categorical data — that is, data that doesn’t have a numerical ordering. For example, you may want to predict, based on an image of a flower, the species of that flower. Or you may want to predict whether a student is a psychology major or a math major. The first step in picking a model is deciding whether or not your response variable is quantitative or categorical. Putting on the brakes: Why is model selection an important concept for non-technical people? Well, if a model is chosen poorly, then its predictions will be inaccurate, leading to tangible actions and policies that are uninformed. As a high-level decision maker, it’s important for you to question why certain models are being employed so that you don’t end up making even poorer decisions down the road. Below, I’ll walk you through an example of a popular, powerful, simple model from each category that can be used for prediction. A common regression model: simple linear regression Simple linear regression: Chances are, you’ve constructed many simple linear regression models in the past. Again, a model is just a mathematical function, which describes a relationship between an input and an output. A (simple) linear regression model is no different. We can describe the general case as follows: y = m × X + b The equation above relates the y value of a point to its X value. Specifically, if we know the X value, we can get the y value by multiplying X by m (called the slope) and then adding b (called the y intercept). If you think that there exists a linear relationship between your features and your response — between, say, height and income — then you might be inclined to use a linear regression model. In other words, this model is a good choice if you think you can describe the relationship between height and income with a straight line. So, let’s take this from the start. You find yourself with some data… Structured training data for our simple linear regression model. …and you decide that you’d like to build a machine learning model to predict someone’s income based on their height. First, you’d plot the two variables on an XY-plane to visually observe the relationship. Perhaps you’d see something like this: A scatterplot of height vs. annual income. Each blue dot is an individual observation. It looks like drawing a straight line somewhere through the blue dots would accurately characterize the visual relationship we’re seeing. This straight line, of course, can be described by the function I wrote out above: y = m × X + b Now I’d like to build a linear regression model. To do this, I train my model on my dataset: the linear regression algorithm will estimate the values of m and b, giving us an equation for a line that we can then use for prediction. Putting on the brakes: We know that a model learns by looking at the training data and applying an algorithm. But what can go wrong if the training data is unrepresentative, biased, or incorrect? Models don’t know whether or not training data is “good” — that’s up the human who trains the model. Unrepresentative training data can cause biased predictions which perpetuate social inequality. For example, imagine a situation in which a model used in the criminal justice system is trained on data which is biased against African Americans. This model’s predictions, unsurprisingly, will then mirror that bias. The details of the linear regression algorithm aren’t totally important, but you should know that the algorithm isn’t random. It takes your training data, applies some fancy linear algebra to it, and comes back with the “best” possible line. Here’s what that line might look like: The linear regression line (red) plotted against the true data (blue) The linear regression model, then, can be written as follows: Income = $989 × Height + $30,687 This model implies that for every inch you grow, you can expect to make $989 more annually. Putting on the brakes: It’s unbelievably important to interpret these model parameters correctly. What we’ve found here is a correlation between height and income, not a causation. Being taller doesn’t directly cause your income to increase, of course. Rather, being tall usually means that people treat you with more respect (at least in Western culture), which can give you a leg up in job interviews and salary negotiations, thereby increasing your income. This example teaches us a valuable lesson about ML: it’s not magic. It won’t discover fundamental truths about the world around us. Rather, it will tell us how certain variables — like height and income — correlate with one another. Concluding from the model above that height causes increased income is an incorrect conclusion, and failing to understand that these patterns are merely correlational might cause you to make inappropriate business or policy decisions down the road. We now know enough to truly understand what the learning in machine learning means. Many models, including linear regression, have parameters — which are variables (e.g. the m and the b in y = m × X + b) that the algorithm estimates (read: learns) by examining the training data. This process is called training. To drive this point home, the simple linear model used above has two parameters that needed learning: m and b. By running the linear regression algorithm, the model estimated m to be $989 and b to be $30,687. Evaluation function: How do we know that this line is best? This is a subtle question, and there’s no universally agreed upon way to answer this question. Best, of course, is a relative term, and, in this case, we’ve defined the best line to be the one that maximizes R² on the testing data (pronounced R-squared, also called the coefficient of determination), which we can calculate by applying an evaluation function to our results. An evaluation function is merely a way of determining just how good our predictions were. The way this is calculated in practice is to train a model on training data, but to reserve a handful of data (called testing data) for evaluation purposes. Once your model is trained, you can hide the response variable in the testing set and run the rest of the test data through your model, resulting in a series of predictions. Then, you can compare your predictions to the true values of the response variable in the testing set, and use the evaluation function to give you an exact determination of how well you did. R² is a very common evaluation function for regression models, though its explanation can get quite technical. It essentially describes how “much” (measured as statistical variance) of our response (e.g. income) we can explain with our predictors (e.g. height). If R² is 1, then we can explain all of the variation in income, just by looking at height. That means height is a perfect predictor of income in the test set. In other words, we can make perfectly accurate predictions in the test set about someone’s income, just using knowledge about their height. Putting on the brakes: Does an R² of 1 imply that we can always perfectly predict someone’s income given their height? No, and thinking so can have dangerous consequences. We received an R² of 1 by training our model on the training data and then predicting the response variable for the test data. We then compared our predictions to the true values of the response variable. What does this say about the validity of machine learning models? It tells us that machine learning models learn from the past, and they implicitly assume that the future is going to behave similarly. But sometimes the future changes — the relationship between height and income goes away when managers go through bias training — thus invalidating the model. What’s even more common is training on biased data in the first place, which might not accurately represent the phenomenon that we’re trying to measure. In building a training dataset for a model to predict income using height, I might be collecting data just from Western cultures. Is this model appropriate for Eastern cultures, for which I don’t actually have training data? Probably not. This has huge implications in learning systems that are employed in the criminal justice system, for instance, which use models trained on data from the past. Certainly we should remember to ask how recently the model was updated with fresh training data. We should also ask if the training data is comprehensive enough — what if our models never included training points from those under 60 inches? Is it fair to make predictions about people who are shorter than anyone else in our training set? Is extrapolation just as fair as interpolation? If R² is zero, then we can’t explain any of the variation in income by looking at height. This means that height tells us nothing about income in our training set — it’s a worthless predictor. This all means, of course, that a higher R² is better than a lower one. In machine learning jargon, a model with a higher R² is more predictive than a model with a lower R². We could have also evaluated our model using a metric other than R², which would have caused our red line to look differently. In the medical domain, for instance, we may care more about eliminating under-predictions (of, say, cholesterol level) than about finding a line with the highest R². A common classification model: KNN K-Nearest Neighbors (KNN): KNN is a powerful classification algorithm that is conceptually intuitive. It is exactly the type of model that a 4-year-old might design, despite having never taken a math class. We’ll start with an example. Let’s imagine that we’re contracting for the US government, working on a team that wants to build and maintain an accurate dataset of which plots of land in the US are vegetative (i.e. fertile) and which plots are not. Having this dataset will help policymakers decide how to manage precious natural resources. A member of your team has already hand-labeled a few hundred latitude-longitude coordinates from a satellite image as being either vegetative or non-vegetative (see the plot below). Your goal is to train a classification model that can take in a latitude-longitude pair (called a test datapoint) and predict whether or not it’s vegetative. A hand-labelled scatterplot. This example is inspired by CS109A, a course offered at Harvard College in the Fall of 2016. One of the most intuitive ways to solve this problem is to merely look at the test point’s closest neighbors. Take, for example, a test point at (.35, .4), colored black below: We want to predict whether this coordinate represents a vegetative or a non-vegetative region. To do this, we can select, say, the 3 nearest dots. These three dots are all labeled as non-vegetative, so we’ll classify the test point as non-vegetative as well Now we know why this model contains the phrase nearest neighbors—because we classify a test point based on that point’s already-classified neighbors. What’s the K for? Simple: K is the number of neighbors we look at when determining how to classify a test point. If K is really small (say K = 1), we’ll only look at the singly closest neighbor when classifying our test point. If K is really large (say K = 100), we’ll look at a huge range of neighbors before we make a classification. The K value that you choose when making classifications depends entirely on the problem at hand and is often not easily intuited. Instead, data scientists will try plugging in many different values for K and seeing which value leads to the most accurate classifications. Note that K is a parameter of the model, but it’s not necessarily a parameter that the model learns through training. Rather, it’s a special parameter that the data scientist explicitly sets herself. We call these parameters hyperparameters, and you should know that a data scientist often decides which hyperparameter to use by evaluating a bunch of different possibilities for that hyperparameter (say, K = 1, 5, 10, and 50). The process of evaluating these possible hyperparameters is called tuning. Putting on the brakes: Why are hyperparameters important consideration for non-technical decision makers? It has to do with the way that most common software packages are implemented. Most ML code libraries use default hyperparameters in the event that the data scientist forgets to explicitly set them herself. For example, if I forget to choose a value for K when running my KNN model in code, the software will assume I wanted K to be, say, 5. But the value of K is important and can have big implications for the results of the model. Should software packages be allowed to use default hyperparameters? I’m not sure. Regardless, the important takeaway here is to always ask whether or not a model has been tuned properly — otherwise, it might not be doing its job correctly. Lastly, a (very) brief discussion about artificial neural nets… It is not the goal of this post to describe in-detail how specific ML models work. That said, neural nets are especially important for non-technical readers because they’re incredibly powerful and are becoming ubiquitous in nearly any technology that merits the use of ML. https://en.wikipedia.org/wiki/Artificial_neural_network Borrowed from A toy neural net with 9 artificial neurons. Neural nets are biologically inspired models, wherein collections of interconnected “neurons” (often called units or nodes) work together to transform input data to output data. Each node applies a simple mathematical transformation to the data it receives; it then passes its result to the other nodes in its path. Just as our brain contains billions of biological neurons, neural nets typically contain thousands or millions of these artificial neurons. Each connection between nodes represents a different parameter to the model. This means, of course, that neural nets with millions of nodes have potentially billions of parameters associated with them. A common criticism of neural nets is that they are too difficult to interpret — that it’s too hard to understand “what’s going on” inside of them. This criticism springs from the fact that neural nets simply have too many parameters, most or all of which are determined via complex combinations of mathematical transformations to the training data. While a simple linear regression has just two parameters (m and b), a neural network can have infinitely more, and it can be hard to figure out how they were calculated. Putting on the brakes: We often run into a trade off between a model’s interpretability and its predictive power. Neural nets are hugely powerful, but are almost impossible to interpret. On the other hand, simple linear regression is considerably easier to understand, but it’s generally not as powerful. Should we trust the results of complicated neural nets if we can’t interpret their parameters? Why might it be a good idea to trust neural nets used in criminal justice situations to determine sentence length? Why might it be a bad idea? The answers to these questions are critically important. As a general rule of thumb, machine learning models have to balance a tradeoff between being easy to understand and being powerful. Really powerful models like neural nets are often called black boxes because we’re not quite sure what’s inside of them, but these models can be extremely predictive. When building models — or when telling others to build them as a CEO or president or professor might do—we must decide what balance we’d like to strike between building models that are predictive and building models that are easy to understand. This balance almost always depends on the domain of the problem, of course. Having a neural net predict who we should let out of jail is probably a poor idea, given how hard it is to understand exactly why the neural net is making certain predictions. On the hand, using a simple linear regression is also inappropriate because its predictions will likely be poor. A parting message: People tend to view much of ML as magic, and I hoped I showed you that this assumption is far from the truth. Assuming that models (like neural nets) are omniscient and infallible could potentially cause us to put too much faith in their output. Knowing how many different components go into building a model should open your eyes to the many opportunities for bias and error in a model’s output, which should be fiercely scrutinized before being trusted. Join SafeGraph: We’re bringing together a world-class team, see open positions. FAQ’s 1. What is machine learning in simple terms? Machine learning is a way of using data to estimate relationships between inputs and outputs so that predictions or patterns can be identified. 2. How is machine learning different from artificial intelligence? Machine learning focuses on building predictive models from data, while artificial intelligence usually refers to systems that use those models to make decisions. For most non-technical discussions, the distinction is not critical. 3. What Is a Statistical Model? A statistical model is a mathematical function that describes how one set of variables relates to another, such as predicting income based on height and age. 4. Why is training data so important? Models learn entirely from training data. If the data is biased, incomplete, or unrepresentative, the model’s predictions will reflect those flaws. 5. What is the difference between regression and classification? Regression predicts continuous values such as income or age, while classification predicts categories such as types, labels, or classes. 6. Are machine learning models objective or neutral? No. Models reflect human choices about data, assumptions, and evaluation methods, which makes scrutiny and ethical judgment essential. Machine learning is a way of using data to estimate relationships between inputs and outputs so that predictions or patterns can be identified. Machine learning focuses on building predictive models from data, while artificial intelligence usually refers to systems that use those models to make decisions. For most non-technical discussions, the distinction is not critical. A statistical model is a mathematical function that describes how one set of variables relates to another, such as predicting income based on height and age. Models learn entirely from training data. If the data is biased, incomplete, or unrepresentative, the model’s predictions will reflect those flaws. Regression predicts continuous values such as income or age, while classification predicts categories such as types, labels, or classes. No. Models reflect human choices about data, assumptions, and evaluation methods, which makes scrutiny and ethical judgment essential. #### A Startup’s Guide to Remote-First Onboarding   Key Takeaways Remote-first onboarding requires deliberate structure to replace informal, in-office learning. Clear tools, paced live-sessions, and early relationship-building improve retention and productivity. Strong onboarding reinforces company culture by showing how teams work, not just stating values. The onboarding experience can make or break a new hire’s trajectory. Onboarding a new employee should be a full introduction to how your company communicates and functions. Now that the world is turning into a remote-first workforce, remote onboarding tools are becoming more and more essential. Without in-person interaction or the opportunities for new hires to physically observe and immerse themselves in your business, it is vital to get your remote onboarding right. On top of this, your onboarding process can impact recruitment. Top talent will naturally gravitate towards leading remote-first teams. SafeGraph is an entirely remote company, with employees spread across North America. ‍SafeGraph has always been a remote-first company. We’ve learned a thing or two about growing a remote-first team and in this guide, we aim to share our knowledge.‍ 5 Reasons Why a Strong Onboarding Process is Essential A good remote onboarding process is crucial for a variety of reasons, including: Employee morale - No new hire wants to feel like they are alone or without guidance in their new position. In a remote-first environment, it is even more important to go above and beyond by personalizing each new employee’s experience. Camaraderie - Trust is critical to the success of any team. Remote-first companies can find it hard to build relationships, especially in a new hire’s early days. The right onboarding tools can facilitate camaraderie by building trust. Employee retention - A study by The Wynhurst Group showed that 22% of staff turnover happens in the first 45 days of signing an offer letter. Fail your employees with a poor onboarding process and you risk losing them. Employees who had a positive onboarding experience were 58% more likely to stay with the company after 3 years. Maximizing productivity - Onboarding new hires effectively will allow them to be better at their jobs, sooner. Company culture - Growing a culture within a remote-first team is difficult, but crucial. A standout remote onboarding process will not only tell new hires what the company's values are, but also show them by leading examples. When it comes to company culture, show don’t tell. 4 Tools for Successful Remote Onboarding There are a ton of tools out there that are worth considering once you hit 50-100 employees, but many of these tools require a lot of time to build out and an internal IT admin to assist. Below are tools that will help you optimize your remote-first onboarding process, especially for companies under 100 employees. They take little-to-no-time to set up and require no IT admin or knowledge. Fleetsmith - Remote Computer Management- Most companies do not hire an internal IT admin until they have reached around 300 employees. IT support is really expensive and in a nimble startup, there often aren’t enough IT tickets to help justify paying for a full-time employee. Finding tools that require no IT background to set-up is hard; however, for remote-first startups, it is worth it. Fleetsmith is one of the diamonds in the rough. The tool took only one hour to set up and distribute across our entire company. We now can manage all computers and remotely delete computers when needed. No IT background or IT knowledge is needed to set-up, manage, distribute, or use Fleetsmith. Notion - Internal Company Wiki - Notion is a user friendly tool to use as a company wiki (amongst many other uses). In a future post, we will dive into how to create and maintain transparent communication through a company wiki or intranet. For now, we’ll share how SafeGraph uses Notion for onboarding. A company wiki holds all internal information and processes that help new employees learn how their team and company function. It also helps them to stay up-to-date on new processes as they form. We chose Notion for many reasons, one of which is its UI functionalities. Making it easier to edit, read, and search is extremely important because the process of writing information down is already a daunting task. Notion requires no training to find, access, and input data. Getting users to data dump into a company wiki is hard enough - why make it harder by picking a tool that requires training to use?SafeGraph’s wiki scavenger hunt. A week before the new hire starts, we send out a fun interactive wiki scavenger hunt. This scavenger hunt prepares the new hire for their first week. The goal of the scavenger hunt is that new employees end up with a good grasp on how to utilize the wiki, how their department functions, and how their company works as a whole. Each scavenger hunt contains the most common questions across the entire company, as well as the frequently asked questions from the new hire’s department. To encourage people to complete it, we offer a prize if they submit the answers before their first day - so far we have only had one new hire who hasn’t submitted their answers. Donut Onboarding - Automate Onboarding Introductions - Donut allows you to automate onboarding intros and send reminders directly through Slack. Most companies use the Slack integrated tool, Donut, to boost morale and encourage employee bonding, but a lot of teams don't know they have an onboarding feature as well. With Donut Onboarding, you can automatically assign new hires to their manager, onboarding buddy, and more. You can make use of customizable templates that trigger on specific dates to specific roles. You no longer need to manually tell people what to do and when to do it - Donut Onboarding does it all for you. On a new hire's first day, you can sit back and relax knowing the new hire is connected to the right people.‍ Asana - Tracking Tool for Admins and New Hires - Asana can be utilized quickly and easily as a pre-onboarding tracking tool for both admins and new hires. When onboarding remotely, it is important to stay organized and create a straightforward process for your new hires. All new employees need to see an overview of the tasks at hand. This ensures that they do not feel overwhelmed when they have 20 different invites in their inbox to documents they have to sign or tools they have to accept. With Asana, you can create a to-do list where both the admin and new hire can track what needs to be done and when. This makes everything far more manageable for both your company employees tasked with managing onboarding and, most importantly, the new hire. 3 Tips for Live Onboarding Sessions In the age of remote work, new hires expect some form of back-to-back virtual onboarding sessions to learn about the company, team processes, and meet their coworkers. Onboarding meetings should not be tiring, long, and boring. No one wants to sit on Zoom meetings all day. With that in mind, here are our guidelines for onboarding meetings: Limit to a maximum of 5 hours of onboarding meetings per day - This prevents new hires from ending their day burnt out and in a bad mood. Any more than 5 hours and we’ve found Zoom fatigue is likely to set in. Only one of the 5 hours per day should be spent in onboarding meetings outside of the new hire’s department - This leaves room for the new hire to connect with their direct department by leaving space for team members to add spontaneous calls or social events to their calendar. No more than 1.5 hours of back-to-back meetings before a meeting break - This allows all new hires to process what they just learned in their meetings and prep for the next round. 4 Golden Rules of Onboarding As well as our thoughts on the right onboarding tools and process, there are plenty of onboarding tips we can share. We will go through these in more detail in a future post, but for now, these simple tips can help you to implement your newfound tools in the most effective and efficient way: Don’t overload new hires - There is no denying that most new employees have a lot to learn. If you dump everything on them at once, it can quickly become very intimidating. On top of that, they’re more likely to skim through new information in a hurry to move onto the next. It is far better to break it down into bite-sized chunks, and spread out over time as much as possible. Team up - A “buddy program” might sound corny, but that doesn’t mean it isn’t effective. There are certain aspects of company culture that can’t really be conveyed in writing, and a buddy can help. On top of that, a familiar face or voice is comforting for a newcomer. Set up regular one-to-one meetings with anyone required - If your new hire is going to have to check in with a manager regularly, get them accustomed to this right when they start. Get feedback on your onboarding process - You won’t get it perfect the first time. A remote onboarding process is always a work in progress, and new tools might crop up that lead to you altering your processes. Always ask for feedback from people once they’ve completed their onboarding. Ask what worked and what didn’t, what they enjoyed, and what they think could’ve been better or different. SafeGraph's completely remote team maintains strong connections through team events even after onboarding. For some companies, becoming remote-first has been a huge transition. There aren’t a lot of resources out there telling you what to do and how to do it. We hope that sharing this knowledge helps startups that are transitioning into becoming a remote-first workforce. The good news is that remote onboarding doesn’t have to be too painful or complex, and getting it right can save you time, hassle, and money in the future. #### Accurate Places Data: The Elusive Pursuit of Perfection Key Takeaways Data quality is multi-dimensional and cannot be reduced to any single accuracy metric. SafeGraph evaluates data using three core measures: row precision, row recall (coverage), and column precision. /li> Radical transparency around location data quality helps customers understand tradeoffs and trust improvements over time. Improvements in POI data accuracy directly impact real-world decisions across mapping, real estate, advertising, and analytics. Fast forward to the 2024 Super Bowl. Patrick Mahomes is trying to become the first QB since Tom Brady to repeat and he’s up against the Dallas Cowboys, who miraculously didn’t choke in the playoffs. A giant watch party is scheduled at Jerry World (the Cowboy’s home stadium, Placekey: zzw-222@5qw-vxp-h3q). You run a sports betting side hustle and are modeling each team’s expected points to decide whether to bet on or against the spread. Your method is bottoms-up: take every play from every game, assign it a situational success rating, and predict points scored based on the aggregated ratings.Before you start modeling though, you need to distill each game the Chiefs and Cowboys played that season into a CSV. Then you will assess the data quality. What if you were missing the play where the Cowboys' third-string corner had a pick-six against Jalen Hurts in the NFC Championship? In your data files, each row is a play and each column contains details about the play: down, distance, time, personnel, run or pass, yards gained, and description. How would you assess the “quality” of this dataset?Let’s explore: Do we have rows for all the plays that happened? Are there extra rows from other games or seasons that snuck in during processing? What should we do about duplicate plays? Or the down just being wrong for a certain percent of rows? But whoops, one game has the distance column totally null throughout the entire file. And our data provider didn’t include the time column until Week 7 - who knows what happened there, probably an intern. Ready to place your bets?Clearly data “quality” is a multi-faceted and hard problem. As a data-only company, we are well suited to tackle the problem and care immensely about addressing it. So much so that we have spent the last few months thinking about how to bucket different quality problems, which fixes will have the biggest impact for our customers, and how to communicate improvements to the market. We have narrowed in on a three-pronged framework and started in the US.Row Precision assesses whether each POI (row) in our data truly exists and is marked open when it should be. We call this the Real Open rate. If we have a Real Open Rate of 80%, that means we think that 80% of the POIs in our dataset exist in real life and are currently open. The other 20% are Unconfirmed, Closed, or Duplicates.Row Recall assesses how much of the universe we have in our dataset, often referred to as “Coverage.” Coverage questions come up from prospects and customers all the time. We have elected to benchmark ourselves against the industry leader: Google. Each month, we take a specific geography and compare our Real Open POIs versus Google’s Real Open POIs. The result is a percent - if our Coverage Rate is 80%, that means we have 80% of the POIs in a geographic zone that Google does. We also break this out by NAICS code on the Summary Stats page.Column Precision assesses the fill rates for select columns and their correctness. For example, if our websites column has a fill rate of 70%, and we compare that to our websites truth set and find that 90% of the entries are correct, the websites column precision is 70% * 90% = 63%. These metrics will be published in future releases and all these truth sets are created manually.This radical transparency about data quality is unprecedented in the POI industry. And while you can expect improvements in future releases, we are proud of where we stand today. Below we will discuss row precision and row recall in detail, including our process, tradeoffs we considered, and what to expect in future releases.Row PrecisionYou’d think that determining if a POI is real would be simple. If we had Santa’s ubiquity, we would check each one individually in just a single night. But alas, our reindeer are sleeping, and we need to use the internet and scalable methods for verifying each of our tens of millions of US POIs. A SafeGraph POI can fall into one of four mutually exclusive categories: Real Open, Unconfirmed, Closed, or Duplicates. Our north star metric for this prong is Real Open divided by Total Rows.First let’s consider a Real Open POI: Open Range Steakhouse located at 241 E Main Street in Bozeman, MT. A quick search reveals that it has a first party website, an active Facebook page, several Google reviews from the past month, and recent yelp reviews. There’s also an article about country music star Gavin DeGraw and his brother buying the restaurant. This one is easy to classify as Real Open.Next we will look at a now-removed Unconfirmed row: a POI named “Phantom of the Opera”. Whoops. That’s a play. And even though it may have a claimed Yelp page with a decent amount of reviews or its own 1st party website with an embedded map widget, it's not a place. The Booth Theater where the play is being performed? Absolutely a place. And also already in our dataset (Placekey: 227-224@627-wbv-kcq). But the line for the play itself? That’s got to go and now gets filtered out as Unconfirmed.Then there’s a Closed POI: a nursing home called Mountain View Care Center located at 205 N Tracy Ave in Bozeman. This facility used to exist, but according to several articles it closed in 2020.Duplicates should be fairly self-explanatory. If we find a POI represented twice, we’ll…wait for it…remove one.Between the clear cut cases, there are obviously significant gray areas. We addressed these ambiguities one at a time with many SafeGraphers manually verifying thousands of POIs. After cross-checking our work and sharing strategies, looser guidelines coalesced into the consistent rules outlined here. The TLDR is that when we manually classify a POI as Real Open, we look for either 1) a first-party website where data originates, 2) several recent reviews, or 3) they pick up the phone.Our March 2023 release had a Real Open Rate of 60%. Which means 40% were Unconfirmed, Closed, or Duplicates. At face value, not an ideal result! But even the behemoth Google was at the same level as us and other competitors had even lower rates. While we were pleased to be at parity with Google, we knew our customers would want more and set out to identify which of our US rows should be filtered out.We prioritized identifying and filtering Unconfirmed POIs. After manually classifying around 30k POIs to generate a truth set, we trained a machine learning algorithm that used attributes and metadata like websites, category, region, reviews, and sources to predict the likelihood that a POI is Unconfirmed or Real. Internally, we dubbed this concept the “IRL factor.” Once all our POIs were assessed, we set category specific thresholds, making sure to balance precision and recall tradeoffs. As the graphic below shows, more aggressive thresholds mean that more Unconfirmed POIs are filtered out (true positive), but also increases the chance that Real POIs are incorrectly flagged (false positive). In the July 2023 release, we stayed conservative and prioritized keeping real POIs over removing every single Unconfirmed row.Thanks to this model, our Real Open Rate improved to 66% in the July 2023 release and we know how we can keep it growing. Each month, we publish our Real Open rate on the Accuracy Metrics page. In Q2, we mostly focused on identifying Unconfirmed rows, but we know we still have work to do for Closed and Duplicates. And we will always strive for further improvement.Row RecallNext, let’s talk about a slightly easier problem: row recall. Colloquially, many people refer to this as “coverage” and it is simply how many total POIs we cover in a geographic zone. As mentioned above, in the US, we gauge ourselves against Google. When doing these analyses, row precision is the first step. This means that first we eliminate Unconfirmed, Closed, and Duplicate rows so we can only compare Real Open rows between vendors. This elimination is manual and follows the guidelines outlined in the prior section: broadly we verify via 1) existence of a first party website or 2) recent reviews.Once we have the Real Open rows for SafeGraph and Google, we manually match the rows. The end result is a number of Real Open POIs each vendor has in a specific zone. For the July 2023 release, we kept it close to home and looked at zip codes 94103 (San Francisco, CA; the zip code of SafeGraph’s first office) and 98110 (Bainbridge Island, WA; the lowest population density zip code that a SafeGrapher calls home). Our north star metric for this prong is what we call the “Coverage” rate, which is SafeGraph’s Real Open POIs over Google’s. As of the July 2023 release, we are at 79%.We know we have work to close the gap with Google, but we are proud of where we stand today and of the rapid progress we make every month.Naturally, there are nuances in deciding if a place is a POI. Should every real estate agent who works at a real estate firm be its own POI? What about each individual lawyer at a law firm? Or a counselor who is a sole proprietor and has an office in a large office building? The ambiguities are endless. Kiosks, ATMs, transit stops, parks, rivers…trees?? Just kidding, trees are probably going too far. But we did put a lot of thought into what should be a place within our scope, and it is always subject to change pending market needs.We have documented guidelines here. While in the past we have focused on places where people spend money - restaurants, bars, retail stores, gas stations - recently we have expanded to also include places of leisure, work and travel. Improving recall is simple: adding sources. Each month we add hundreds of sources to our pipeline. These sources are cleaned, joined, and deduped to result in a single source of truth for each POI. We have counts by NAICS code and country listed on the Summary Stats page and if you are interested in something we don’t have yet, please don’t hesitate to Contact Us.Why Does This Matter?Row precision, row recall, all this detailed and technical work, why bother? The cursory reason is that we are a data company and pride ourselves on selling high veracity data. But more importantly, better SafeGraph data improves our customer’s bottom lines.Imagine you are a Product Manager at a company that makes a mapping application - think Apple Maps, Bing Maps, or Mapbox - and users use your app to find and route themselves to nearby places, aka “local search.” If SafeGraph has a higher Real Open rate, that means fewer bad arrivals for your users. If we have more Coverage, more places exist when users query in the search bar. A better user experience leads to increased usage and higher revenue per user.Imagine you are the Head of Real Estate at a QSR - think Subway, Domino’s, or Chipotle - and you are responsible for choosing new locations. Real estate teams use complex Huff or gravity models to estimate foot traffic and revenue at future sites, but we all know that in models, garbage in = garbage out. When SafeGraph has cleaner data, your team can be more confident in the model output and ensure that the million plus dollar decision you are making is the correct one.Imagine you are a Product Manager for a site selection or real estate software company - think Kalibrate, Buxton, or Crexi. You embed POIs in your site-selection software and/or model so that users can analyze different trade areas for new store development. When your customers see inaccuracies like closed or missing POIs, they doubt the validity of your platform; ‘bad data’ may impact model output. When SafeGraph has more accurate data, aka a higher Real Open rate and more Coverage, your customers can make better decisions, feel more confident in the platform, and renew at a higher rate.Imagine you are the Chief Data Officer at an advertising firm that does visit attribution or OOH campaign planning and measurement - think Clear Channel, Vistar, or Billups. You take billions of mobile GPS pings, cluster them into groups, and see which POIs people actually visited to create audiences for your customers. Or, your customer McDonald’s wants to advertise specifically on billboards near Wendy’s locations. Better SafeGraph data means more accurate audiences or improved campaign planning, enabling your customers to derive more revenue from their advertising efforts.It is difficult to enumerate all the potential use cases for POI data. We sell into many verticals, each with slightly different product requirements. But, we are confident that in all use cases, across all industries, more accurate data allows our customers to make better decisions that improve their bottom lines. If you’ve worked with SafeGraph before, you know that we stop at nothing to make life easier for our customers and this effort was no exception. FAQ’s 1. What does “row precision” mean in place data? Row precision measures whether each POI in the dataset truly exists and is currently open, expressed through the Real Open rate. 2. What is row recall or coverage? Row recall, often called coverage, measures how much of the real-world universe of POIs SafeGraph captures, benchmarked against Google. 3. Why does SafeGraph compare its coverage to Google? Google is widely considered the industry leader in POI coverage, making it a practical benchmark for comparison. 4. Why is perfect POI data so difficult to achieve? Places constantly open, close, relocate, and change attributes, making verification at scale complex and subject to tradeoffs. 5. How does better POI data affect business outcomes? More accurate places data improves user experience, modeling confidence, campaign performance, and decision-making across industries. Row precision measures whether each POI in the dataset truly exists and is currently open, expressed through the Real Open rate.Row recall, often called coverage, measures how much of the real-world universe of POIs SafeGraph captures, benchmarked against Google.Google is widely considered the industry leader in POI coverage, making it a practical benchmark for comparison.Places constantly open, close, relocate, and change attributes, making verification at scale complex and subject to tradeoffs.More accurate places data improves user experience, modeling confidence, campaign performance, and decision-making across industries. #### Adapt or Be Left Behind: The Shift to Data-Driven Real Estate   Key Takeaways Real estate has become more data-driven, but data adoption remains uneven across the industry. Institutional players lead in real estate analytics, while smaller firms lag. Competitive advantage comes from applying data-driven real estate practices effectively, not merely accessing it. Faster deal cycles reward firms with strong data confidence and execution speed. Data quality is paramount as the industry moves toward AI and automation Why Data Adoption Remains Uneven in Real Estate While the real estate industry has rapidly shifted toward data-driven real estate over the last decade – reshaping asset managers and tenants to make business decisions – there is a critical issue: real estate data adoption is not evenly distributed across the industry. In this context, the data revolution refers to the shift from intuition-driven and consultant-led decision-making to the systematic use of alternative data, analytics, and data science models across underwriting, site selection, investment, and operations. It is not just about having more data, but about integrating raw datasets into repeatable, decision-ready workflows that allow firms to move faster and with greater confidence. As you can see in the graph above, some areas of real estate, like logistics companies and big chain retailers, are very sophisticated in their use of alternative data. These organizations have invested heavily in data science teams that build advanced models using raw datasets to improve decision-making. Regional brokers, investors, and retailers, on the other hand, are far behind. On top of that, while some areas of real estate are forward-thinking and advancing in their data adoption, as a category, real estate still has a long way to go. Compared to other industries, it remains near the bottom of the data maturity curve. The good news is that the number of data scientists in the real estate industry has grown tenfold in the last four years. The bad news is that this growth is coming from a very small base. The market is evolving – adapt or get left behind The days of deal teams spending months on diligence and working with a long list of consultants are long gone. Leading retailers and large investment managers recognized this shift years ago and adjusted their processes accordingly. The real estate world is changing fast, and there are three major real estate analytics trends to watch: Bigger data science teams: Real estate investment and brokerage firms are hiring data scientists at a faster rate than ever before, and the pace continues to accelerate. Deal cycles are shrinking: Speed has become a competitive advantage, and confidence in data-backed models enables faster conviction. ‍Data options are growing: The alternative data market is expected to grow at a CAGR of 39 percent through 2025. Same data, different use cases The real estate industry is fairly broad, with many different types of organizations optimizing for various use cases. Investment firms, developers, operators, and brokers often rely on the same core datasets while applying them to very different decisions. There are many use cases across many verticals. What differentiates outcomes is not access to data, but how effectively it is applied through real estate analytics. Data is your alpha. Use it or be outperformed Underwriting is probably the most important aspect of investing. There are two dimensions of underwriting: depth and speed. You want to be able to go deep on an asset while moving quickly. Going deep means doing significant research and truly understanding the quality of an asset. It requires going beyond surface-level materials and widely available market studies. It means finding sources of information that create information asymmetry between you and your competitors. That advantage often comes from data-driven real estate insights. For example, a large commercial real estate investment firm that invests in retail properties uses a variety of datasets to power its underwriting models. They use Point of Interest (POI) data to understand the surrounding landscape and competitive context. They use lease data to stay ahead of shifts in leasing comps while underwriting. They also leverage consumer transaction data to quickly evaluate trends at competitive and adjacent locations to estimate stabilized top-line performance. The second component of underwriting is speed. When underwriting new assets, you generally have three options: Do manual research on the asset and source information points Outsource the research and have someone else do this work Use pre-built models that update by pulling in raw data feeds instantaneously Most companies rely on some combination of the first two options. The third requires upfront investment. Building internal data capabilities depends on trusted data sources, specialized talent, and refined models. While these require time and effort upfront, they save many hours over the long term. In competitive markets, the ability to move faster often determines who wins the deal. Location, location, location Picking the right location is one of the most critical decisions in data-driven real estate. For retail businesses, it can be a matter of survival. Consider chain coffee shop operators. When entering a new market, they need to deeply understand neighborhood quality, the competitive landscape within each area, and local demographics. Most competitors rely on reports from research companies and consultants, but far fewer work directly with location intelligence. Here’s how retail businesses like coffee shop operators can use data to get a faster view of a new market: POI data: Locate competing coffee shops and understand their proximity to potential sites Transaction data: Analyse transaction patterns across coffee shops to inform site selection ‍Census data: Identify neighbourhoods with income levels aligned to the target customer base The bigger the investment size, the more data you need The larger the investment size, the greater the risk. As deal sizes increase, the cost of being wrong compounds quickly. Hotel operators and multifamily developers are good examples of organizations that combine multiple alternative data sources when acquiring or developing high-capital assets. One of the most powerful analyses they use is neighbourhood scoring, which helps assess sub-market quality. By using POI data, they can quickly evaluate proximity to transportation hubs, restaurants, bars, and event venues. Hotel operators need to understand access to airports and convention centers. Multifamily investors often look for proximity to grocery stores, gyms, and high-quality schools. POI data helps address all of these considerations. Neighbourhood scoring alone is not enough. Investors also rely on property-level data to contextualize construction quality, asset history, and comparable buildings. There are many other data types that support smarter investment decisions. The takeaway is simple: data becomes more valuable as the stakes rise. It’s not all about commercial assets The value of data extends beyond commercial real estate. There is significant opportunity to generate alpha in residential real estate by using better data. The rise of iBuying has reshaped the residential market. Companies like Opendoor and Zillow brought speed and scale to home buying. iBuyers operate as both investors and market makers. Their advantage lies in deploying capital quickly while compressing traditional buying timelines. However, diligence should not be an afterthought. iBuyers can use POI data to understand retail open and close trends in surrounding areas and strengthen forecasts for home price growth. They can also analyse nearby consumer spending patterns, which often serve as strong demand signals. Speed and diligence do not need to be trade-offs. You don’t have to compromise one for the other. Data is useful for more than just investment decisions Data does more than improve investment strategy. It can also enhance marketing effectiveness and operational efficiency. Developers and brokers invest significant effort in marketing assets coming to market. Building properties is only the first step. Leasing them can be just as challenging. When opening a new multifamily residential building, operators and brokers can run a radius analysis of nearby POIs. Using APIs to identify bars, restaurants, grocery stores, and gyms near a property enables clearer amenity positioning and more efficient leasing. On the operations side, consider big box retailers. Large retailers can combine alternative data with first-party data to optimize hours of operation. By analysing POI data, they can review operating hours of nearby retailers and complementary venues. This insight helps retailers adjust staffing, improve efficiency, and maintain operational discipline. Transaction data adds another layer. By evaluating hourly transaction patterns, retailers can quickly identify peak demand periods and adjust accordingly. Go beyond just data about places One of the most successful real estate investors is Blackstone. The firm moves quickly, conducts deep diligence, and has generated strong returns across real estate cycles. Tyler Henritze, Head of Strategic Investments at Blackstone Real Estate, discussed how Blackstone uses alternative data on the World of DaaS Podcast. In the 2000s, Blackstone acquired homes across the southeastern United States after identifying migration trends toward states like Florida. The signal came from U-Haul data. If you follow the U-Hauls, you often find the opportunity. Blackstone also looks beyond location-based datasets, including Truck traffic data to assess logistics routes, LinkedIn job postings to gauge future demand, and Employee count trends to evaluate tenant health. These datasets reflect broader real estate analytics strategies beyond location alone. All this is great… but how do you pick a good data provider? Choosing the right data provider can be difficult. Many data companies do not publish their schemas or pricing. Evaluating datasets often requires multiple reviews over several months, and even then, quality differences can be hard to assess. A few heuristics can help: Accuracy and veracity: Poor-quality data can materially impact returns. Metadata such as confidence scores and coverage rates matters. Accessibility: APIs and raw data access are essential for scalable workflows and timely insights. Cost: Pricing should align with dataset size and intended use case. These heuristics are simple, but they are powerful in practice. Conclusion Real estate has made meaningful progress in real estate data adoption, but it still lags behind many industries on the data maturity curve. The shift to data-driven real estate is no longer optional. Reliable data reduces uncertainty, accelerates decisions, and creates durable competitive advantages. Firms that invest in data capabilities will move faster, operate more efficiently, and win better deals. Those that do not risk being left behind. FAQ’s 1. Is real estate behind other industries when it comes to data?Real estate is not behind in awareness or intent, but it remains behind structurally compared to industries like fintech and advertising technology.. 2. What is the real estate data revolution?It is the shift toward widespread use of data, analytics, and automation in real estate decision-making. 3. Why does data quality matter more now than before?As firms adopt AI and automation, poor data leads to amplified errors, unreliable outputs, and reduced trust. AI systems are only as good as the data they are trained on. 4. What are current real estate analytics trends?Larger data teams, faster deal cycles, and increased use of alternative data. 5. How does POI data help real estate decisions?It provides insight into competition, amenities, neighbourhood dynamics, and property valuation. 6. What is brand attribution in real estate analytics?Brand attribution measures how specific brands or tenants influence foot traffic, demand, and asset performance. Real estate is not behind in awareness or intent, but it remains behind structurally compared to industries like fintech and advertising technology.. It is the shift toward widespread use of data, analytics, and automation in real estate decision-making. As firms adopt AI and automation, poor data leads to amplified errors, unreliable outputs, and reduced trust. AI systems are only as good as the data they are trained on. Larger data teams, faster deal cycles, and increased use of alternative data. It provides insight into competition, amenities, neighbourhood dynamics, and property valuation. Brand attribution measures how specific brands or tenants influence foot traffic, demand, and asset performance. #### AI Boom: How Data Center Investments Are Shaping the Future   Key Takeaways AI’s rapid growth is driving unprecedented investment in specialized data center infrastructure. AI data centre investment spans applications, models, data services, hardware, and physical infrastructure. Modern AI workloads require data centers with advanced power, cooling, scalability, and strategic location planning. Location intelligence plays a critical role in identifying optimal regions for AI-enabled data center expansion. Accurate places data helps investors, tech companies, and planners understand where AI infrastructure is scaling and why. Artificial intelligence (AI) is transforming industries worldwide—from improving customer service to advancing healthcare. But behind the scenes, supporting AI’s rapid growth requires massive investments in infrastructure, especially in data centers. Data centers house powerful computers that allow AI models to operate efficiently, and their role is essential as AI demand grows. At SafeGraph, we gather data on places, including data centers, and can offer insights into how these facilities are evolving to support AI. Here’s what we’re seeing in this growing space and what it means for the future of technology. Where Investment in AI is Happening The AI revolution is unfolding across four main areas: AI Applications: Many startups, including those supported by Y Combinator, are building specialized AI tools for industries like healthcare, finance, legal, and supply chain. We are personally excited about customer service chatbots, domain-specific AI agents, and consumer applications that can leverage SafeGraph Places data (i.e. travel use cases). Large Language Models (LLMs): At the core are foundational AI models like OpenAI’s ChatGPT, Anthropic’s Claude, and Meta’s Llama. LLMs provide substantial value through their ability to understand, generate, and interact with natural language. These models require a lot of computing power, advanced algorithms, and clean data to achieve their remarkable accuracy and speed. Data-as-a-Service (DaaS): Data is critical to the performance of foundational AI models. LLMs operate by analyzing and processing massive datasets to learn patterns, structures, and nuances to human language. Data companies like SafeGraph exist to democratize access to data to power these models with an emphasis on data accuracy, trust, and transparency. AI Hardware: Investment in assets like semiconductors is critical to meet AI’s computational demands. Core hardware components like CPUs, GPUs, and TPUs are crucial in enabling AI. Companies like Nvidia are at the forefront of this, driving a surge in semiconductor development with AI-optimized chips to handle compute-intensive models. Supporting the above technology stack requires robust AI-enabled infrastructure, particularly through specialized data centers designed to handle the unique demands of machine learning. Traditional data centers are often insufficient for AI workloads which need enhanced power, cooling, and storage capabilities. Data Centers: The Backbone of AI Data centers are essentially the powerhouses that keep AI running. These facilities host vast networks of computers and storage that allow AI companies to process enormous amounts of data in real-time. Companies like Equinix and Digital Realty, which manage large portfolios of data centers, have become essential for keeping AI running smoothly. Investment in data centers is rapidly increasing to meet AI’s needs. Sequoia Capital stated that AI’s infrastructure demands are now pushing data center investments to historic highs, with major cloud providers like Microsoft making substantial capital commitments to support AI workloads. Hyperscalers such as Amazon (AWS) have also announced big plans to spend $100 billion on AI data center development. This surge in spending reflects the growing need for advanced data center capabilities to meet AI’s intensive computational requirements. Building out this infrastructure involves more than installing high-speed GPUs and advanced networking; it also requires efficient cooling systems, sustainable energy resources, and strategic locations. These facilities must be resilient and scalable to meet the ever-growing demand of AI applications. This is why location intelligence is crucial in selecting the best locations for new AI-enabled data centers. Insights from SafeGraph’s Data Center Data SafeGraph’s data on US data centers offers valuable insights into evolving infrastructure to support AI’s demand: Data center growth: New data centers are emerging in technology-driven states like California, Texas, and Virginia. These regions are well-equipped with robust tech ecosystems, advanced connectivity, and renewable energy resources, making them ideal for supporting AI’s intensive needs. Trends in investment: High investment concentration is evident in urban areas and tech hubs such as Silicon Valley, Dallas-Fort Worth, and Northern Virginia. These areas are becoming prime locations as they meet AI’s computational and infrastructure requirements, driven by ongoing demand for data processing capabilities. Building footprints: SafeGraph’s Geometry data includes metadata like polygon_wkt (well-known text) and wkt_area_sq_meters, offering insights into the physical size of data center locations. This can highlight not only growth but the scale of individual centers, showing how infrastructure size aligns with regional AI demand. Proximity to nearby places: Analyzing data center locations in relation to nearby places provides additional context for accessibility and connectivity, showing how proximity to key areas supports efficient operations and connectivity in these high-demand regions. What This Means for Businesses and Communities The rise of AI and data center investment has big implications: Investors: Data centers are becoming a smart investment opportunity. With SafeGraph Places, investors can better understand which regions are seeing the most growth. Tech companies: As AI continues to grow, tech companies are looking for the best places to set up data centers. Using data like SafeGraph’s can help these companies choose strategic locations. Local communities: Data centers often bring economic growth and job opportunities to communities. Urban planners can use SafeGraph’s data to understand where growth is happening and what it means for future development. The Road Ahead for Data Centers As AI continues to drive demand for data centers, understanding where these facilities are emerging and how they are evolving is crucial. SafeGraph’s data provides unique insights into the scale, location, and connectivity of data centers across the US, helping stakeholders make informed decisions in this rapidly changing landscape. From tracking new developments to analyzing proximity to other essential infrastructure, SafeGraph’s data equips investors, tech companies, and communities with the information needed to strategically support and benefit from AI’s growing infrastructure needs. ‍ To learn more about SafeGraph’s data and how it can provide insights into the evolving landscape of data centers, reach out to us for more information. ‍ FAQ’s 1. What is AI data centre investment?AI data centre investment refers to capital spent on building and expanding data centers designed to support AI workloads, including compute-intensive models, storage, and networking. 2. Why are data centers critical to AI growth?AI models require massive computational power, real-time data processing, and specialized hardware, all of which are housed and supported by data centers. 3. Where is AI data center investment concentrated in the US?Investment is heavily concentrated in technology hubs such as California, Texas, and Northern Virginia, where connectivity, energy access, and tech ecosystems are strongest. 4. How does location intelligence support data center development?Location data helps assess proximity to infrastructure, energy resources, workforce, and connectivity, enabling better site selection and long-term scalability. 5. Who benefits from insights into AI data center growth?Investors, technology companies, urban planners, and local communities all benefit from understanding where AI infrastructure is expanding and how it affects economic developments. AI data centre investment refers to capital spent on building and expanding data centers designed to support AI workloads, including compute-intensive models, storage, and networking. AI models require massive computational power, real-time data processing, and specialized hardware, all of which are housed and supported by data centers. Investment is heavily concentrated in technology hubs such as California, Texas, and Northern Virginia, where connectivity, energy access, and tech ecosystems are strongest. Location data helps assess proximity to infrastructure, energy resources, workforce, and connectivity, enabling better site selection and long-term scalability. Investors, technology companies, urban planners, and local communities all benefit from understanding where AI infrastructure is expanding and how it affects economic developments. #### Alternative Data Providers: Where to Get Alternative Data for Unique Insights   Key Takeaways Alternative data is increasingly used by financial institutions to gain early and differentiated market insights. Choosing the right alternative data provider depends on scope, accuracy, freshness, cost, and interoperability. High-quality alternative data should be analysis-ready and easy to join with other datasets. Different providers specialize in different data types, making provider selection highly use-case dependent. Geospatial and behavioral data play a growing role in investment research and risk assessment. Financial institutions are turning more and more to alternative data to predict how the investment market will change from day to day. This data can be all kinds of things: weather, foot traffic, social media trends, news analysis, online transactions, mobile app use, and more. But where do you find all of this information? Or, to put it more specifically: where do you find this information in a form that’s ready to analyze and pull insights from without needing a bunch of preparatory organization work? That’s what we’ll attempt to answer here by pointing you towards some of the top alternative data providers in business today. Here’s what’s inside: What to look for in an alternative data provider 8 top alternative data providers: leveraging data for deeper insights Before we get into the best places to get alternative data from, we’ll start with some questions and considerations to keep in mind when evaluating an alternative data source. What to look for in an alternative data provider Not all alternative data companies are equal, at least not for your specific investing strategy. There are a number of factors you should consider when choosing who to source your data from, including the following: Scope – You want to make sure the data provider you choose has a large enough sample size. If your data pool is too small (e.g. constrained to too specific a transaction type or too short a time period), you could accidentally identify what appears to be a unique trend, when a more wide-angle look would tell you that it’s simply an anomaly. ‍Cost – On the other hand, you don’t want to go too wide and incorporate data that isn’t conceivably relevant to the sectors you’re thinking of investing in. You want your data to generate more worth than what you invested in it, so be sure to buy only the data you need. Otherwise, you risk going on a wild goose chase for insights that don’t actually exist, or that have no relevance to your strategy. And that can cost you a lot of wasted time and money.‍ Accuracy – Inaccurate data can lead to costly mistakes, especially in the financial sector where the stakes are often very high. Buy-side and sell-side analysts both need to make detailed reports with recommendations on whether to buy or sell investments. So the data they use to make those reports needs to be correct.‍ Freshness – Even if data is accurate, it may not be as valuable if it doesn’t reflect current real-world conditions. Up-to-date data gives analysts an edge when making their reports, as it ensures they are making financial decisions with the most recent information available.‍ Interoperability – Alternative data is almost always more powerful when connected with other alternative datasets. This is because investment decisions often need to be evaluated from multiple angles, and with various factors taken into account. Datasets that can be easily joined or related to each other make analyzing alternative data that much easier. For example, Placekey provides a standard for identifying places on Earth while avoiding the problem of having to match addresses that use different formats and conventions.‍ Detailed attribution – The more information analysts have at their fingertips, the more precise and comprehensive their recommendations can be. Alternative datasets with many detailed attributes give analysts more to work with when generating their models and reports. Here are a few other specific questions you may want to ask alternative data suppliers (or at least yourself) before you buy: Is the supplier the primary source of the data? If not, have they processed or filtered the data in any way, or offered it as-is? How might this affect how you interpret it? Is the particular type of data you’re looking for available from multiple sources, or only a few? What makes a certain provider unique? What does a particular dataset actually represent, and how does that specifically relate to a problem you’re trying to solve or a question you want answered? What relationships or patterns can a certain dataset show you, either explicitly or implicitly? What assumptions might you be making about this? Based on how a dataset is presented, how much time and processing work will be required to convert it into a usable state? Based on what you determine a dataset can (and can’t) tell you, can you think of some other creative uses for it? 8 top alternative data providers: leveraging data for deeper insights There’s a lot to think about when trying to get the alternative data that’s right for you. But we don’t want you to get too overwhelmed with considerations and conditions that you end up with “paralysis by analysis”. To give you an idea of what’s out there so you can get started on finding the alternative data you need, here’s a list of some of the top alternative data companies. 1. SafeGraph Cost: charged on a per-dataset basis Major data types: points of interest, building footprints Key use cases: retail investment, consumer insights, risk assessment, real estate investment SafeGraph is one of the top alternative data providers for points of interest. Our datasets include detailed information and accurate spatial representations of millions of commercial buildings, historic monuments, and other landmarks globally. Use our data to build consumer profiles, monitor the performance of (and relationships between) stores and brands, assess liability for insurance purposes, and more. 2. HARNESS Data Cost: $0.005/record; charged on a per-dataset basis Major data types: internal documents and communication, address, property, points of interest Major use cases: real estate investment, insurance risk assessment, logistics planning, fraud prevention Harness Data provides three distinct services. First, their PDFx tool allows for extracting actionable data points out of PDF files. This can include elements such as images, contact information, organization names, tables & schedules, and more. Next, their AddressM tool provides address matching services. Mostly, this helps you see if differently-formatted addresses refer to the same place. You can also use it to find out if a place pointed to by an address even exists. Finally, their “Addressable” dataset is the most comprehensive database on commercial properties in the UK. As a sample, they also offer a free database of price per square meter (PPSM) data for over 17 million properties across England and Wales. All of this information is ideal for investing in real estate, assessing property liability, planning precise logistics, rooting out fraud, and more. 3. Veraset Cost: contact for pricing Major data types: foot traffic, visit attribution Key use cases: retail investment, real estate investment Veraset provides two sets of alternative data: Movements and Visits. Movements uses multiple sources to get an estimate of human traffic around points of interest in over 150 countries around the world. Meanwhile, its Visits dataset combines foot traffic data with polygons of over 6 million points of interest in the US to show how much patronage a business is getting. Both of these datasets can be compared against company financial information to form predictive models about how they will do in the future, long before it becomes news. 4. Transparent Cost: $0.05/record; charged on a per-dataset basis Major data types: vacation rental properties Key use cases: real estate investment, hotel competitor research, tourism marketing The short-term rental market has exploded since the introduction of rental property booking websites like Airbnb and HomeAway. With that in mind, Transparent was created to provide a granular breakdown of what’s on the rental housing market. They have data on over 35 million listings worldwide across all the major property booking companies, including over 50 attributes like address, property type, number of bedrooms, amenities, pricing, and more. This data is great for those looking to invest in real estate, but it’s also helpful for those keeping an eye on the tourism industry. It can even be used to see what kind of competition hotels in a particular area have. 5. Vertical Knowledge Cost: $800-$7,000/month (average is $2,500-$3,000/month) Major data types: automotive transactions, online transactions, home rentals, company metrics Key use cases: real estate investment, automotive investment, equity research, employment trends Vertical Knowledge stands out among alternative data vendors by specializing in the privacy-compliant collection of publicly-available information on the internet. They also provide a platform on which to filter and process this data so you can gain actionable financial insights. Their datasets include things like best-selling books, car rentals and purchases, air and sea travel, and short-term home rentals. These kinds of datasets could be useful if you’re investing in the automotive, aviation, nautical, or real estate industries. Or maybe you just want to get a read on public sentiment about what’s hot and what’s not. 6. Greenwich.HR Cost: $0.05/record; charged on a per-dataset basis Major data types: financial, employment Key use cases: workforce analytics, talent acquisition and management In addition to having standard financial data, Greenwich.HR is one of a handful of alternative data firms that has data on corporate hiring practices. Their comprehensive database has information on jobs from over 5 million companies in over 200 countries worldwide, spread out over 85,000 different job attributes (including ~80% completion rate on pay data). It’s another way to get a faster and different perspective on companies’ financial outlooks by seeing how many people they’re hiring, what positions they’re hiring for, how much they’re paying their employees, and much more. Of course, you can also use this data in an inwards-facing capacity to make sure that your own organization remains competitive in the labor market. 7. Infutor Cost: $8,000-$10,000/month Major data types: real estate, automotive transactions, consumer demographics, email Key use cases: equity research, automotive investment, consumer insights, real estate investment If you’re looking to get a read on how people in the US are spending their money, look no further than Infutor. They’re a leader in US consumer identity management and resolution, combining precise property profiles with numerous customer demographic attributes. Their alternative data gives an accurate view of who US consumers are, where they’re shopping, and what they’re buying. Their property data can also be useful for real estate prospectors who want to see who’s in the market and where they’re moving. And Infutor also has automotive transaction data for those looking to invest in car and truck manufacturers. Watch our webinar with Infutor to see how alternative data can improve property analytics. 8. ClimateCheck Cost: $0.05/record, charged on a per-dataset basis Major data types: environment, weather, real estate Key use cases: real estate investment, risk assessment ClimateCheck is a unique entry in our list of alternative data providers. It combines US property data with historical weather and climate data, processed through over 25 aggregated international climate change models. The end result is a sophisticated dataset on how vulnerable different areas of the US are to natural disasters caused by climate change. These include heat waves, wildfires, violent storms, droughts, and floods. That’s useful information for risk management when investing in real estate or writing insurance policies. ‍ --- ‍ Valuable alternative data is out there; you just need to know where to look to find enough of it nicely packaged and ready to go. On that note, if you’re wondering if geospatial data on points of interest and foot traffic could be useful to your financial analytics, check out SafeGraph to try out some samples. FAQ’s 1. What is cross shopping behavior?Cross shopping behavior refers to the overlap in where consumers spend money across different brands within the same time period. 2. Does cross shopping behavior include online spending?Yes. The data includes affinities with online merchants, delivery services, streaming platforms, and payment services. 3. How can businesses use cross shopping insights?They can be used for competitive intelligence, understanding customer preferences, and evaluating site selection opportunities. Cross shopping behavior refers to the overlap in where consumers spend money across different brands within the same time period. Yes. The data includes affinities with online merchants, delivery services, streaming platforms, and payment services. They can be used for competitive intelligence, understanding customer preferences, and evaluating site selection opportunities. #### Analyzing Spending Behavior at Multiple Levels of Granularity   Key Takeaways Consumer spending analysis at the transaction level exposes patterns that brand-wide averages often conceal. Regional and store-level variations show that consumer spending analysis cannot be reliably inferred from national trends alone. Separating spend per transaction from transaction volume reveals distinct drivers behind changes in consumer behavior. Granular consumer spending analysis explains regional outliers that appear anomalous in aggregated reporting. Linking Spend data with POI attributes strengthens location-specific insights and improves interpretation of spending trends. ICYMI, SafeGraph recently launched Spend data. This dataset contains aggregated and anonymized information on debit credit card transactions by point of interest (POI). Other transaction data providers typically only offer brand-level insights, not store-level. One strong use case for the data is breaking out sales revenue more precisely. Sure, you can look at a company’s annual report to get an idea of what its revenues are overall, but what if you want to know how that company is doing regionally? Or if there are any standout stores? Or maybe you’re wondering about opening a new location and are wondering where to put it. There are meaningful differences between national and regional performance, and Spend data can help you uncover that. Analyzing Spending Behavior by Brand Let’s take Walmart as an example and look at data on spend per transaction over the first nine months of 2021. This is all same-store data, in other words using only the Walmarts present in Spend data both in January 2021 and September 2021. Then, matching the way that Walmart does same-store analysis, we also drop any stores that opened in the most recent year, using the SafeGraph opened_on field. Let’s start things off by looking at the national picture in the United States. How has Walmart fared throughout the first three quarters of 2021? Walmart national spend per transaction, 2021 We can see a few things in this. First, and most obvious, is that there’s a pretty big spike in late March and early April, with spend per transaction jumping from about $36 to about $40, but it’s only temporary and we’re back down to $36 by May. Then there seems to be a slow and slight decline, ending up with per-transaction spend values of more like $34 by the end of September. Notably, this is all based on spend per transaction. We can (and will) use this metric to look for Walmarts where people tend to spend a lot in each transaction, but this doesn’t necessarily mean that Walmart’s fortunes have looked worse over the year. The declining spend per transaction is accompanied by more transactions, to the extent that if we look at overall national Walmart spend, it’s up by 8.5% from January to September. This is nearly identical to the Walmart 8.4% same-store growth from January through October reported by the company. But beyond the aggregate, Spend lets us break things down. In particular, let’s see whether spend-per-transaction seems to be evolving similarly over the year in different states. Analyzing Spending Behavior by State Walmart spend per transaction by state, 2021 We can already see how much detail we have at our fingertips. And we can also see where some interesting trends seem to be popping out. North Dakota, for example, has an interesting dip in its spend per transaction appearing throughout June that doesn’t seem to be there for the other states. Let’s focus in a little more closely on North Dakota and its 14 different Walmart locations in the data. Walmart spend per transaction in North Dakota, 2021 Let’s take a look at the individual stores in North Dakota and see what might be driving this dip - is it widespread across all the North Dakota Walmarts? Is one store really taking a dive and the others are fine? Walmart spend per transaction by North Dakota store We see some interesting results like the enormous one-day spike in Jamestown, which might be an interesting story for another day. But for now we’re looking for a dip in July. Where did it come from? Not every store, it looks like. But both Fargo locations, Dickinson, Minot, and Wahpeton all seem to have this same sort of July-dip pattern. Analyzing Spending Behavior by Store Location We’ve pinned down three specific locations that seem to be driving the odd North Dakota behavior. What’s going on at these stores? Is this a regional thing? We can look at where the dip stores are relative to the others. North Dakota Walmart locations The dips appear to be spread throughout the state, but a lot of the activity is concentrated around Fargo in the southeast of the state, which includes both Fargo store locations and the Wahpeton store. Let’s focus on those three then. What’s driving the change in spend per transaction? Less spend overall? More small transactions? Both? Let’s focus in a bit more narrowly around the July dip and separate these two things out. We see two different stories here (and a third trail of bread crumbs we could follow on another day - what shopping bonanza is going on in Fargo 1 in May? Looking at finer grained data does tend to raise an infinite number of questions we’d love to follow up on). In Wahpeton, the dip seems powered by a drop in actual spend for a short period in July. But in Fargo, total spend seems pretty consistent, while overall transactions went up. People in Fargo were making lots of little purchases in July. If we’re willing to extrapolate a little further, we might be saying something interesting here about the effectiveness of Walmart’s 4th of July sales in Fargo as opposed to in other regions. Data-Driven Consumer Spending Insights Getting national trends and statistics for revenues for different brands often isn’t too difficult (depending on which kind of brand you’re looking at). But often the questions we might be really interested in, and the real opportunities, are at finer levels of detail. Looking at Spend data lets us distinguish how spending patterns differ across different regions, or types of location. What do we mean by types of location? That could be anything to do with the different kinds of POIs. POIs differ by geography, as we saw on the map above. They also differ by all sorts of other stuff. How long they’ve been open, or whether they’re in a mall - three of our Walmart locations in North Dakota are a part of larger shopping centers, and none of these three had a July spend-per-transaction dip. Interesting! There are a lot of possibilities, not just for broader trends, but information about individual locations, much of which could be explored directly by linking in SafeGraph Places, like we did here with the latitude and longitude of each of the Walmart locations for our map. Schedule a demo to learn more about SafeGraph Spend. FAQ’s 1. What is SafeGraph Spend data? Spend data is an aggregated and anonymized dataset of debit and credit card transactions mapped to points of interest. 2. How is Spend data different from brand-level transaction data? Unlike brand-only datasets, Spend data allows analysis at the individual store level, enabling more precise regional insights. 3. Why analyze spend per transaction instead of total revenue? Spend per transaction helps distinguish between changes driven by purchase size versus changes driven by transaction volume. 4. How does granular spend analysis help businesses? It supports decisions around site selection, regional strategy, promotions, and understanding local consumer behavior. 5. Can Spend data be combined with other datasets? Yes. Linking Spend data with POI attributes and location data provides additional context for interpreting spending patterns. Spend data is an aggregated and anonymized dataset of debit and credit card transactions mapped to points of interest. Unlike brand-only datasets, Spend data allows analysis at the individual store level, enabling more precise regional insights. Spend per transaction helps distinguish between changes driven by purchase size versus changes driven by transaction volume. It supports decisions around site selection, regional strategy, promotions, and understanding local consumer behavior. Yes. Linking Spend data with POI attributes and location data provides additional context for interpreting spending patterns. #### Announcing SafeGraph Global Places Key Takeaways SafeGraph Global Places provides standardized POI data for brands across countries worldwide. A single global schema reduces data cleaning and preparation for international analysis. The dataset supports use cases such as site selection, competitive intelligence, and investment research. SafeGraph maintains its focus on data quality, transparency, and frequent updates at a global scale. POI Data for Any Brand, Anywhere in the WorldFor the past five years, we’ve worked tirelessly to build points of interest (POI) data for places that data scientists care about. This started as places where people spent time or money in the US, and has since expanded to include industrial locations, apartment complexes, and ATMs in the US, Canada, and UK. But this month, we are proud to announce SafeGraph is launching a massive effort to expand its global POI dataset.Introducing: SafeGraph Global Places. Global Places provides every column and attribute from SafeGraph Places, for any brand anywhere in the world. As of September 2021, Global Places includes over 380 brands in 188 countries - and we are only getting started. Why We Built Global PlacesWe first built Places data in the US because we tried to build a different dataset that required POI data, and we couldn’t find a reliable source. Once we fulfilled that need in the US, Canada, and UK, we decided to set our sights on the entire world.The world is continuing to globalize and more and more organizations need to make strategic decisions about where to invest their time, money and resources both domestically in the US, as well as abroad. For example, if you are a large retailer or restaurant chain, you are investing in many different geographies globally. And it’s important to have a global strategy backed by reliable data.However, POI data can vary greatly from country to country with respect to its availability and level of quality. Each individual country may have a local vendor for POI data, but aggregating or stitching them together can be a very difficult and time consuming task for a data science team. Add in powering models with differing data schemas, and analyzing global places becomes a major endeavor.Global Places data enables users to easily compare and contrast markets with confidence. With one data schema for all worldwide POIs, Global Places drastically reduces time spent preparing and cleaning data for analytics and ingestion into data science models. SafeGraph’s proven methodology for building POI data now delivers the accuracy, precision, and freshness required for reliable analytics at a global scale.“We are thrilled to launch the Global Places dataset and bring these highly sought-after POIs to the market. Data scientists from large corporations, small organizations, and academic institutions alike have been asking for a truly global POI dataset, and we’re committed to continuously increasing our coverage to help power their analytics.” - Stu Kendall, Head of Product MarketingWhat’s Included in Global Places?Global Places data includes attributes on each branded POI including:Location name, address, lat/longBrand and category informationPhone number and open hoursWhen the location opened or if/when it closedStock market symbolAnd more - view the full schema here.How Can Global Places Be Used?There are endless use cases for POI data, but we tend to see Global POIs used for site selection, competitive intelligence, and investment research, mainly by:Tracking open/close over time and the expansion/contraction of major brand footprintsUnderstanding the geographic distribution of brands and/or POI categoriesExplore Global Places coverage here: The SafeGraph PromiseWhile we’ve expanded globally, we’ve maintained our fierce commitment to quality and transparency. SafeGraph data has a 95+% brand recall verified with first-party data, and we always share our schema publicly at docs.safegraph.com.Get Started with Global Places DataWant to dive deeper? Reach out for a free consultation with our data experts to see how Global Places data can transform your organization. FAQs 1. What are SafeGraph Global Places? SafeGraph Global Places is a global POI dataset that includes the same attributes as SafeGraph Places for brands across multiple countries. 2. How is Global Places different from regional POI datasets? It uses a single, consistent schema worldwide, making it easier to compare markets and analyze brands across geographies. 3. What types of use cases does Global Places support? Common use cases include tracking brand expansion, site selection, competitive analysis, and global investment research. 4. How does SafeGraph ensure data quality globally? SafeGraph applies the same proven methodology used in its regional datasets, with a strong focus on accuracy, freshness, and transparency. SafeGraph Global Places is a global POI dataset that includes the same attributes as SafeGraph Places for brands across multiple countries.It uses a single, consistent schema worldwide, making it easier to compare markets and analyze brands across geographies.Common use cases include tracking brand expansion, site selection, competitive analysis, and global investment research.SafeGraph applies the same proven methodology used in its regional datasets, with a strong focus on accuracy, freshness, and transparency. #### Are US Inflation Trends Reflected in SafeGraph Spend? Inflation is a complex topic, and with inflation rising to its highest in the past 40 years in February, it’s a topic being discussed a lot. Inflation essentially means the US dollar does not hold as much value as it used to because the prices of goods and services are rising. For example, Dollar Tree raised its prices 25% back in November of 2021. Although the reasons and calculations which lead to inflation are often expertly debated, there’s one thing we know for certain - it impacts each and every consumer. The recent conversations around inflation led SafeGraphers to ask: Is inflation being reflected in SafeGraph’s Spend data? How is this impacting consumer behavior? To find out if we could answer these questions, we used SafeGraph Spend. SafeGraph Spend is an anonymized, permissioned, and aggregated transaction dataset that allows data scientists to uncover insights on how spending at individual locations changes over time. In this article, we compare consumer spending across fast food, fuel, and furniture industries from December 2020 to February 2022 to glean whether SafeGraph Spend is signaling for inflation and how the data shows consumers are reacting to it. Our Analysis Fast food prices are going up That’s right, not even your favorite dollar menu item or late night meal special is safe from rising prices. But even with the increase in prices, does that mean consumers have stopped going to grab their staple quick bite? It appears not, their bills just got a little larger. According to SafeGraph Spend data, the average median spend per transaction at six different fast food restaurants increased by 5% or more since January 2021. The lowest increase between the six brands was at Chipotle Mexican Grill, which showed a 4.8% increase. This increase is in tandem with rising prices over the past year and even despite possible substitutions to cheaper menu items. These learnings showed us that consumers were not changing spending habits at fast food restaurants, but their dollars also weren’t as valuable as they used to be for one transaction - a reflection of inflation. Spending on fuel is increasing across the US Similar to fast food, fuel is a staple good for many consumers and it is difficult, if not impossible, to ignore the soaring gas prices. However, even as prices increase, consumers are often left with no other choice than to fuel up since personal vehicles have become a modern day necessity for many. Across counties nationwide, SafeGraph Spend shows the average per-location median spend per transaction at gas stations is increasing. 30% of counties in January 2022, and 40% of counties in February 2022, had an average increase in median transaction price of 25% or more compared to January 2021. Now, you may be wondering, what are consumers not spending more on as inflation rises? Fast food and fuel are both examples of inelastic goods, in other words their price does not strongly impact their demand. More money spent on those goods means less money spent elsewhere, but where? Well, let’s take a look at an elastic good - think luxury items. Consumers are changing their behavior Unlike fast food or fuel, furniture can be considered a luxury good, and therefore we would expect to see a stronger demand reaction to price increases. Luxury goods have higher price elasticity of demand. As furniture and home goods prices have been going up over the past year, we’d expect consumers to be putting off furniture purchases as their disposable income decreases with inflation. Using SafeGraph Spend, we found that among US furniture store locations, there were fewer customers year-over-year in January and February 2022 compared to the previous year. In percentage terms, the median location had a year-over-year decrease in customers of 40% in January 2022 and 13% in February 2022 compared to the same month in 2021. There was even a 15% decrease in customers at the median furniture store in December 2021, as inflation was beginning to reach record highs, as compared to December 2020. But how can we be sure this is signaling that consumers are changing behavior as a result of inflation? To be sure, we completed the same analysis using a density graph for the fast food industry. Comparing this analysis to fast food restaurants, we don’t see the same extreme decrease in customers despite the increase in prices. We see a small decrease in customers in January, and less so in February, but these customer decreases are far smaller than that of furniture stores. In percentage terms, the median fast food location had a year-over-year decrease in customers of 19% in January 2022 and 4% in February 2022 as compared to the same month in 2021. Looking at December 2021, we see more or less the same number of customers at fast food restaurants as compared to December 2020. This is inline with our expectations for how consumer spending behavior shifts due to inflation. Our Conclusion Did our analysis show anything groundbreakingly new? Maybe not, but it did show us that SafeGraph Spend is capturing inflation’s impact based on how we’d expect consumer spending to change as a result of this period of high inflation. For a group of data nerds, we thought this was pretty cool and hope you did too. Download a free sample of Spend data to test out the data yourself. If you think SafeGraph Spend may be valuable for your company, reach out to our data experts to learn more and receive a personalized consultation around your use case. #### Bad Data is Bad for Business   Key Takeaways The impact of bad data on business is that it leads to flawed decisions, wasted time, and significant financial loss. Much of the cost of bad data comes from hidden effort spent cleaning, correcting, and validating datasets. Location-based data is especially vulnerable to inaccuracies due to lack of standardization. Even small errors in POI or geometry data can have outsized business consequences. Investing in clean, accurate data upfront reduces risk and improves long-term decision-making. Don’t Let Bad Data Hurt Your Business The data you’re relying on to make important business decisions could actually be doing a lot more harm to your organization than good. We know, everybody loves data. There’s definitely no shortage of it these days. Unfortunately, as we’ve said before, not all data is created equal. That’s why we talk about the importance of data standards so often and even created a handy data evaluation checklist to help you make smarter and more informed data choices. As with many things in this world, there is a form and function to data, too. We talk about its form through the lens of how to make a dataset usable—including the flurry of technical problems that can result from attempting to use dysfunctional datasets. Data’s function, however, is a bit more of a moving target. With the right data in hand, you can do amazing things. With bad or inaccurate data, you can quickly down the wrong “rabbit hole” and make ill-informed decisions that do more harm than good. Sadly, bad data gets used all the time, often without organizations even realizing it. And it can lead businesses to draw inaccurate conclusions and make costly long-term errors. This is why, and especially at a time when data is basically everywhere, it’s so important to source the right and most accurate data at all times. Not doing so simply isn’t worth the consequences. Where does bad data come from? To answer this question, we first need to take a step back and look at the industry’s big picture. In 2016, the big data market was estimated to be worth $136 billion per year. IBM also found that, in the same year, using bad data could cost the U.S. economy around $3.1 trillion per year, if not more. Just imagine what this number is globally today. While this clearly paints a horrible picture of negative ROI, that’s not what should concern you. There are two important things at play here: 1) a lot of bad data is circulating around, and 2) too much bad data is being used, consuming valuable resources and leading to poor decisions. Most bad data is a byproduct of either human error or a lack of data expertise by the people using data to draw insights. As Thomas Redman in the Harvard Business Review puts it: “The reason bad data costs so much is that decision-makers, managers, knowledge workers, data scientists, and others must accommodate it in their everyday work. And doing so is both time-consuming and expensive. The data they need has plenty of errors, and in the face of a critical deadline, many individuals simply make corrections themselves to complete the task at hand. They don’t think to reach out to the data creator, explain their requirements, and help eliminate root causes.” Think about how many people in one organization alone are doing things like this. Then multiply that by every organization in the world. The reality of that is truly bleak. But it should come as no surprise that, with so many cooks in the data kitchen, errors slip through the cracks often. This is what Redman refers to as “hidden data factories,” which illustrates an important point about the economic impact of poor quality data. It’s not merely about the decisions being made that are informed by bad data. Rather, a lot of the waste, financially-speaking, stems from the enormous amount of time spent by data professionals cleaning and organizing data, spotting and fixing errors, and confirming sources. If the data were clean, this would be unnecessary. For knowledge workers, this kind of ‘quality control’ work can consume up to 50% of their time. For data scientists, that number easily climbs to 60%. In both cases, this kind of manual labor is simply not a good use of their time and an incredulous waste of highly valuable resources. What about location-based data, more specifically? One study found that 59% of location data is inaccurate while in another study, 25% of respondents said there wasn’t enough clarity around the sources of location data collected. Placekey simplifies address matching for data scientists. Truth be told, location-based data, whether around points of interest (POI) or building footprints (Geometry), is quite complicated and not always easy to work with. One of the biggest problems with these datasets is that—up until recently, thanks to Placekey — building addresses and specific geographic locations have not historically been standardized. This makes it easier to miss important and timely details, such as if a business is open or permanently closed. This has been a major sticking point for anyone working with location data during the pandemic. Taking it a step further, looking at building geometry, for example, if the polygons aren’t built to accurate dimensions, it can create a domino effect of inaccuracies and inconsistencies that can, when coupled with flawed POI data, make it impossible to measure foot traffic accurately. How can bad location-based data negatively impact businesses? Different businesses use location-based data in different ways. Retailers use it in trade area analysis to make revenue-impacting decisions around site selection and de-selection. Financial analysts use it in investment research or company valuations. Marketers across all sectors rely on it heavily to inform how, when, and where they place digital ads, OOH ads, and mobile ads. Unfortunately, one kink in the chain can spell disaster and flush a lot of money down the drain. Let’s take retailers as an example. Bad data used to inform site selection may have told you that a brick-and-mortar location being considered was near a complementary business, one that could create a beneficial organic foot traffic “halo effect” for your business. But again, the data was not up-to-date and failed to reveal that this business, potentially the factor that got you to lean towards this location in your decision-making, may have actually been closed for three months—but you don’t realize it until you start setting up shop. The revenue impact from the loss of shared foot traffic alone could turn a once smart investment into a total “lemon.” Drive times can be useful for determining trade areas, but only if the POIs they are created from are correct. As another example, let’s look at how this affects marketing. You might decide to launch a mobile ad campaign that triggers promotional messaging to consumers when they’re within a pre-determined geofenced area. But what if the geofence was created using bad data? Well, for starters, it could trigger notifications prematurely, when consumers are not within ideal proximity to your business. This creates an incongruent and confusing customer experience. This list of examples is endless, but the key takeaway here is simple. Even the slightest inaccuracies in location-based data can cause costly errors that can’t be recouped. Why do businesses keep using bad data? Continuing with the marketing example for inspiration, it was found that 62% of organizations use marketing data that is up to 40% inaccurate to plan their advertising and communications campaigns. But they still do this in spite of the fact that 94% of businesses have said that they suspect their customer data is inaccurate. It starts to make you wonder: If so many people are knowingly aware that they’re using bad or suspicious data, why do they keep on using it? Not that this is a good excuse for doing so, but letting bad dictate business decisions is simply the path of least resistance. Sometimes businesses just need data and insights in a pinch and don’t have the time or resources to ensure that it’s 100% clean and accurate. Of course, that’s bound to happen from time to time. But this kind of oversight should be the exception and never the rule. Using bad data like this creates a slippery slope around everything you do, from marketing campaigns, customer acquisition efforts, resource management, business expansion, investment choices, or pretty much anything else that data can inform. And the errors attributed to bad data can waste valuable marketing dollars, minimize conversions, increase customer acquisition costs, tank profits, and well beyond. Simply put, taking action on poor quality data is akin to expecting someone to make good on empty promises. If the data isn’t clean and accurate from the start, you can’t go into a business decision or expect a specific outcome with a high degree of confidence. If anything, just anticipate the worst and, if all goes well in the end, be pleasantly surprised by a positive end result. But why leave it up to chance when you could just get it right the first time? Don’t ever take the risk of using bad data At SafeGraph, our mission is to make it easier than ever to access clean and accurate data. Our team never compromises on quality to ensure that your business or organization can glean meaningful, relevant, and actionable insights—based on quantitative truth—to help make more informed decisions, allocate budgets and resources wisely, and even spark new innovations. Our entire SafeGraph Places dataset is expertly curated by our team every month so that the data is always up-to-date and immediately usable on your end. That means there’s less time (and money) wasted by your internal resources to clean the data, which conversely, gives you more opportunities to improve your marketing, sales, and business analytics over time. To see what it’s like to work with truly clean and accurate data—and why it makes all the difference—schedule a demo with a SafeGraph expert. FAQ’s 1. What is considered bad data? Bad data includes inaccurate, outdated, incomplete, duplicated, or poorly structured information that leads to unreliable insights. 2. How does bad data affect business decisions? It can result in poor site selection, ineffective marketing campaigns, misallocated budgets, and incorrect investment decisions. 3. Why is location-based data especially prone to errors? Location data lacks universal standards, making address matching, POI status, and geometry accuracy difficult to maintain without rigorous validation. 4. Why do companies continue using bad data despite knowing the risks? Time pressure, limited resources, and short-term needs often push teams to rely on imperfect data rather than validate it thoroughly. 5. How can businesses reduce the impact of bad data? By sourcing data from reliable providers, prioritizing accuracy and freshness, and minimizing internal data-cleaning overhead. Bad data includes inaccurate, outdated, incomplete, duplicated, or poorly structured information that leads to unreliable insights. It can result in poor site selection, ineffective marketing campaigns, misallocated budgets, and incorrect investment decisions. Location data lacks universal standards, making address matching, POI status, and geometry accuracy difficult to maintain without rigorous validation. Time pressure, limited resources, and short-term needs often push teams to rely on imperfect data rather than validate it thoroughly. By sourcing data from reliable providers, prioritizing accuracy and freshness, and minimizing internal data-cleaning overhead. #### Best Data Science Podcasts: Analytics, Management, Visualization, and More Key Takeaways Data science podcasts cover technical, business, and industry topics across AI, analytics, and geospatial data. The best podcast for you depends on your experience level and learning goals. Many leading data and cloud companies host shows featuring real-world practitioners. Podcasts offer a convenient way to stay current on trends without formal coursework. Podcasts have become a great way to learn new information at your own convenience, whether you’re on the go or in the middle of something else. If you're looking to learn more about how big data analysis is reshaping our businesses and our lives, the field of data science has a number of podcasts dedicated to the subject.Data science is a broad and burgeoning field, and there are many more podcasts on related topics than you may think. So how do you find one that might interest you? We’ll give you a hand by covering some things to consider when looking for a good data science podcast and provide suggestions of some of the more popular shows.Here’s a rundown of the program:What to look for in a data science podcast20 best data science podcasts to listen toLet’s turn on, tune in, and get going.What to Look for in a Data Science PodcastWhen you’re trying to find a podcast on data science – or any kind of podcast or infotainment program, really – that you’ll enjoy, there are plenty of factors you’ll want to consider. Here are some questions to ask when you come across a podcast you might find interesting.Target audience: Who are the subject matter and tone directed towards? Are they meant for non-experts and beginners, or are they more technical and geared towards specific applications and business roles?Host: Who is the person (or people) hosting the show? Do they have a background in the field(s) they’re covering, or are they perhaps an author or journalist who specializes in covering data and technology? Are they an engaging speaker to listen to? Do they seem to know their stuff?‍Guests: What kinds of people are brought in as guests on the show? Are they exciting, high-authority people in relevant fields whom you would want to hear from? Are they reputable people you can trust?Length: How long does an average episode of the podcast last? How much time do you have to devote to listening intently to the content?Frequency: How often does the podcast release new episodes? How long do you have to wait to get your next data science fix?Availability: How many platforms is the podcast available on? Do you have various places where you can look for new episodes or listen when another service is unavailable? Or do you only have limited options to work with?20 Best Data Science Podcasts to Listen toIf you don’t know where to start and would just like some suggestions on the best big data podcast to listen to, we’ll give you a hand. We’ve scoured the Internet for recommendations, and found these 20 programs rank among the most popular.1. World of DaaS‍Listen: Apple | Spotify | YouTube | Podbay.fmAverage length: 40-60 minutesAverage frequency: 1 podcast/weekWorld of DaaS (Data as a Service) is a big data analytics podcast hosted by SafeGraph’s CEO, Auren Hoffman, covering a range of data-related topics like big data analytics, data visualization, data curation, and more. In it, Auren talks to leaders in business and technology to gain insights into what makes big data companies tick: where they get their data from, how they process and store their data, and what breakthroughs they’re planning to use their data to make next.Best Episode: Jack Dangermond: Building EsriAuren talks to Jack Dangermond, CEO of Esri, about how he founded the world’s leading geospatial company through a combination of niche software pioneering and innovative management practices.2. Esri & The Science of Where‍Listen: Stitcher | Apple | Spotify | Google | TuneIn | BlubrryAverage length: 20-25 minutesFrequency: 2-3 podcasts/monthThis is a big data podcast that, as the name suggests, focuses on geospatial information. Leading geographic information systems (GIS) company Esri talks to leaders in the technology and business industries about how they are using location data to power new “smart” technologies that are changing the digital landscape.Best episode: How UPS Strengthens Customer Connection with Spatial AnalyticsThe Esri team talks to Jack Levis, Senior Director of Process Management at UPS, about how the courier service uses geospatial data analysis to streamline its operations and improve customer satisfaction.3. The Official AWS Podcast‍Listen: Apple | Spotify | iHeartRadio | Stitcher | Amazon MusicAverage length: 15-30 minutesFrequency: 6-7 podcasts/monthThe official podcast for Amazon Web Services (AWS) is for those looking for the latest news and trends in data storage, application security, hardware/software infrastructures, and more. Whether you’re a web developer or other digital technology professional, you’ll find topics of interest on the AWS Podcast, from machine learning to cloud solutions to open source platforms and beyond.Best episode: Are You Well-Architected?Philip ‘Fitz’ Fitzimons talks about how AWS’s “Well-Architected” cloud development tool has been made available to the public, and how it can help businesses with the building, monitoring, and collective improvement of cloud-based IT infrastructures.4. Identity Revolution‍Listen: Apple | Spotify | Google | Amazon MusicAverage length: 15-30 minutesAverage frequency: 2 podcasts/monthIdentity Revolution is a data management podcast from Infutor, a leader in consumer identity management. The podcast discusses how big data technology and consumer analytics drive today’s marketing efforts.Best episode: Exploring the MarTech, AdTech Data Ecosystem with Auren Hoffman of SafeGraph‍SafeGraph CEO Auren Hoffman discusses his company, the address data standard it developed (Placekey), and how he predicts data companies will continue to expand their offerings of services and tools – with an emphasis on breadth of data over accuracy – to stay competitive within the industry.5. Rise of the Data Cloud‍Listen: Podtail | Apple | Spotify | YouTubeAverage length: 20-30 minutesFrequency: 2-3 podcasts/monthAward-winning author and journalist Steve Hamm hosts this data governance podcast, brought to you by cloud data management platform provider Snowflake. Steve talks to figures in leading businesses about how they’ve leveraged big data to take their companies to the next level.Best episode: Unlocking the Power of AI with Dan Wright, President and COO of DataRobotDan Wright talks about his experience in getting more value out of machine learning, neural networks, and artificial intelligence systems to successfully run disruptive technology companies such as People.ai, LogDNA, AppDynamics, and DataRobot.6. Data Brew‍Listen: Google | Apple | Spotify | Amazon Music | Stitcher | Podcast Addict | Castbox | PodchaserAverage length: 25-40 minutesFrequency: 1-3 podcast(s)/monthFrom data and AI platform Databricks comes this machine learning and data engineering podcast. Denny Lee and Brooke Wenig interview leaders in both the corporate and academic worlds about various topics in data science, with each season covering a different theme.Best episode: BI on Data Lakes: Making It Real for RetailDenny and Brooke talk to Lara Minor from Columbia Sportswear about how transitioning from data warehouses to data lakes has helped the company better manage purchases, analyze reviews, and forecast consumer demand for their apparel.7. DeepMacro: Future of Finance‍Listen: Apple | Spotify | Breaker | Castbox | Google | Pocket Casts | RadioPublic | Anchor.fm | YouTubeAverage length: 20-40 minutesFrequency: 1-2 podcast(s)/monthThis podcast on data analytics examines the intersections between big data science, artificial intelligence, and marketing. A unique feature of this podcast is that it often focuses on developments in these fields from Asia, which DeepMacro feels do not get enough attention in the US.Best episode: Geolocation Data with Auren Hoffman, CEO of SafeGraphThe DeepMacro team talks with Auren Hoffman, CEO of SafeGraph, about mobile device GPS data and how it has been used during the COVID-19 pandemic to monitor the spread of the virus and determine which places are safer to visit than others.8. Data Skeptic‍Listen: Apple | Spotify | Google | Castbox | Player.fm | Pocket Casts | Podcast Addict | TuneIn | Amazon Music | Stitcher Average length: 25-35 minutesFrequency: 3-5 podcasts/monthThe Data Skeptic podcast discusses topics and interviews from experts in statistics, data science, machine learning, artificial intelligence, and related fields. Data Skeptic podcast host Kyle Polich approaches each conversation from critical and scientific angles, separating fact from fiction and judging what works and what doesn’t.Best Episode: Quantum ComputingKyle talks with Scott Aaronson, a professor at the University of Texas at Austin, about what a quantum computer is – including its capabilities, potential applications, and limitations.9. Data Crunch‍Listen: Apple | Google | Podtail | Podchaser | Repod.ioAverage length: 20-30 minutesFrequency: 1-4 podcast(s)/monthThe Data Crunch podcast focuses on practical applications of new technologies: data science, machine learning, deep learning, artificial intelligence, and more. Its hosts talk to entrepreneurs and tech industry experts about how they work and what they’re working on to make the world a better place – whether it actually works or not. Its accessible and entertaining approach to topics makes this a good data science podcast for beginners.Best Episode: Getting into Data ScienceGinette and Curtis from Data Crunch talk to three people who recently became data scientists to get their takes on how to get hired as one, from learning the subject matter to navigating the job application process.10. Software Engineering Daily‍Listen: Apple | Spotify Average length: 45-60 minutesFrequency: 5 podcasts/weekJeff Meyerson hosts this software-focused podcast to discuss technical topics with leaders in technology industries. His goal is to immerse his listeners in the technical aspects of how software works so they can get a little bit more knowledgeable each day. That makes it a decent podcast to learn data science from, too.Best episode: DaaS with Auren HoffmanJeff talks with SafeGraph CEO Auren Hoffman about his company and why it’s one of the few successful ‘data as a service’ businesses.11. Not So Standard Deviations‍Listen: Apple | Spotify | Podbay.fm | Google Average length: 50-70 minutesFrequency: 2-3 podcasts/monthRoger Peng and Hilary Parker host this data analytics podcast that spotlights the latest news in data science from the worlds of academia and business.Best Episode: This Week in Algorithmic BiasHilary and Roger finish their conversations from the previous episode on learning R as a first programming language and licensing open-source software, then talk about how algorithms could be used to discriminate against vulnerable populations in areas such as housing and healthcare.12. Data Stories‍Listen: Apple | Spotify | Stitcher | Podchaser | Podcast Addict | Podbean | GoogleAverage length: 40-60 minutesFrequency: 1-2 podcast(s)/monthThis is primarily a data visualization and data analysis podcast. Hosted by Enrico Bertini and Moritz Stefaner, it aims to make the ways data is affecting our everyday lives understandable at a non-technical level.Best Episode: Big Data Skepticism with Kate CrawfordKate Crawford, Principal Researcher at Microsoft Research, joins the Data Stories podcast to discuss if having more data always makes for better and smarter decisions, or gets humanity any closer to objective truth.13. SaaStr Podcasts‍Listen: RadioPublic | Apple | Spotify | Player.fm | Podyssey.fm | Repod.ioAverage length: 20-30 minutesFrequency: 2 podcasts/weekThis is a data analyst podcast with a focus on software as a service (SaaS). SaaStr brings in leading minds in building, managing, and investing in commercial software to teach listeners their strategies for establishing and growing their companies.Best episode: SafeGraph CEO Auren Hoffman: How to Build a Unicorn in 8 Simple StepsThe SaaStr team sits down with CEO Auren Hoffman of SafeGraph to discuss how to mimic the strategies of the most successful software companies to build a business model that can skyrocket your company’s growth.14. O’Reilly Data Show Podcast‍Listen: Apple | Stitcher | Google | PodchaserAverage length: 30-45 minutesFrequency: 2-3 podcasts/monthThe O’Reilly Data Podcast, from education company O’Reilly Media, explores topics such as big data, data science, and AI. It looks at the processes behind these technologies, as well as the opportunities they present for future applications and innovations.Best Episode: Trends in Data, Machine Learning, and AIThis 2018 year-end holiday edition of the show features O’Reilly managing editor Jenn Webb talking with host Ben Lorica about trends he saw in big data, machine learning, and AI over the previous year, as well as what he expects to be big in these fields during 2019.15. Making Data Simple‍Listen: Apple | Spotify | Stitcher | GoogleAverage length: 30-45 minutesFrequency: 1 podcast/weekThis is the most popular IBM big data podcast, hosted by IBM’s Vice President of Data & AI Development Al Martin. Al brings in a range of experts to talk about the latest ideas and developments in big data and AI, along with their potential impacts on businesses and people alike.Best Episode: 100th Episode Special: Uncovering a Mad Scientist with John Cohn, Part 1For the 100th episode of the Making Data Simple podcast, Al talks to IBM engineer John Cohn about his career at IBM so far, his role on the Discovery Channel TV show The Colony, and what he plans to work on next.16. The MapScaping Podcast‍Listen: Podbean | Amazon Music | Apple | Spotify | Audible | GoogleAverage length: 30-40 minutesFrequency: 1 podcast/weekThis is a data visualization podcast dedicated to cartographers and mapmakers. MapScaping sits down with experts on GIS to discuss the future of collecting, storing, organizing, and presenting geospatial data.Best episode: Building Geospatial Truth SetsThis episode takes a look inside SafeGraph’s processes as it works to collect and validate location information on every point of interest in the US and Canada.17. Data Gurus‍Listen: Apple | Spotify | Google | Stitcher | Podchaser | The Podcast App | iHeartAverage length: 20-30 minutesFrequency: 1 podcast/weekMarket research entrepreneur and Infinity Squared CEO Sima Vasa hosts the Data Gurus podcast. She discusses how big data is changing our everyday lives, and talks to successful businesspeople and company representatives about how they keep up with such a rapidly-changing industry.Best Episode: Encore Episode: Unpacking Social IntelligenceA repeat of a December 2019 interview Sima did with Menaka Gopinath, Lead of Social Intelligence and Communities at Ipsos, about how companies are using data from social media and other social channels to inform their business decisions.18. Towards Data Science‍Listen: Apple | Spotify | Podtail | Breaker | Google | Podbean | RadioPublic | Anchor.fmAverage length: 45-65 minutesFrequency: 1 podcast/weekHosted by Jeremie Harris, the Towards Data Science podcast dissects some of the biggest issues surrounding data science and AI with the help of leading researchers and businesspeople in those fields.Best Episode: Ben Lorica: Trends in Data Science with O'Reilly Media's Chief Data ScientistThe TDS team sits down with Ben Lorica, Chief Data Scientist at O’Reilly Media (and host of the O’Reilly data science podcast), to talk about recent trends in the evolution of data science, including shifts towards deploying data models and integrating software engineering practices.19. Something Ventured‍Listen: Audacy | Apple | SpotifyAverage length: 20-30 minutesFrequency: 2-4 podcasts/monthIn this podcast, entrepreneurial veteran Kent Lindstrom sits down with the biggest players in Silicon Valley. He discusses their insights not only into the inner workings of California’s legendary digital technology hub, but also how innovations coming out of it will shape the Internet and the world beyond.Best episode: Auren Hoffman: Silicon Valley’s Hyper-Connected Founder, CEO and Investor Shares His WisdomThis episode spotlights CEO Auren Hoffman of SafeGraph, discussing his journey to becoming one of Silicon Valley’s most well-known entrepreneurs/investors and his sharing of business wisdom (including why it’s often in the form of sketches on napkins).20. DataFramed‍Listen: Apple | Spotify | Google | Soundcloud | PodbeanAverage length: 50-60 minutesFrequency: 2-3 podcast(s)/monthDataFramed is a podcast brought to you by Datacamp, a data science education company. Host Adel Nehme talks to thought leaders in academia and business about various topics regarding data science: how to create and lead data-oriented teams, why it’s important for more people to become data-literate, and how to increase what data can contribute to an organization.Best Episode: Getting Your First Data Science JobIn this edition of the Data Framed podcast, former host Hugo Bowne-Anderson talks to Chris Albon of Devoted Health about what the data scientist hiring process is like, and how prospective data scientists should go about getting their first jobs.Now you’ve got some starting points for finding a podcast that can teach you more about the world of data science. Of course, another great place to learn about data science is here at SafeGraph, because data is the only thing we do. Be sure to check out the rest of our blog for more great informative articles. #### Best Practices for Working with Large Quantities of Geospatial Data   Key Takeaways Modern geospatial analysis requires cloud-based infrastructure rather than traditional tools alone. A scalable tech stack typically includes cloud storage, data lakes, processing engines, and orchestration tools. Spatial data offers unique analytical value by revealing relationships between places, brands, and consumer behavior. Organizations benefit from balancing preprocessing with flexible, on-demand analysis to control costs. Clear communication with non-technical stakeholders is essential to align expectations and build trust in data-driven insights. Our clients often ask us for best practices when setting up data infrastructures or tooling for analyzing large quantities of geospatial data. To help answer these questions, we teamed up with Rayne Gaisford, Head of Data Strategy and Equity Research at Jefferies, and Felix Cheung, SafeGraph VP of Engineering. You can watch the full webinar here, or read up on what these data experts have to say in the text below. Managing today’s data infrastructure landscape The data infrastructure landscape has changed completely over the last few years. Most notably, there has been a shift from “bare minimum solutions” such as Excel and SQL to a cloud environment that offers more processing capabilities. Managing and operating these tools not only requires time and effort, but also indicates a shift from a system management approach to a more data-centered approach through the adoption of programming libraries and languages. Second, the availability of new data infrastructure has changed how different stakeholders interact with each other. For example, the adoption of data science by the business community means that data providers now expose the work that used to be looked at by data scientists to non-technical users and help them draw conclusions for decision-making. Recommended key tech stack parts Many users ask what the main stack parts are for working with large quantities of geospatial data. Both Felix Cheung and Rayne Gaisford recommend a cloud source like S3. These platforms offer reliability, convenience, consistency, and speed. A data lake such as Delta Lake or Apache Iceberg running on top of that offers central versioning, snapshot protection and lays the foundation for future data processing, for example using Apache Spark. All this needs to be complemented with data storage, a processing compute platform, a scheduling service such as Amazon Managed Workflows for Apache Airflow (MWAA) and pipeline tooling that simplifies the act of writing a data processing pipeline. But there’s more than just hardware and tools: to orchestrate multiple teams in large organizations, it is recommended to have an active committee that structures an overall data catalogue as it needs to be centralized so that different teams know what data exists outside their own data catalogue. Connecting different datasets with each other requires more than just data integration, as different brands, products, and companies are related to different data catalogues, to find the best way to convey the obtained insights to a decision maker. What makes spatial data different? Even though spatial data does not require any different tooling than other datasets, it’s good to know about what it represents and which insights can be gained from it. GIS data is more than just data representing a physical shop location: it also tells us something about the relationships between multiple businesses or the consumer behavior between multiple brands. For example, spatial data enables us to expose if a shopper who shops at place A has a higher propensity to shop at brand B. That propensity question is not often found outside of the world of spatial data analysis. Getting started with geospatial big data analysis To get started with analyzing large GIS datasets, it’s possible to start using existing data analysis tools before investing in anything new. This could mean using Excel or connecting to a database using Open Database Connectivity (ODBC) interface that allows for data access in database management systems using SQL. However, this would only accommodate a small database, a solution that is not scalable over time. Cheung and Gaisford say a better option would be to adopt a cloud data platform in combination with running geospatial queries through a spatial database. Besides being a very affordable option, this solution is extendable for creating one’s own data lake to parse the data. An added benefit of starting right on the edge of what an organization is used to, is that over time, different teams will want to take ownership of the data and move them up the stack instead of holding on to whatever solution they are used to. Optimizing an organization’s data approach for added cost efficiency It can be a tough balance between how much an organization wants to preprocess and how much they want to optimize as they go. To have it both ways, an organization’s data analysis infrastructure can be split, in order to be able to offer both flexibility and cost-efficiency. For example, instead of running 12 hours of ‘pre-canned’ cloud jobs, it might be preferable to run a 3-hour job and make a data processing tool a little slower. This creates the opportunity to incrementally run a tool to answer an individually unique querying question if one comes up. Communicating the limits of tools and data with stakeholders Long before any data analysis takes place, it’s important to communicate with stakeholders about their expectations in terms of the insights they’re looking for and what data they could benefit from. To have such a conversation ahead of everything else lays the foundation for a strategy to find, connect and integrate the data at a later point in time. Although such an approach from scratch can take a lot of time, it’s not necessary to go through such a process for each client. When having this conversation with the client, the first task of a programmer is to build trust with a non-technical team so that they are seen as allies. After it becomes clear to both the technical and non-technical parties what the question is that needs to be answered with the data, it will become easier to manage client expectations and develop a common language to answer more questions in the future when it is not possible to answer all questions from a client the first time around. Having an overview in a large organization with multiple teams has the added benefit that it oftentimes becomes possible to answer the questions of one team using the data catalogue from another team. Watch the full webinar from Jefferies and SafeGraph to learn more about best practices for working with large quantities of geospatial data. FAQ’s 1. What makes geospatial data different from other types of data?Geospatial data captures relationships between places and movement patterns, enabling insights that are not possible with non-spatial datasets. 2. What kind of infrastructure is best for large geospatial datasets?Cloud-based storage combined with data lakes, scalable compute platforms, and spatial databases is generally recommended. 3. Do organizations need specialized tools to analyze spatial data?Spatial data can use standard data tooling, but spatial databases and geospatial query support greatly improve scalability and performance. 4. How can organizations manage costs when working with large datasets?By splitting preprocessing and query workloads, organizations can balance efficiency with flexibility and avoid unnecessary compute costs. 5. Why is stakeholder communication important in geospatial analysis projects?Aligning expectations early helps ensure the data supports real decision-making needs and prevents misinterpretation of results. Geospatial data captures relationships between places and movement patterns, enabling insights that are not possible with non-spatial datasets. Cloud-based storage combined with data lakes, scalable compute platforms, and spatial databases is generally recommended. Spatial data can use standard data tooling, but spatial databases and geospatial query support greatly improve scalability and performance. By splitting preprocessing and query workloads, organizations can balance efficiency with flexibility and avoid unnecessary compute costs. Aligning expectations early helps ensure the data supports real decision-making needs and prevents misinterpretation of results. #### Best Practices in Applying Geospatial Data to CPG Strategy 90% of worldwide purchases are predicted to take place in-store versus online in 2025. Source: Statista via CARTOThis may sound surprising. With brands like Instacart and Amazon, you could easily get everything online with minimal effort. But despite COVID and the increase in online shopping, CPG is still mostly an in-store experience. When it comes to CPG companies leveraging geospatial data to make informed decisions for site selection and identifying areas of opportunity, mobility data has always been seen as a silver bullet - and that makes sense, right? Understanding how people move is essential to developing a successful CPG in-store strategy. It’s an essential factor in assessing where people are spending their time and money. The problem is that if CPGs are not measuring what mobility data is in relation to, they could be missing out on a lot of other factors that impact the “full picture.”CPGs are using data - but is it the right data? Source: CARTOMany CPG brands overlook the importance of other geospatial data, which is ultimately detrimental to their strategy. Foot traffic is not the only thing CPGs need to be data-driven and successful. In fact, they will be unsuccessful if that’s all they use. CPG companies want to know: Where are we selling our products? Where are we not selling? Which locations meet my criteria for expansion?What does the competition look like?Analyzing POI data is the best way to answer these crucial CPG questions.SafeGraph recently presented at CARTO’s Spatial Data Science for CPG and Retail event, exploring how CPG companies should be using and thinking about geospatial data in order to be successful. Here’s a recap. Mapping the competitive landscape and measuring TAMCPG brands need to visualize store locations to understand how their products can be sold across different regions.Total addressable market (TAM) is the most common metric CPG companies use to reference revenue opportunity and best prioritize potential opportunities. For example, some may consider how many stores can theoretically carry a product across a specific city or region. If I own my distribution, I want to know where to put my product while ensuring I am not cannibalizing my other products or bleeding any revenue that I wasn’t aware of. This is where the quality of points of interest (POIs) become important. With reliable POI data, CPG companies can map their current distribution and competitive landscape to identify areas of opportunity or risk. Geometry for those POIs are also important, particularly for geofencing and advertising. If polygons are not accurate, mobile pings may be associated with something totally irrelevant, resulting in wasted ad spend and a negative customer experience. Stay up-to-date with changing store landscapesPOIs and building footprints are constantly changing. We’ve seen in our data that store locations and brands ultimately have an expiration date. Whether it is 6 months or 100 years from now, there’s no permanence. However, it can be easy to see a store or park as being permanent. If you have a map with POIs of stores that sell a product from January of 2020, the landscape now looks completely different. SafeGraph Places data provides dynamic representations of how the world looks in a monthly cadence, to provide CPG brands with an up-to-date view of their markets. To accurately depict these changes in landscape, we’ve added columns to our dataset that represent when a store or POI opened and which ones closed. We accurately track this through our ingestion pipeline, through different sources and data science algorithms to maintain the highest quality data possible. Within SafeGraph data, if you’ve noticed a new store or POI consistently appearing, that probably means a new brand has launched. If you see a POI consistently disappearing, that can give you a hint that a brand went out of business.SafeGraph's Places dataset includes detailed attribution for CPG analytics, like open/close dates and brand affiliation.Geometry data, including building footprints with spatial hierarchy, helps CPG brands identify child and parent polygons - for example, a CVS within a greater strip mall. By pinpointing exactly where that is, CPG companies can assess the competitive landscape, how accessible the location is to consumers, accurately geofence to deploy mobile marketing, and finally - bringing in mobility data again - correctly attributing visits to that specific store. A building footprint is much more accurate than a centroid radius for targeted CPG geofencing.Mobility data is useless without accurate POI and building footprint data to provide context. When all three datasets are combined, CPGs can understand the full picture when building and executing a successful strategy. To get started with POI and building footprint data for CPG analytics, browse and preview our data through CARTO’s Spatial Data Catalog. #### Best Tech Podcasts for News, Developers, Entrepreneurs, and More   Key Takeaways The best tech podcasts cater to different audiences, from developers and entrepreneurs to general tech news followers. Strong podcasts combine subject-matter expertise, consistent publishing schedules, and high production quality. Listening to a range of tech podcasts offers both timely industry updates and deeper perspective on how technology shapes business and culture. Podcasts have quickly risen in popularity over the past number of years as “radio on demand” programs about various topics. And with all the advancements being made in technology, it should come as little surprise that there is no shortage of tech podcasts out there on the Internet. So which ones are worth listening to, and why? Find out as we cover: What makes a great technology podcast? Top 10 tech podcasts for news, analysis, and storytelling We’ll start by listing some of the qualities that you’ll want to look for in a podcast before you become a serial listener. What makes a great technology podcast? Making a podcast may become easier as technology improves, but making one that stands out remains far more complicated. A blend of different factors go into crafting a show that attracts enough followers (and sponsors) to keep going strong for years. The best tech podcasts tend to have most or all of the following: Consistent (and unique) theme: The best podcasts are based around a particular idea, and they stick to it as much as possible. This is especially effective if they discuss niche subjects that few other podcasts address. They know what their audiences want, and they don’t stray overly far from it.‍ Regular schedule: Audiences like it when podcasts are predictable with respect to when new episodes will be available, similar to TV and radio programs before them. Top podcasts adhere as strictly as they can to standardized publishing schedules, whether that’s daily, semi-weekly, weekly, or bi-weekly. Many will publish on a particular day (or days) of the week, and some may even publish at a specific time. Well-structured: The best podcasts have an explicit agenda that they follow with every show, clearly delineating when each topic will be discussed. This helps to keep both participants and the audience on track while minimizing interruptions and avoiding having the show run excessively long. Informed: Hosts and guests on a podcast should come across like they know and have researched what they’re talking about. They can back up their information and opinions by referencing news stories and other credible sources. Authentic: The greatest podcasts have participants – especially the host(s) – who strive to form relationships not only with each other, but also with the audience. They build a culture of honesty and intimacy around a program. This not only helps them play off each other, but more importantly, it also lets them establish trust with the audience. It lets them invite listeners into the world they’ve created as if they’re truly a part of it. Good production values: Many of the most popular podcasts make sure to invest enough in recording equipment, production software, and even acoustic spaces. Even the most intriguing content or amicable host can be difficult to listen to if the audio has too much background noise, extreme/inconsistent volumes, or other problems. Top 10 tech podcasts for news, analysis, and storytelling Technology podcasts come in all shapes and sizes. Some help you keep on top of the latest tech news. Others offer analysis and opinions on tech from specific perspectives, like software development or business management. And still others look to tell unique stories about humanity’s relationship with technology. Here are 10 that have kept new listeners coming in – and fans coming back – year after year. 1. World of DaaS Where to listen: Apple | Spotify | YouTube | Podbay.fm Average frequency: 1 podcast/week Host: Auren Hoffman World of DaaS features SafeGraph CEO Auren Hoffman talking with a cast of forward-thinking business leaders who are finding new ways to build or utilize big data products. You’ll hear about the latest methods, standards, and use cases for working with massive quantities of data – from collection to analysis to visualization and beyond. This is one of the top tech podcasts for those looking to use new big data applications in their organization, or for entrepreneurs who want to start a business focused on providing data science as a service. Recommended listening: Jack Dangermond: Building Esri Hilary Mason: The Rise of Data Science Will Lansing: Power of Data Standards 2. Darknet Diaries Where to listen: Apple | Spotify | Google | iHeart | Stitcher | TuneIn | Castbox | RadioPublic Average frequency: 1 podcast every 2 weeks Host: Jack Rhysider One of the best technology podcasts is, ironically, about a side of technology that not many see. Darknet Diaries host Jack Rhysider takes listeners on a journey through true stories in the often hidden – but still frighteningly relevant – ongoing war between computer hackers and cybersecurity experts. You’ll hear tales about data breaches, digital robberies, government cyber-operations, hacktivism, and all sorts of other shady things that happen on the dark side of the Internet. Recommended listening: The Beirut Bank Job Black Duck Eggs Project Raven 3. Accidental Tech Podcast Where to listen: Apple | Spotify | Google | Podbay | Podchaser | Acast | Podbean | iHeart Average frequency: 1 podcast/week Hosts: Marco Arment, Casey Liss, John Siracusa Three software developers (and long-time friends) attempted to make a podcast about cars. They ended up creating Accidental Tech Podcast instead. It’s one of the best tech podcasts for developers, and one of the most popular tech podcasts in general. The show discusses topics in computer programming, software development, and other tech trends. The hosts also throw in some pop culture discussion, as well as respond to listener questions and comments. Marco, Casey, and John are also particularly knowledgeable about Apple products. So Accidental Tech Podcast is one of the best Apple tech podcasts as well. Recommended listening: A Bomb on Your Home Screen A Squirmy Soup of Rectangles You Are a Computer Athlete 4. Daily Tech News Show Where to listen: Apple | Spotify | Google | Acast | Player.fm | Podchaser | Podbay Average frequency: 1 podcast/day Hosts: Tom Merritt, Sarah Lane Daily Tech News Show is often cited as the best daily tech news podcast. Tom Merritt and Sarah Lane, along with regular contributors and guests, offer independent discussion on the current top stories in tech. And they do it all in about half an hour, so you won’t be strapped for time listening to this podcast. This is also one of the best tech podcasts for beginners, fitting the show’s motto of “helping each other understand.” The hosts and guests present tech news in an accessible style that’s as insightful as it is fun. Recommended listening: What Happens after You Brag about NFTs 3-D Printers in Spaaaaaaace! There Can Be Only One Delivery 5. Reply All Where to listen: Apple | Spotify | Google | Stitcher | Podbay | Podchaser | Acast Average frequency: 1 podcast every 2 weeks Hosts: Alex Goldman, Emmanuel Dzotsi Despite being one of the best tech podcasts on Spotify, the focus of Reply All isn’t really on technology. The show is more about stories of people trying to find their way in the modern world, and how technology has helped (or hindered) them in unique and interesting ways. It’s a fusion of tech, popular culture, and human interest that has been a winning formula since 2014. Recommended listening: Happiness Calculator vs. Alex Goldman Summer Hotline Autumn 6. This Week in Tech Where to listen: Apple | Spotify | Google | iHeart | Stitcher | RadioPublic | TuneIn | Spreaker Average frequency: 1 podcast/week Host: Leo Laporte Another of the best tech news podcasts, This Week in Tech, has a motto that says “your first podcast of the week is the last word in tech.” This long-running podcast, which started in 2005, is now the flagship show for a whole network of podcasts about tech. Hosted by Leo Laporte, it features a roundtable of journalists and tech experts who analyze and give opinions on all the latest news stories in technology. There’s a lot of material to cover, and discussions can sometimes get deep or be broken up by anecdotes and bantering. For that reason, most shows are fairly lengthy – usually just over 2 hours. Recommended listening: Doomtown Rats Leo, Not Leopoldo Disco Mannequin 7. The a16z Podcast Where to listen: Apple | Spotify | Stitcher | Google | iHeart | RadioPublic | Deezer Average frequency: 1 podcast every 2 weeks Host: Sonal Chokshi The a16z Podcast is produced by Andreessen Horowitz, a venture capital firm operating out of California’s famed “Silicon Valley” technology hub. Despite its Silicon Valley roots, though, it doesn’t just have a lot to say about tech trends, innovations, and culture. It covers all of these things from a decidedly business-oriented angle, explaining how what happens in tech has a ripple effect on how companies in other industries are being built and run. The a16z Podcast is, therefore, one of the best podcasts for tech entrepreneurs. But it’s still an interesting listen for anyone who’s curious about technology and how it shapes the future. Recommended listening: The Stories and Code of Culture Change The Machine that Made the Vaccine: Company, Platform, Innovation The Environment, Capitalism, Technology 8. How I Built This Where to listen: Apple | Spotify | Google | Stitcher | Podbay Average frequency: 1 podcast/week Host: Guy Raz Every business starts with an idea: a product to sell, a service to provide, or a need to fill. Guy Raz, the host of How I Built This, sits down with founders and leaders of some of the most famous companies around the globe to help them tell their stories. You’ll hear about the inspiration, struggle, and pivoting that it took to build world-famous brands – or steer them in radical new directions. This is one of the best tech startup podcasts for entrepreneurs who have big ideas, but may need some guidance on how to put them into action. Recommended listening: Serial Entrepreneur: Gary Vaynerchuk ARRAY: Filmmaker Ava DuVernay Siete Family Foods: Miguel and Veronica Garza 9. Pivot Where to listen: Apple | Spotify | Breaker | Castbox | Google | RadioPublic | Stitcher | TuneIn Average frequency: 2 podcasts/week Hosts: Kara Swisher, Scott Galloway Pivot isn’t your average tech news podcast. Renowned technology journalist Kara Swisher teams up with Scott Galloway, a professor of marketing at New York University, to give hot takes on the latest news in technology, business, and politics. They definitely aren’t afraid to be bold, critical, or even controversial. Get ready for a pair who love to debate pressing issues in modern technology as much as they love exchanging barbs and witty banter. Recommended listening: The Future of Travel and a Friend of Pivot on Telehealth California Is Regulating Consumer Data Privacy, the Gig Economy Revisited, etc. Airbnb’s IPO, Amazon Breaks into the Pharmacy Business, etc. 10. Note to Self Where to listen: Apple | Spotify | Google | TuneIn | Stitcher Average frequency: 1 podcast every 1-2 weeks Host: Manoush Zomorodi Note to Self is a look at how technology is impacting our health, psychology, work, play, routines, and everything else that makes us human. You’ll hear discussions about the effects of online privacy issues on anxiety, how digital communications are reshaping our habits, and how relevant a person’s virtual identity is to who they are in the real world. This podcast from New York Public Radio now only has new episodes on Luminary. However, its archival episodes are still popular and noteworthy for the subject matter they tackle. Recommended listening: Introducing: The Privacy Paradox Is the Opioid Epidemic a Tech Problem? Ghosting, Simmering, and Icing with Esther Perel Those are some of the best podcasts for technology available to listen to right now. If you’re into the data science niche, also check out our list of best data science podcasts. #### Building a Billion-Dollar Data Company: World of DaaS Interview with ZoomInfo CEO Henry Schuck   Key Takeaways Successful growth strategies for a data company start with a simple, high-value proposition that customers immediately understand. Prioritizing data quality over volume helps new data companies establish trust and dominate niche markets before scaling. Strategic acquisitions, often funded through debt rather than equity, can accelerate growth while minimizing dilution. Building a contributory data network works best when trust is established through low-risk data sharing first. New podcast with Henry Schuck, CEO of ZoomInfo (NASDAQ:ZI). Our conversation is available everywhere (Apple Podcasts, Spotify, YouTube, etc.). Please subscribe, follow, and review.‍ I am a really big fan of ZoomInfo. SafeGraph is a customer. It’s a fantastic product, and it’s one of the only billion dollar data companies that have been created in the last 20 years. I was super excited to dive into and share what made ZoomInfo so successful. Here are some highlights from my conversation with Henry Schuck. ZoomInfo’s value proposition is super simple.‍ ZoomInfo sales cycle is less than 30 days. Sometimes they’re same day deals. How? ZoomInfo provides salespeople with high quality contact information for their prospects. It's a simple tool. Customers instantly know the value that they're going to be able to get with the data. When it works, they gladly pay for it because it will make their sales process easier. ZoomInfo first focused on high quality, but a small amount of data.‍ At ZoomInfo’s start, there were a lot of companies selling contact information for B2B leads. A lot of it was stale and outdated. Quality was ZoomInfo’s differentiator -- they promised to deliver data that is 95% accurate. They started in a niche market by collecting information on IT decision makers. By narrowing their focus, they were able to maintain the highest quality data. ‍ ZoomInfo started as a data company but now is also an application company.‍ Yesterday, ZoomInfo announced it will buy sales intelligence tool Chorus.AI for $575 million. ZoomInfo has a long history of acquiring companies. Henry’s company DiscoverOrg actually acquired ZoomInfo in 2019. Henry believes there's going to be a decent amount of consolidation in the marketing software landscape. If a company is working on something ZoomInfo might have ambitions to be a part of, they try not to do a BD deal. Instead they embrace acquisitions. ZoomInfo takes every opportunity to make an acquisition with debt.‍ ZoomInfo has a long history of acquiring companies. And when possible they’ve used debt instead of equity because it dilutes less. ZoomInfo has been a profitable company acquiring other profitable companies. When looking to raise debt, they demonstrated how they would increase margins and continue to grow sales with the combined businesses.‍ As a data business, you can have a winner-take-most market.‍ Solutions and applications rarely gain 50% market share. But as a data company, you can serve the majority of a market. ZoomInfo is the dominant leader for business contacts. ZoomInfo’s secret to starting a contributory network: start with exhaust data.‍ ZoomInfo first asked their customers to share bounced and confirmatory emails. Their customers said yes because nobody cares about this email-related data, and ZoomInfo explained it would make the whole community better. After building trust and improving the network, they moved on to more sensitive information. They also acknowledged not everybody had to become a contributor to make the network really valuable. Hope you enjoy this episode of World of DaaS — would really appreciate it if you subscribe and review Apple Podcasts, Spotify, YouTube, etc.). #### Building a Data Company in Space: World of DaaS interview with Planet CEO, Will Marshall Key Takeaways Planet succeeds by positioning itself as a data company, using satellites only as a proprietary means to generate valuable datasets. Applying software development principles like agile iteration to aerospace enables rapid innovation and continuous improvement. High-frequency, globally consistent data collection makes large-scale analytics and machine learning more reliable. Investing in analytics and accessibility expands the market for complex datasets beyond highly specialized users. New podcast with Will Marshall, CEO of Planet. Our conversation is available everywhere (Apple Podcasts, Spotify, YouTube, etc.). Please subscribe, follow, and review.‍I first met Will when he cold emailed me about The DaaS Bible. He’s a fellow student of data companies, and he has built a really cool company that builds a super valuable dataset of amazing satellite imagery.Planet recently announced a SPAC deal and plans to go public later this year. This will make Planet one of the few data companies, like ZoomInfo, to go public with a multi-billion dollar valuation. I was excited to dive into what made Planet successful as a data company.Here are some highlights from my conversation with Will Marshall.Planet is a Data Company, not a Space CompanyPlanet (formerly called “Planet Labs”) creates data. Their data consists of persistent images of the Earth. They also design, build, and launch satellites. But they are first and foremost a data company. Why? They deliver data to their clients.Their clients don't buy satellites.Similar to how SafeGraph provides its clients with just data on physical places, Planet just provides their clients with satellite imagery of Earth. Their clients don't buy satellites. Satellites are merely a proprietary way of creating their data asset.Planet Operates a Fleet of 200 Orbiting SatellitesPlanet’s fleet collects images from the entire world every day. They launch satellites every 3 months. On average 22 satellites per launch. Their largest launch included 88, setting a world record. They have ground stations all around the world, they created their own technology.Planet Brought Software Development Practices to SpacePlanet pioneered “agile aerospace." They’ve taken the same concepts from agile software that lead to fast development and done the same with aerospace. They are constantly iterating. In each launch, Planet includes and tests the next generation sensor, radio, or hard drive. Every week Planets uploads new software, making the satellites more efficient, both in terms of the operations and processing of the imagery up there.Planet Captures 25 Terabytes Imagery DailyPlanets’ fleet scans the whole world at 11am local time. This allows all of their images to have consistent shadow angles, which makes machine learning and analytics easier. They also have a sky set system that lets Planet zoom in on a particular location up to 10 times per day, which is useful if you need more rapid revisit. This all adds up over 3 million images a day from their satellites. Today Planet makes this data available to clients, such as geospatial experts, via massive data feeds.More and More Companies will have the Ability to Work with Planet’s Data ‍Planet is investing in analytics to make this data valuable to organizations who cannot handle its firehose of data. We’re also generally seeing data scientists become more and more lethal because they have more and more tools around them. As these tools become more powerful, data scientists who cannot use Planet’s today might be able to in the future.Planet has some Insanely Crazy use CasesLocal governments use Planet data to enable permit enforcement.Google automatically triggers Planet satellites with lat-long requests to update their maps to identify new roads or buildings.Planet can literally tell crop type and crop yield for every 3x3 meter grid of every farm. Their infrared spectral band picks up chlorophyll, which lets them measure biomass. Over time, they can quickly tell the crop time. Agriculture represents 25% of the landmass of the earth. Planet can fly over all of that every day and enable farmers to do more precision agriculture.Hope you enjoy this episode of World of DaaS — would really appreciate it if you subscribe and review Apple Podcasts, Spotify, YouTube, etc.). #### Building a Data Maturity Model + The Four Stages of Data Maturity Data maturity is dependent on data governance, data management, data literacy, and other data analytics capabilities.If you’re reading this, you probably know a thing or two about data. Data is one of the most rapidly growing resources in our world, with an estimated 2.5 quintillion bytes created every day. It seems like data is everywhere, but in reality, we’re just getting started with it. Over 90% of data existing today was created within the last five years. But something changed in the last year that made us want to reconsider what data maturity means today: data usage. Businesses using data to guide decision making has skyrocketed in the last year. The pandemic forced businesses to redefine baselines, find new customers, and make hard decisions about where to close operations. These tough decisions forged more data mature businesses. While innovative businesses have gotten more mature, there are still many organizations that are just beginning their data journey. This model speaks to all stages of the data maturity journey.SafeGraph is a true data company; we provide CSVs with information about physical places so our clients can perform analytics and grow their businesses. We provide some of the ingredients for data analytics, but we are not a one-stop-shop for solutions. Our laser-focus on data has exposed us to organizations throughout the spectrum of data maturity, and we are fascinated by the different ways organizations interact with data.What is Data Maturity?Data maturity is a measurement of the extent to which an organization is utilizing their data. To achieve a high level of data maturity, data must be deeply ingrained in the organization, and be fully incorporated into all decision making and practices. Data maturity is often measured in stages.Sisense tells us that “data maturity is a measurement of how advanced a company’s data analysis is.” Seems like a reasonable definition, but what does that really mean?Data maturity is not just about the role that data plays within an organization’s day-to-day operations as much as it is about how it can enable organizations, of all shapes and sizes, to do something in the future that it couldn’t have done in the past without using data. When looking at data maturity from this angle, it becomes a question of empowerment: how can data be leveraged in a powerful way to unlock new insights and innovations that can eventually turn ideas into reality?For us, data maturity is a journey of exploration—where organizations not only get more acquainted with the data sources they have to work with but also learn how to leverage it in oftentimes surprising, eye-opening, and unexpected ways. In fact, once organizations take the big step from seeing data as merely a source of information and, over time, begin to understand its real potential as an influencer—or even disruptor—of decision-making, an organization’s desire to become more data mature will likely (and immediately) increase tenfold.Data Maturity Model: Defining the Varying Levels of Experience Companies Have With DataTo better understand these differences, we researched and developed a data maturity model that details the different levels of experience companies have with data and how that experience manifests in different areas of the business. Our data maturity model isn’t wildly revolutionary. Others came before and helped guide us in the right direction. But we hope this sheds light on data maturity in 2021, and gives companies direction as they start to become more data mature.Most data mature businesses are way more advanced than before, and this plays out across the categories of data maturity. They’re working with a wide variety of data, whether they be structured or unstructured datasets, they’re searching out companies that might have exhaust data to help them further differentiate their business, and they’ve built their infrastructure to handle all this data and to uncover insights fast, very fast. We added a new category for data maturity - procurement and onboarding, it’s not flashy, but the process is required for speed and scale. The companies that are the most data mature have a team dedicated to procuring and onboarding new datasets so that the data can be leveraged anywhere in the business that decisions are made. The 4 Stages of Data Maturity: Where Does Your Organization Fall?Many companies and industries have developed data maturity curves or models to illustrate how data can be integrated into business processes. Data maturity curves and models exist for very specific topics, such as customer data, as well as more universal themes like data governance. The SafeGraph data maturity model is designed to be generally applicable across all organizations, regardless of the specific type of data they use.To create our data maturity model, we looked at six aspects of a business: strategy, data, culture, architecture, data governance, and procurement/onboarding. We used the different levels of sophistication across each of these aspects to then develop four unique stages in data maturity.The Four Stages of Data Maturity‍‍Stage 1. Explorer‍Organizations that are just getting started with data generally do not have a defined strategy for incorporating data into their business. While they may use data for reporting purposes, it is on an ad-hoc basis. They do not source data for these reports and only use internally-collected data.‍Stage 2. User‍User organizations are aware of how important data quality is for success. They make it a standard to use data internally across the organization with the addition of ad-hoc datasets to assist with amplifying internal data sources. Their reactive use of data is convenient for making insightful business decisions. ‍Stage 3. Leader‍Similar to Users, Leaders centrally use data for decision making within their organization. However, they also use data for competitive intelligence. In order to accomplish organizational missions and business success, Leaders use third party datasets in addition to their own data. ‍Stage 4. Innovator‍Data is used for more than just analysis and observation. In fact, organizations that are Innovators are using data to create algorithms and predict how they can stay ahead of the game. With data governance being a part of the entire organizational business strategy, Innovators must constantly utilize data in new ways to adapt to the uncertainty of the future. Top 3 Data Maturity Assessment Questionnaires to Benchmark Your Organization's Experience LevelMost organizations today are using data in some capacity, but those that have reached the Innovator stage, where data is at the center of their strategy and operations, are truly leveraging it to the fullest potential. How data mature is your organization? Here's a round-up of self-assessment tools you can take to find out, and contribute to industry research on the topic.1. SafeGraph's Data Maturity SurveyWe developed a survey aimed at establishing industry benchmarks of data maturity. By taking our survey, you'll reflect on your own organization's use of data and help us dive deeper into our research. Results are anonymous and will be shared in a white paper and webinar in the coming months. 2. TDWI's Big Data Maturity Model and Assessment ToolTDWI developed this assessment to help determine the maturity of your organization's big data initiatives in an objective way when compared with other companies. You can complete the assessment and receive a set of scores indicating your big data maturity across five dimensions that are key to deriving value from big data analytics: organization, infrastructure, data management, analytics, and governance.3. The University of Chicago's Center for Data Science and Public Policy Data Maturity Framework QuestionnaireThe Center for Data Science and Public Policy at the University of Chicago created a data maturity framework for non-profits and government organizations based on organizational, data, and technology readiness. Their matrix and assessment questionnaire are designed to help benchmark non-profit and government organizations' ability to start data-driven social impact projects.Data Maturity is Continuously EvolvingData is one of the most valuable assets available to any organization today. Unfortunately, many simply don’t know how to use data to its fullest. So if your organization falls into this category, don’t worry—it just means that you are on the start of your own data maturity journey. And as the amount of data available continues to grow and become more accessible, the ways in which organizations can use data - and be data mature - will continue to evolve.The good news for you: There are a lot of ways to become a data mature business. It’s not always a linear path nor is it going to happen over night. But when you make the important decision to put data at the heart of your organization—to fuel business strategy, inform decision-making, and uncover competitive intelligence like never before—you are taking the first step in bringing your organization into the data age. At SafeGraph, we know a thing or two about data. So let us guide you along your own data maturity journey. Contact sales to get started and see what it’ll take to get your organization from where it is today to where you want it to be in the future. #### Building a Fairer Future of Work: World of DaaS Interview with Checkr CEO Daniel Yanisse Key Takeaways Checkr modernized background checks by replacing manual, fragmented workflows with API-driven, standardized data infrastructure. The rise of the gig economy exposed the limitations of legacy background check systems, creating demand for faster, scalable solutions. Prioritizing product integration over short-term revenue helped Checkr build durable channel partnerships and long-term adoption. Separating the data business from the software business enabled Checkr to scale data acquisition, pursue M&A, and monetize data independently. Data, when structured and contextualized correctly, can be used not only for efficiency but also to support fairer hiring practices. New podcast with Daniel Yanisse, CEO of Checkr. Our conversation is available everywhere (Apple Podcasts, Spotify, YouTube, etc.). Please subscribe, follow, and review.Checkr has created a really impressive business doing criminal background checks. They pull in fragmented, nonstandard, sensitive data from thousands of sources. Joining and categorizing this information was a really difficult task. They overhauled the process for background checks and saw tremendous success early on. They’re now using this data to encourage fairer hiring practices and working opportunities. I enjoyed learning what made this possible.Here are some highlights from my conversation with Daniel Yanisse.Background Checks Used to Run on an Incredibly Manual, error Prone ProcessCheckr invented the modern API for background checks. Legacy competitors did all of this work manually -- reviewing, filtering, matching. It was slow and error prone. The data was fragmented, there were no standards, and the data was incomplete. Identity resolution was also a massive challenge. Criminal records and DMV records aren’t tied to social security numbers. So you would get a ton of false positives and negatives when multiple people share the same name.‍The Gig Economy Started to Break the Old way of Running Background ChecksRunning a background check wasn’t a trivial task. It could take a week and it was very costly for everyone. But the staffing needs for the gig economy were an order of magnitude larger and at least an order of magnitude faster. The old process wasn’t going to work for anyone trying to hire tens of thousands of contractors within a short period of time. And the gig economy was taking off back in 2014 when Daniel first founded Checkr.Check Prioritized Product Over Revenue when Building Channel PartnershipsChannel partnerships can really scale your business, but they are a long term investment. You might not see the benefits for years. In the beginning, Checkr decided to prioritize adoption instead of revenue. If there’s no product component, you’re likely not giving the end customer a reason to buy the two products together. You want to make sure that your partnership increases the value of your partner’s product and improves the end customer’s experience.Checkr is a B2B Company Becoming a B2C CompanyConsumers have to receive a copy of their background checks. This is a great situation for Checkr. Their customers have to introduce them to consumers. While they only monetize solutions today for businesses, this could change over time. Longer term, they could build out products that drive value for the customer. For example, FICO is primarily a B2B company that initially rolled out free services for consumers. As they expanded their consumer product offering, they found new ways to monetize their consumer business.Checkr has a Separate Data CompanyCheckr runs their software and data business as two separate entities. The data business is run separately, with its own CEO and P&L. Their data company sells data to Checkr as well as other companies -- credit and identity companies. They’ve done a lot of M&A on the data side because acquiring data companies can be far simpler than acquiring another SaaS company.Checkr Raised $250 Million on a $4.6 Billion Valuation Last WeekIncredible!Note: if you enjoy this episode of World of DaaS, be sure to follow Daniel Yanisse on Twitter.Hope you enjoy this episode of World of DaaS — would really appreciate it if you subscribe and review Apple Podcasts, Spotify, YouTube, etc. #### Building a Geospatial Data Ecosystem By 2025, the geospatial analytics market is estimated to be valued at $96.3 billion. That’s a 12.9% CAGR over the forecast period beginning in 2020. Esri co-founder and CEO, Jack Dangermond, believes the future of the geospatial industry lies in the proliferation of location data and the ability to connect vast quantities of data for deeper insights about a physical place. Given these forecasts, the industry can expect to see a rapid increase in organizations incorporating geospatial data into their workflows and architectures as they become more data mature. This will vary by use case, with some aspects of a location being more important to one industry than to another. For example, a retailer may rely more on points of interest (POI) data to evaluate the competitive landscape, while an insurer may require detailed environmental data to underwrite policies. Whatever the industry, everything has a location, and geospatial data makes location actionable. With the multitudes of location data existing today and the rapid growth expected in years to come, what does the geospatial data ecosystem look like? How do datasets relate to one another, and how do organizations determine how to leverage location to make better decisions? SafeGraph is committed to democratizing access to data. To help organizations navigate the changing data landscape, we’ve mapped out a geospatial data ecosystem. Methodology: SafeGraph conducted market research to compile a list of geospatial data categories. Categories are thematically defined, and align with how users search for data and apply it in an analysis. In most cases, this means datasets within a category are typically the same type (raster, point, line, or polygon), as is the case with Boundaries data (predominantly made up of polygonal data). However, other categories are comprised of multiple types of datasets, such as Environment data, which can contain lines for rivers or rasters for vegetation. Each category is defined as the following for this ecosystem map: Address: Point datasets providing address strings or geographic coordinates Boundaries: Polygonal datasets for geographies encompassing more than one address or property Streets: Road network and traffic data Points of Interest: Point datasets with detailed attribution for non-residential places Property: Attribution pertaining to a specific address, structure, or parcel of land Mobility: Aggregated population movement data Environment: Climate, weather, and other natural data Demographics: Aggregated population count, segmentation, or characteristic data Imagery: Aerial imagery and basemaps While we include anonymized mobility data, we have chosen not to include individual-level consumer data or other personally identifiable information (PII). Other categories of geospatial data exist, such as those that include PII. This ecosystem maps the most common geospatial data categories that do not include PII. Every day, more data is created with a location component. The creation of unique identifiers, such as Placekey, makes it easier than ever to link data together for deeper insights and more efficient analysis. While it would be impossible to fully capture the multitudes of location data within one graphic, we hope this ecosystem map provides context for the geospatial data industry and sparks a healthy debate about what constitutes the given categories. As a US-based company, we recognize there are many international geospatial data providers out there we might have missed. Did we leave something out? Let us know. Important note: The scope of this market research did not extend to open source data providers. It should be noted that government entities, such as the U.S. Census Bureau and other organizations, do provide geospatial data across the various categories, but they are not listed here. This list also does not include geospatial software providers, although some companies listed do provide both software and data. For purposes of this study, SafeGraph is strictly defining ecosystem as the network of datasets that can be used in a geospatial analysis. We were strategic in our placement of each category in the ecosystem map: Addresses are placed at the bottom as geocoding forms the basis of any geospatial analysis. Moving upward, Boundaries, Streets, Points of Interest, and Property data are grouped together in the Location Data section, as these categories provide information about what is present at a specific geography. The top section, Enrichment Data, is comprised of Mobility, Environment, and Demographics, which all provide contextual detail to the underlying layers. Running along the side and underpinning all of these datasets is Imagery. Imagery can provide reference data, detailed rasters, and a helpful backdrop for visualizing all the other types of data in the ecosystem. The geospatial analytics industry’s rapid development is bound to have an effect on data ecosystems in years to come. New categories of data could develop, new companies will inevitably join the market, and how these datasets operate together will likely change. The SafeGraph team is thrilled to be part of the growing geospatial industry. As the geospatial market continues to grow, so will this map. Did we leave something out? Let us know. #### Building a Killer Go-to-Market Strategy   Key Takeaways Spatial analytics platforms strengthen go-to-market strategies by helping customers understand location context for site selection and revenue modeling. Many data and SaaS companies are increasingly expanding their market strategies beyond software to include data products and professional services to accelerate customer value. Partnerships enable step-function growth by allowing companies to meet customers where they already operate across infrastructure, data, and tooling. Product-led growth works best when free trials or self-serve models are paired with genuinely strong, differentiated products. Being a “one-stop shop” can simplify a go-to-market strategy by reducing customer friction when internal resources or expertise are limited. A lot of SafeGraph customers use GIS platforms like CARTO. We recently spoke with Javier de la Torre, Founder and Chief Strategy Officer at CARTO, on our “World of DaaS” podcast about the world of spatial analytics.‍ Spatial analytics platforms can seriously uplevel site selection and revenue modeling‍ Analytics tools like CARTO provide a rich cartographic user experience that helps customers understand the context of the physical locations of their businesses. This enables them to understand the conditions around these places and improve how they approach site selections. Whether it’s a new competitor opening near an existing customer location or entering a new market, understanding the dynamics of places in the neighborhood is critical.‍ SaaS companies are beginning to sell data and services‍ Although companies like CARTO generate most of their revenue from software, they also provide professional services and data to customers who lack the resources. In CARTO’s case, they offer Data Observatory, a product that aggregates and cleans data from different sources. Being a one-stop shop can be seriously powerful for application companies looking to deliver value to customers fast.‍ Partnerships can drive step-function growth‍ Application products recognize that their offering is just a piece of the larger puzzle, the same reality that data companies face. To go where the users are and meet their needs, it’s important to recognize which pieces of technology customers will need, which could be anything from cloud infrastructure, ETL, as well as low code applications, data models and data. By using partners to offer everything your customer needs, you can seriously up-level your go-to-market strategy.‍ Product-led growth is mostly driven by having killer products‍ CARTO has always offered a 14-day trial that allows users to download their software and start using it immediately. This free trial model is quite popular and currently has more than 250,000 users. Companies like Tableau have demonstrated that a self-serve model with free trials or freemium can be a killer tactic to product-led growth. But at the core of it, the main driver of product-led growth is simply having really really great products. Listen to the full episode on Spotify, Apple Podcasts, or wherever you get your podcasts. #### Building Footprints: Essential Data for Accurate Geospatial Analysis  Key Takeaways Building footprints use polygons and geofences to capture the true physical shape and boundaries of structures, enabling more accurate geospatial analysis than centroid-based geocodes. Centroid and radius-based approaches rely on approximations that can introduce overlap, misattribution, and distorted spatial insights. Accurate building polygons are critical for applications such as store visit attribution, trade area analysis, insurance risk modeling, and urban planning. Geometry data enhances the precision of foot traffic measurement by aligning visits to the actual occupied space of a POI. High-quality building footprint data must be accurate, current, and enriched with spatial hierarchy to deliver reliable analytical value. A Closer Look at Why Building Polygons and Geofences are more Effective than Centroid-Based Geocodes for Geospatial Analysis.‍As a starting point, satellite imagery can help us visualize, at a quick glance, where buildings exist at any point in the world. But it, too, has its own limitations: satellite imagery is unable to capture buildings obstructed by landscapes or other overhangs just as it cannot reveal the presence of multiple sub-stores within a given building. It’s merely a starting point.To go a level deeper, you need building footprints—also known as polygons, geometry, or geofences—to fill in the blanks where satellite imagery and other geocoding solutions fall short. Here, we’ll review what they are, where they can be put to use, and how SafeGraph helps businesses, urban planners, local governments, and other organizations perform accurate geospatial analysis that leads to more effective decision-making.What Is a Building Footprint?A building footprint is a form of geofencing capturing a building’s precise perimeters, excluding adjacent property features like parking lots and landscaping.You can think of these polygons as the ground-level “outlines” of any given structure. These outlines can provide useful metadata and other spatial characteristics—such as location, shape, distribution, and relationship to surrounding structures—to aid in geospatial analysis.Building footprints are composed of polygons that demarcate the boundaries of structures, from government buildings to offices to single-family homes to farms (and beyond).Unfortunately, a lot of today’s geocoding solutions rely heavily on approximations. While this may be a decent starting point, this approach does not typically yield the most accurate data or results. This is just one of many reasons why geometries have quickly become the de facto choice for geospatial analysis: they provide us with precise data that can enable a deeper and more accurate understanding of any given point of interest (POI).It’s important to note, however, that for polygons to provide meaningful analytical value, the data associated with them must come from accurate and up-to-date sources.Why Use Polygons?Articulating the actual dimensions that represent the true shape of a POI—in the form of a building polygon or geofence—provides us with useful data to better understand and visualize the true nature of POIs in ways that traditional geocoding data simply can’t.For example, address point and street-level geocodes only provide approximate measures of occupied space based on the center point of a building to its distance from either an adjoining parking lot or a nearby street curb. This creates a radius that is then used to explain, in broad strokes, a building’s general footprint.Building footprint or rooftop geocodes, on the other hand, can precisely identify the actual space a POI occupies by using real rooftop specifications (versus center points) to capture a building’s true footprint. This helps avoid overlaps and erroneous radiuses from being accounted for in the data—one of the biggest downsides of using centroid-based data—which can quickly undermine the overall accuracy and effectiveness of geospatial analysis.Building Footprints for Store Visit AttributionA great example of how geometry data is used regularly is store visit attribution. Simply put, store visit attribution is a method for predicting or measuring where foot traffic to brick-and-mortar POIs will (most likely) come from. This type of analysis typically relies on a combination of Census Block Group (CBG) data and other foot traffic data to paint a picture of how, when, and where consumers travel to and from throughout the day. Having specific data pertaining to the actual space a specific POI occupies is critical for accurate store visit attribution.Using store location centroids—also known as the “radius approach”—tend to produce geocodes that are over-representative of sub-stores or under-representative of larger stores. Both errors can lead to an inaccurate understanding of actual foot traffic to those locations.>>To learn more about why building polygon data provides a better and more accurate way for measuring foot traffic, be sure to download our Store Visit Attribution whitepaper today.Example of the potential for radius overlap “errors” when using centroids for building footprints.Top Use Cases for Building FootprintsGeometry data can add value to every industry across a variety of different applications and use cases. Accurate polygons enable you to:Identify the exact location of a buildingDetermine the number of buildings within a given areaAccount for buildings hidden in aerial images by trees and other obstructionsSpot potential risks and hazards to a buildingMeasure the square footage of a POIThe information derived from building polygons can support:Mobile Marketing: Targeting, reaching, and engaging location-based audiences, based on their proximity to a property or structure. Want to see this in action? Learn how Media Storm used SafeGraph Places data to improve its mobile targeting.City Planning: Identifying development constraints, informing landscape design, assessing property sale value, and estimating urban growth potential.Risk Evaluation: Streamlining risk analysis by identifying surrounding elements that could potentially cause harm to a building. This information can also inform disaster planning and emergency response time (should a disaster ever occur).Navigation: Creating highly accurate navigation maps with precise road geometry.Insurance Risk Assessment: Leveraging risk analysis to estimate more accurate and reasonable insurance premiums and deductibles.Retail Site Selection and Trade Area Analysis: Gaining a deeper understanding of a store’s surrounding to both manage and optimize potential and ongoing foot traffic.These are only a handful of the ways that polygon data can be put to work. Here are a few more examples of how geometry and POI data can be used in the real estate, consulting, and financial services industries.In all of these use cases above, one thing is clear: having access to precise data about a building’s actual footprint can lead to more accurate and effective geocoding. This drives better development and planning, more accurate data visualizations, stronger connections to other data sources, and, most importantly, more informed decision-making.SafeGraph data specifies sub-stores in relation to their parent store geometry. Why Use SafeGraph Building Footprint Data?At SafeGraph, our mission is to be the source of truth about all physical places in the world. Highly accurate building polygon data is at the heart of our open-source data.For example, SafeGraph’s Geometry dataset includes POI footprints with spatial hierarchy metadata that depicts when two tenants share the same polygon, available for over 8 million POIs in the US, Canada, and UK.To ensure our data is accurate and up-to-date at all times, we leverage thousands of sources —from satellite imagery to municipal records—to help generate the most accurate reference for a store’s or building’s actual footprint. Then, we supplement the geometry derived from satellite imagery with hand-drawn polygons to ensure our dataset’s utmost precision. This helps eliminate the “noise” and inaccuracy caused by centroid approximations.Additionally, we make it a priority to specify sub-stores within malls, stadiums, airports, and other similar building structures in relation to their parent store geometry. For this reason, SafeGraph’s store location geofences are the most precise in the market.Get Started with SafeGraph TodayThere’s one thing we know to be explicitly true: the physical world is constantly changing.Businesses, non-profits, academic institutions, researchers, and local, state, and federal government organizations choose SafeGraph data for its industry-leading accuracy and precision. Our polygons are built using sophisticated machine learning, computer vision, and satellite imagery, allowing us to not only develop more detailed and accurate building geometry but also to provide the cleanest and most useful datasets.‍Ready to begin using SafeGraph data? Our team is ready to help. Just schedule a call to get the ball rolling. FAQ’s 1. What is a building footprint in geospatial analysis? A building footprint is a polygon or geofence that represents the exact ground-level boundaries of a structure, excluding surrounding features like parking lots or landscaping. 2. Why are building footprints more accurate than centroid-based geocodes? Centroid-based geocodes rely on approximate center points and radiuses, which can overlap or misrepresent a building’s true size and shape. Building footprints use real boundaries for higher precision. 3. How do building footprints improve store visit attribution? By matching foot traffic to the exact area a store occupies, building footprints reduce misattribution caused by oversized radiuses or overlapping geofences. 4. Which industries benefit most from building footprint data? Retail, insurance, urban planning, mobile marketing, navigation, real estate, and public-sector planning all rely on accurate building polygons. 5. What makes building footprint data reliable for analysis? Reliability depends on accuracy, freshness, precise polygon construction, and spatial hierarchy that accounts for sub-stores within larger structures. A building footprint is a polygon or geofence that represents the exact ground-level boundaries of a structure, excluding surrounding features like parking lots or landscaping.Centroid-based geocodes rely on approximate center points and radiuses, which can overlap or misrepresent a building’s true size and shape. Building footprints use real boundaries for higher precision.By matching foot traffic to the exact area a store occupies, building footprints reduce misattribution caused by oversized radiuses or overlapping geofences.Retail, insurance, urban planning, mobile marketing, navigation, real estate, and public-sector planning all rely on accurate building polygons.Reliability depends on accuracy, freshness, precise polygon construction, and spatial hierarchy that accounts for sub-stores within larger structures. #### Building Footprints: Examples & Where to Get the Data Key Takeaways A building footprint is a polygon-based representation of a building’s shape, size, and spatial context, offering deeper insight than point-based location data. Building footprint data becomes most powerful when combined with metadata such as spatial hierarchy, addresses, and POI attributes. Industries including retail, mapping, telecommunications, insurance, and urban planning rely on building footprints for accurate modeling and decision-making. Accurate building footprints improve analyses such as visit attribution, site selection, risk assessment, and infrastructure planning. Pre-processed building footprint datasets save significant time and resources compared to building polygons manually. A building footprint may seem like just a polygon (or set of polygons) on a map, but it’s so much more than that. It’s an important data point for geospatial analysis that can be used in fields like real estate, urban planning, and even insurance.So exactly what is a building footprint? What can it be used for? And where can you get accurate data on building footprints? We’ll answer all of these questions in the following sections:What is a building footprint and why is it useful?5 industry use cases for building footprintsWhere to get building footprint dataLet’s start with a building footprint definition to give you a better understanding of what a building footprint actually is.What Is a Building Footprint and Why Is It Useful?The definition of a building footprint is a polygon, or set of polygons, representing a specific building in the physical world. It provides a ground-centered visual representation of a building’s location, shape, dimensions, and area. It may also include other geospatial information as well.This information can include:Address: address strings that include attributes such as street number and name, city, state, and ZIP code.Latitude/longitude: geographic coordinates that allow locations to be geocoded and mapped.Place: a categorical attribute that describes a building’s general purpose (e.g. residential, commercial, or industrial) and/or its specific use case (e.g. an electronics store).Spatial hierarchy: metadata that provides information about individual units within buildings (e.g. apartments, stores in malls, or offices in business complexes), and how they are spatially related to both each other and the building that contains them.Building footprints are useful because they provide detailed delineations of structures or parts of properties. This offers more insight than simple point of interest (POI) data. For example, when comparing building footprint vs. building area, the former lets you visualize what a building is shaped like and how much space it takes up relative to its surroundings. With the right metadata, it can also tell you whether a building is its own entity or a unit inside another building (i.e. spatial hierarchy). The latter can tell you how large a building is, but not how that area is distributed.Building footprint data is a great tool on its own, but can become even more useful when combined with other types of geospatial data. We’ll demonstrate how in the next section.5 Industry Use Cases for Building FootprintsSo what is a building footprint actually used for in a business context? Well, as we mentioned, it’s usually combined with other types of data for various applications. Here are 5 common ones.1. Retail: advertising, store layout planning, and site selectionBusinesses with brick-and-mortar stores can combine polygon data with human mobility data to perform visit attribution. This allows them to see how customers interact with a specific store location. They can measure things like how close people get to the store, how many people actually enter the store, where people go in the store, and how long people stay in the store.They can use this information in a number of ways. For example, they can plan out where they’re going to advertise in the surrounding area so they can direct more customers to the store. Also, based on where customers typically go in the store, they can rearrange the layout of the store (or future stores) to make the popular departments or products more accessible. And they can more accurately track how much foot traffic their store gets, which can be a factor in a decision to close down a store and move somewhere else.2. Mapping: creating accurate maps of assets, amenities, etc. for both industry and consumer consumptionWhen mapping a building, it’s important to use the footprint of the building to understand how it interacts with other nearby spaces. For example, some buildings may be irregularly-shaped to meet the constraints of the surrounding terrain, or because of other design choices. And if a building has multiple levels, it is possible that not all of them have the same profile as the main floor.Another consideration is spatial hierarchy. Some basic address data, combined with basic POI or property data, can help trace a route to a particular building or point of interest. But it may not be much help when dealing with a building that has multiple smaller units inside it, such as a mall, apartment complex, or office tower. It can also not be precise enough information for a place such as a college or university campus, which is made up of multiple related buildings but can sometimes be classified as a singular point of interest.In these cases, the building footprint area and shape must be calculated and drawn for each unit in the building and/or each building in the complex. Only then will these places have accurate mapping information – not only for navigation, but also for analyzing with other geospatial data (like specific unit or building addresses).3. Telecommunications: planning network infrastructureTelecommunications companies are expanding their services into more sparsely-populated areas in an effort to connect more people. So it’s useful for them to know how to calculate a building footprint in order to properly position cellular towers (and other infrastructure).To illustrate, they need to compare the tower plans against the area of land available for development, as well as the range of a tower’s signal. They will also need to take into account the footprints of other buildings in the area, as their heights and/or locations could interfere with signal transmissions. This will help the company get the maximum amount of coverage with a minimum number of towers.4. Insurance: co-tenancy and adjacency risk assessmentInsurance companies require accurate and precise polygon data to evaluate the risk factors of buildings and other properties. As a start, they need to be able to model a building or area to identify potential hazard spots where an accident is more likely to occur. But there is much more they can do with building footprints.For one thing, by measuring the area of a building footprint on a plot of land, insurers can look at how vulnerable a building is to damage caused by the terrain or other nearby objects. For example, a building that’s in a low-lying area, and/or close to a river or other body of water, is at greater risk of flooding or other water-related damage. Or a building in a crowded urban area close to other buildings may be more vulnerable to damage from a vehicle accident or another building collapsing.Insurers can also use spatial hierarchy metadata to assess risk. For instance, an apartment complex with a greater number of individual units can potentially house a greater number of people who could either cause or experience an accident. So that building’s risk profile goes up. Switching to a retail example, a store that sells clothing might have a relatively low base risk profile. However, that store may have an elevated risk profile anyway if it’s inside a mall and near a type of store with more hazards, such as a restaurant with stoves and grills.5. Urban planningOf course, understanding building footprint examples can also be helpful for government agencies trying to parcel out the use of city land. Knowing the dimensions and area a building takes up on a plot of land can help civil engineers plan infrastructure around it. That includes plumbing, electricity, sidewalks (and other transportation infrastructure), and constructs to aid with accessibility (like wheelchair ramps).Another part of this is making sure that buildings and the infrastructure around them don’t take up too much space. Communities still need land for things like yards, public parks, and nature conservation areas. Analyzing the total area of building footprints relative to the area of available land helps city planners maintain this balance.Studying building footprint GIS data also gives context to other geospatial data that urban planners might use. Chief among those is mobility data. Planners can look at the flow of population movements within a city on a daily or weekly basis, and assess how the layout of buildings may be affecting that. They can also identify which buildings people are congregating at, and where people are coming from to reach those buildings. All of this may suggest the need for improved transportation networks to make important or popular destinations more accessible. Or it may mean a city should build critical facilities closer to places where people are already active.Where to get building footprint dataBuilding footprint calculation is tedious and time-consuming to do on your own. That’s why it’s much easier to purchase pre-processed data from a company like SafeGraph that specializes in collecting, cleaning, and organizing data. This allows you to start your analysis quicker without having to spend your own time and money on prep work.SafeGraph’s Geometry dataset offers building polygon data with 15 different attributes. Our building footprint database includes spatial hierarchy metadata, as well as detailed information about each building as a point of interest.Building footprints are versatile geospatial data that can be valuable in many different industries. But they’re only valuable insofar as they’re precise. That’s why we pride ourselves on the coverage, accuracy, and ability to compute spatial hierarchies of our polygon data. FAQ’s 1. What is the difference between a building footprint and a point of interest (POI)? A building footprint represents the physical shape and area of a structure using polygons, while a POI is typically a single point that marks a location without showing its physical dimensions. 2. Why are building footprints more accurate than building area alone? Building area provides size but not shape or spatial context. Footprints show how a building occupies land, interacts with nearby structures, and fits within its environment. 3. Can building footprint data include multiple units within a building? Yes. With spatial hierarchy metadata, building footprints can represent individual units such as apartments, offices, or stores within a larger structure. 4. Which industries benefit most from building footprint data? Retail, mapping and navigation, telecommunications, insurance, and urban planning commonly use building footprint data for analysis and modeling. 5. Why should businesses buy building footprint data instead of creating it themselves? Manually creating accurate building footprints is time-intensive and error-prone. Purchasing curated datasets allows teams to focus on analysis rather than data preparation. A building footprint represents the physical shape and area of a structure using polygons, while a POI is typically a single point that marks a location without showing its physical dimensions.Building area provides size but not shape or spatial context. Footprints show how a building occupies land, interacts with nearby structures, and fits within its environment.Yes. With spatial hierarchy metadata, building footprints can represent individual units such as apartments, offices, or stores within a larger structure.Retail, mapping and navigation, telecommunications, insurance, and urban planning commonly use building footprint data for analysis and modeling.Manually creating accurate building footprints is time-intensive and error-prone. Purchasing curated datasets allows teams to focus on analysis rather than data preparation. #### Canadian Market Analysis with POI Data From the AWS Data Exchange Key Takeaways SafeGraph’s Canada Places (Essential Columns) dataset is available for free through AWS Data Exchange and can be used for market analysis. Amazon Athena enables scalable, serverless SQL querying of POI data stored in Amazon S3. Geospatial functions in Athena allow users to perform proximity-based market analysis, such as identifying nearby competitors or complementary businesses. Amazon QuickSight can be used to visualize POI data on interactive maps for exploratory analysis and dashboards. Combining POI data with cloud-native analytics tools simplifies market analysis without requiring complex infrastructure setup. SafeGraph's Canada Places (Essential Columns) dataset is available for free in the AWS Data Exchange. To use this POI data to conduct a market analysis, follow the steps below.Getting Started1. Sign up for an AWS account if you don’t have one already.2. Setup two Amazon S3 buckets. You can create S3 bucket through your AWS Console. Since SafeGraph Places - Entire Canada (Essential Columns) is currently hosted in AWS Region us-east-1, to avoid any unnecessary data transfer cost, it is recommended for you to create your S3 buckets in the same region. You can leave the other default setting unchanged.The first S3 bucket will be used to copy over the dataset from SafeGraph Places - Entire Canada (Essential Columns). The second S3 bucket will be used as the Athena query output. Athena is a server-less query engine with pay-per execution. Athena uses an S3 bucket to store its query result.3. Request subscription to Free POI Data: SafeGraph Places - Entire Canada (Essential Columns) through your AWS Data Exchange Console.4. Wait until your subscription is approved. You will receive an email notification and have access to it directly through AWS Data Exchange Console left navigation panel “my subscriptions” > “entitled data."Accessing SafeGraph Data5. To start working with your dataset, first you need to export it to your own S3 bucket which you created previously. You can do this by choosing Free POI Data: SafeGraph Places - Entire Canada (Essential Columns) > Data set: SafeGraph Places - Entire Canada (Essential Columns) > (any available revision). Since we are interested in a single file, we can select “export selected assets to Amazon S3."6. In the dialog box, choose the first S3 bucket you created in the previous steps, and confirm it by clicking the “export” button.7. You can view the progress and completion of the export job in the “jobs” table list at the bottom of the page. After the state turns into “completed”, you can start working with the dataset.Query with Amazon Athena8. To easily query your SafeGraph Core Places - Entire Canada (Essential Columns) data, you can use Amazon Athena which gives you a SQL interface to query the dataset.9. You can use the SQL query editor and view the result directly from your AWS Console - Amazon Athena.10. Setup an Amazon Athena output S3 bucket by clicking on the “settings” navigation at top right corner, and selecting the second S3 bucket previously created.11. A Geospatial Accessor Function, great_circle_distance() which returns the distance between two points on Earth’s surface in kilometers, is available in Amazon Athena Engine version 2. Enable this in your Athena workgroup using the following steps.11a. View the Amazon Athena workgroup setting by clicking “workgroup: primary” in the top left navigation links.11b. Select “primary” workgroup and click “view details.”11c. Click on “edit workgroup."11d. In the “query engine version” section, choose Athena engine version 2.12. Go back to the query editor and execute the following query to create an Athena table based on your S3 bucket files. The following SQL uses core_poi_canada as the table name and refers to export-location-s3-bucket-sample as the S3 bucket. You should change this to your S3 bucket name under “location."Link to downloadable code.13. After the table successfully created, you will be asked to load the table partition. You can do so by executing this query in the query editor pane.Link to downloadable code.14. You can execute a sample query for determining places where top_category equals “Restaurants and Other Eating Places” within 10KM of a Tim Hortons location at 1750 Finch Avenue East, Toronto, ON (Placekey zzw-227@665-ztc-nh5).Link to downloadable code.15. You could also run another sample query of selecting the number of Tim Hortons within 1 KM from a particular Starbucks location.Link to downloadable code.Using an Athena Table as an Amazon Quicksight Data SourceWith this data, you can leverage Amazon Quicksight to provide a feature-rich data visualization.16. Subscribe to Quicksight. For this solution you only need Quicksight Standard Edition.17. Create a new dataset from your Athena table. Click on “datasets” in the left navigation panel, and then the “new dataset” button on the top right of the page.18. Choose “Athena” as your data source, and specify the name which can be different from your Athena table. You will choose the associated Athena table after you click “create data source."19. When you see “finish dataset creation”, choose “edit/preview data.”Visualization Using Amazon Quicksight20. In the dataset preview/edit screen, you can create a field hierarchy for your geospatial visualization.20a. Click on the three dots on the right side of the “latitude” field, then choose “add to coordinates.”20b. Choose “create a new geospatial coordinates.”20c. Choose latitude and longitude for each field.20d. Click “save and visualize” from the top navigation panel.21. From “visual types”, choose “points on map.”22. Click and drag the following fields to the “field wells” > latitude/longitude to “geospatial”, location_name to “size”, and top_category to “color."23. After you finish setting up the visualization you can publish a dashboard by clicking “share” > “publish dashboard.”24. Specify the name you want for your dashboard, expand “advanced publish options”, and select “enabled ad hoc filtering.”25. You can interact with the dashboard by adding filters or zooming in and out on the map.Removing the DataIf you decide to remove all related AWS resources to prevent costs in the future, you can do the following steps. Note: the deletion/removal steps cannot be undone.1. Unsubscribe from Amazon Quicksight. This will be deleting all Quicksight related content from the account.1a. Go to your Amazon Quicksight console and click on your profile in top right corner. Choose “manage Quicksight.”1b. In the left navigation links, choose “account settings”, and click “unsubscribe.”2. Delete Amazon Athena Table by running this query in your Amazon Athena query editor:3. Delete your Amazon S3 bucket by selecting the bucket name from the list in your AWS Console - Amazon S3. First click on “empty” then “delete." You will be asked to confirm several times. FAQ’s 1. What is SafeGraph Canada Places data? It is a point-of-interest dataset covering businesses and locations across Canada, including essential attributes such as location, category, and coordinates. 2. Why use AWS Data Exchange for POI data? AWS Data Exchange allows users to easily discover, subscribe to, and access third-party datasets directly within the AWS ecosystem. 3. How is Amazon Athena used in market analysis? Athena provides a serverless SQL interface to query large POI datasets stored in S3, including support for geospatial distance calculations. 4. Can this data be visualized on maps? Yes. Amazon QuickSight can connect to Athena tables and render POI data as interactive geospatial visualizations. 5. Do I need advanced GIS tools to perform this analysis? No. The workflow relies on standard AWS services and SQL-based queries, making it accessible to analysts without specialized GIS software. It is a point-of-interest dataset covering businesses and locations across Canada, including essential attributes such as location, category, and coordinates.AWS Data Exchange allows users to easily discover, subscribe to, and access third-party datasets directly within the AWS ecosystem.Athena provides a serverless SQL interface to query large POI datasets stored in S3, including support for geospatial distance calculations.Yes. Amazon QuickSight can connect to Athena tables and render POI data as interactive geospatial visualizations.No. The workflow relies on standard AWS services and SQL-based queries, making it accessible to analysts without specialized GIS software. #### Coming Soon: UK Places Data Key Takeaways SafeGraph announced the launch of Places and Geometry data for the UK, covering England, Scotland, and Wales. The initial beta release included over 800,000 POIs across more than 500 brands. UK POI coverage focused on key non-residential categories such as retail, restaurants, healthcare, education, and transportation. This article contains outdated information, SafeGraph has since expanded our offering globally.SafeGraph is committed to democratizing access to data for all. As the source of truth for physical places, we are continuously updating our database of non-residential places. Now that we’ve delivered over 7M points of interest (POIs) for the US and Canada, we are setting our sights a little farther away from our Denver HQ.This spring, SafeGraph is launching Places and Geometry data for England, Scotland, and Wales.We’re thrilled to be expanding to a new market, filling a need for reliable UK places data many of our customers have identified over the past few years. Our beta release in April will include over 800,000 POIs and their geometry for over 500 brands. And we’re just getting started. Like we do for our US and Canada datasets, we plan to continuously source more places data in the UK to increase our coverage even more.Our initial UK offering will include POIs in England, Scotland, and Wales from the following categories:RestaurantsRetailersShopping MallsAirportsStadiumsHospitalsParksCasinosNursing HomesGolf CoursesSchools, Colleges, UniversitiesCheck out our UK Sample Data dashboard, showing coverage of five of the brands included in the beta release.To learn more about our international expansion and how to download a sample, reach out to one of our data experts. #### Comparing SafeGraph Aggregate Spend to the Census Monthly Retail Trade Survey The Census Monthly Retail Trade Survey (MRTS) provides estimates of year-over-year (YOY) percent change in retail spending, by region and eleven 3-digit NAICS categories. This analysis examines the relationship between Census MRTS YOY percent change estimates and SafeGraph aggregate Spend YOY percentage change for the years 2021 & 2020.SafeGraph Spend is our newest product, detailing anonymized consumer transactions at many POIs from our Places data. It's a great way of measuring economic activity and understanding how consumers interact with businesses. We will be looking at Region and NAICS aggregated Sum Spend, sub-selected from branded POIs in our Spend product, to help understand where we find agreement and disagreement with the Census Monthly Retail Trade Survey.Observations for both Census MRTS and SafeGraph are across 11 NAICS codes and 51 US states (including Washington DC). Each observation represents the year-over-year change from 2020 to 2021, for both Census and SafeGraph Spend, for a single NAICS by Region by Month combination.Histogram of Raw DataAt an initial glance, SafeGraph Spend data includes a greater number of negative YOY percent change observations. SafeGraph data also has a greater number of observations greater than 700%.In short, there are strong seasonal trends in outlier observations for both the SafeGraph and Census samples. Since COVID-19 and related lockdown policies upended much of American life starting in 2020, we are analyzing a time when mobility and spending experienced large and swift changes. This can be clearly seen across both data sources. We will go deeper into details on these outliers in the ‘Seasonal Trends in Year-Over-Year Outliers' section of the blog.To move forward with this analysis, the data needed to be trimmed to remove outliers. We removed anything from either source that has a YOY change estimate outside of mean +/- 2 * standard deviation. There are also some especially extreme values in both samples that could unnecessarily skew the mean and standard deviation statistics, so we first trimmed off observations where either estimate is greater or equal to 700%. While 700% itself is somewhat arbitrary, it represents an area far out on the tail of both distributions. This gave us more workable mean and standard deviation values to trim by.Overall Accuracy MeasurementsNow let's look at how well the two datasets match. We measure accuracy in two ways:Directional Accuracy: When the Census data show a positive YoY growth (2021 retail trade increased over 2020), how often does the SafeGraph data show a positive value? (and vice versa for negative values)Relative Ranking: What is the rank correlation between the SafeGraph and the Census data, and is it statistically significant?SafeGraph Aggregate Spend has the same directional YOY pct. change as the Census MRTS about 65% of the time. The most common disagreement is 31% of observations where SafeGraph Spend data indicates negative growth, while Census MRTS indicates positive growth.Across observations, when SafeGraph aggregate Spend predicts YOY growth in a Region by NAICS, Census MRTS also predicts YOY growth almost 95% of the time. While SafeGraph Spend is more likely to predict economic contraction in general, Census directional results almost always agree when we see positive growth in Spend.Overall, Census MRTS and SafeGraph Spend share a correlation of 0.39 - a good start. However, as we will see, this overall number is hiding some important nuances in the underlying relationship. There are several places where SafeGraph Spend and Census MRTS are showing different levels, but similar ranks of NAICS category growth within a given region and month. In fact, they share a 0.53 Spearman rank-order correlation. When it comes to the highest and lowest growth sectors of a state, we are very commonly seeing the same top-performing business categories.Correlation by RegionLooking at the average correlation by region, there are strong correlations across the United States. 39 out of 51 regions have an average correlation across NAICS categories greater than 0.5, punctuated by correlations of 0.8 in CO, GA, and MD.Clearly, there are also some outlier regions with low correlations as well. Idaho and Iowa both have negative average correlations. This likely owes to the fact that we sub-selected Spend at branded POIs for this analysis, and those states have a greater proportion of retail trade measured at non-branded POIs.Overall, there is an easily noticeable relationship between SafeGraph and Census estimates across the United States.Correlation by NAICS Category‍The strongest NAICS correlations are among clothing, electronics & appliance, and furniture stores. All have correlations greater than 0.45. While there is still room for improvement for some, most NAICS are showing evidence of a relationship between samples.Some NAICS categories are ambiguously defined, and there is a valid disagreement between some brands and stores. This may be partially responsible for the lack of agreement between Census MRTS and SafeGraph in NAICS categories such as 'General Merchandise Stores', where the correlation is low.Virginia YOY Change by NAICSHow does the comparison look for a particular state?Zooming in on a single region, both SafeGraph Spend and Census MRTS estimate high YOY growth across most included NAICS categories. In particular, both show YOY spend growth greater than 35% at gas stations. It's clear that driving behavior changed dramatically in this area from 2020 to 2021.For many regions of interest, there is strong alignment between SafeGraph and Census MRTS, and the weakest agreement tends to be in NAICS categories that are the most poorly defined ("Miscellaneous Store Retailers", "General Merchandise Stores", "Motor Vehicle and Parts Dealers", and "Building Materials and Supplies Dealers").Seasonal Trends in Year-Over-Year OutliersRecall that each observation analyzed here is itself a YOY change measurement relative to the year prior. We would therefore expect some impact due to volatility in the reference month's numbers. For reference months with particularly low retail trade or Spend numbers, this could lead to greater variability when calculating YOY change.We explore here seasonality in the YOY growth numbers which were removed as outliers.‍Most of the outliers that were removed were in March, April, and May.Remember that we are looking at year-over-year changes from 2020 to 2021. 2020 was an odd year, especially due to the global spread of COVID-19 and resulting lockdowns. Most observations that were trimmed, or omitted by the Census, were during the spring, precisely when the United States was undergoing massive social change, as the COVID-19 lockdown policy was in full effect. Going forward, as we get further from the beginning of the pandemic, the impact of this will likely fade from both sources.Key Findings and TakeawaysThere is a lot that can be explored with SafeGraph Spend and the Census Monthly Retail Trade Survey. Broadly, the Census and SafeGraph are showing a shifting economy. Strong indicators point to major shifts in consumer purchasing habits over 2020 - 2021 in tandem with major disruptions in American life.Overall,SafeGraph Spend and Census MRTS share an average correlation greater than 0.5 in 39 out of 51 US regionsClothing, electronics & appliance, and furniture stores show the strongest relationship over this time period. Each NAICS category has an average correlation greater than 0.45When SafeGraph Spend estimates positive YOY growth, Census MRTS predicts positive growth almost 95% of the timeSafeGraph Spend and the Census MRTS share a rank-order correlation of 0.53 for NAICS within each region-month, demonstrating alignment of the top and bottom growth NAICS each month within each stateThe majority of outlier estimates, for both SafeGraph and Census MRTS, are in March, April, or May - yet another example of the significant disruption that occurred over this period in 2020Given the granularity of SafeGraph Spend data, all the way down to our POIs, there are nearly limitless details that can be derived and analyzed. Looking through this macro lens shows promising alignment to another source of truth, and there is more to discover under the hood. #### Comparing SafeGraph and OpenStreetMap: The Hidden Cost of Free Data Key Takeaways Free POI data from OpenStreetMap carries hidden costs in time, effort, and data quality. SafeGraph provided 100% coverage of Dollar General locations in the study area, compared to 26% coverage from OSM. Data completeness differed. significantly, with SafeGraph achieving a 95.6% fill rate versus 39.8% for OSM. SafeGraph’s standardized schema and documentation reduced data preparation and analysis time. Analytical workflows such as hotspot, proximity, and trade area analysis produced more reliable results using curated data. When sourcing points of interest (POI) data, many organizations first look at free, open source options like OpenStreetMap (OSM) before choosing to pay a provider. We compared a sample of OSM data to the same query of SafeGraph data to assess differences in accessibility, coverage, completion, and usability. A sample of data from each provider was obtained for Dollar General stores in Little Rock, Arkansas to mimic the user experience of someone analyzing that specific market and brand.TL;DRAcquiring the data for Dollar General store locations in Little Rock was more streamlined than searching multiple tools for a full dataset from OSM. However, OSM data is free and SafeGraph data does come at a monetary cost.SafeGraph data contained 100% of the locations reported on Dollar General's website for Little Rock, while the best query tool we could find for OSM data provided only 26% of store locations and in differing formats.While SafeGraph and OpenStreetMap provide a similar number of attribute columns about each place (28 and 24 respectively), SafeGraph's fill rate for the given sample was 95.6% while OSM's was 39.8%. SafeGraph also provided transparent and open documentation about each column and potential fill rates, while OSM data did not come with accompanying documentation.Conducting common POI workflows using each dataset to reflect the user experience, we found that SafeGraph data produced more comprehensive and trustworthy results based on the higher coverage and completion rates. Analysis with SafeGraph data also took less time due to the availability of transparent documentation outlining each field, and the easier acquisition process when compared to OSM.Read about our analysis in depth below.Comparing SafeGraph and OpenStreetMapOrganizations large and small are increasingly realizing the importance of geospatial data. Whether they are building a consumer-facing application guiding people on where to go, or developing an internal analytics tool that informs strategic decisions, product managers and engineers are turning to POI data as a key ingredient in their solutions.But these builders are tasked with more than just populating maps with points. They are responsible for powering a positive user experience and delivering accurate information for their end users to glean insights from. The ability to provide trustworthy data in apps, platforms, and tools is the most critical goal of a product builder, regardless of if the end user is a consumer navigating to a store or a real estate developer choosing a new store location. It doesn’t matter how quickly or inexpensively a product is built if the end users can’t rely on it to solve their challenges.That’s why product builders are frequently turning away from open source data. While open source data appears to be free, it does come at a cost when considering the negative effect poor data quality has on the user experience - not to mention the time and resources required to get open source data in a usable format.To better understand the differences in open source vs curated geospatial data, we conducted a study on SafeGraph and OpenStreetMap data. The goal of our study was to explore, quantify, and articulate both the technical differences between the two data sources and the impact using one over the other would have on common POI data workflows.Our study focuses on Dollar General Stores in Little Rock, Arkansas and uses the Dollar General website’s store locator as a source of truth to compare each dataset to. We assessed the following aspects of data quality in our research:Accessibility: The length of time and steps involved needed to work with each data source, from acquiring the data needed to deriving results.Coverage: The data’s ability to reflect real-world truth and the impact that ability has on performing common POI workflows and analysis.Completeness: The level of attribution associated to individual POIs in each dataset and the impact that attribution has on adding context for visualizations and analytics.Usability: The overall quality of each dataset as an input into location-based products, taking into consideration the need to clean, manipulate, or augment the data to be fit-for-purpose.OpenStreetMap vs. SafeGraph: Data AccessTo begin our analysis of each data source, we examine how accessible SafeGraph and OpenStreetMap POI datasets are.Accessing SafeGraph dataFrom the SafeGraph website, anyone can get in contact with a data expert that will help guide them through the data procurement process. Data can be delivered through a CSV or directly to a warehouse or analytics environment such as AWS, Snowflake, or CARTO.We worked with a CSV file of Dollar General locations in Little Rock, Arkansas from the SafeGraph July 2022 release. Records were filtered from the full release using the "brands," "city," and "region," columns. Seventeen POIs were identified with a "brands" value of Dollar General, "city" value of Little Rock, and "region" value of AR.SafeGraph data for Dollar General stores in Little Rock, AR; July 2022; visualized in QGIS.Accessing OpenStreetMap DataThere are multiple programs available for extracting OSM data, and several of them were explored as options for this report.First to be tested was the Humanitarian OpenStreetMap Team (HOT OSM) export tool. By identifying an area of interest and setting the configurations to extract commercial shops, a geopackage was produced and visualized in QGIS. The results, both point and polygon, were filtered down to include only Dollar General stores; the results were one point and three polygons. Neither the point nor the polygon file had significant attribution, so we decided to look elsewhere.BBBike is an alternate tool for extracting OSM data. This tool extracts all OSM data present for a given area, so the results included buildings, places, waterways, railroads, roads, and natural features. The "buildings" shapefile proved to be the most useful, but the file only included two attributes (“name” and “type”), much of which were incomplete. A filter for Dollar General in the “name” field resulted in three polygons.Ultimately, the best tool for extracting the appropriate data proved to be Overpass Turbo. Once the area of Little Rock, AR was selected, the query wizard was used to generate statements to extract nodes (points), ways (lines), and relations tagged with the appropriate attribution (“Dollar General”). Results yielded two Dollar General point locations in the area of interest and three polygons, for a total of five distinct Dollar General locations. The field list included details like shop type, street address, business hours, website, and phone number.OpenStreetMap data for Dollar General stores in Little Rock, AR; July 2022; visualized in the Overpass Turbo user interface.OSM vs. SafeGraph Data Access Comparison ResultsIn terms of acquisition, the SafeGraph process is much more streamlined than the OSM extraction workflow; multiple different tools were used to extract OSM data in order to identify the best source, and each one has a bit of a learning curve for new users. The price to acquire the data from SafeGraph is marginal, especially when considering time spent utilizing Overpass Turbo, HOTOSM, and/or BBBike.It is also notable that the SafeGraph data can be easily visualized as both points and polygons, because the attribution for each point includes geometry (delivered in a column as well known text or WKT). Both point and polygon data can be exported from OSM as well, but the results of the two do not necessarily align and contain varying levels of completeness, requiring more time to be spent searching for and cleaning the data needed for the final output.In terms of accessibility, SafeGraph data proved to be easier and more efficient to acquire than OSM POI data. While SafeGraph data did come at a monetary cost, it did not require the same amount of time needed to search for and download the POIs as did the OSM data, which required a significant amount of time to obtain the data (data that was less complete at that).OpenStreetMap vs SafeGraph: Data CoverageTo measure and compare the coverage of POI data from SafeGraph and OSM, we use Dollar General’s online store locator as a source of truth. When searching the store locator for "Little Rock, Arkansas" using their default geographic filter of 10 miles, 27 POIs appear on their map.Dollar General store locator results for Little Rock, AR; July 2022.Because the SafeGraph and OSM data is being filtered and acquired using the city name of Little Rock and not a radius like the Dollar General website, we then filtered the store locator records by city name. Fifteen POIs were identified with an address string including Little Rock as the city name within the default 10 mile radius.SafeGraph Data CoverageComparing the SafeGraph data for Dollar General stores in Little Rock, we were able to easily match 13 of the 17 SafeGraph POIs to locations on Dollar General’s website. For the remaining four Dollar General POIs cited in the SafeGraph data, we did some digging to identify any discrepancies.Four additional POIs identified in SafeGraph’s data that were not initially matched via Dollar General’s store locator.After researching further, we were able to see that all 17 of the SafeGraph Dollar General POIs were indeed listed as operational stores on Dollar General’s store locator. Three did not originally show up because they fall outside of the default 10 mile radius, and one had a different city name listed on Dollar General’s site.Additionally, two store locations with a city name of Little Rock did appear on Dollar General’s website that were not included in the SafeGraph dataset. We did another investigation to see why and found that similar discrepancies in how the city name is attributed were the reason. The SafeGraph database does include all Dollar General POIs in Little Rock as listed on the site’s store locator, but our data acquisition filtering methodology required us to investigate a little further.Two additional POIs identified with Dollar General's store locator were not in the initial SafeGraph data download due to differences in address strings.OpenStreetMap Data CoverageOf the OSM data acquisition methods used, Overpass Turbo provided the most Dollar General locations, with the results including two points and three polygons for a total of five stores. All five stores were present on Dollar General’s store locator site.Comparing these results to the 19 Dollar General locations identified on the company website (the 17 identified by SafeGraph plus the remaining two not included in our original SafeGraph query), we can see that OpenStreetMap’s data for Dollar General POIs in Little Rock is only 26% accurate.OSM vs SafeGraph Data Coverage Comparison ResultsSafeGraph data was found to be much more comprehensive than OSM for Dollar General stores in Little Rock. Not only are all store locations represented in the SafeGraph dataset compared to only 26% in OSM, but the SafeGraph data includes both a point and a polygon for each POI. The OSM data is a mix of the two geometry types, which is not ideal for users looking to develop uniform visualizations or analytics tools.Explore the differences between the SafeGraph and OSM point data for Dollar General stores in Little Rock:Zoom in to see differences in polygon coverage between SafeGraph and OSM. For OSM features where queries only returned polygons, we generated points for an easier visualization of the differences between the two providers.OpenStreetMap vs SafeGraph: Data CompletionWhile locating points on a map is a critical function, much of the value from POI data lies in the attributes or columns associated with each location or row. Now that we have acquired the Dollar General POI data and measured its coverage of store locations, we will assess the level of detail associated to each location through the provided data attributes.SafeGraph data fill rateSafeGraph provides 28 columns of attributes for each location record or row. For the 17 Dollar General POIs identified by the original query, only 21 of the 476 attributes were incomplete (meaning the data was 95.6% complete). Some did contain "NULL" values, but those are accounted for in the product documentation (for example, a value of "NULL" in the “closed_on” field indicates that the business has not yet closed). The attribution for all Dollar General stores was almost entirely complete and thorough, including not just location information, but phone numbers, business hours, open dates, category, and other relevant details about that place. SafeGraph also provides clear reasons why some fields are not populated, for example erring on the side of caution rather than sharing false information about a place.SafeGraph data attributes for Dollar General stores in Little Rock, AR; July 2022; visualized in QGIS.OpenStreetMap Data Fill RateIn terms of completion, the OSM data included a total of 24 fields, and none of the five features were entirely complete. Among the five of them, over half the fields were left empty, for a completion rate of 39.8%. The attribution for these features is sporadically complete; some features included addresses, websites, phone numbers, and open hours, but most were nearly void of attribution. OSM data does not provide documentation to explain why some of these fields contain "NULL" values.OSM data attributes for Dollar General stores in Little Rock, AR; July 2022; visualized in QGIS.OSM vs SafeGraph Data Fill Rate Comparison ResultsOverall, the number of columns in both datasets is comparable, in the sense that both SafeGraph and OSM have fields for details of the business: name, phone, website, open hours, and street address. However, because OSM is a crowd-sourced database, it is left up to chance whether or not these fields are filled in. Some features were complete with much of this attribution, but most of them had significant gaps. In contrast, SafeGraph had a 95.6% completion rate, providing more context about each location than the OSM data. The open accessibility and transparency of SafeGraph's documentation also made it much easier to identify why certain fields were incomplete, whereas no context around data completion is available for OSM data.OpenStreetMap vs SafeGraph: UsabilityOnce we compared the level of detail provided by each dataset, we set out to determine which one was more fit-for-purpose for common POI-related workflows. Using both the SafeGraph and OSM datasets, we conducted hot spot, proximity, and trade area analyses. These methods are often baked into tools in location-based analytics platforms that perform site selection, competitive intelligence, visit attribution, and other key functions. The right quality of data baked into a tool can drastically impact the output of these functions, so we decided to see the difference using SafeGraph data vs OSM had on the results.The following technical analyses were performed to determine the value of each dataset to common workflows:Hot spot analysis was conducted using kernel density estimation in QGIS, which statistically calculates areas with concentrations of a specific feature.Proximity analysis was performed using an aggregated distance matrix generated from both point and hexagonal grids overlaid on the area of interest in QGIS. Each hexagon was then symbolized by color based on proximity to a Dollar General location.Trade area analysis was conducted using Voronoi cells in QGIS, which generates polygons that are sized based on the spatial distribution of features.SafeGraph vs. OpenStreetMap Data UsabilityHot spot analysisThe hot spot analysis conducted on the 17 SafeGraph features originally queried for identified one significant cluster in the center of the city of Little Rock. Deriving this information from a business intelligence tool, the company could decide to construct or eliminate Dollar General stores based on existing hot spots. Similarly, a competitor could choose to open a store in an underserved area.In order to conduct the hot spot analysis with the OSM data, the polygon features needed to be converted into points and merged with the existing point data. However, even then, there were not enough features in the dataset to conduct a reliable analysis using kernel density estimation, so a manual hot spot analysis had to be conducted instead. The results identified no significant hot spots that could be used for decision making.Kernel density estimation analysis with SafeGraph data vs OSM data; July 2022; visualized in QGIS.Proximity analysisTo measure the proximity of Dollar Generals to various parts of Little Rock, statistics were calculated on hexagons that define distance values. According to the SafeGraph data, there is nowhere in the city of Little Rock that is more than 5.9 miles away from a Dollar General store location; on average, a Dollar General store is within 1.9 miles of any other point in Little Rock.Using the same proximity analysis methodology, we found that, on average at any given point within the city of Little Rock, a Dollar General store is within 3.51 miles. Additionally, the furthest an individual can be from a Dollar General store is 9.97 miles. These distances are extremely different from those generated using SafeGraph data, and reflect how the data used for inputs into a product or tool can drastically impact the outcome of the results (and therefore the overall user experience).Aggregated distance matrix analysis with SafeGraph data vs OSM data; July 2022; visualized in QGIS.Trade area analysisThe final analysis conducted generated service areas for each Dollar General location with Voronoi polygons. Using the SafeGraph data originally queried for, 17 service areas were generated for Dollar General that extend even beyond the Little Rock city limits.Because of the limited number of POIs in the OSM data, the resulting service areas generated with that data do not cover the entirety of the city of Little Rock. Additionally, because the Dollar General store locations from this dataset were more dispersed, without the presence of hot spots, the range of sizes of the Voronoi cells were much smaller. A user relying on the trade areas generated from the OSM data would risk misallocating resources based on false information, and not taking into consideration cannibalization from other Dollar General stores.Voronoi polygon analysis with SafeGraph data vs OSM data; July 2022; visualized in QGIS.OSM vs. SafeGraph Data Quality Comparison ResultsOverall, SafeGraph data appears to be very nearly complete and accurate, accounting for 100% of the stores Dollar General reports itself on its website. The attribution for all features was 95.6% complete and thorough, including not just location information, but phone numbers, business hours, and opening dates. It was sufficient to conduct the three chosen analyses, and according to all three analytical outputs, there is a cluster of Dollar General stores in the center of the city.Approximately 75% of the Dollar General locations in the city were missing from the OSM dataset, and the acquisition process was lengthy and disjointed. Additionally, the attribution for these features is sporadically complete; some features included addresses, websites, phone numbers, and open hours, but most were nearly void of attribution entirely. The results of the analyses did not identify any hot spots or discrepancies in service region size. However, because of these data gaps, very little confidence can be placed in the results of the analyses. Products built with the OSM data would require supplemental POI sources to ensure the output of their users' analyses are reliable and trustworthy.While this study only compares SafeGraph and OSM data for one brand in one geographic location, the results can be used to infer what a larger analysis and the subsequent results would look like. This research was intended to recreate the user experience of someone performing specific analyses on a particular brand in a defined geographic region, but future comparisons could explore different markets, brands, or place categories to see how the two providers differ in other scenarios.Our overall recommendation? If you are looking for free data, OpenStreetMap is available and will allow you to put points on a map. But if you are looking to cost effectively reflect real world truth and embed reliable data into your products, SafeGraph is the right provider for you.Open source data may be free, but our analysis of its accessibility, accuracy, completeness, and overall usability uncovers that there is true hidden cost. Ready to get started with clean, accurate, and up-to-date data? Get in touch with the SafeGraph team. We’re here to help. FAQ’s 1. Is OpenStreetMap reliable for POI analysis? OpenStreetMap can be useful for basic mapping, but coverage and attribute completeness vary widely due to its crowd-sourced nature. 2. Why does free POI data have a hidden cost? While there is no licensing fee, free data often requires significant time for extraction, cleaning, validation, and supplementation, increasing operational costs. 3. How does data completeness affect geospatial analysis? Incomplete or inconsistent attributes can lead to inaccurate visualizations, flawed trade area models, and unreliable proximity or hotspot analysis. 4. Can OSM data be used for enterprise analytics? It can be used, but typically requires additional data sources and preprocessing to meet enterprise reliability and accuracy standards. 5. When should organizations consider paying for POI data? When data accuracy, consistency, documentation, and scalability are critical to user experience or decision-making, curated datasets provide better long-term value. OpenStreetMap can be useful for basic mapping, but coverage and attribute completeness vary widely due to its crowd-sourced nature.While there is no licensing fee, free data often requires significant time for extraction, cleaning, validation, and supplementation, increasing operational costs.Incomplete or inconsistent attributes can lead to inaccurate visualizations, flawed trade area models, and unreliable proximity or hotspot analysis.It can be used, but typically requires additional data sources and preprocessing to meet enterprise reliability and accuracy standards.When data accuracy, consistency, documentation, and scalability are critical to user experience or decision-making, curated datasets provide better long-term value. #### Complete Guide to Understanding Spatial Analysis Key Takeaways Spatial analysis uses location-based data to reveal patterns that traditional analysis often misses. Geographic context helps explain why outcomes differ across regions, neighborhoods, or trade areas. Mapping and spatial modeling support clearer, more confident decision-making across industries. Accurate and regularly updated places data is essential for producing reliable spatial insights. Most data tell you what happened.Spatial analysis tells you where it happened and why that location matters.From how cities grow to how businesses expand, many real-world decisions are shaped by geography. Spatial analysis helps uncover those patterns by examining data through the lens of location, distance, and spatial relationships. As the world continues to evolve with advances in technology and the rise of big data, spatial analysis has become foundational to modern analytics and decision-making.In this guide, we will explore what spatial analysis is, its importance, and how it fits into real-world decision-making.What is Spatial Analysis?Spatial analysis is the practice of analyzing location-based data that has a geographic component in order to identify patterns, relationships, and trends tied to location.Rather than treating data points as isolated values, spatial analysis considers how those points relate to one another in physical space. Location, proximity, boundaries, and movements all become part of the analysis.In simple terms, spatial analysis answers questions such as:How do behavioral patterns vary across regions?What role does location play in business performance and demand-supply relationships?Why do similar outcomes cluster in certain areas only?By adding geographic context, spatial analysis turns raw data into actionable insights for better decision-making grounded in the real world. Why Spatial Analysis matters today Location has always influenced outcomes but what has changed today, is the scale and availability of location-based data. Global businesses now operate across multiple regions, markets shift faster, and consumer behaviour varies widely even within the same city. Traditional analysis often smooths over these differences. Spatial analysis makes them visible.Today retail teams want to understand why performance differs across regions. Commercial real estate analysts need to assess how neighborhood dynamics affect site visibility. Marketing and media teams look to measure how audiences move through physical spaces, not just how they behave online. In each case, location is a core variable.This is where spatial analysis becomes essential. By combining geographic context with behavioral and market data, spatial analysis helps organizations move beyond surface-level metrics towards location-aware insights. Instead of asking what happened, teams can understand where it happened and why that place matters.As location intelligence becomes more central to modern analytics, spatial analysis plays a key role in supporting data-driven decisions. It enables organizations to identify regional patterns, compare markets more accurately, and respond to changes in the physical world with greater confidence.For industries that operate across locations, spatial analysis is no longer a niche capability. It is a practical tool for understanding real-world dynamics and making informed decisions at scale.How Does Spatial Analysis Work?Spatial analysis follows a structured process that turns location-based data into actionable insights. While tools may vary, the workflow generally moves through data collection, analysis and interpretation.Data CollectionThe process begins with gathering data that includes geographic context, such as coordinates, boundaries, or place identifiers. This data can come from spatial datasets, remote sensing technologies, or location-based business data that captures how the physical world is organized.Spatial Analysis and ModelingOnce the data is collected, it is analyzed to understand how locations relate to one another. This includes examining distance, proximity, clustering, and connectivity between two or more places. Analytical models can be applied to detect patterns, compare regions, or identify trends that would not be visible without a spatial lens.Visualization and InterpretationVisualization plays a key role in interpreting spatial data. Maps often surface patterns that are difficult to detect in spreadsheets alone. By viewing data in geographic contexts, analysts can translate findings into insights that support planning, forecasting and improved decision-making.Types of Spatial AnalysisDifferent questions require different spatial approaches. Rather than relying on a single method, spatial analysis draws from a set of techniques designed to explore how location influences patterns, relationships, and outcomes.Some analysis focuses on proximity, examining how closeness to other locations affects behavior, access, or performance. Others look for clusters or hotspots, identifying areas where activity is concentrated or changing rapidly.Network-based analysis explores movement and connectivity, such as transportation routes, delivery paths, or commuter flows. Surface-based analysis examines how values vary continuously across space, such as population density, elevation, or accessibility.In more advanced cases, spatial analysis incorporates statistical and modeling techniques to measure spatial dependence, compare regions, or understand how patterns shift over time. Overlay methods allow multiple spatial layers to be combined, revealing relationships that are not visible in isolation.Each approach offers a different lens for understanding how geography shapes real-world patterns. The choice of technique depends on the question being asked, the type of data available, and the level of insight required.Real-world examples of Spatial AnalysisA common real-world application of spatial analysis is retail expansion and consolidation planning.Consider a retail brand deciding where to open new stores or close underperforming locations. By combining location-based data such as existing store locations, competitor presence, population density, and consumer movement patterns, spatial analysis helps identify areas of opportunity and risk.Instead of relying solely on historical sales data, teams can visualize how customer activity flows across regions, how trade areas overlap, and where demands may shift over time. This allows decision-makers to act proactively, not reactively.The same analytical approach applies beyond retail, from infrastructure planning to logistics optimization. What changes are the questions being asked. The underlying principle remains the same: using location-aware data to understand patterns that would otherwise remain hidden. Data Required for Effective Spatial AnalysisSpatial analysis is only as strong as the data behind it.Accurate coordinates, consistent place definitions, and well-maintained boundaries are essential for understanding how activity unfolds across locations. Just as important is freshness. Because the physical world changes constantly, outdated points of interest can quickly undermine analysis.This is where working with a modern data partner matters. Reliable, regularly updated POI data ensures spatial insights reflect what exists on the ground today, not what existed months ago.High-quality places data, including precise footprints, detailed attributes, and comprehensive coverage, provides the foundation needed to support meaningful spatial analysis across industriesGetting started with Spatial AnalysisIn today’s data-driven environment, spatial analysis has become essential for understanding patterns, relationships, and decisions that are inherently tied to location.Getting started doesn’t require understanding every tool in depth, it only requires clarity on the questions you are trying to answer and the data you rely on. Because spatial analysis relies on accurate, regularly updated data, and choosing a trustworthy data partner as a critical first step. Explore SafeGraph to learn more about how reliable places data can support spatial analysis from the start. Explore SafeGraph FAQ’s 1. How does spatial analysis show insights that regular data analysis cannot? By adding geographic context, spatial analysis reveals how proximity, distance, and location influence patterns that remain invisible in non-spatial datasets. 2. How do maps improve the interpretation of spatial data? Maps allow analysts to visually detect relationships, trends, and anomalies that are difficult to identify through tables or spreadsheets alone. 3. Why does data quality matter more in spatial analysis? Since spatial insights depend on real-world conditions, inaccurate or outdated location data can quickly lead to misleading conclusions. 4. What kind of data is needed for spatial analysis? Spatial analysis requires accurate, regularly updated data that includes geographic details such as coordinates, boundaries, or place identifiers. 5. What is the first step in getting started with spatial analysis? Begin by identifying your goals and gathering accurate location data before applying any spatial analysis techniques. By adding geographic context, spatial analysis reveals how proximity, distance, and location influence patterns that remain invisible in non-spatial datasets.Maps allow analysts to visually detect relationships, trends, and anomalies that are difficult to identify through tables or spreadsheets alone.Since spatial insights depend on real-world conditions, inaccurate or outdated location data can quickly lead to misleading conclusions.Spatial analysis requires accurate, regularly updated data that includes geographic details such as coordinates, boundaries, or place identifiers.Begin by identifying your goals and gathering accurate location data before applying any spatial analysis techniques. #### Connecting Alternative Data Sources for Investment Research Key Takeaways Alternative data delivers the most value when multiple datasets are connected, not analyzed in isolation. Credit card transaction data reveals spending behavior but lacks geographic and demographic context on its own. Geospatial data adds spatial intelligence, showing where transactions occur and how performance varies by location. Linking consumer spending, POI, and demographic data enables deeper competitive and investment analysis. Tools like Placekey reduce friction when joining disparate alternative datasets for faster insights. Geospatial data, like SafeGraph Places, is one of the fastest-growing areas of alternative data, with the industry expected to be valued at $96.3 billion by 2025. Credit card data is also rapidly growing in popularity and is currently the highest-grossing category of alternative data.The financial services industry is catching on. According to a study by EY, 28% of hedge funds use geolocation data, while 38% use consumer spending data. However, Rayne Gaisford, Head of Data Strategy and Equity Research at Jeffries, says the financial service industry is just starting to scratch the surface when it comes to leveraging alternative data. Many firms are still adjusting to using alternative data as their primary source of information instead of as a supplemental afterthought.Credit card data is fundamental to financial analysis. Understanding how consumers spend their money is one of the strongest indicators of economic performance, and hedge funds, private equity firms, and investment banks frequently apply it in their analysis of an industry or brand as they make strategic decisions.For example, a private equity firm choosing a grocery chain to invest in might look at credit card transaction data for a few brands to see which has the stronger performance. But credit card data alone does not provide all the answers needed for a thorough analysis. Where are those transactions coming from? Who is making them?Source: CARTOWhen combined, consumer spending and demographic datasets can reveal much more than they can independently. Connections can be made between high-performing stores or products and demographic profiles that provide valuable information for competitive analysis and investment research. Alternative data works best when connected to other alternative datasets to paint a complete picture.Consumer spending data gives investment researchers a detailed look at key economic indicators, such as wallet share, activity segments, and cardholder behavior. Geospatial data allows those indicators to be analyzed spatially. Using Placekey, credit card transaction data can be easily joined to other datasets. This streamlines the process of combining consumer spending and geospatial data, getting to the results faster. By connecting credit card and geospatial data, insights that might have otherwise gone unnoticed can be factored into an analysis.When combined with geospatial data like POIs and building footprints, credit card data tells a story. This has never been more important than it is today, as the global economy is changing due to COVID-19 and former predictive models become irrelevant. Chief U.S. Economist at Goldman Sachs, David Mericle, says that 2020 was a breakthrough year in his firm’s use of alternative data as they adapted to the COVID economy. More than ever, financial services firms are feeling the pressure to gain a competitive edge from alternative data.Source: Goldman SachsPoints of interest (POIs) can provide hedge funds, private equity firms, investment banks, and other financial organizations with crucial information about where businesses are located and where there are areas of opportunity. When combined with credit card data, these geospatial datasets reveal how transactions vary across space and time and enable researchers to see the full picture before making a strategic decision. FAQ’s 1. Why isn’t credit card data sufficient on its own for investment research? Credit card data shows what consumers spend, but not where transactions originate or who is making them, limiting contextual analysis. 2. How does geospatial data enhance alternative data analysis? Geospatial data adds location-based context, allowing analysts to evaluate performance across regions, store locations, and demographic segments. 3. What types of alternative data are most effective when combined? Consumer spending, geospatial POI data, building footprints, and demographic datasets are especially powerful when analyzed together. 4. How does connecting datasets improve investment decision-making? Integrated datasets reveal patterns and relationships that are invisible in siloed data, supporting more confident strategic decisions. 5. What role does Placekey play in connecting alternative datasets? Placekey provides a standardized way to join transaction, demographic, and geospatial data, reducing data preparation time. Credit card data shows what consumers spend, but not where transactions originate or who is making them, limiting contextual analysis.Geospatial data adds location-based context, allowing analysts to evaluate performance across regions, store locations, and demographic segments.Consumer spending, geospatial POI data, building footprints, and demographic datasets are especially powerful when analyzed together.Integrated datasets reveal patterns and relationships that are invisible in siloed data, supporting more confident strategic decisions.Placekey provides a standardized way to join transaction, demographic, and geospatial data, reducing data preparation time. #### Creating a Winning Map: Mapscaping Contest Winner Interview Mapscaping and SafeGraph Map Contest: Q&A with GIS Manager, Sallie Vaughn ‍ ‍ We love seeing our users put data to work—and we’re always game to shine a spotlight on new and innovative ways people are using SafeGraph Places data to solve big business problems. That’s exactly why we launch the Mapscaping and SafeGraph Map Contest. So today, we decided to chat with contest runner-up and GIS Manager, Sallie Vaughn to learn more about how she built a stunning visual model to analyze and understand the dynamics of the fitness center market in Brooklyn, New York. Then, to wrap up our Q&A, Sallie offered up a few useful tips for creating clear and simple presentations that make a huge impact. Before diving in, however, it’s worth noting that Brooklyn is home to nearly 750 fitness and recreation centers and a total population of 2.6 million people. This means there is approximately one fitness center for every 3,500 people. Sallie’s density map (see below) was able to pinpoint with incredible accuracy where the market is oversaturated as well as where it has significant room to grow—great insights for anyone looking to open up a new fitness center and/or compete in the Brooklyn market. This is also helpful for anyone who works in economic development; using data to develop visual representations of real world market dynamics can fuel innovation and provide insights that impact decision-making in a big way. If you can’t tell, we loved everything about Sallie’s map creation. But don’t just take our word for it. Let’s take a moment to hear from Sallie herself. Tell us a little about yourself. “I’m Sallie Vaughn, GIS Manager for Person County, North Carolina. As a GIS professional with 13 years of experience in location analytics, cartography, application design, and database management, I’m passionate about presenting information in a clear and compelling way. I’m always looking for new challenges and love solving problems with the right map. When I’m not working, you can find me playing the euphonium, biking, or hiking with my husband, Evan.” What inspired you to enter the SafeGraph Map Contest? “I love getting the chance to showcase information in new and interesting ways. We don’t get to do this all that often in local government; we tend to generate maps that aren’t as visually impressive as they could be. We mostly recreate maps based on printed documents from decades ago, so we don’t have much leverage to be creative. That’s why working on this project was so much fun. I wanted to dig into a complex question about the fitness industry and create a map depicting the areas of market saturation and potential growth for fitness centers in Brooklyn, New York. Fitness is an important part of people’s lives, whether for people wanting to stay healthy or for businesses looking for ways to serve their needs. I really enjoyed being able to take a closer look into the numbers to reveal some useful insights about the market as a whole.” ‍ Sallie Vaughn’s runner-up winning map of the fitness center market in Brooklyn, NY. ‍What drew you to fitness centers? “As a GIS Manager in local government, I regularly work with our economic development staff to analyze and depict areas of saturation and potential growth for various sectors of business and industry. Naturally, most of my work focuses on Person County, but I’m also fascinated by how to find and use data—especially point-of-interest (POI) data—on other locations. In this case, focusing on fitness centers in Brooklyn was a natural extension of these interests. I was looking for an enjoyable challenge that would also give me a good reason to work on my cartography skills. Plus, this was a great opportunity to use a little creativity.” Talk us through your process. “For data exploration, I started by importing SafeGraph data to ArcGIS Online. This provides the easiest and quickest way to symbolize and filter data based on attributes and then examine patterns of market oversaturation. ArcGIS Online allows you to quickly overlay population data with market data, which is useful with a complex analysis like this. Personally, I’ve always noticed that once one gym pops up somewhere, several other gyms will pop up right next to it. This makes me wonder how marketable and sustainable those businesses really are. To answer this question, I chose the NAICS (North American Industry Classification System) Code for fitness centers and spent some time digging into it to see what I could find. Like a lot of GIS folks, I can get lost for hours in a good dataset! So, once I chose a subset, I decided to stick with it. What you see on the map is the end result of all that digging.” How did you build the map with that data? “ArcGIS Online is great for initial data exploration, but I needed something different for making the map itself. So, I decided to import the data into an ESRI geodatabase and then develop the map with ESRI’s ArcGIS Desktop v. 10.6. After downloading raw population data for New York from the U.S. Census Bureau, I calculated the average number of fitness centers per person and determined the standard deviation. Then, I ran several different kernel densities to model the density of fitness centers. Once I had this information, I could begin depicting areas of market saturation and areas of potential growth.” How did you land on particular design features? “When it comes to building maps, I always let the subject guide the design aesthetic. The shape of Brooklyn lends itself well to a portrait orientation, so I created the map as an 18-inch by 24-inch document. I also added details, such as text boxes and legends, in all the right places to enhance the look and feel of the map without overshadowing the most important information. One of the biggest challenges was finalizing the map colors. Typically, I’m only trying to depict either oversaturation or growth potential on a map—not both at the same time. This required me to choose colors that could convey the information in a simple and visually-appealing way. I actually went through a lot of iterations before choosing the final colors. I am thrilled with the end result. I really feel this challenge gave me an opportunity to offer clear insights about a subject that is both complex and personally interesting to me.” Telling great stories requires the right data Trends and patterns, like economic development in urban areas, can be tricky to turn into clear and compelling stories. Not to mention, it can be really easy to get lost in the most minute details when working with complex sets of information. The good news: SafeGraph data and state-of-the-art GIS technology combined make it possible for anyone to make powerful and informative maps on pretty much any subject imaginable. In Sallie’s case, she relied on clean and accurate POI data to identify the parts of Brooklyn currently underserved by fitness centers and to offer up clear insights about where there’s the most room for potential future growth. This is a great example of how data can be used to understand, anticipate, and address a local population’s growing needs. A big thanks to Sallie for letting us share both her story and her fantastic map! Make your own map! If you work in economic development, urban planning, or any area involving data-driven decision making, you know just how important it is to have access to good, clean data. SafeGraph Places data is packaged up monthly in a well-organized CSV file (with release notes) to ensure you always have access to the most accurate and updated information available. Plus, as an added perk, it’s super easy to use with ESRI, the world’s leading GIS technology platform. ‍ Has this story inspired you? Ready to make your own map? Let SafeGraph get you moving in the right direction. Contact us to learn more! #### Cross Shopping Behaviour: See Where Else Consumers Spend Money Key Takeaways Cross shopping behavior shows where consumers spend money beyond a single brand or location. New Spend columns surface brand affinities across physical stores, online merchants, and services. Cross shopping insights add an omnichannel view to consumer spending analysis. These insights support competitive analysis, consumer profiling, and site selection. In case you missed it, this year we launched SafeGraph Spend. Spend is the first transaction dataset to associate anonymized and aggregated consumer spending behavior to hyper-accurate points of interest (POIs). Built upon SafeGraph Places, a POI dataset updated each month to reflect when businesses open, close, relocate, or change name, Spend includes information about the raw spend, volume of transactions, median spend per transaction, spend by day of the week, and offline vs online spend at each POI.With the June 2022 release of SafeGraph Spend, data scientists can derive even more insights about consumer spending behavior. This release includes new cross shopping columns that detail what other brands consumers spend money at in a given month. For example, cross shopping can show that 49% of consumers who spend money at the McDonald’s on 123 Main Street in January also made purchases at Target that same month. The addition of these new columns means the Spend dataset not only shows geographic patterns in consumer spending behavior, but also brand affinities.There are ten new columns included in Spend to highlight these brand affinities, all reporting cross shopping behavior by percentage for easy comparison and analysis.New cross shopping columns include:Related physical brands: What other brands with physical stores do people who spend money at this POI also spend money at?Related online merchants: What percent of people who spend money at this POI make purchases with online brands like Amazon?Related same-category brand: What percent of people who spend money at this POI also spend money at a similar business or competitor brand?Related local brand: What other brands with locations in this zip code do people who spend money at this POI also spend money at?Related delivery service: What is the affinity of people who spend money at this POI with online delivery services like DoorDash or UberEats?Related wireless carrier: What percent of people who spend money at this POI are Verizon vs AT&T customers?Related rideshare service: What percent of people who spend money at this POI spend money on Uber vs Lyft?Related buy now/pay later service: What percent of people who spend money at this POI also use buy now/pay later services like Klarna or Affirm?Related streaming service: What percent of people who spend money at this POI subscribe to streaming services like Netflix or Hulu?Related payment platform: What percent of people who spend money at this POI use services like PayPal or Cash App?Cross shopping insights provide a new dimension to the behavior of consumers in and around physical places, but also online. Discovering brand affinities between brick and mortar stores and ecommerce sites helps show how consumers engage in an omnichannel market, and can be used to build stronger products, services, and offers for customers.Some top use cases for cross shopping insights include:Competitive intelligence: Are your customers also shopping at competitive stores? Or are they purchasing goods online that they could be getting in your store?Consumer insights: Do your target customers frequent specific stores or brands? Can this help you better understand what they want and provide that to them?Site selection: Where are the best locations for my new store? How do surrounding stores promote positive consumer spending behavior?The flexibility in geographic analysis offered by SafeGraph Spend enables data scientists to uncover consumer spending patterns and brand affinities at the level of granularity they need.To see more brand affinity insights derived from SafeGraph Spend’s new cross shopping columns, check out our retail scorecard.‍Interested in exploring cross shopping columns for yourself? Schedule a demo with one of our experts. FAQ’s 1. What is cross shopping behavior? Cross shopping behavior refers to the overlap in where consumers spend money across different brands within the same time period. 2. Does cross shopping behavior include online spending? Yes. The data includes affinities with online merchants, delivery services, streaming platforms, and payment services. 3. How can businesses use cross shopping insights? They can be used for competitive intelligence, understanding customer preferences, and evaluating site selection opportunities. Cross shopping behavior refers to the overlap in where consumers spend money across different brands within the same time period.Yes. The data includes affinities with online merchants, delivery services, streaming platforms, and payment services.They can be used for competitive intelligence, understanding customer preferences, and evaluating site selection opportunities. #### Data Can Transform Society for the Better Key Takeaways Responsible access to large datasets can unlock breakthroughs in economics, healthcare, and public policy. Longitudinal tax data has reshaped our understanding of income mobility in the United States. Healthcare data, when securely analyzed, can improve cancer treatment and accelerate life-saving drug development. Credit card and spending data reveal real-time patterns that support public health and economic planning. The central challenge is balancing innovation with privacy safeguards, not choosing one over the other. Recently “data” has been an ugly word. Queue up news on Cambridge Analytica or GDPR. But like many technical advancements, data can be used for both the benefit of society and can also be used by bad actors to attack society.Data is currently being used to advance our understanding of societyData about people, when used correctly, can significantly increase our understanding of society and has the potential to better the lives of everyone in the world. Better understanding of data can lead to more effective cancer treatments, reduction of traffic in cities, greater understanding of income inequality, a deeper appreciation of why people vote, and much much more.Chetty’s IRS partnership leads to insights in income mobilityProfessor Raj Chetty (of Stanford University) has been publishing some of the most interesting economic papers this decade because of his unique access to IRS data. Chetty has access to almost 30 years of tax records. Through this data, Chetty (and his colleagues) have been able to study income mobility in the United States in a way that was never able to do in the past.Of course, the IRS data is extremely privacy-sensitive (it includes the tax returns of everyone in America) and must be handled with care. Chetty has to take many precautions when using the data (including accessing it only in IRS clean-rooms). The precautions are incredibly necessary because of the sensitivity of the data.The collaboration between Chetty and the IRS have had a significant impact on our understanding of society. For example, Chetty found that the share of US children making more than their parents dropped from around 90% in the 1940s to just over 50% in the 1980s.Saving lives in cancer treatments using dataLike IRS data, mobility data has the potential to be extremely sensitive. Analyzing this data can have incredible benefits to society that make it worthwhile to do the hard work to aggregate it and enable the positive benefits while protecting against possible abuse of the data.While IRS and mobility data are sensitive, health care data is even more personal. And while there are a lot of protections of this data (like HIPAA) many institutions have deep medical data tied to exact-person identities. If used properly, healthcare data can save lives and massively enhance quality of life. But the data also has the potential to be used to take advantage of patients … so protection and controls are paramount.IBM Watson has made a lot of strides in analyzing oncology data for good. IBM partnered with Sloan Kettering, one of the top medical institutions in the world, to analyze cancer treatments and outcomes. The hope is that the data will help doctors optimize treatments for individual cases to lead to the very best outcomes.Flatiron Health (recently acquired by Roche for $1.9 billion) has also made significant strides in getting better cancer treatments. In 2014 the company raised $130 million and used the bulk of that to acquire an oncology CRM software company that also develops electronic medical records systems. They are using the proprietary data to positively impact the lives of potentially millions of people.In other areas of the healthcare field, Datavant is working to use health data to help life-saving drugs come to market faster. (conflict note: I am an investor in Datavant).Credit card data can really help understand societySome of the most interesting data is individual credit card data. There are about a half-dozen companies that sell credit card data (many at the individual level). This data, of course, is very private and we would never want an individual’s data to become public.Credit card data offers significant insight into society. We can even learn a great deal from the mundane, like Second Measure’s study that more people buy flowers on Mother’s Day than Valentine’s Day. (conflict note: I am an investor in Second Measure). More interesting, we can predict flu and epidemic patterns in specific areas based on spending data. This could allow for better deployment of public health resources and lead to significant life savings.We must re-envision credit card data as a powerful tool to improve society (and not just to optimize the betting algorithms of quant funds on Wall Street).Using big data for policy problemsProfessor Susan Athey (of Stanford University) wrote (in 2017) a seminal piece in Science on how societies can better take advantage of data to help people. More access to data has amazing potential to transform the efficiency of cities, provide better health care, and help policy makers to grow the economy. Large, deep data sets about people have the potential to unlock some of society’s greatest secrets (like how to have a successful marriage, more effectively raise children, increase happiness, and increase altruism).The opportunities for benefits are massive, but to reap those benefits, we also need to build protections. We can develop smart policies that both protect people’s privacy while enabling innovation. We need open information to promote innovation while simultaneously building protections for user privacy.The benefits of accessing and aggregating deep data about places and people are too large to ignore; so the great challenge is to figure out how to enable the use of data for good while also protecting individual privacy and defending against bad actors.The ultimate goal in dealing with data is to enable society to take advantage of all this data while simultaneously protecting individual privacy. This goal will take some of the world’s most innovative minds to achieve. But it might be the most important goal to work on in our lifetime … so I have hope that we will make a lot of progress over the next decade. FAQ’s 1. How can data improve society? When used responsibly, data enables better public policy, improved healthcare outcomes, smarter urban planning, and deeper economic research. 2. How did IRS data help research income mobility? Long-term tax records allowed researchers to measure intergenerational income trends with far greater accuracy than surveys. 3. Can healthcare data really improve cancer treatment? Yes. Large clinical datasets help identify treatment patterns, optimize care plans, and speed up drug development. 4. What insights can credit card data provide? Spending data can reveal consumer trends, detect early signs of epidemics, and inform economic and policy decisions. 5. How can society balance data innovation with privacy? Through strict safeguards, anonymization, secure access environments, and clear regulatory frameworks that protect individuals while enabling research. When used responsibly, data enables better public policy, improved healthcare outcomes, smarter urban planning, and deeper economic research.Long-term tax records allowed researchers to measure intergenerational income trends with far greater accuracy than surveys.Yes. Large clinical datasets help identify treatment patterns, optimize care plans, and speed up drug development.Spending data can reveal consumer trends, detect early signs of epidemics, and inform economic and policy decisions.Through strict safeguards, anonymization, secure access environments, and clear regulatory frameworks that protect individuals while enabling research. #### Data Demos: SafeGraph Places Python Starter Notebook Working with SafeGraph Places Data in Python? We want you to hit the ground running.Here is an example Google CoLab notebook pre-loaded with some data from SafeGraph Places to show how you can interact with and analyze some of the columns in SafeGraph Places.You can open this notebook in your browser and work interactively with the data and the notebook in real time! This is an ephemeral copy that only you can see and edit. Please tinker with it!This notebook shows you how to answer questions like:How many restaurants are there in Ohio?What is the ratio of quick-service to full-service restaurants in Ohio?What are the most common words used in restaurant names?Which cities in Ohio have the fewest restaurants per capita?Which chain restaurants have the largest number of locations that are open 24–7?Which chain restaurants are open the most hours / week?And shows example analyses with many of the columns in SafeGraph Places:safegraph_place_idnaics_code, top_category, and sub_categorylocation_namebrandscityopen_hoursTo open an interactive ephemeral copy of the notebook, click here. #### Data Science Interview Preparation: Top Tips for How To Nail It Key Takeaways Preparing for a data scientist interview requires both technical review and business context. Understanding the specific role helps you tailor your preparation effectively. Practicing scenario-based questions improves clarity and confidence. Honest communication about your skills builds trust with interviewers. Researching salary benchmarks and preparing thoughtful questions strengthens your position. You’ve refined your resume, sent out an application, and got the call back - now you just need to nail your interview. To put your best foot forward, you’ll want to properly prepare for the interview so you can make a great first impression.To help you do just that, we outline how to prepare for your data science interview by covering the following:The main topics covered in a data science interviewData scientist roles and responsibilities: what to expectHow to prepare for your data science interviewCommon questions you’ll be asked in a data science job interviewBefore we dive into the best strategies and things to consider when preparing for your interview, we’ll cover some of the things that are likely to come up in your interview.The Main Topics Covered in a Data Science InterviewDespite the wide variety of unique roles in the field of data science, there are still essentials that are important to know (and that will likely come up in your interview). You’ll want to be able to show you have this foundational knowledge and experience.While this is by no means exhaustive, below are some of topics that you can almost guarantee will be covered in any data science interview:Coding and programming: Experience with programming languages (whether or not it’s the specific one you’ll be using on the job) is a must for any data science job. Experience in one language can show that you have the proficiency to learn others as required, but concrete experience with programming languages is always a valuable asset.‍Product sense and business applications: Having technical knowledge and skills without an ability to transfer that into product development and analytics that drive better business and product decisions will have little value. You’ll need to have some sense of how to apply this knowledge for success in your industry and market.‍Statistics and probability: No matter what distinct data science job you’re applying for, statistics and probability are pillars that will be important. Be sure to have a basic sense of how these will be factor into the role and how your knowledge and skills in this area will add value to their company and team.‍Data modelling techniques: You will likely be asked about different methods of modelling data, depending on the situation, sample size, needs, and more. Ultimately, being able to discuss the method you’d use as well as the reasoning behind it is important.Data Scientist Roles and Responsibilities: What to ExpectData science is an ever-evolving field, with many specific roles that can vary widely depending on your industry, company, and discipline. A major component of preparing for the interview will be understanding the actual role you are applying for, including a detailed job description and an understanding of what your responsibilities and requirements will be.From the small sample list of related job titles below, you can see how varied and wide-ranging the roles and responsibilities may be:Data ScientistData EngineerBusiness AnalystData AnalystData VisualizationStatisticianData ArchitectData Science Project ManagerMachine Learning EngineerThe more you know about the job you’re applying for, the better you’ll be able to prepare for the interview. Do as much research as possible so you know what type of data science job you’re applying to, so you understand - and can speak to - your fit in that role. This will also save you a lot of time applying to jobs that don’t fit your interests or experience.How to Prepare for Your Data Science InterviewSo you’ve landed the interview for the data science job and you want to know how to properly prepare. While interviews can be intimidating, the best way to combat this is to be prepared going into the interview.Below are the top tips to make sure you’re ready for your upcoming data science interview:1. Research the role and identify your fitRead the entire job description thoroughly, and consider what the responsibility and tasks you’ll be performing are. From there, you can gauge the soft and technical skills that you’ll need for the job. To really nail the interview and prepare properly, you’ll need to have a clear idea of what the role is and what the requirements will be.Look up what the interviewer does at the company; in most cases, the main interviewer will be an immediate — or close — supervisor for the position you are applying to. Researching them, their role, and critically thinking about how your roles will interact will be helpful during the interview (while also giving you a chance to showcase your interpersonal skills).With a clear idea of what the role and job description are, you’ll be able to predict which topics will be covered in the interview, and better determine which topics to focus on when preparing. If you haven’t performed a relevant task since since you last left school, you may want to brush up on it before the interview so you know how to discuss it with confidence.It’s also important to research industry, company, and technical terminology so you sound informed, can follow along, and can engage throughout the interview.2. Get an idea of what the interviewer is looking forSome interviewers are looking for someone with the hard, technical skills required to start working right away. Others are looking for someone with the soft skills and critical thinking to learn quickly, knowing that they can train them on specific software tools they use as they go. If you can get an idea of what the interviewer is looking for, you can tailor your responses to cater toward either the technical skills or soft skills and critical thinking ability.It’s also important to brush up on your previous experience, whether that be at a job, on personal projects, or challenging (but rewarding) school assignments. Being able to speak to tangible projects or experiences where you overcame challenges or produced a specific result can greatly help your prospects during an interview.If you’re given a ‘scenario-based’ question, ask as many useful, information-gathering questions as you can to better frame your response. Many interviewees feel like they need to answer a question with the information given, when on the job, you will often need to ask clarifying questions to better meet your objectives and ensure you understand your assignment. Asking clarifying questions may be something the interviewer is expecting, and at the very least it will show them that you are critically thinking about the issue they presented.If you have any ideas about solutions, mention them. Even if you don’t fully know how to implement the solution, or are missing certain components of the process. Once onboarded, these wrinkles would be ironed out, and showing that you can think of quality, innovative solutions to problems being presented will go a long way. It’s also important to factor in ethical considerations, even in made-up scenarios, as you’re showing the interviewer how you’d conduct yourself on the job.3. Be honest about your technical skills and software experienceWhile you do want to ‘sell’ yourself and make yourself sound appealing, don’t lie about or overly embellish your technical skills or software experience. If you don’t have SQL experience outside of the classroom, don’t pretend you do. Never blindly say “yes” to every skill they ask you about, especially if it’s a specific technical skill. The worst case situation is for them to ask you if you’re familiar with something like regression, and then be unable to answer a direct question about linear regression.Be honest and upfront about the skills you have and the skills you don’t; you’d be surprised how far this honesty and confidence can go. For many employers, they are looking at your character and soft skills as much as experience with specific data science software and tools. Technical skills can be learned, but honesty, integrity, and dedication can’t. Show them these, and you’ll convince them you’re worth training.Show an interest in the solutions they mention, and make note of them so you can research them after the interview. If they do call you for a follow up, you can impress them with what you’ve picked up since the interview, especially if you know there will be multiple rounds of interviews during the process.4. Ask about the team that you will be working withAs you start your data science career, it’s important to surround yourself with people that you can learn from. School, educational courses, and training only go so far; real-time experience is the best teacher, and you’ll want to make sure you are in a role where you can consistently develop and grow as a data scientist - no matter what your specific job is.Ask about the team that you’ll be a part of, including your supervisor and the peers you will be working with. Finding a job that will challenge you, push your boundaries, and give you opportunities to grow and develop is extremely important for advancing your career.5. Be prepared to discuss salaryIf you find salary discussions awkward or discomforting, you’ll want to practice your responses, or at the very least have a firm idea of what your expectations are. It’s common for salary expectations to come up in an interview, and you should be ready for this to come up at any time; sometimes they will come up in the first interview, and other times it won’t come up until the final interview.It is best to use a salary range as opposed to a single number, and you should have a salary in mind going into it. This shouldn’t just be an arbitrary amount that you expect, but a value that you can justify based on the requirements and responsibilities of the role, and the expertise and experience you bring to it. This means that your salary range will likely — and should — change depending on the role you’re interviewing for.There are a number of services that are helpful in identifying a reasonable salary range for different jobs in various industries.Glassdoor Salary CalculatorPayScaleSalary.comIn some cases, you won’t have enough information or won’t feel comfortable listing a salary range. If you don’t want to, it’s okay to tell them you don’t feel confident listing a salary. This is especially true if you don’t have a lot of information about the requirements of the role, such as the weekly hours, vacation time, benefits, and more. The base salary doesn’t always tell the whole story, so make sure to ask questions when appropriate.6. Have questions ready for your employer (and write some down during the interview)It’s a good idea to come to the interview with notes, and a pen and paper to record information throughout. Leave an area for you to jot down questions you think of that you don’t want to ask immediately. At the end of the interview, you can ask these, showing how well you listened and retained information (which is itself showcasing your soft skills), as well as showing how well you understand the role.While researching the role, write down questions you want to ask the interviewer if they aren’t covered in the interview. These can be a great way to better understand the role, as well as show off how much you’ve researched the company and how interested you are in joining them. You can always cross out or ignore questions that have been answered by the end of the interview.Common questions to ask your interviewer include:When do you want to hire for this position?Is this a new position or will I be replacing another person? (Will that person be providing training?)What is your preferred method of communication for follow-up?What would my typical work day look like?You can also ask some more specific questions, even turning some of the questions you got back on them as an employer. This can help them consider the points you’ve made and allow you to speak to skills and experience you may not have had the opportunity to mention.Some examples of these questions include:What are the 2 - 3 most important qualities you are looking for in an applicant?What is the worst or best quality to have in a teammate? Why?Common Questions You’ll Be Asked in a Data Science Job InterviewAlthough you can’t predict all the questions you’ll be asked in an interview, you should still try to think about what will likely be asked of you. Going over practice questions and technical refreshers can be extremely helpful when preparing for your interview.Below, we list some of the best resources for finding questions you are likely to be asked:The Data Science Interview Study Guide | KDnuggets109 Data Science Interview Questions and Answers | SpringboardTop 50 Data Science Interview Questions and Answers for 2021 | simplilearn20 Data Science Interview Questions for a Beginner | Analytics VidhyaData Science Interview Guide - Questions from 80 Different Companies | stratascratch9 Common Data Science Interview Questions | indeed career guideAce The Data Science Interview | Nick SinghDatalemur | Nick Singh‍Now that you know what to expect - and have all these tools at your disposal - you should be able to nail your data science interview, and get the job. FAQ’s 1. How do I prepare for a data scientist interview? Review core concepts such as statistics, probability, and modeling, practice coding problems, study the job description carefully, and research the company. 2. What technical topics should I study before a data scientist interview? Focus on programming, data modeling techniques, statistics, probability, and real-world business applications of data science. 3. Are scenario-based questions common in data scientist interviews? Yes. Employers often present business problems to assess your analytical thinking, communication skills, and structured approach. 4. How important are soft skills in a data scientist interview? Very important. Interviewers look for clarity in communication, ethical judgment, and the ability to explain technical work to non-technical stakeholders. 5. Should I discuss salary expectations in a data scientist interview? Yes. Be prepared with a justified salary range based on your experience and current market data. Review core concepts such as statistics, probability, and modeling, practice coding problems, study the job description carefully, and research the company.Focus on programming, data modeling techniques, statistics, probability, and real-world business applications of data science.Yes. Employers often present business problems to assess your analytical thinking, communication skills, and structured approach.Very important. Interviewers look for clarity in communication, ethical judgment, and the ability to explain technical work to non-technical stakeholders.Yes. Be prepared with a justified salary range based on your experience and current market data. #### Data-as-a-Service Bible: Everything You Wanted to Know About Running DaaS Companies Key Takeaways DaaS is not SaaS. Data companies sell structured truth, not workflows or predictions. Successful DaaS companies master three pillars: acquisition, transformation, and delivery. Data markets often become winner-takes-most, driven by accuracy, standards, and scale. Data becomes exponentially more valuable when it can be easily joined with other datasets. Early margins can look weak, but strong data businesses generate powerful long-term economics. Learn about the power of standards in our DaaS Bible 2.0, and about running a data business on our new podcast, World of DaaS.A Treatise on Data BusinessesData businesses are generally misunderstood. (That is an understatement).I’ve spent the last 13 years running data companies (previously LiveRamp (NYSE:RAMP) and now SafeGraph), investing in dozens of data companies, meeting with CEOs of hundreds of data companies, and reading histories of data businesses. I’m sharing my knowledge about data businesses here — written primarily for people that either invest or operate data businesses. I put this together because there is so much information on SaaS companies and so little information on DaaS companies. Please reach out to me with new information, new ideas, challenges to this piece, corrections, etc. And please let me know if this is helpful to you. (this is written in mid-2019).DaaS is not really SaaS … and it is not Compute eitherData businesses have some similarities to SaaS businesses but also some significant differences. While there has been a lot written about SaaS businesses (how they operate, how they get leverage, what metrics to watch, etc.), there has been surprisingly little written about data businesses. This piece serves as a core overview of what a 21st-century data business should look like, what to look for (as an investor or potential employee), and an operational manual for executives.In the end, great data companies look like the ugly child of a SaaS company (like Salesforce) and a compute service (like AWS). Data companies have their own unique lineage, lingo, operational cadence, and more. They are an odd duck in the tech pond. That makes it harder to evaluate if they are a good business or not.1/ Data businesses are generally misunderstood. DaaS has different metrics than SaaS.While there has been a lot written about SaaS businesses (how they operate, what metrics to watch, etc.), there has been surprisingly little written about data businesses.— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) June 18, 2019Everything today is a service — data companies are no exceptionAlmost all new companies are set up as a service. Software-as-a-Service (like Salesforce, Slack, Google apps, etc.) has been on the rise for the last twenty years. Compute-as-a-service (like AWS, Google Cloud, Microsoft Azure, etc.) has become the dominant means to get access to servers in the last decade. There are now amazing API services (like Twilio, Checkr, Stripe, etc.). And data companies are also becoming services (with the gawky acronym “DaaS” for “Data-as-a-Service”).data business are like the babies of one SaaS parent (like Salesforce), one API parent (like Twilio), and one compute parent (like AWS).yes -- data businesses have three parents ... they are that weird.— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) June 9, 2019Data is ultimately a winner-takes-most marketLong term (with the caveat that the markets work well and the competitors are rational), a niche for data can be dominated by 1 or 2 players. That dominance does not give these players pricing power. In fact, they actually might have negative pricing power (one of the ways a company may continue to dominate a data market is by lowering its price to make it harder for rivals to compete).As a data company starts to dominate its niche, it can lower its price and gain more market share and use those resources to invest more in the data … thereby gaining more market share (and the cycle continues). Because data companies have no UI and are not predicting the future (see more in the paragraphs below), the data company can dominate by just having the correct facts and having an easy way to deliver those facts (APIs, queryability, self-serve, and integrations become very important).Of course, some data markets have no dominant player and are hyper-competitive. These are generally bad businesses. But even in these businesses with “commodity” data, one can potentially get to 50%+ market share by using price and marketing as a lever. (By contrast, it is very hard to make a competitive SaaS category less competitive … we go into why later in this post).Data is a growing businessOne of the biggest themes in the last ten years has been products that help companies use first-party data better. If you invested in that trend, you had an amazing decade. Those companies include core tools (Databricks, Cloudera), middleware (LiveRamp, Plaid), BI (Tableau, Looker), data processing (Snowflake), log processing (Splunk), and many, many, many more. (note: as a reminder about the power of these tools … while I was writing this post, both Tableau and Looker were acquired for a total price for almost $20 billion!)These products help companies manage their own data better.The amount of collected first-party data is growing exponentially due to better tools, internet usage, sensors (like wifi routers), etc. Companies are getting better and better about managing this first-party data. At the same time, compute costs continue to fall dramatically every year — so it is cheaper and cheaper to process the data.More and more people are comfortable working with data. “Data Science” is one of the fastest growing professions and more people are moving into the field. People are getting more technical (aided by many tools) and communities of data scientists are growing fast — KDNuggets reports “in June 2017, the Kaggle community crossed 1 million members, and Kaggle email on Sep 19, 2018 says they surpassed 2 million members in August 2018.” IBM estimates that the number of people in data science is growing faster than 20% per year.First party data is not enoughBut unless your company is Google, Facebook, Apple, Amazon, Tencent, or 12 other companies … even analyzing all your data perfectly will only tell you about 0.01% of the world. If you want to see beyond your company’s pinhole, you will need external data.Even five years ago, very few companies were equipped to leverage external data. Most companies still did not analyze their own data! But as companies get better and better at finding insights in their internal data, they will look externally for data more and more.At least, that is the bet.There are an order of magnitude more data buyers today than there were five years ago. IAB reports that even buying marketing audience data (which is traditionally the least accurate of all data) is a massive business and growing.Nonetheless, there are still very few data buyers today. Most companies want applications (answers), not data (which is essentially a collection of facts).The only reason to start (or invest in) a data business today is if you believe the number of data buyers will go up another order of magnitude in the next five years.Even five years ago, very few companies were equipped to leverage external data. Most companies still did not analyze their own data!But as companies get better and better at finding insights in their internal data, they will look externally for data more and more.— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) June 10, 2019Data companies are backward lookingData companies are ultimately about selling verifiable facts. So data companies collect and manufacture facts about things. For instance, you could start a data company about the Eiffel Tower, compiling historical facts about the type of steel it is made of, all the changes over the years, the height of the tower, how it responds to wind and other conditions, the biography of Gus Eiffel, and millions of photos of the tower taken from every angle and every hour of the day.Data companies are about truth. They are about what happened in the past. So much so that SafeGraph’s motto is “we predict the past.” Of course, being accurate (even about verifiable facts) is really hard (more on that later in this post). It is grueling work to get to a point that is even close to being true. And there is no possible way to get to 100% true.While data companies are about truth, prediction companies (like predicting fraud, predicting credit-worthiness, predicting elections, etc.) are about religion. One framework to think about data companies is truth versus religion and data versus application.Truth companies focus on facts that happened and Religion companies use those facts to help predict the future. Data companies focus more on selling the raw data while Application companies take the raw data and create some sort of work-flow around it.One way to think about the market is that Religion companies often buy from Truth companies … and Application companies often buy from Data companies. For instance, SafeGraph (a Truth Data company) has a lot of customers that are applications or religions.It is really important that data is trueOne of the weird things about data companies historically is they often failed on one core value: veracity.There is a huge trade-off between precision (accuracy) and recall (coverage). In the past, most data vendors were prioritizing coverage over accuracy. This has been especially true about “people” data (see discussion on “People Data” below) for marketing. The more entities one has data about (and the more information about each entity), the less likely that any one data element is correct.Not too long ago, much of the best data was actually compiled by hand. Some of the biggest data companies still have 3000-person call centers calling and collecting data. As data becomes easier to collect and merge programmatically, we should see more companies with accurate data reach scale.As companies rely more and more on data (and build their machine learning models on data), truth is going to be even more important. If you are using data to make predictions about the future, then the data that represents the past needs to be highly accurate. Of course, no dataset is 100% true … but good data companies strive for the truth.One thing to look for in a data company is its rate of improvement. Some data companies actually publish their change logs on how the data is improving over time. The faster the data improves (and the more the company is committed to truth), the more likely the data company will win its market. And there are massive gains to winning a market.One of the weird things about data companies historically is they often failed on one core value: veracity.As companies rely more and more on data (and build their machine learning models on data), truth is going to be even more important.— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) June 12, 2019The Three Pillars of Data Businesses: Acquisition, Transformation, and DeliverySafeGraph’s President and co-founder Brent Perez always likes to remind me that there are only three core things a data company does:1) Data Acquisition2) Data Transformation3) Data DeliveryFirst Pillar: Data Acquisition is about bringing in raw materialsThere are lots of ways for companies to acquire data and every data company needs to be very good at least one of the ways. Some of the ways to acquire data are:Data co-op: getting your customers to send you the data (usually for free) in return for analytics on the data. Verisk is a good example of this. Clearbit has a great data co-op for customer contacts. Windfall Data has a great data co-op of people that spend money. Bombora has a co-op on B2B purchase intent.BD deals: creating strong long-term business development deals to get data. These often take a long time to negotiate and can be costly. While this is a fixed cost, accounting rules often require companies to put these costs in COGS (even though the costs are fixed) — see Margins for data businesses initially look very bad (below). Datalogix (which sold to Oracle in 2015) did a great job of acquiring auto data through a long-term-agreement they made with Polk. BD deals tend to be very difficult because most companies over-value their data … they don’t realize the intense work it takes to actually make the data useful and to bring it to market. Though there are many companies that do monetize data as a byproduct of their current business model.Public data: One example of companies that compile great data are search engines (like Google). They do not pay for the data directly — they instead crawl the web (which can be super costly). In this case, the costs of acquiring the data goes below the line.Second Pillar: Data TransformationYour data acquisition might come from thousands of sources. You need to fuse the data together and make it more useful.Even if you get your data from a few BD deals, you eventually want to graph the datasets together to ask questions across the data. This is where the real magic (and dare I say, “synergies”) happen. Once you marry weather data with the attendance of Disneyland, you can start to ask really interesting join questions. The more datasets you join, the more interesting questions you can ask. (more on this below)Some transformations might be simple (like local time to UTC) and some might be extremely complicated. Data Scientists spend 90% of their job munging data, not building models. When it really should be the other way around. So simply filtering/deduping data is in itself a valuable transformation.Questions you want to ask include: How do you join all of the data sets together? What is your “key” (primary or secondary)? How do you ensure you assigning the right data to the right entity (business, person, etc.)? How do you measure that efficacy/accuracy? What is the impact of this downstream for making the data more valuable?If the data company is employing machine learning (and most good data companies are), this is the step where all the ML magic happens. For instance, SafeGraph uses computer vision and ML to align, register and connect satellite imagery with street addresses and points of interest.Data transformation is very difficult. As Arup Banerjee, CEO of Windfall Data, reminded me: “You can’t just fix a bug with a simple fix — you can certainly do ‘post-processing’ and remove certain data points, but it isn’t as easy as A/B testing where to put the home button — you need to have a high degree of fidelity and confidence.”Third Pillar: Data Delivery is about how the customer gets access to the dataIs it an enterprise solution where they get a big batch file (via an s3 bucket or SFTP)? Is it an API? Is there a self-serve UI? What integrations with existing platforms (i.e. SFDC, Shopify, etc.) do you have?Does the data come streaming in real-time? Or is the data compiled monthly? It is reliable and timely or variable and unpredictable?Is the data well documented and well defined? Or does it contain inscrutable columns and poor data dictionaries?Does the data document its assumptions and transformations? Are there “hidden” filters and assumptions?Is the data organized into schemas and ontologies that make sense and are useful? Is it easy to join with internal data or other external datasets? Or does the customer have to build their own ingestion ETL pipelines to be able to work with the data?Great Data Companies Unify on a Central ThemeData companies need to get leverage and so the data should ultimately fit together with a common key. It is really important to have a data model where you can tie data across disparate elements — so having some sort of guiding theme is really important. For database nerds, think of a theme as a unifying primary key with a series of foreign keys across the dataset. This is not only true for data companies, great middleware companies also should have a central theme to stitch all the data together.Of course, the best themes are ones that everyone understands, are big enough to collect lots of interesting data, and can be internationalized.The biggest themes of data business are core concepts that make up our world:PeopleProductsPlacesCompaniesProcedures(we dive into each of these “themes” in the appendix at the end)Tying static data to timeData on these static dimensions (people, products, companies, places, etc.) become more valuable when they are temporal and change with time. You can charge more for the data (and align with a subscription model) if it is changing a lot — and, more importantly, you can retain customers because the data is not just a one-time use.For instance, charging for real-time traffic data can sometimes be more valuable than charging for street maps. That is an example of using time with the physical world.Another example of time crossed with the physical world is weather data — it changes all the time and is vital for many consumers and industries. In a place like San Francisco that has hundreds of micro-climates, the weather data itself can vary every hour every 100 square meters.One of the classic temporal datasets is price per stock ticker per time. That dataset is vital to any public market investor. The data goes back over 100 years (the “tick” a hundred years ago might be a day while “tick” today might be a tenth of a second).In fact, much of the most valuable data is tied to pricing over time. Examples include commodity pricing, rental pricing, prices of goods on Amazon, the Economist’s Big Mac Index, etc.Linking datasets together makes the data much more valuableData by itself is not very useful. Yes, it is good to know that the American Declaration of Independence was ratified on July 4, 1776 — that allows you to prove you are a smart person and helps you more enjoy your hot dog on Independence Day. But it does not have a lot of use in isolation.One of the big ways that data becomes useful is when it is tied to other data. The more data can be joined, the more useful it is. The reason for this is simple: data is only as useful as the questions it can help answer. Joining, linking, and graphing datasets together allows one to ask more and different kinds of questions.One of the big ways that data becomes useful is when it is tied to other data.The reason for this is simple: data is only as useful as the questions it can help answer. Joining, linking, and graphing datasets together allows one to ask more and different kinds of questions.— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) June 14, 2019One great join key is time. Nowadays, time is mostly pretty standard (that was not true a few centuries ago). And we even have a UTC Time that standardizes time zones so that an event that takes place at the same exact time in Japan and Argentina is represented as such.Another join key is location (like a postal code).The more join keys (and joined data sets) you can find, the more valuable those data become.Let’s consider a simple example. Let’s get data on a ticker and also find out where all the company’s operations are (geographically). Then let’s get a representation of the percentage that each postal code affects sales of the company over time. Then we can join that data (via time and geography) to historical weather to see if the weather had any correlation to the individual sites of operation and ticker price historically.As you keep joining data, the number of questions you can ask grows exponentially.As the amount of data grows, the number of questions you can answer grows exponentially.Which means that if the value of Dataset A is X and the value of Dataset B is Y, the value of joining the two datasets is a lot more than X+Y. Because the market for data is still very small, the value isn’t X*Y yet … but it is possible it will approach that in the future.Building Keys into Your Data So That They Are Easier to Join: Make It SIMPLEYour data will be much more valuable if you enable it to be joined with other datasets (even if you make no money off the other datasets). This is the #1 thing that most people who work at data companies do not understand.Most people think that they need to hoard the data. But the data increases in value if it can be combined with other interesting datasets. So you should do everything you can to help your customers combine your data with other data.One way to make data easy to combine is to purposely think about linking it — essentially creating a foreign key for other datasets.The SIMPLE acronym for data companies — ID or foreign key.Storable. You should be able to store the ID offline. For instance, I know my SSN and my payroll system stores my SSN.Immutable. It should not change over time. An SSN on a person is usually the same from birth until death (except if you enter the witness protection program).Meticulous (high precision). The same entity in two different systems should resolve to the same ID. It should be very difficult for someone to claim they have a different SSN.Portable. I can easily move my SSN from one payroll system to another.Low-cost. The ID needs to be cheap (or even free). If it is too expensive, the transaction costs will make it hard to use in many situations. The SSN itself has no cost.Established (high recall). It needs to cover almost all of its subjects. An SSN covers basically every American taxpayer (and more).Creating a SIMPLE key to combine your data to other datasets is the most important thing you can do to build a truly valuable data company. Unless you are planning on cornering all the data in the world, your data needs to be graphed to other datasets and the best way to do that is SIMPLE.I’d like to see a world where organizations are actively encouraged to share data as more data sharing will lead to a world of much more open information.Economics of Data Companies Are Not What They Appear to BeMargins for most data businesses initially look very bad.Data companies generally have a lot of trouble attracting Series A and Series B investors because the margins often look very bad in the beginning. Data companies often have a fixed cost of purchasing the core raw materials and for some odd accounting reason, those fixed costs sit in COGS. So the margins initially can look really bad (and sometimes can even be negative in the first year).But these “COGS” do not scale with revenue. In fact, they are just step function costs as companies go to new markets. Michael Meltz, EVP of Experian, likes to remind me that “incremental margins eventually become extremely attractive at successful data businesses.”Here is an example of a company’s numbers:Imagine for a second if you were a Series B investor looking at the business at the end of 2013. Someone with little experience investing in data companies (which is 95%+ of SaaS investors) would look at this company and think this is a long-term 50% margin business.The reality is that data costs are often a long-term asset and they only sit in COGS because of an odd accounting rule. Data is a fast depreciating asset (because much of its value is temporal), but even the historical data can have a lot of value. And it is buy-once, sell-as-many-times-as-you-can. Gathering the data itself is a significant asset — just the act of compiling the data leads to a “learning curve” moat.SaaS companies, by contrast, spend gigantic sums on sales, marketing, and customer success. Most of those costs technically go “below-the-line” so the SaaS margins look good. In some cases, those costs really should be below the line and are just really high because the companies are mismanaged (Vista Equity has had massive success in bringing down these costs when it acquires companies). But many of these costs are hidden COGS and the true margin on these SaaS companies are actually not as good as advertised because they are so hyper-competitive.In DaaS companies, CACs (Customer Acquisition Costs) tend to decline over time (for the same customer types). In some of the best SaaS companies, CACs eventually stabilize but rarely drop significantly (Vista Equity companies seem to be the exception).One way to see this is ARR (annual recurring revenue) per employee. Another thing to look at is net revenue per employee. Is that metric getting better over time or is it getting worse? Once the company gets to some size (say $20 million ARR), that metric should get better each year unless there is some core strategic investment reason for it to decline. If the ARR/employee is getting better, the business is likely a good one. Companies like Google and Facebook have incredibly high net revenue per employee — like over $1 million per employee. But a lot of the best SaaS companies have between $100k and $200k per employee. The more net revenue per employee, the better.A good analogy is Netflix, which aggregates consumers worldwide to justify spending money on content. Netflix spends a lot of money on content, but can be amortized among all subscribers. Of course, the analogy breaks down a bit because while data is expensive, it is nowhere near the cost of creating quality video content.There are some data businesses that look much more like Spotify (which has to pay a percentage of revenue to the content creators). The “margins” in those businesses are much more legitimate and more permanent.Example: “Priviconix”Of course, there are a lot of ways to do data acquisition and they have different cost structures with different account rules. Let’s analyze Priviconix, a fictional company sells data about privacy policies. It parses the privacy policies for the top 100,000 companies and offers analysis on those policies.(by the way, this is a fictional example, but someone should start a company like this — I am happy to fund this)There might be a vendor that has already crawled the top 100,000 company web sites and can send you a daily file of their privacy policies. Let’s say that costs you $40,000 a year to buy. That cost sits in COGS (above the line).Let’s say instead that you decide to do the crawling yourself. Let’s say it costs you $55,000 in salaries a year to maintain the crawl. Those costs (if you can even calculate them) go below the line.Some CEOs might be tempted to go with the $55k option because it will make her margins look better. But the reality is that the data is the same. Many investors do not appreciate the distinction.Of course, this depends on the model on sourcing the data. BD deals are really costly, but co-op makes the margins incredibly high (often right from the beginning). Public data is hit or miss depending on the structure, accuracy, and consistently of what you can crawl.Getting to Dominant Market Share (and Leveraging Acquisitions)Once you have a flywheel going for a data company, you need to get to market share dominance in your niche. The goal should be to get to over 50% market share. LiveRamp, for instance, has over 70% market share in its niche.One way to get to 50% market share is to go after a very small niche and relentlessly focus on it. Of course, you will eventually need to move to adjacent niches.Another way to dominate market share is via aggressive pricing. In the world of SaaS, this is usually not possible because CACs are too high — so lowering LTVs, even temporarily, is usually not a smart option. But with DaaS companies, CACs can be low and one can find ways to get them lower over time. If that’s the case, then there is a case of being aggressive on price using the Bezos “your margin is our opportunity” strategy.Once you have traction, a third lever to get to market share dominance is via acquisitions. SaaS companies have lots of trouble acquiring their competitors. That’s because SaaS companies have a UI — so merging those workflows is incredibly hard to do (and almost never done right). When SaaS companies acquire, they tend to acquire other products in adjacent spaces so they have more products to sell their current customers (to increase LTVs per customer). This has been an incredibly successful strategy for Oracle, Salesforce, and others. Of course, data companies can also acquire new products to sell into their customers.But DaaS companies have an additional opportunity to acquire direct competitors. These DaaS acquisitions have the potential to be much easier to be successful (and model) because they can just acquire the customer contracts (this is especially true if they already have the superior product). For instance, if there are two companies selling pricing data on stock tickers, combining those offerings is pretty simple — it is basically just a matter of buying the customer relationships and the ongoing associated revenues.The goal of getting to market share dominance is not to increase prices on your customers. On the contrary. The goal is to lower your CACs so that you can LOWER prices for your customers. CACs go down because there is one dominant player. LTVs go down too because prices drop. But LTV/CAC ratios don’t go down (they usually go way up). Great DaaS companies act like compute companies (think AWS) — they lower dollar per datum prices every month. So customers get more value for the money and that value compounds over time. (At SafeGraph we aim for minimum 5% monthly compounding benefit for customers — meaning the dollar per data element drops by a minimum of 5% each month).Compounding is really key for data companies. Data companies build an asset that becomes more and more important over time. But it is really hard to see the compounding in the early days so people often give up. Of course, many super profitable data companies stop innovating and just milk the tedious past work put into compiling the data (that was sometimes done decades ago).Commoditizing your complementLike all businesses, data companies want to understand their complements and their substitutes. The core complements to data business are cloud compute platforms (like Amazon Web Services (AWS), Microsoft Azure, Google Cloud, etc.) and software tools to process the data (many of which are open-source) and make sense of the data (like many machine learning platforms). The more powerful the tools and the more power available for compute, the more likely a customer will be able to buy and use data.In fact, if you are in the data-selling business, you can easily qualify your customers by finding out what other tools they are using. A customer spending lots of money on Snowflake and Looker might be much more likely to buy your data.Another thing to think about is how to commoditize your complement for data. There might be core data that makes your high-priced data more useful. In this case, you want to make sure the customers get access to that data (even if you do not sell it). One way to do that is potentially work to open-source datasets that align with your data. Another way is to join your data with data that is already free (like government data). At SafeGraph, we realized that many of our customers wanted to marry our data with U.S. Census but that data was extremely hard to download and use — so during a hack-day we created a much simpler and free download of Census Block Group data. To learn more about the commoditize-your-complement strategy, check out the detailed posts by Joel Spolsky (more: Joel Spolsky) and Gwern.Vertical Versus Horizontal, Number of Data Buyers, and Growth of the DaaS MarketGenerally, most great SaaS companies sell into a specific industry. On the other hand, DaaS tends to be more horizontal than SaaS.Data tends to be more horizontal than software. Compute, too, is horizontal. So are many API services.This is because data is just a piece of the solution. It is just a component. It is an ingredient — like selling high-quality truffles to a chef.SaaS (software) is the solution. SaaS companies solve problems. So they usually need to get down into the specific issues. While SaaS companies might not be the Executive Chef, they at least position themselves to be the Sous-Chef.Many DaaS companies sell their data not to the end user but directly to software companies. Most end customers are not yet sophisticated buyers of data — so DaaS companies go for the low-hanging fruit (which are other technology companies). Of course, this is not always the case — Windfall Data has been extremely successful selling its data to non-profits and universities (which are decidedly low-tech).One interesting thing about the data market is that it historically has been a really bad market to sell into. Very few companies historically have had the ability to buy large amounts of outside data and make use of it. In fact, many companies struggle just to make use of their own data.Example: hedge fundsJust five years ago, only about 20 of the 11,000 hedge funds were making use of large amounts of alternative data. Today (2019 as of this writing) it is still only about 100 funds. But hundreds of funds are currently making the investment to get better at managing, ingesting, and using this data. So five years from now, it might be 500 funds. 500 is still only a fraction of the 11,000 funds … but it is a significant increase just in recent history.Because the hedge fund industry is such a competitive and consolidating industry, incremental data points that could produce alpha signals are treated as scarce resources that should not be shared (once other participants know the signal, the alpha shrinks until it is gone). During that time there was a practice of buying exclusive rights to datasets which limited the availability of data to other hedge funds and drove up the prices of comparable data sets. Data acquisition at some of the best funds became a battleground for competitive advantage.Hedge funds always knew the power of alternative data. Today, that industry finds itself in a more democratized state when it comes to acquiring alternative data and transforming it into insights. Computing power is cheaper, there are more and cheaper vendors that provide comparable datasets and more qualified data scientists and engineers that can be hired to do their jobs better than they had 5 years ago.And this is not just a trend in hedge funds. The growth in data consumption looks the same in every industry.Partially this growth is because people are recognizing the power of data. But most of it is due to the growth in the power of the tools to manage and process the data. We use Apache Spark at SafeGraph to manage our datasets. Spark is an incredibly powerful tool and it is significantly more powerful and easier to use than the Hadoop stack (which is what we used ten years ago at LiveRamp).SafeGraph’s customers benefit from Snowflake, Alteryx, ElasticSearch, and many other super powerful tools. New ML tools that make finding insights from data easier than ever before. These tools do a really important thing for DaaS: they increase the market for companies willing and able to buy and ingest data.It used to be that only companies with the very best back-end engineers could glean insights from large amounts of data. The best software engineers only want to work for the top technology companies — they are likely not going to want to work for a QSR like Starbucks. But now Starbucks can pay for Snowflake and have the power that only the very best technology companies had five years ago.Operational Cadence of a Data-as-a-Service (DaaS) CompanyRunning a good data evaluation processAlmost every potential customer of every data company will want to evaluate the data before making a large buying decision. Making that evaluation process easy for your customers is essential to any data company. You also want to make it easy for your salespeople (as data companies tend to have a lot of tire-kickers).One way to accelerate data buying and evaluation is having either a freemium model or some sort of self-serve model (or both). Once companies already use some data, they are pre-qualified (like a PQL — Product Qualified Lead).Upsells are important long-termIf you are a data company and your customers are benefiting from your services (and they have assessed the data and seen it to be true), then you are in position to upsell new data elements to those customers. It is generally important for data companies to be able to upsell additional data products or services over time. Often they start by selling one data product and then upsell customers with an additional catalog of data products over time.The really important thing is that you maintain quality as you add SKUs. This is hard to do so better to go slow than to dilute your brand. Most of the large data companies today have SKUs of varying quality which really hurts their brand. They would be better off selling fewer SKUs (or selling their competitors’ SKUs).Data agreements and how data is actually soldData can be sold in many dimensions. By volume, usage rights, SLA, and more.One thing that all data agreements have in them is specific rights for the buyer. These rights outline what the buyer can do with the data. For instance, many data agreements are time-bound — which means the data needs to be deleted after the agreement is terminated or expires. Most agreements do not let the end buyer resell the data but some might have limited resell rights or discuss what can and cannot be done with derivative data. These data rights can be extremely complicated so it is generally good for your organization to standardize them and not have lots of different data rights for each customer.Fraud, watermarking, and moreOne of the problems with “data” is that it is easily copied. For centuries, mapmakers had to contend with their maps being copied and stolen. Starting 500 years ago, many cartographers added fake data to their maps (like fake streets or even countries). Then if they saw that reproduced, they knew it was theirs.Today many data companies add watermarks to their data. Essentially they will mix in tiny bits of fake data into the real data so they can track it. Super sophisticated data companies will have different watermarks for each customer — so they can trace a data breach to a specific customer.The per-seat model to using dataMany data companies do not actually sell data downloads (“data by the kilogram”) but instead sell SaaS-like per-seat licenses to a tool that allows users to download the data and make use of it. Innovative companies like CoStar, Reonomy, Clearbit, Second Measure, Esri, Verisk, etc. have some version of this. The per-seat subscription model makes pricing simpler and also can make it much easier to protect data. But the per-seat model also means that your company will need to build a user-interface, analytics, and more. So that likely means you are quickly going to be competitive with lots of other solutions (you will not be able to sell your data to your analytics competitors).Getting data into a workflow can be really powerful. Alex MacCaw, CEO of Clearbit, often reminds me that “the data isn’t useful unless it’s in the place it needs to be. Thus building great integrations and workflows is a key edge for companies competing.”Your business model and approach will vary greatly depending on your dataset, partners, vertical, and competition.Software versus data.Right now, most companies spend way more on software than they do on data. They also usually have more than 20 times the number of software vendors as they do data vendors. Alexander Rosen, GP at Ridge Ventures, mentioned “Will this be different in twenty years? I think it will.”It is hard for data companies to get started because they are just data and not the full solution. It is also hard because so much of the data out there is of poor quality — so you need to get above the noise to get any customers.The good news is that as software (like Snowflake, etc.) gets more powerful, evaluating the data (during the buying process) will get easier.Data Companies Are the Unsexy ArchivistsWorking at a data company is like being an archivist at the Library of Congress. You know your job is important but you also know it is a supporting role that helps other people shine. Your job is to help and support innovators.There are very few monuments to archivists. They don’t win Nobel Prizes. They don’t write the Constitution, they only preserve it. Being an archivist means being extremely humble. You are an unsung hero. Your job is to help the innovators innovate. You are not the race car driver, you are the pit crew (or maybe just the person who built the wrench).Some people are naturally excited about the role of being an archivist. They are excited to be in the background and have the intrinsic self-worth of playing the core supporting role. Like the lighting engineers in a Broadway play. But not everyone is suited to be behind-the-scenes and those people should not start or work at data companies.(note: if you are excited about the mission to be an archivist, join us in a career at SafeGraph)#DaaS will be at the center of data marketplaces & #data cleaning products/services.Here's what I see as the critical elements of successful DaaS #startups. https://t.co/f1oqKCyXUn— Alexander Rosen (@Rosenrosen) November 30, 2018Appendix: Data Themes“People” is a very common themePeople: data around a person. Data can be tied together with an email address, social security number, phone number, advertiser ID, cookie, name and address, and many other ties. Data companies that focus on data about people include Experian, Clearbit, People Data Labs, FullContact, and Windfall Data. Middleware companies can also base their data model on people (LiveRamp does that). Almost all companies that use the word “identity” to describe its services is likely based on a “person” theme.Privacy of People DataOne of the problems with having a “People” theme is the great responsibility of protecting people’s privacy. None of the other data themes (Organizations, Products, Places, Procedures, etc.) have big privacy issues — but for People Data, privacy is THE ISSUE. This is especially true in today’s world of GDPR, CCPA, pending federal regulation, calls for privacy by (and greater scrutiny of) Apple, Google, Facebook, etc. Protecting data on consumers becomes paramount. Even though the consumer is usually not the data company’s direct customer, one needs to do everything to make sure that she benefits from the end use of that data.As you get more data on people, you also open yourself to attacks from the outside (because data about people can be used to steal money from people) … so security becomes really, really important.One of the good things about data on people is it’s hard to access and not widely available (or requires a partner network to access). Often the privacy problem can be a feature (instead of a bug), which creates defensibility and a moat around anyone who can aggregate it.Truth is hard to assessOf course, a HUGE problem with data about people is that it is very difficult for a customer to check if it is true. So most customers discount the data and assume it is really bad (which means it is hard to charge a premium for better data).The overwhelming winds in the people data business have been moving in a direction that may make it increasingly difficult to have a third-party people data business (that does not have a direct relationship with the consumer). New regulations (like GDPR) can put a lot of burdens on people data companies … but these regulations also create a ton of opportunities for those that do it right and really aim to protect consumers.“Products” themeAnother great theme is one on products (or SKUs). You can aim to cover all products (like the barcode) or a subset of products.Most of your electronics (like your smartphone, laptop, TV, etc.) carry a serial number that uniquely identifies that device. One could start an entire data company around understanding these serial numbers or other identifiers of SKUs.One example is R.L. Polk (now part of IHS Markit) which has traditionally collected data about cars. Their aim was to be the best data source about a car. And not just the make and model of a car … but about the actual individual car. So they use foreign keys like the license plate number and the individual Vehicle Identification Number (VIN).Products are really important and they can be really niche. For instance, you can build a great wine intelligence business selling information on wine bottles. Another nice thing about products is that they have no privacy concerns… you can collect whatever you want on them as long as you do not connect them to a human.“Companies” has been a good businessAnother good theme historically has been selling data about companies (or organizations).Dun & Bradstreet runs the DUNS number to uniquely identify a company. DUNS is used by many organizations (including the U.S. government, the UK government, the United Nations, and more). It has been around since 1963 but only became more of a standard in the last 20 years. Dun & Bradstreet signed a contract with the U.S. federal government which was instrumental in making DUNS a standard. For example, companies must register for a DUNS number in order to work with the federal government or file certain documents with the federal government.Not only do many governments and organizations use the DUNS as a standard, but it is also often required to get certain services (like many bank loans). Because DUNS is a standard, disparate organizations can share information easily on a company. For instance, if a bank wants to repackage its loan to a business, it uses the DUNS number to describe that business so that all other parties can better evaluate the loan (because they have information they trust on the businesses).Another example of data tied to a company is the stock ticker (and all the financial data that joins on it).“Places” is how you think about the physical worldOne of the oldest forms of data is information about a place.Maps have been with us for millennia. Since maps of countries and cities don’t change that much, there has always been a worry from the cartographers that their work would be copied. So starting in the 1500s mapmakers began inserting fake places into their maps — including fake streets.SafeGraph (where I work) focuses on information about Places. As of this writing (June 2019), SafeGraph focuses on places one can spend money (like pay in cash or credit card for something) or can spend time (like parks, etc.). The data includes things like store hours, address, category of place, geometry (e.g. building polygon), IP address of the place, and more. SafeGraph publishes its full schema — as you can see, everything is connected to a place (via the SafeGraph Place ID).There are many other super successful place businesses.One amazing places business is CoStar (their market cap is over $20 billion as of this writing). They have detailed information on commercial real estate rentals (like price per square foot, lease length, and more). Originally they collected the data from calling brokers (today they get much of the data directly from the big landlords in a big data co-op).CoreLogic sells data about residential properties (like last transaction price of a home, number of bedrooms, square footage, etc.). Many of the B2C web sites that have home value data get it from a place like CoreLogic.“Procedures” is bit different — it is instructions on how things are doneA “procedure” is data about a particular action. These are most common in the medical field. For instance, “Lasik surgery” is a procedure — which might have certain expertise, time length, equipment, and price tied to it.Procedures tend to be more complex data elements than people, places, companies, or products because they are often something that combines many people/products/places into one action. But procedures still have their own IDs, own codes, etc.Special thank you to Brent Perez (President of SafeGraph), Michael Meltz (EVP Strategy at Experian), Ryan Fox Squire (Product at SafeGraph), Nick Singh (Marketing at SafeGraph), Alex MacCaw (CEO of Clearbit), Alex Rosen (General Partner at Ridge Ventures), Arup Banerjee (CEO of Windfall Data), Erik Matlick (CEO of Bombora), Scott Howe (CEO of LiveRamp), Andrew Steinerman (Equity Research at JP Morgan), Sean Thorne (CEO of People Data Labs), Joel Myers (CEO of Accuweather), Jeff Lu (General Partner at Flex Capital), Mike Babineau (CEO of Second Measure), Will Lansing (CEO of FICO), and many others for their help in putting this together.If you want to work at a data company, we encourage you to join us at SafeGraph. If you are an entrepreneurial-oriented superstar, please also email me directly. FAQ’s 1. What is Data-as-a-Service (DaaS)? DaaS is a model where companies collect, standardize, and deliver structured datasets that customers use for analytics, applications, and decision-making. 2. How is DaaS different from SaaS? SaaS delivers software workflows. DaaS delivers factual datasets that power those workflows. 3. What are the three pillars of a DaaS company? Data acquisition, data transformation, and data delivery. 4. Why do DaaS margins initially look low? Data acquisition costs often sit in COGS, making early margins appear weaker than they are. 5. Why are standards important in data businesses? Standards and consistent join keys make datasets easier to combine, increasing their long-term value. DaaS is a model where companies collect, standardize, and deliver structured datasets that customers use for analytics, applications, and decision-making.SaaS delivers software workflows. DaaS delivers factual datasets that power those workflows.Data acquisition, data transformation, and data delivery.Data acquisition costs often sit in COGS, making early margins appear weaker than they are.Standards and consistent join keys make datasets easier to combine, increasing their long-term value. #### DataChronicle Architecture: Achieving Traceability for Large-scale Data Processing in SafeGraph Key Takeaways Traceability ensures Spark jobs produce reproducible results by versioning data, models, and configurations. DataChronicle provides a unified abstraction layer over Apache Iceberg, Delta Lake, and MLflow to simplify versioned data operations. An automated auditing system links Spark applications to exact input versions, improving debugging and accountability. The ingestion framework converts dynamic data sources into versioned tables with optional Change Data Capture support. Rollout tools like the SparkSQL Extension Shim enable large-scale adoption without requiring extensive code changes. SafeGraph is a leading geospatial data company that specializes in curating and managing over 45 million global points of interest (POI). Our comprehensive dataset includes detailed attributes such as brand affiliation, advanced category tagging, open hours, and precise polygons.. To produce and deliver the high-precision data product to our customers, we build our data processing stack on top of Apache Spark and run up to 1000+ daily Spark applications.Traceability in the context of data processing refers to the ability to trace the data changes and consistently recreate the same results from data processing workflows. Traceability is an important factor to deliver accurate data products and enhance the internal capability to debug and fix data issues. Like many other companies, we encountered the significant and prevalent challenge of achieving traceability. Some typical examples includeWithout source code change, the rerun of an “unchanged” Spark application had not produced the same results as its original run and we were not able to pinpoint the factor leading to the inconsistent resultsA table in the data warehouse suddenly exhibits unexpected data distribution. We cannot locate which Spark application wrote unexpected data to the table.Specifically, we recognize several significant challenges in striving for traceable data processing:Complexity of tracing the inputs: The landscape of data processing introduces a multitude of inputs that are constantly changing, intricate to monitor, and pivotal to the outcome of processing jobs. For instance, the content of AWS RDS tables is subject to real-time modifications. Machine learning models might be updated in production without informing downstream users—a complex and sometimes unattainable task. To version and trace all these inputs to the right formats requires complicated considerations ranging from choosing the right data formats to handling the intricate cases like data schema evolution, concurrency control, etc. A comprehensive, all-encompassing solution that can proficiently track these diverse inputs remains elusive in the existing market.Conversion overhead from dynamic to versioned Inputs: In our data infrastructure, there exists a diverse array of data sources—ranging from AWS RDS instances to assorted S3 buckets or vendor-specific software housing critical data. These sources, unfortunately, do not offer a well-defined means of versioning data to facilitate traceability. Mandating users to construct data ingestion pipelines for every type and number of data sources would pose an immense burden both in terms of development effort and ongoing maintenance.Final stretch to traceability: Even assuming the successful development of the requested technologies, a final challenge emerges: implementing these solutions across thousands of data processing jobs. The shift to versioned table formats like Delta Lake or Apache Iceberg entails recalibrating configurations across a multitude of data processing jobs. This undertaking can be daunting, entailing not only configuration modifications but also the introduction of novel concepts and terminology that necessitate user acclimatization. Thus, the need for a rollout mechanism becomes evident—one that eases the user's burden, simplifying the process of configuration and concept assimilation. Additionally, we must devise a mechanism for users to seamlessly trace and query the inputs associated with a particular Spark application. This linkage empowers effective debugging and tracking capabilities on the level of individual data jobs. To realize this, the development of an efficient, comprehensive auditing mechanism is essential, one that minimizes dependency on engineers manually adding audit records through specific APIs.To address these challenges, SafeGraph has developed an architecture called DataChronicle. In this blog post, we will delve into the design and implementation of DataChronicle. It will highlight how DataChronicle operates in the production environment of SafeGraph, serving as a robust foundation for building high-quality data products.DataChronicle Architecture‍‍The above diagram depicts the DataChronicle architecture. DataChronicle consists of the following components:Data Warehouse & Model Registry: Data warehouse and Model registry serves as the components versioning the inputs for users’ Spark applications. Our data warehouse is constructed upon the foundation of the Apache Iceberg format [1]. This choice facilitates not only the storage of data on a large scale but also inherently embeds versioning capabilities at its core. Remarkably, Apache Iceberg seamlessly integrates Changed Data Capturing (CDC) functionality, enabling the highlighting of disparities between successive iterations of data tables. We build our machine learning model registry on top of MLFlow which provides the functionality ranging from versioning ML models to recorded metrics for each ML experimental iteration.Ingestion and User Spark applications: Within SafeGraph, we have ingeniously developed an ingestion framework based on Apache Spark. This empowers users to seamlessly configure and initiate Spark applications, which in turn retrieve data from dynamic sources like AWS RDS, S3 buckets, and other vendor-provided solutions. The extracted data is then stored within the Data Warehouse, preserving its fidelity. Furthermore, the User Spark applications read the tables produced by the ingestion framework and create new tables in the data warehouse. The harmonious integration of these applications with the DataChronicle library is pivotal. This library takes charge of generating and tracking versions for an array of inputs utilized by data processing tasks—ranging from fundamental data inputs to intricate machine learning models.Audit Logging: A vital facet of our architecture encompasses an automated mechanism dedicated to auditing the input versioning pertinent to each Spark application. These audit records chronicle the versions of data/model/configuration that are associated with each Spark application. This repository of historical versions is securely stored within the audit log storage, forming a comprehensive record of the data processing jobs.Users have the ability to retrieve the data versions linked to each Spark application which read/write this data within the audit storage. These version numbers can then be utilized as parameters when initiating Spark applications, with the intention of replicating specific results. Moreover, leveraging the CDC (Change Data Capture) table within the data warehouse can assist users in precisely identifying any anomalous alterations in the data.In the next sections we will give details of each component in DataChronicle.DataChronicle LibraryThe DataChronicle library is central to the DataChronicle architecture. It acts as the interface connecting data processing jobs with the data warehouse and model registry. Through its APIs, jobs can read/write versioned tables in the data warehouse and access data models from the registry.The above figure shows how Spark applications interact with the data warehouse and model registry through the DataChronicle APIs. DataChronicle library serves in two purposesProvides an abstraction over underlying table formats (e.g. Delta Lake and Apache Iceberg) as well as the model registry implementation (e.g. MLFlow)Provides consistent configuration across all Spark applications in SafeGraph for easier management and to avoid confusionUnified AbstractionInstead of exposing the APIs from Apache Iceberg and MLFlow to users directly, DataChronicle masks the technical details and provides unified APIs like read/insert/upsert/delete for data warehouse and uploadModel/fetchModelByStage/fetchModelByVersion for model registry.This design significantly simplifies our data platform engineering work and makes it future-proof.As we illustrated in the above figure, when users create a new table managed by the DataChronicle library, they have the option to choose either Delta Lake or Apache Iceberg as the underlying format. Once the format is selected, users can utilize the DataChronicle library APIs to generate versioned tables and perform a wide range of operations, from basic tasks like reading and writing to more advanced tasks like incremental reading and upserting. There are 2 major reasons leading us to design an abstraction on top of Delta Lake and Apache Iceberg:Uncertainties of open source projects: During the initial phase of our endeavor to establish a foundation for traceability in data processing jobs over two years ago, we encountered notable uncertainties surrounding the open-source projects Delta Lake and Apache Iceberg. In particular, Apache Iceberg was still in its early stages of development with various missing features and bugs. Concerning Delta Lake, we encountered challenges when relying on the open-source version of Delta Lake, as it significantly differed from the private version used by Databricks.Complexities in Delta Lake/Apache Iceberg: Another key reason driving the development of an API layer on top of Delta Lake and Apache Iceberg is the intention to shield users from certain complexities inherent in these two formats. For instance, with Delta Lake, the absence of dynamic partition overwrite led to using replaceWhere [2][3]. Educating a growing team on its use became challenging, prompting the creation of DataChronicle library APIs to generate replaceWhere statements automatically. Similarly, earlier Apache Iceberg versions posed schema “time-traveling” difficulties [4], addressed by APIs that abstracted these intricacies.The unified abstraction offered by the DataChronicle library allows us to swiftly establish the groundwork for traceable data processing while ensuring adaptability to future advancements in open-source technology. We began with Delta Lake due to its robustness and gradually transitioned to Apache Iceberg, which evolved to be more mature and gained more community traction. By encapsulating Delta Lake APIs within DataChronicle library APIs and calling them in data processing jobs, the shift to Apache Iceberg required minimal adjustments. Users are not burdened with learning new concepts or SQL commands exclusive to Apache Iceberg; they continue seamlessly with familiar DataChronicle APIs. Migration mainly involves specifying Apache Iceberg instead of Delta Lake as the format when creating versioned tables.Consistency Across Data Processing JobsProviding a consistent handling strategy for various scenarios is pivotal in averting confusions and unexpected behaviors across data processing jobs.One of the example scenarios we address is how we deal with the changed schema between two versions of data. The silent changed schema in upstream is one of the most common root causes of cascading failures in the downstream. Inconsistency between data processing jobs in this aspect, e.g. some silently allow schema changes and some explicitly disable that case and bring even more challenges. Within the DataChronicle library, we adopt a conservative approach of disallowing schema changes by default. Nevertheless, we offer users the flexibility to enable schema changes and configure notification mechanisms, including Slack integration.Another example is about concurrency control. This addresses situations where multiple Spark applications or threads within a single Spark application use the same version of an Apache Iceberg table as input but generate diverse outputs via different data transformations and then concurrently commit the outputs as the new versions. A uniform strategy for this scenario simplifies user tasks and boosts job reliability without failing conflicted commits too aggressively.For instance, we enable up to 1000 retries for append-only concurrent commits from the same Spark application. Additionally, we utilize a distributed lock based on DynamoDB to coordinate commits across various Spark applications, sparing users from intricate distributed system challenges. The consistent distributed locking strategy ensures consistent performance during concurrent commits, without users needing to tackle extra complexities.Ingestion FrameworkThe purpose of the ingestion framework is to enable users to easily pull data in dynamic sources like RDS, Label Studio and S3 buckets, etc. and save them in Apache Iceberg format in the data warehouse.As shown in the below figure, we build our ingestion framework on top of Apache Airflow and Spark. Airflow serves as the orchestration engine to periodically trigger the job and Apache Spark is the main processing engine responsible for data reading and writing.‍Dynamic Ingestion DAG GenerationUsers in SafeGraph can easily define their ingestion Spark applications using a YAML file, which is hosted on our config server [5]. Below is a YAML configuration that demonstrates setting up a daily job for ingesting data from an S3 bucket and an hourly job for ingesting data from an RDS source.‍‍Tailoring our approach to diverse data sources, we grant users distinct configuration capabilities. As an example, users can define configuration entries for Spark applications to effectively read CSV files from S3. For instance, they can specify the data source type with source_type (s3 or aws rds), they can define the s3 inputs format with source_s3_format, e.g. csv, they could also regulate the behavior of Spark applications behavior like whether to automatically change the ingested table schema with merge_schema. These configurations extend to source-specific Spark configurations like specifying whether the CSV file contains headers, etc.For the streamlined scheduling of data ingestion Spark applications via Airflow, we've engineered a DAG generator within our Airflow environment. This dynamic generator loads the aforementioned YAML file from the configuration server, crafting a unique DAG for each designated scheduling interval. Within each DAG, individual airflow tasks are automatically instantiated, and each task corresponds to the specific data sources outlined in the configuration.This design elegantly abstracts the intricacies of data ingestion into a data warehouse, shielding users from unnecessary complexities. Their sole requirement is to draft a YAML file and commit it to our dedicated configuration server. Our ingestion framework seamlessly manages all aspects, encompassing tasks spanning data reading and writing, as well as the orchestrated execution of Spark applications.Changed Data CapturingWe empower users not only to generate standard ingested data tables, but also to create Changed Data Capturing (CDC) tables that explicitly highlight additions, deletions, and updates in the data.To understand how CDC helps to capture changes in the table, take the following as the example. The below table contains 2 columns, v1 and v2. Two rows exist in the table (v1=1, v2=1) and (v1=2, v2=2)‍After we UPSERT (v1=1, v2=2), (v1=2, v2=3) and (v1=4, v2=4) based on column v1. The table has evolved into the belowHowever, with only the above table we cannot tell the origin of each row as there are multiple possible operations which could generate this table.Below is an instance of a CDC table generated using Apache Iceberg.The ensuing CDC table showcases captured data changes following the UPSERT operation. By showing (v1=1, v2=1) and (v1=2, v2=2) have been deleted and the corresponding new rows (v1=1, v2=2) and (v1=2, v2=3) being INSERTED, we can easily build the change lineage. With this table, users can easily track the data changes between two ingestions. Additionally, they can parse the table contents and implement incremental processing practices by only consuming the updated rows.Final Stretch to TraceabilityAs emphasized in the beginning of this blog post, to achieve traceability in large scale data processing, we need to develop the right rollout strategy at a large scale in addition to the establishment of technical solutions like the DataChronicle Library. To finish this final stretch, we focus on two problemsAutomatic auditing to associate Spark applications with the versions of inputsRolling out changes to Spark applications to leverage Apache IcebergAutomatic Auditing MechanismWhile the DataChronicle library and Ingestion framework render data inputs traceable for data processing applications, the automatic auditing mechanism aims to actualize this traceability by linking input versions to their corresponding Spark applications.The cornerstone of the automatic auditing mechanism lies in the utilization of audit logs, as illustrated in the initial examples of this blog post:Disparate Outputs on Re-run: When a Spark application produces divergent outputs upon re-execution, having a record of the data version it initially reads aids in discerning whether the output discrepancy stems from differing inputs.Unforeseen Data Distribution: If a table within the data warehouse unexpectedly showcases irregular data distribution, noting which Spark application generated a specific table version aids in pinpointing the origin of the anomaly and streamlines subsequent debugging efforts.Requiring the users to explicitly make a record for every read/write in Spark application code is not reliable and sustainable. Therefore, we implement the automatic auditing functionality in the DataChronicle library by wrapping the main functionality with the code extracting necessary information like version number, Spark application ID, etc. and sending them to the audit logging storage system.In our automatic audit logging mechanism, we handle the typical race condition on audit data. a Spark application can have multiple threads read/write Spark jobs, and similarly, multiple Spark applications can read/write the same table at the same time. Considering the following scenarioSpark application A writes new version of table, version XSpark application B writes new version of table, version YSpark application A fetches the latest version number of the table and make an audit log entry into the audit storage systemSpark application B fetches the latest version number of the table and make an audit log entry into the audit storage systemWith the above case, we will get a wrong audit log entry due to the race condition where version Y will be associated with Spark application A.To resolve this challenge, we attach the version tag to the audit log. The version tag consists of Spark application id, thread id and the epoch timestamp when the audit log entry is generated. Correspondingly, we provide the functionality for searching the version based on the version tag in the DataChronicle library. In the above scenario, even if a new version of table written by Spark application B becomes the latest version of the table , Spark application A can still get the latest version written by itself by searching versions based on the version tag consisting of its own Spark application id, thread id, etc. and eventually generate the correct audit log entry.With the automatic auditing built in DataChronicle architecture, users do not need to remember to make a record of read/write every time or they do not need to deal with complexities to guarantee the correctness of audit log entries.SparkSQL Extension ShimOne of the biggest challenges to leverage the power of Apache Iceberg to version the inputs is to roll out changes up to 1000+ Spark applications.One of the critical changes involves SparkSQL's Catalog concept introduced in Spark 3.x, utilized by Apache Iceberg to manage tables. Instead of referring to the table as “DatabaseName.TableName”, users need to refer to Apache Iceberg tables as “CatalogName.DatabaseName.TableName”. We need to educate such a new concept in a growing team. Additionally, users can configure inconsistent catalogNames for Apache Iceberg across different Spark applications leading to inconsistency and non-reusable source code.To resolve this issue, we have developed a module called SparkSQL Extension Shim in the DataChronicle library. As Apache Iceberg registers the catalog name via SparkSQL extension[5], SparkSQL Extension Shim calls the private APIs in Spark to invisibly inject CatalogName configurations into SparkSQL extension. In each Spark application, when users call any DataChronicle library API for the first time, the relevant configuration is automatically injected into SparkSQL extension. Therefore users do not need to make any code change to add configurations.The SparkSQL Extension Shim not only seamlessly integrates Apache Iceberg with Spark applications but also effectively conceals internal concepts like Catalog to users and eliminates concerns of inconsistent catalog names across various Spark applications.SummaryIn this blog post, we unveiled DataChronicle—the foundation of SafeGraph's architecture empowering extensive data processing traceability and the capability of shipping precise data products to customers. Central to this is the DataChronicle library, offering versioned input read/write functionalities while maintaining unified abstraction and consistency across Spark applications. Our streamlined ingestion framework simplifies the conversion of dynamic inputs into traceable, versioned forms. Finally, we achieve seamless tracing through automated auditing and integration with SparkSQL, minimizing overhead. FAQ’s 1. What is data traceability in large-scale data processing? Data traceability is the ability to track input versions, code execution, and outputs so results can be reproduced and audited consistently. 2. How does DataChronicle improve Spark job reproducibility? It versions data and machine learning models, logs input versions automatically, and allows jobs to rerun using specific historical versions. 3. Why does SafeGraph use Apache Iceberg and Delta Lake? Both formats support versioned tables and time travel. DataChronicle abstracts their complexity while preserving traceability features. 4. What is Change Data Capture (CDC) and why is it important? CDC records inserts, updates, and deletes between table versions, enabling incremental processing and precise change tracking. 5. How does automated auditing help debug data issues? It connects each Spark application to the exact data and model versions it used, making it easier to identify the source of inconsistencies. Data traceability is the ability to track input versions, code execution, and outputs so results can be reproduced and audited consistently.It versions data and machine learning models, logs input versions automatically, and allows jobs to rerun using specific historical versions.Both formats support versioned tables and time travel. DataChronicle abstracts their complexity while preserving traceability features.CDC records inserts, updates, and deletes between table versions, enabling incremental processing and precise change tracking.It connects each Spark application to the exact data and model versions it used, making it easier to identify the source of inconsistencies. #### Demystifying the SafeGraph Facts Key Takeaways SafeGraph sells factual data about physical places, including location, store hours, and geometry. The company does not sell data about individuals and maintains a publicly available data schema. SafeGraph supports diverse use cases across research, logistics, local search, real estate, advertising, government, and healthcare. More than 15,000 researchers and hundreds of journalists use SafeGraph data, with broad self-serve access. The company emphasizes transparency, data accuracy, and privacy-safe democratization of access. SafeGraph sells facts about places. We strive to be a super transparent company and have always provided our full data schema available online. Our mission is to empower data scientists working on humanity’s hardest problems.The SafeGraph Places dataset includes where businesses are located, when they’re open, and what neighboring businesses surround them. Any and all of the places in our datasets are easily searchable online. It’s all out there. We even list our bugs and errors every month.SafeGraph just sells facts.Most of the organizations that use our facts want to know about things like the store hours of the local cafe. The store hours actually change a lot (and during peak COVID they were changing weekly) and a lot of people want to know that information.We also have data about the geometry of a physical place. Like understanding the shape of your local gym or understanding all the parking lots in your metro area.SafeGraph only focuses on the truth. So here are a few truths:We only sell data about physical places (not individuals). Our data is available to anyone to buy. Our schema is public. There is nothing hidden. Even VICE (an online news organization) bought our data and we would never prevent them from getting the data. We also give the SafeGraph data away for free to 15,000+ researchers and academics who use the data in amazing ways. We also have over 100 reporters from some of the nation’s best press institutions use the data. We encourage you to get the data and see it yourself.We service tons of different use cases. Here are the main use cases of the SafeGraph data:Research – 15,000+ researchers and academics use the SafeGraph data resulting in hundreds of major research papers.Logistics – we have a lot of data about warehouses, train depots, ports, and more. So the data is very helpful for logistics. This is especially important today when there are so many supply bottlenecks.Local search – one of the big use cases for SafeGraph data is helping put places on a map. Our data can be very helpful in a search for “Italian restaurants near me.”Real estate planning – many of the largest retailers use the SafeGraph data to figure out where they should put their store. Also some of the big real estate buyers (like large PE funds) use SafeGraph data to figure out what to buy.Adtech - most of our advertising customers are in the Out Of Home category where they are trying to figure out where they should deploy new ad assets and they also use our data for compliance (like to make sure there are not alcohol ads near schools).Government – our government customers include the CDC (for health policy), Federal Reserve (to help understand the economy), and many local and state governments (mainly for things like urban development, understanding food deserts, and transportation planning). We have great data on parking lots. Contrary to some belief, we don’t have any law enforcement customers.Healthcare - for real estate planning and logistics about how to deliver better care.We build facts about physical places and that’s all we do. We have competitors that also sell this type of data … but we think we’ve been successful because we have focused on the veracity of the data.Part of democratizing access to data means making it available in a self-serve way. But of course, making data convenient and accessible also has drawbacks. It means we aren’t able to fully control who buys the data. But we’ve never tried to censor or hide anything.But there are always extreme hypothetical corner cases, and in some cases these are worth actively preventing.We will still have Places and Geometry data about Family Planning Centers (like their locations and operating hours). Family Planning centers like Planned Parenthood make their location data public because they want to serve their constituents.These decisions are never easy and there will certainly be more conflicting situations in the future. SafeGraph is committed to remaining dynamic and advancing our mission of democratizing data in a privacy-safe way. FAQ’s 1. What kind of data does SafeGraph sell? SafeGraph sells factual data about physical places, such as business locations, hours of operation, categories, and spatial geometry. 2. Does SafeGraph collect or sell personal data? No. SafeGraph focuses only on data about places, not individuals. 3. Who uses SafeGraph data? Researchers, logistics providers, real estate firms, advertising companies, healthcare organizations, and government agencies use the data for planning and analysis. 4. Is SafeGraph’s data publicly documented? Yes. The full data schema is publicly available, and the company publishes known issues and updates regularly. 5. Is SafeGraph data available for academic research? Yes. SafeGraph provides free access to thousands of researchers and academics for non-commercial research purposes. SafeGraph sells factual data about physical places, such as business locations, hours of operation, categories, and spatial geometry.No. SafeGraph focuses only on data about places, not individuals.Researchers, logistics providers, real estate firms, advertising companies, healthcare organizations, and government agencies use the data for planning and analysis.Yes. The full data schema is publicly available, and the company publishes known issues and updates regularly.Yes. SafeGraph provides free access to thousands of researchers and academics for non-commercial research purposes. #### Developing a Site Deselection Strategy with Regional Trends in Business Closures Key Takeaways Site deselection strategy helps retailers proactively reduce risk before locations begin to underperform. Retail analytics insights reveal regional and industry-specific closure patterns that inform smarter decisions. Brick-and-mortar closure trends vary significantly by metro area, even within the same state. Economic downturn retail data can signal early warning signs for vulnerable trade areas. Retailers and commercial real estate analysts are no stranger to site selection. The success of a brick-and-mortar store is often dependent on its location and the market conditions of the surrounding area. For example, a quick service restaurant that has popular lunch options has a higher chance of success if located near office buildings than it does if located in a rural, sparsely populated area. Identifying the right spot for a new brick-and-mortar location often comes down to identifying success factors for other stores and finding lookalike markets to expand into. For years, retail analytics insights have increasingly incorporated local market data to give users an edge in site selection. But what about when a business is looking to close locations? Sure, they can evaluate store performance and close locations that are losing money. However, this method involves relying on a lagging indicator rather than taking a proactive approach to cutting costs. Retailers who truly stay ahead not only have a site selection strategy, but also a site deselection strategy. When a brand is struggling or anticipating economic uncertainty, retail analytics strategy become even more important as they reveal insights that can make or break a business. What is site deselection? As with choosing new locations to open, deciding where to close an existing location requires thorough investigation and analysis, and is often conducted using analytics platforms powered by location data.  What is site deselection? It involves looking at trends in how consumers are interacting with a brand’s stores, their competitors, and their complementary brands to anticipate which locations will underperform in the future and act. Combined with the latest market landscape data, including which businesses have opened and closed nearby and how they intersect with trade areas, these insights enable brands to make strategic decisions before a location truly starts to underperform. In many ways, site deselection follows the same steps and requires the same inputs as site selection but focuses on risk indicators instead of opportunities. Site deselection in a recession economy Having a solid site deselection strategy is critical for any brand in any economy, but even more so during (or ahead of anticipated) times of economic downturn. For example, the pandemic economy drastically altered the quick service restaurant industry, resulting in some brands experiencing increases in demand and others closing shop due to a lack of customers.  So with some economists warning we may be headed for a recession, how can brands best prepare to weather the storm? While a recession economy is different from a pandemic economy, there are learnings brands can glean from their COVID-19 experience. Insight into where competitors and complementary businesses are opening and closing can indicate how trade areas are changing, while consumer behaviour data offers clues into how demand may shift. Leveraging economic downturn retail data through analytics platforms enables retailers to build a proactive approach to site deselection. Q2 2022 Economic Insights:  Regional Closure Trends by MSA To see how businesses in the US are faring in these uncertain economic times, we looked at regional trends in place closings for the month of April 2022 using SafeGraph Places data. Percent of Places Closed in April 2022 by State- MAP First, we look at overall place closures by state. States shaded blue saw fewer closures, while states shaded orange saw more. Hawaii had the most closures, with 0.36% of all places open in March being closed in April. However, at the state level, businesses appeared relatively stable. Percent of Places Closed in April 2022 by Metropolitan Area- MAP When drilling down to a more granular level, we can see more distinct brick-and-mortar closure trends. Auburn–Opelika, Alabama and Michigan City–La Porte, Indiana had the highest percentage of places closed in April, at 0.56% and 0.54% respectively. More variation is visible at the MSA level than at the state level. For example, while California appeared high in closures at the state level, MSA-level data revealed substantial regional differences. Percent of Restaurants Closed in April 2022 by Metropolitan Area- MAP Looking at restaurant closures by MSA shows that the industry is stable in some regions and changing rapidly in others. The Casper, Wyoming metro area saw 2.23% of restaurants close in April. However, the same metro area saw only 0.23% of retail stores close during the same timeframe. How Retail Analytics Platforms Inform Site Deselection Percent of Retail Stores Closed in April 2022 by Metropolitan Area- MAPWith these regional and industry insights, retail analytics insights allow retailers to identify which markets are likely to expand or contract in coming months. This enables brands to develop proactive site deselection strategies that help them stay ahead of the competition. Each month, SafeGraph pulls similar insights for the retail and restaurant industries to provide a snapshot of how markets are performing. How does SafeGraph track open and close information month over month? The SafeGraph team sources our Places data from a variety of sources, including publicly available store locators that many brands offer online. When a brand updates their store locator to reflect a closure, that change is ingested into our pipeline.  Learn more about our opened_on and closed_on columns. Resources for site deselection Ready to learn more about the data needed for effective site deselection? Here are some resources for site deselection to help you get started: How to turn a bunch of data into a site deselection strategy webinaRetail site selection checklist A guide for real estate site selectionConclusion A proactive site deselection strategy is no longer optional in an uncertain economic environment. By combining retail analytics insights, brick-and-mortar closure trends, and economic downturn retail data, brands can move beyond reactive cost-cutting and make informed, forward-looking decisions. Regional analysis and timely market intelligence enable retailers to protect performance, reduce risk, and respond strategically to changing consumer behaviour. Schedule a Free Demo to See How SafeGraph Data Powers Site Deselection Schedule a Free Demo FAQ’s 1. What is site deselection? Site deselection is the process of identifying underperforming locations and proactively deciding which sites to close based on data-driven risk indicators. 2. How does a site deselection strategy differ from site selection? By integrating with CRMs and workflows, POI data automatically fills business details and flags new opportunities without manual research. 3. Why are regional closure trends important? Brick-and-mortar closure trends vary significantly by metro area, making regional analysis essential for accurate decision-making. 4. How do retail analytics support site deselection? Retail analytics insights reveal consumer behaviour, competitive dynamics, and market shifts that signal future performance risks. 5. How does economic downturn data help retailers? Economic downturn retail data helps brands anticipate demand changes and adjust their physical footprint accordingly. 6. What data is required for effective site deselection? Location data, consumer behaviour data, competitive intelligence, and market closure trends are all key inputs. 7. Where can I find resources for site deselection? SafeGraph offers multiple resources for site deselection, including webinars, guides, and analytics tools to support decision-making. Site deselection is the process of identifying underperforming locations and proactively deciding which sites to close based on data-driven risk indicators. By integrating with CRMs and workflows, POI data automatically fills business details and flags new opportunities without manual research.Brick-and-mortar closure trends vary significantly by metro area, making regional analysis essential for accurate decision-making.Retail analytics insights reveal consumer behaviour, competitive dynamics, and market shifts that signal future performance risks.Economic downturn retail data helps brands anticipate demand changes and adjust their physical footprint accordingly.Location data, consumer behaviour data, competitive intelligence, and market closure trends are all key inputs.SafeGraph offers multiple resources for site deselection, including webinars, guides, and analytics tools to support decision-making. #### Do Republican Counties Drink More Than Democrat Voting Counties? Key Takeaways County-level alcohol spending shows only a weak correlation with Democratic vote share. Republican vote share has an even weaker relationship with beer preference. Per capita normalization is essential for meaningful comparison. Most of the work involved cleaning and linking mismatched public datasets. Reliable dataset joins are critical for meaningful cross-domain analysis. Divided by Politics, United by BeerWe’ll explore whether alcohol spend at the county-level predicts voting behavior by analyzing open government datasets. Through this process, we’ll also demonstrate the difficulties data scientists face when joining and linking open datasets (and SafeGraph’s upcoming solution to this problem).First, we’ll start with a proxy for the amount spent on alcohol in different states. Fortunately, under the Texas Tax code, alcohol permittees must report how much they made on alcohol. So, we’ll use that data to obtain the total sums of liquor, wine, and beer consumed in each county per day, a straightforward aggregation on the county where the store is located of the amount reported divided by the number of days the store reported.In order to find out which party people are affiliated with, we’ll proxy by the results of the US General Election in 2016 and assume that people don’t switch parties most of the time and that their affiliation is representative of the population of the county. Remembering that not everyone is registered to vote, we’ll go out there and get a population estimate by county so that we can normalize each county’s drinking per capita, assuming that Texas counties didn’t substantially grow or shrink during the reporting period.Already we find ourselves facing a couple of problems:We needed to go find these three data setsThey don’t really join, since(a) The Texas tax data is keyed off Texas County Number — an internal numbering system(b) The Census data is based off the County’s FIPS code(c) The US General Election results are reported off the County’s nameAnd finally, because of that previous problem, we’re going to have to go build ourselves a path from each dataset to anotherSo now we’ll have to go get something that links a county’s name to its Texas County Number and its FIPS code. Usually, this is where we’ll also run into spelling differences and minor things like some data sources suffixing their county names with the word “County” and others not doing so.But now that we have all of this data, we can join it together, and look for, say, a linear correlation.As it so happens, our vast adventure in acquiring and cleaning all this data led to another null hypothesis failing to falsify. It looks like there’s the weakest of correlations (about r=0.25, for an r2≃0.06) between how much a county votes Democrat and how much money they’ll spend on alcohol.And our search for interesting correlations must continue elsewhere since there’s only a weak correlation (r=0.17 , r²≃0.03 ) between how much a county votes Republican and how much of a preference it has for spending money on beer over wine or liquor. On the bright side, it’s quite heartening to know that no matter where we are on the political divide, the great commonality we have is how much we like to spend on our booze.The analysis here is fairly simplistic, and this isn’t a particularly statistical robust result, but it illustrates that a large portion of our time is often spent identifying which datasets can contain the statistics we’re interested in, cleaning up the data in them, and identifying a way to join the datasets together even when all the data is open access. At SafeGraph, we’re now working on a tool to find, structure, and join disparate open access datasets. When common keys exist for datasets, the ability to join them opens up vastly more opportunities to study relations. #### Do you vote like a Toyota? Or a Ford? Key Takeaways Brand distribution across congressional districts often aligns with voting patterns.American car dealerships overindex in Republican-leaning districts, while foreign and luxury brands are more concentrated in Democratic districts. Coffee and bookstore brands show milder but noticeable partisan skews. Regional banking presence, particularly Regions Bank, strongly aligns with Republican districts. Retail geography reflects broader demographic and political divides. How Voting Preferences Are Tied to Local BrandsWe know there are differences between Republicans and Democrats on political issues. But what about the brands they love?We analyzed SafeGraph Places, a dataset of 6 million points of interest in the US & Canada, to determine whether your voting habits are related to the cars you drive or the places you buy your books. We proxied how popular a brand is by how many of its stores are present in a particular congressional district. It turns out that the popularity of some products is quite an indicator of how we’d vote.Car Preference by Political PartyCar owner stereotypes are common and well-known, and while it’s no surprise that Democrat-leaning congressional districts (which tend to be urban and have higher incomes) have more luxury cars, it’s also true that partisan districts have quite large differences in the distributions of makes even for non-luxury cars.Of the 435 congressional districts in the 115th Congress, the ones that most voted for Trump had dealerships that were 25% more likely to be Ford, Chevrolet, and GMC than the US average. Dealerships in districts that went strongly for Clinton were twice as likely to be BMWs as average. A higher percentage of dealerships in these districts were foreign makes (Volkswagen, Toyota, Subaru, for instance, all over-represented by 25–50% over the US average) and a lower percentage of American makes. These results even look similar with California (which supplies most of the Democrat-partisan districts) removed.Where Fords are popularWhere Volkswagens are popularThe top districts that went for Clinton have far more dealerships that sell luxury cars (notably, three times as many Lamborghinis or Teslas as average) and fewer dealerships in total. Since congressional districts have populations roughly the same by law (the largest is less than twice the size of the smallest), this means strong Clinton districts have fewer dealerships (about a third the number present in strong Trump districts) for the people present. That’s not particularly surprising, since urban voters (where there is less room for dealerships) more frequently voted for Clinton.Coffee Preference by Political PartyDrinks and food are great unifiers, with Americans of all stripes choosing to get their coffee at Starbucks, their doughnuts at Dunkin’ Donuts, and their ice-cream at Baskin Robbins. The top Trump and Clinton quintiles (and even deciles) go to the same top stores for their favorites, though there are much fewer Dunkin’ Donuts locations in the former.Here are the results by Brand for Top Trump Districts vs. Top Clinton Districts:Starbucks: Dead even, 107% the US average for both sets of partisansDunkin’ Donuts: 62% the US average for Trump partisans, 89% for Clinton partisansBaskin Robbins: 121% the average for Trump vs 130% for Clinton. We all do love our ice cream.Panera Bread: 133% the average for Trump vs. 64% for Clinton!Bookstore Preference by Political PartyMore books are sold online today than any other way, but as far as brick-and-mortar shops go, most Americans have a Barnes and Noble nearby they can go to, irrespective of where in the country they are or how their district votes. Notable distinctions between top Clinton districts and top Trump districts are that the former don’t have as many Books-A-Million locations, and before Christian retailer LifeWay’s planned shutdown of their physical locations, they were most common in Trump-leaning districts while being nearly completely absent in top Clinton districts.Barnes and Nobles as a percentage of bookstores across the countryLifeWay Christian Stores as a percentage of bookstoresBank Preference by Political PartyMost Americans have a Chase, Bank of America, US Bank, or Wells Fargo near them, but Trump partisans usually live in Regions Bank territory. If you know you’re in a partisan congressional district and you can’t find out which side the district falls on, look around for a Regions Bank. If you find one close to you easily, chances are that your district votes for Trump preferentially. FAQ’s 1. What data was used in the analysis? SafeGraph Places data combined with congressional district voting results from the 115th Congress. 2. How was brand popularity measured? By the number of brand locations within each congressional district compared to the national average. 3. Does brand presence cause voting behavior? No. Both reflect underlying demographic and geographic factors. 4. Which category showed the strongest divide? Luxury and foreign car dealerships showed the clearest partisan differences. 5. Are some brands politically neutral? Yes. National chains like Starbucks and major banks are widely distributed across districts. SafeGraph Places data combined with congressional district voting results from the 115th Congress.By the number of brand locations within each congressional district compared to the national average.No. Both reflect underlying demographic and geographic factors.Luxury and foreign car dealerships showed the clearest partisan differences.Yes. National chains like Starbucks and major banks are widely distributed across districts. #### Esri User Conference 2025 Recap Key Takeaways POI data is now foundational infrastructure powering logistics, supply chains, and real-world operations. The “conflation tax” is a measurable drag on enterprise efficiency and data reliability. Hybrid public and commercial data strategies are becoming the new standard in government analytics. Enterprise GIS is expanding into a cross-functional platform, despite slow modernization cycles. AI systems increasingly depend on precise, richly structured location data to produce reliable outcomes. Every July, San Diego becomes the global hub for all things geospatial – and this year’s Esri User Conference proved once again why it's one of the most important events in the mapping and location intelligence world.While our days were packed with meetings, sessions, and walking the expo floor, the real value of Esri UC lies in connecting with people across a wide array of organizations. Here are a few key themes we noticed about where the industry is headed:📍 POI is No Longer Just a Map Layer - It’s Operational InfrastructureMultiple sessions and demos highlighted POI data as more than a visual aid. Places data is mission-critical across many industries. From logistics to public health to disaster response, organizations are relying on rich, fresh, and accurate points of interest data to power internal systems, decision-making tools, and customer experiences.One surprising insight: International address geocodes are key to global initiatives for several companies. One hardware technology firm presented their work creating a customer-to-wearhouse-to-technician service supply network. The foundation to this advanced analysis started with address data. In countries where this content is difficult to source, it is difficult to provide support services and expand their market footprint.🧠 The Conflation Tax Is RealA recurring phrase this year was the “conflation tax” – the hidden cost companies pay to reconcile inconsistent datasets. Whether it's open data like OSM or commercial sources, the industry is collectively feeling the burden of duplicate records, mismatched formats, and manual cleanup work.We saw increasing momentum behind standards efforts like GERS (Global Entity Reference System) and growing appetite for interoperable identifiers and data matching services like Placekey. In an era where more data is not always better, data that is easily joinable has a competitive edge.⚙️ Government + Commercial = Closer Than EverPresentations from public sector leaders emphasized how government and commercial data ecosystems are blending. Many agencies are adopting hybrid strategies: starting with authoritative government sources while also integrating commercial location datasets to get faster insights and fill gaps. This work is especially critical in areas like infrastructure resilience, supply chain visibility, and emergency response.🏭 Enterprise GIS Is Evolving, But Still SlowOrganizations like KPMG, CVS, and Stripe presented how GIS is evolving from a static mapping tool into a cross-functional data platform. We saw examples of tax boundary mapping, site selection, and clinical risk modeling, all powered by Esri. But enterprise GIS modernization still moves at a glacial pace. Many teams are still transitioning their data and analytics stack to take advantage of new cloud capabilities, and change management remains a real hurdle.👁️‍🗨️ AI & Spatial: The Coming CollisionWhile not as dominant as at pure AI conferences, Esri UC did hint at a coming convergence: Spatial intelligence is becoming inseparable from AI. From grounding large language models in the real world, to powering map-centric agents, to enriching user-generated content with real-world context; location data is emerging as the connective tissue.But not just any data will do. AI models need location datasets that are precise, current, and detailed. Rich metadata capturing not just where places are, but what they are, how they relate to one another, and how they evolve over time keeps AI outputs from being generic, incomplete, or flat-out wrong.As AI moves from demos to mission-critical applications, the demand for focused, trustworthy location datasets is only going to accelerate.Final ThoughtsEsri UC remains a barometer for the spatial data industry. The conversations this year weren’t just about maps – they were about operational excellence, data standardization, and solving real-world challenges. Whether it’s a franchise planning a new location or a government agency preparing for natural disasters, the foundation is the same: accurate and connected data about the physical world.Until next year, San Diego. 🌍‍ #### Everything You Need to Know About Industrial POIs At SafeGraph, our sole focus is curating the most accurate and precise points of interest (POI) data for businesses, non-profits, and academics alike to use in their analytics. This broad range of use cases for POI data means our customers require a wide variety of POI types to work with. Over the past few years, we’ve been continuously adding new categories of POIs to our datasets. One of the most popular POI types we’ve seen a need for is industrial POIs. “Industrial” is a blanket term used to imply a range of non-consumer POIs. Our coverage in the June Places release can be broken into the following categories:General Warehousing and Storage (NAICS code 493110): ~2k POIsLessors of Miniwarehouses and Self-Storage Units (NAICS code = 531130): ~20k POIsRefrigerated Warehousing and Storage (NAICS code = 493120) - 39 POIsIndustrial Equipment Wholesalers (several NAICS codes): ~33k POIsManufacturing Facilities (several NAICS codes): ~3k POIsData Centers (NAICS code = 518210): ~2k POIsIn total, these categories of industrial POIs amount to over 60K data points.You can see a full breakdown of our industrial POI coverage here. What are industrial POIs used for?There are just about endless possibilities for using POIs in an analysis. While industrial POIs are very different from consumer-based POIs, the use cases for them are actually very similar. At SafeGraph, we’ve seen a great deal of interest in industrial POIs for demand forecasting, supply chain management, and commercial insurance purposes.Demand ForecastingFinancial services firms and brands themselves are increasingly turning to data science as inputs into their demand models and forecasts. While transaction data is fundamental to financial analysis, geospatial data - particularly POIs and their foot traffic - is quickly becoming a must-have for firms.Openings and closings of industrial POIs are powerful indicators of future supply and demand of products manufactured there.Analyzing the location and foot traffic of industrial POIs enables firms to take a general pulse on macro-economic activity. If a particular brand or type of industrial location is increasing or decreasing its coverage (or has a change in foot traffic), this could mean that whatever goods it’s producing or distributing will soon see a change in market presence. Firms can then use this data to make more informed decisions about their investment portfolios.‍Similarly, measuring foot traffic at ports or industrial wholesaler locations can be a better proxy for demand than at the industrial plants themselves. The ability to analyze multiple levels of the industrial supply chain gives firms an edge in their modeling and forecasting.Supply ChainMapping and analyzing the location of various points in a supply chain is essential for not only demand forecasting, but also logistics planning itself. As more and more consumers are purchasing goods online, brands have to find new and innovative ways to meet the growing demand for delivery and high customer expectations. Industrial POI data provides a way to factor delivery infrastructure into logistics planning and execution. With an accurate and up-to-date account of all locations involved in the supply chain, a brand can efficiently coordinate delivery routes to keep up with consumer demand. Industrial POI footprints are the most precise geofences for brands to use when monitoring the supply chain. A centroid radius can indicate a potentially incorrect delivery point.POI footprints are also an important ingredient in supply chain and logistics planning. Accurate and precise building footprints can be deployed as geofences to confirm drop-off or pick-up of goods at any points in the supply chain. With the reliability of good POI and geometry data, brands can be confident in their supply chain and ability to meet increasing consumer demands.Commercial InsuranceIndustrial complexes are much more complicated than your average consumer shop or restaurant. For one, they often have multiple buildings associated with a single address spread out over a much larger physical footprint. Consider the number of buildings at a large fulfillment and distribution center, as well as the surrounding parking lot and roadways.For the commercial insurance companies who generate the policies that cover these buildings, it's important to have strong, accurate data about the entire parcel as well as the surrounding area. Some variant of this information may be provided by the customer who is being insured but it's key to vet those details from an independent source for increased accuracy and context.There are many uses for industrial POI data, ranging anywhere from operating a manufacturing plant to anticipating demand and everything in between. As we continue to add more categories of POIs to our data, that list will only grow. #### Fast Food Brands: The Rise and Fall 2021 and Fast Food2021 was a turbulent year for a lot of reasons and for a lot of people. In particular, let’s take a look at fast food and the different major brands. What was changing over the year for different brands? Who was closing locations? 2021 was when we realized we might be in for delivery rather than eating in restaurants for a good while longer. Some places - pizza chains, for instance - are a little better set-up to weather that storm than places that depend on you eating your french fries no more than fifteen minutes after they leave the fryer.We’ll be focusing here on the SafeGraph opened_on and closed_on columns, which show us when different locations, well, opened or closed. This lets us see not just how many locations each fast-food chain brand has (Subway still reigns supreme, by the way), but how that number evolves over time, on a monthly level.Let’s start by making a simple graph of what’s going on with locations in 2021:Alright, so no major shakeups. Nobody’s dropping out of the market all of the sudden. But we can already see some differences in which brands are trending up and which are trending down. What happens if we look just at the changes?1A lot of McDonald’s woes appear to be related to the shutdown of their Walmart locations. But the 327 net closures in the data go beyond the 200 or so Walmart locations that got closed. Seems like people really aren’t into those delivery fries!Where are these closures occurring? Let’s take a look.We see quite a lot going on here. The midwest and mountain parts of the country seem largely unaffected by McDonald’s shutdowns in 2021. But Florida, the Southwest, and West are looking like trouble for the arches. Three of Alaska’s 28 locations closed down, more than 10%. Nevada is also above 10%, with 13 of 125 locations shutting down.The one that’s really got to sting is Texas, though, with 73 out of 1113 locations shutting down, or 6.6% in a big state. Ouch!Still, things seem not to be so bad for McDonald’s in the northeast. To further dive into these regional trends, schedule a demo with one of our experts.‍1. Important to note with this data is that Wendy’s, Dairy Queen, Sonic, and Arby’s only started getting tracked in 2021, so some openings we see may have just been re-openings from temporary closures that started at the beginning of the pandemic. Openings may be a bit overstates for those four. #### Fast Food Chains With The Most Locations Per State (#2 Most Popular Restaurant Will Surprise You!) We found & visualized the 1st, 2nd, and 3rd most popular fast-food restaurant chain in each state using SafeGraph’s point of interest data.It’s Subway’s World: You’re Just Living In ItWhen asked, “What’s the chain with the most restaurants in the US?”, the majority of Americans would guess McDonald’s. Surprisingly, the correct answer is actually Subway!Subway is the most popular fast-food chain (most restaurant locations) in 49 out of 50 states. Delaware isn’t as interested in eating fresh.In the US, there are 24,568 Subways compared to 13,793 McDonald's locations. Delaware is the only state where there are fewer Subway’s than McDonald’s, with only 32 Subways compared to 37 McDonald’s locations.Subway’s lax real-estate requirements & franchising model have allowed for fast expansion since the brand’s founding in 1965.Subway locations have fewer real estate requirements than McDonald's since McDonald's locations usually need to be in free-standing buildings & need a drive-through. Additionally, the average Subway store format is smaller than McDonald's which further reduces costs. Lastly, Subway boasts some of the lowest startup-costs when it comes to equipment and franchise fees compared to other Quick-Service-Restaraunt chains.The flexible site-selection criteria & friendly franchising model, combined with America’s love of footlongs, has allowed Subway to expand to more locations than McDonald’s.However, the gap between the brands is closing. Subway closed 1,100 restaurants in 2018, leading to the lowest number of store locations since 2011.McDonald's: The Fast-Food Chain With The 2nd Most Locations In 40 StatesThe 2nd most popular restaurant chain is McDonald's in 40 states. Dairy Queen, Pizza Hut, & Sonic are popular in the Midwest.In 40 out of 50 states, McDonald’s has the 2nd most locations out of any fast-food chain. Even when competing for the #2 position, Mickey D’s wasn’t able to be as dominant on the map as Subway was for position #1.From the map, it’s obvious McDonalds has had a harder time in the middle of the country.Compared to McDonald’s, Dairy Queen and Pizza Hut have more store locations in the Midwest. Dairy Queen has strong roots in the Midwest, with its first store opening in Joliet, Illinois in 1940 and its corporate headquarters based in Edina, Minnesota. Pizza Hut was founded in 1958, by two Wichita State University students and brothers in Wichita, Kansas and has historical ties with the Midwest. Sonic, America’s drive-in, was started in Oklahoma and has maintained its home state advantage against McDonald's.3rd Place: A Hodgepodge Of BrandsNo single brand dominates for 3rd place. Instead, we start to see strong regional preferences in this fragmented map.The 3rd most popular restaurant chain per state features a wide array of brands.Many regional preferences are due to the initial restaurant being started in or near a state. For example, burger chain Jack In The Box was founded in San Diego and is ubiquitous on the West Coast. Waffle House, the breakfast food chain, was founded in 1955 in Georgia. It remains popular in its home state along with South Carolina.Interestingly, Burger King is dominant in New England, even though it was started far away in Florida.Curious About How SafeGraph Creates Its Datasets?Get insight on how we create our geospatial datasets and handle the weird edge cases that come up when dealing with places data such as the corner case below.Weird Edge Case #29356: A single store location that’s a KFC, Pizza Hut, & Taco Bell at the same time. #### Felix Cheung Joins SafeGraph as VP Engineering We are really excited that Felix Cheung has joined SafeGraph as our VP Engineering. He is leading all of engineering at SafeGraph, which is focused primarily on data, infrastructure, machine learning, and more. Felix has deep expertise in Apache Spark and machine learning infrastructure and is a leader in open source Felix most recently was Senior Engineering Manager at Uber where he managed engineering teams across 4 different infrastructure/platform areas including Spark as a Service, Machine Learning framework, Data Security & Compliance platform, and the Data Observability platform. Previously he was the area/tech lead at Uber for Spark. Felix Cheung Joins SafeGraph as VP Engineering Felix is active in open source. He is a Committer and PMC member for Apache Spark and Apache Zeppelin (check out his Github profile). He is also a mentor of 4 projects as a part of the Apache Incubator and a Member of The Apache Software Foundation. At Uber, he served on the Technical Steering Committee for the Uber Open Source Program, and led the Data Open Source Working Group as Chair which provided technical oversight around open sourcing software and contributing to open source software projects. Before Uber, he was a software engineer at Avvo (a start-up), Automattic (well-known for WordPress), and spent over 15 years as a software engineer and engineering manager at Microsoft. Felix holds a degree in Electrical and Computer Engineering from the University of British Columbia 🇨🇦. We are excited for Felix to bring his expertise to SafeGraph. We use Spark, machine learning, and big data every day to ingest data from thousands of sources, and clean, impute, merge, organize, and structure, and deliver world-class data products to our customers. SafeGraph Continues to Build a World-Class, Distributed Team Although SafeGraph was started in San Francisco, today half of our team members now live outside of the San Francisco Bay Area (but we all live in North America). Felix lives in Seattle. That means he will be the third member of the SafeGraph executive team based outside of San Francisco (and we have a fourth that we will be announcing soon) -- continuing our path to becoming a distributed company. Felix, Kara and Roshan hanging out at a recent SafeGraph team event in San Francisco. We regularly fly all of our team members to San Francisco for team events. Really exciting to have Felix on the SafeGraph team. Come work at SafeGraph If you are interested in working at SafeGraph (with Felix!), we are hiring amazing software engineers and machine learning engineers, as well as sales and business roles (you can be located anywhere in North America). SafeGraph is building geospatial datasets to power innovation across industries. Our goal is to be the definitive source of information about physical places. Read more about our Vision and Values [Focus, Judgment, Humility, Leverage, Don’t-be-a-Bottleneck, and Growth]. Join SafeGraph and help us open access to information and be the data utility to all.   #### Forget ML — 4 Weird Edge Cases Which Confuse Even Humans When It Comes To Places Data ‍Perplexing Edge Cases SafeGraph Encounters On Our Journey Building The Source of Truth About Physical PlacesSafeGraph aims to be the source of truth for physical places. No fake news — just the facts.But the dynamic, evolving, complex world we live in poses a real challenge for us SafeGraph-ers. By aiming for 100% accuracy, we know we are undertaking a Sisyphean task.At the top of the hill lies the truth. We’ll never get there, but we’ll keep trying.Capturing the full complexity of the world and encapsulating it neatly into one clean CSV dataset is impossible. One reason our job is so difficult is that all data sources are noisy, which causes our algorithms to make errors and mistakes.Our ML algorithms discovered Atlantis (amongst other mistakes)Our machine learning team fuses data from many, many sources including satellite imagery, first-party data, municipal and government data, web searches, and more. This has enabled SafeGraph to maintain a very accurate understanding of almost everywhere people spend money.But operating at the scale we do, it’s not surprising that SafeGraph’s algorithms make mistakes.Once, we put a point of interest squarely in the middle of a big lake. That point of interest was NOT Atlantis. It was a Burger King. Clearly, a mistake.We found out about it because one of our customers was giving driving directions to a person. Luckily, that person wisely decided not to drive into the water.Thankfully SafeGraph data had nothing to do with this unfortunate accident.Our mistakes have huge real-world consequences because the largest mobile carriers, search engines, and satellite companies rely on SafeGraph’s data.But this blog post isn’t about how our algorithms mess up and how we fix our obvious mistakes. This blog post is about the weird edge cases where even after multiple humans look at the data, we still don’t know what the right answer is.Forget algorithms. Even humans are confused about how to handle these edge cases.This blog post is about the cases where we struggle to translate the complexity and nuance of the real world into simple rules and heuristics which our algorithms can then follow.We don’t have all the answers yet, but we want to shine some light on some of the challenges we face every day. If you have any suggestions, please let us know (or come work with us!).Opening up about open hoursKnowing when a place is open for business or not seems easy enough. But how would you handle the open hours for this urgent care center?Broncos Stadium at Mile High: previously known as Invesco Field at Mile High and Sports Authority Field at Mile High, and commonly known as Mile High, New Mile High or Mile High Stadium.Do we report Office Hours or InstaCare Hours? If both, how can we cleanly represent that in our schema?Open hours become even more challenging when you account for points of interest that are seasonal, like water parks open only in the summer, or malls which have extended shopping hours during the holiday season.During last Christmas day, our clients recommended that people go to hundreds of closed business … all because we couldn’t get the store hours straight. Again, our mistakes have serious real-world consequences (but luckily all the toy stores were open … so little Tony still got his truck).Getting a name for a place is easy. Getting the right name(s) for a place is really hard.Here’s a great article on the falsehoods programmers believe about people’s names. You can imagine when it comes to physical places, which have fewer social conventions for naming than people, there is even greater complexity to understanding and representing place names accurately.Take for example the Broncos Stadium at Mile High. Or whatever they decided to name it this year.Broncos Stadium at Mile High: previously known as Invesco Field at Mile High and Sports Authority Field at Mile High, and commonly known as Mile High, New Mile High or Mile High Stadium.Names of places change. Often.And some places might go by two names, both equally valid. This makes determining the best name for a place challenging even for humans, let alone algorithms.Between the rebrands, mergers, and acquisitions, brands are in constant flux.Businesses are continually merging and acquiring other businesses. Sometimes these businesses undergo rebrands. Sometimes they don’t. Sometimes they create new special regional co-branding.We’ve reached 99% recall and also 99% precision when it comes to the top 3,000 brands in the U.S. But for smaller chains and brands, it’s difficult to organize and keep track of this information without extensive research and local familiarity. Take for example Daphne’s.Daphne’s Greek Cafe? Or Daphne’s California Greek? Or Daphne’s Mediterranean?Without local familiarity, it’s not easy to know how many distinct restaurants and brands are contained in the above search results and news stories.Capital One Cafe category confusionNAICS codes are an industry standard system for categorizing a type of business. Some example sub-categories in the NAICS system are “Commercial Banks”, & “Snack and Nonalcoholic Beverage Bars”, & “Lessors of Nonresidential Buildings (except Miniwarehouses)”.As much as we love working out of Capital One Cafes, we hate them when it comes time to categorize these points of interest.It’s a bank! It’s a cafe! It’s Capital One CafeWhat’s the best category for a bank which is also a cafe and also a co-working space? And another question: should Capital One Cafes be a separate brand from Capital One?As you can see, the real world is tricky, and it’s hard to cleanly represent what’s happening in a simple CSV with a clear taxonomy.We’re steadily dealing with every edge caseWe need to get this right because we believe that truth data is fundamental to innovation in the Machine-Learning driven future.So, until we reach the impossible goal of 100% accuracy, we’ll keep fixing our errors and handling these edge cases.All models are wrong… but we are trying to make SafeGraph’s models of the physical world the most accurate and useful.But we still make tons of mistakes. Many of the mistakes make us cringe. Our commitment to our (very demanding) customers is that we significantly improve the data every month and that we will be a bit more true every month.You can track our progress on this journey, by following our release notes which are published with every monthly update of the data. We feature the bugs and edge cases we’ve handled, and articulate known problems that are not solved (yet!). #### Geofencing Marketing: What It Is and How to Advertise With It Key Takeaways Geofencing marketing uses virtual geographic boundaries to deliver location-based ads. It allows businesses to target customers based on real-world proximity. Different geofence types include radius-based, travel-time, and building footprint methods. Accurate spatial and POI data directly impact campaign performance. Campaign success should be measured using impressions, reach, visit attribution, and conversions. As advertising becomes more expensive and consumer groups become more fragmented, some marketing firms are using geospatial data to create geofences in specific areas. This allows them to advertise to audiences who are more likely to become customers while avoiding wasting money marketing to those who likely will not buy from them.So what does geofencing mean? What is geofencing marketing, and how does geofencing work? What are the specific reasons advertising firms are utilizing it? And how can you take advantage of it for your own marketing efforts? We will answer all these questions and more in the blog.We will start by explaining in more detail what it means to advertise using geofences, including why it is becoming a popular marketing tool.What Is Geofencing Marketing and Why Is It Important?Geofencing marketing involves setting up virtual boundaries around a point or area that track whenever someone with a mobile device crosses them. When this happens, it triggers a notification that advertises a nearby store, brand, service, or product to that person’s mobile device.This is sometimes also referred to as geofencing advertising. If you are wondering what geofencing is in marketing, it is simply the use of location-based technology to deliver ads based on where someone is physically located.To understand why this matters, think about how traditional advertising works. A company sends out ads through mass media channels such as TV, radio, newspapers, magazines, and flyers that attempt to appeal to as broad an audience as possible. This is a rather random approach, as the people reached may be just as likely to have no interest in what the company has to offer as they are to actually want the advertised products or services.In contrast, geofencing allows marketers to define specific spatial boundaries within which their ads will be deployed. This enables them to target nearby people who are more likely to shop at a particular business because of proximity, demographics, brand affinity, and other attributes. Compared to wide-ranging and expensive mass advertising, geofencing marketing is typically more focused and cost-effective.Geofencing vs. GeotargetingAlthough the two terms are sometimes used interchangeably, geofencing and geotargeting are not the same.Geofencing creates a defined virtual boundary around a specific physical location and triggers a marketing action when someone enters or exits that boundary.Geotargeting delivers ads to users within broader geographic areas such as cities or ZIP codes and does not rely on real-time boundary crossing.If your goal is proximity-based engagement and visit attribution, geofencing marketing is often the better option. For broader awareness campaigns, geotargeting may be more appropriate.How to Implement Geofencing Into Your Marketing StrategyThere are three main ways to create geofences. Understanding how geofencing works in practice can help you implement it more effectively in your marketing strategy.The first is with a centroid radius. This involves locating the center point of a building or property and then calculating a certain distance away from that point, or radius, in every direction. This creates a general proximity zone that will send out ads whenever someone with a mobile device gets sufficiently close to a point of interest.The second is with a walk or drive time, also known as an isochrone. This involves calculating the amount of time it takes to get to a particular point from any other location using available transportation methods. A company can then send out ads only to people who can travel to their business location within a specified amount of time. This is a more specific type of geofence that allows for marketing based on how easily people can get to a business, rather than simply how close they are.The third is with building footprints. This method uses measured polygons to represent the exact physical boundaries of a point of interest, whether that is an entire building, a park, or a store unit in a mall. A business can then send notifications only to people who actually set foot inside the building or on the property. This more precisely targets people who, by visiting a business, may already be signaling an intent to buy. Accurate geofences depend on reliable spatial data. Explore SafeGraph Geometry Data to build precise geofencing marketing campaigns. Explore Geometry Data 4 Benefits of Geofencing MarketingWe’ve already talked a bit about how geofence-based advertising is different from traditional marketing. Now let’s take a deeper dive into four specific advantages it provides over doing things the old-fashioned way.1. TimelyOne reason that geofencing ads are often superior to traditional ads is that they’re delivered almost instantaneously in response to a potential customer’s actions. As a business owner, you don’t have to wait for someone to read a newspaper or magazine, or be near a TV or radio at a specific time, to be exposed to your ad. As soon as they cross your geofence, you can start marketing to them immediately.This is also a more relevant time to reach customers, as they are right near where they are able to buy from you. In other words, you don’t have to count on them remembering your advertisement the next time they decide to go shopping in the area near your store.2. TargetedAnother advantage of advertising with geofences is that they cover specific geographic areas. This allows marketers to deploy them in places where they expect likely customers to be. Contrast that with mass marketing, which is deployed throughout large geographic areas such as cities, regions, or countries. That makes it much more random in its ability to reach interested consumers.3. EconomicalThe fact that geofence marketing campaigns can be deployed in precise locations also makes them cost-effective. Mass advertising is expensive because it takes a lot of resources to disseminate a message across a broad geographic area. It is also based on the belief that more exposure is always better.When marketing with geofences, however, you only have to pay for ads in the specific areas in which you deploy them. If deployed correctly, they can reach a much higher concentration of people who are more likely to become customers.4. FlexibleSince geofences can be created in a number of different ways, they are also adaptable to your marketing strategy. You can set them up to cover a proximity around your business, or just the grounds of the business itself, so you only advertise to people already inside your store.You can also set them up for areas within a certain transit time from your business, or at nearby places that get high foot traffic, so people know your business is an accessible option. You can even set up geofences near competitors so that consumers will know your business is an alternative if they do not find what they are looking for at the right price.How to implement geofencing into your marketing strategyMarketing is an industry that is increasingly becoming all about personalization, so you can’t just set up geofences randomly around your business and expect to make tons of conversions. You need to understand who your customers are, where they are, where they go, and when they go there. Here are four tips for getting the most out of your marketing geofences.1. Find out if nearby people are likely to become customers or notA big issue with mass marketing is that it often blindly reaches out to people who have absolutely no interest in the products or services being offered. But marketing geofences can suffer from the same problem if they’re set up with little regard for who they’re going to target.That’s why it’s important to research the demographics of nearby populations before you set up your business’s geofences. This will give you hints as to which census block groups are most likely to yield customers, so you can position your geofences accordingly.2. Base how you construct your geofences on your strategy and scenarioDepending on your marketing strategy and your business’s geographic situation, certain methods of creating geofences may work better for you than others. For instance, let’s say your store is surrounded by competitors you want to take market share away from, or complementary businesses you want to cross-promote with. In this case, you may want to create geofences based on building footprints. This lets you target those exact locations while avoiding other unrelated businesses.In another scenario, you may want to take advantage of your business being close to one or more popular tourist attractions. Here, you might want to set up radius-based geofences with help from point-of-interest data and property data. Or, if your business is accessible to other nearby areas where people tend to hang out, you may want to create isochrone geofences based on mobility and transportation data.3. Pay attention to time, tooJust as important as knowing who is close to your business and where they go is knowing when they’re going to be there. You need to calibrate your geofences to deliver appropriate marketing messages for certain times of day. For example, if you run a restaurant, using geofences to advertise your breakfast special when it’s already 8:00 PM is not likely to result in many conversions.Also keep in mind that people from particular demographics may be more likely to visit specific places at certain times of day or on certain days of the week. So be sure to adjust the messaging your geofences are sending out to target the right people at the right time. Mobility data can help you determine when people visit specific stores or neighborhoods.4. Make sure your foundational data is accurate and preciseWhether you’re using POI, property, mobility, demographic, or transportation data to inform the building of your advertising geofences, make sure you get it from reliable sources. If your calculations, polygons, or other data are incorrect, you could end up with some very big problems.For example, you might overextend your geofences’ reach and thus overrepresent your audience. Or you might position your geofences near census block groups that aren’t your target audience. Either way, you will end up overpaying for marketing that is reaching people who are likely not going to become customers anyway.On the other hand, incorrect data may cause you to set up geofences that are too small or otherwise fail to capture at least some of your target demographics. This may make it look like your advertising campaign was unsuccessful. But the reality is that it simply underperformed because bad data prevented it from reaching the people you most wanted to connect with.How to Track the Success of Your Geofencing Marketing CampaignsWhether it uses geofences or not, you’re going to measure how impactful an advertising campaign is in driving sales or other goals for your business. But how you do that when using geofences for marketing may be a bit different from what you’re used to. After all, you’re tracking stats that are based on physical places, but some of them may be digital in nature.Here are some top metrics to consider:Impressions: This is a basic count of the number of times someone with a mobile device crossed into the boundaries of one (or more) of your geofences. In other words, it counts how many times your ads were sent out, regardless of whether or not they were acted upon.Reach: This counts the number of unique mobile devices that crossed into the boundaries of one (or more) of your geofences. It is similar to impressions, except that it does not count multiple visits from the same person.Visit attribution: This is a measure of how many unique mobile devices crossed a geofence over a certain period of time, and how long each one stayed within the geofence. It’s especially useful if you’re using geofences based on your stores’ building footprints, as you can see how many people actually entered your stores and at least considered making a purchase.Conversions: This describes the number of times consumers took an action that your ads prompted them to. Usually, that will involve buying something from your business, but it can be measured in other ways as well. For example, it could include going inside one of your stores (visit attribution) or signing up for a rewards program or newsletter email list.Cost per acquisition: This measures how efficient your marketing campaign was at prompting consumers to become customers of your business. It is calculated by taking the total cost of the campaign and dividing it by the number of conversions you received over the course of the campaign. The lower this number is, the better.Brands effectively using geofencing marketingSo what does advertising using geofences look like in action? Here are a couple of geofencing advertising examples: companies that have honed the precision of their marketing campaigns by using geospatial data.BillupsBillups was trying to accurately measure impressions and conversions on its outdoor ads. Part of the solution involved cross-referencing anonymized mobile device GPS data with POI data. This allowed them to estimate the route someone took and which billboards they encountered along the way.Tracking online conversions related to these ad impressions is relatively easy due to web technologies such as cookies. However, it’s more difficult for physical stores because anonymized mobility data can’t always accurately reveal whether a person visited a specific store. The person may have just parked in the parking lot or visited an adjacent store, for example.That’s why Billups turned to polygon-based property data. They used it to build geofences that matched the building footprints of the stores for which they wanted to measure conversions. This allowed them to determine more accurately whether a person who saw one of their ads visited a particular store.Media StormMedia Storm was running advertising campaigns for its clients to re-engage customers who had simply visited stores. The goal was to turn these visitors into purchasers and, ideally, keep them coming back. They began by tracking anonymized GPS locations from mobile devices, but this was not very helpful in determining whether a person had visited a client’s store or even a competitor’s store. The reason was that they did not have accurate information about the precise locations of their clients’ stores or those of their competitors.Centroid-based proximity models were ineffective as well, especially in crowded environments such as malls or downtown city neighborhoods. They ran the risk of including people unrelated to the conversions of their clients or competitors, including visitors to nearby buildings or pedestrians on the street.The solution was to use property data based on polygons that matched the building footprints of client and competitor stores to construct geofences. When Media Storm combined this with the anonymized mobility data it had licensed, it was able to determine much more accurately how many people had visited a client’s or competitor’s store while ignoring people who were nearby but were not actual visitors.There you have it: an introduction to why and how to use geofences to power advertising campaigns. Of course, you’re going to need accurate geospatial data to build them, and a great place to start is with SafeGraph’s Geometry data.  Ready to strengthen your geofencing marketing strategy? Book a Data Demo to see how precise location data can improve campaign performance. Schedule a Demo FAQ’s 1. What is geofencing marketing? Geofencing marketing is a location-based advertising strategy that triggers marketing actions when a mobile device enters or exits a defined geographic boundary. 2. How does geofencing work? Geofencing works by using GPS, Wi-Fi, cellular, or Bluetooth signals to detect when a device crosses a virtual boundary and then triggering a pre-set marketing action. 3. What does geofencing mean in marketing? In marketing, geofencing means creating virtual perimeters around physical locations to deliver targeted ads based on proximity. 4. Which industries use geofencing marketing? Retail, restaurants, automotive, real estate, entertainment, and outdoor advertising commonly use geofencing marketing. 5. What is the difference between geofencing and geotargeting? Geofencing relies on real-time boundary crossing, while geotargeting delivers ads to broader geographic areas without requiring entry into a defined perimeter. Geofencing marketing is a location-based advertising strategy that triggers marketing actions when a mobile device enters or exits a defined geographic boundary.Geofencing works by using GPS, Wi-Fi, cellular, or Bluetooth signals to detect when a device crosses a virtual boundary and then triggering a pre-set marketing action.In marketing, geofencing means creating virtual perimeters around physical locations to deliver targeted ads based on proximity.Retail, restaurants, automotive, real estate, entertainment, and outdoor advertising commonly use geofencing marketing.Geofencing relies on real-time boundary crossing, while geotargeting delivers ads to broader geographic areas without requiring entry into a defined perimeter. #### Geometry Data: The Anchor of SafeGraph Places If you’ve heard of SafeGraph, you’re most likely familiar with our POI (point of interest) data for COVID-19 response. What if I told you that SafeGraph's Geometry data is the crux that makes it all possible? You know - the shapes depicting the places we care so much about? Without these obscure little polygons, SafeGraph products would not hold the same value as they do today. Below, we’ll discuss where these polygons come from and why they’re so useful. Unfortunately, geometry data doesn't grow on trees - nor does it grow on maps, apps, or S3 buckets for that matter, and this makes our mission to provide a “best fitting” polygon for each record in the SafeGraph Places dataset an incredibly tall order. To further complicate matters, the definition of a “best fitting” polygon varies by POI type and can range from a building footprint (or even a slice of a building footprint) to a massive shape containing parking lots, land, and several buildings within its bounds - like a college campus for example. We obviously don’t tackle all of this alone (we’re ambitious - not crazy) and are fortunate to be in business with some amazing partners who specialize in curating geometry data of varying criteria. In most cases, we prefer to have polygons extracted from aerial imagery using the latest methods in object recognition and AI. We recognize that this is the future of geometry data sourcing and has the best chance of scaling rapidly. In other cases, and especially for places with complex requirements, we prefer to have polygons hand drawn. This is still the most sure-fire way to source an accurate polygon, and that fact is unlikely to change in the short term. For some, geometry data is already useful in raw form. Polygons offer a robust visual representation of places and can aid in use cases ranging from square footage calculations to site selection. But for others, geometry (and the metadata inferred from it) really proves its value when used to derive additional geospatial products. Externally, our customers also use SafeGraph Geometry data as a “blueprint” of sorts to derive their own foot traffic insights. ‍ So, in the spirit of transparency, we’d like to walk through the metadata we build into our geometry as well as our best practices for putting that metadata to work.‍ For every place, we always want to answer three key questions: 1) Does this place encompass other places? ‍ ‍2) Is this place completely enclosed inside of a larger place? ‍ 3) How many places belong to this polygon?‍ Let’s take these one at a time...‍ 1) Geometry Data & Spatial Hierarchy: Does This Place Contain Other Places? The real world is full of places that contain other places, and these relationships exist in many forms. Some places are massive and represented by a broad, expansive boundary, and these places encompass several, if not hundreds, of smaller places within their borders. An outdoor shopping mall, for example, encompasses many POIs within its footprint, and so do hospitals, college campuses, ski resorts, stadiums, casinos, etc. In other cases, a single building may represent the footprint of a POI, but it still might contain other POIs within. A Walmart containing a Subway is a canonical example of this, and we are also interested in understanding these relationships. In any case, we identify spatial relationships (what we refer to as “spatial hierarchy”) by measuring polygon overlap. For each pair of overlapping polygons, if the larger polygon contains at least 80% of the smaller polygon, and if the larger polygon is also of a particular POI category, then we mark it as the “parent” of the smaller polygon. It’s important to restrict parent POI candidates to a specific set of categories or brands so that we’re not solely reliant on polygon precision to determine spatial hierarchy. For example, we want airports to be parents when overlapping other POIs, but we generally don’t want cafes to be parents if overlap exists and the cafe happens to be the larger of the two polygons. See our Places Manual for a complete list of POI categories that are eligible parents. We flag these relationships in our geometry data by setting the “parent_placekey” of the smaller POI equal to the “placekey” of the larger, encompassing POI. We colloquially refer to the larger, containing POI as the "parent" and the smaller POI as the "child." Blue: Polygon for Presidio Park in San Francisco (parent). Red: “child” POIs within Presidio Park. ‍‍ 2) Enclosed Polygons: Is This Place Completely Enclosed Inside of a Larger Place?‍ Within spatial hierarchy, we are interested in further classifying parent/child relationships. In general, we want to know when a parent POI encompasses its children completely indoors vs. on open air grounds. For example, a ski resort boundary may enclose a restaurant midway up the mountain, but the ski resort boundary itself is not an indoor enclosing structure. On the other hand, an airport containing a Starbucks completely encloses that Starbucks indoors. As a general guideline, if you must enter another structure to arrive at a POI, we want to be aware of that fact, and we set the “enclosed” column in our geometry data to “true” wherever that exists. Similar to determining eligible parent POIs, we rely on categories to distinguish enclosing vs. non-enclosing spatial hierarchy relationships. See the enclosed section of our Places Manual for a complete breakdown of the spatial hierarchy relationships we treat as “enclosing.” Blue: Footprint of Flatirons Crossing Mall (parent). Red: enclosed = true POIs within Flatirons Crossing (children). 3) Polygon Class: How Many Places Belong to this Polygon?‍ It’s important to distinguish when geometry data reflects the shape and size of a POI’s real world footprint and when it does not. In most cases, each polygon represents the unique footprint of a single POI, but in some cases, a precise polygon for a POI does not exist (or is not discernible through our sourcing methods), so the only polygon available may be too large and could represent several POIs. When a polygon reflects the true shape and size of a unique POI, we give it an “OWNED_POLYGON” value in the “polygon_class” column. This means the polygon represents that unique POI, but there could be child POIs within its borders attached to the same polygon. In other words, if a single POI maps to a distinct polygon (excluding that POI's children), then polygon_class = "OWNED_POLYGON;" otherwise, polygon_class = “SHARED_POLYGON.” We exclude children from influencing their parent POI's polygon_class because there are cases where a unique polygon does not exist for each child POI, and the child POIs most likely share the same polygon as their parent. In these cases, it does not mean that the polygon is a bad representation of the parent itself. A canonical example of this is a Nike store inside of a shopping mall. If we don't have a good polygon for the Nike store, then the Nike store likely shares the same polygon as the mall. Despite the fact that multiple POIs are attached to this polygon, the polygon is still representative of the mall's shape and size, so the polygon_class for the mall POI = “OWNED_POLYGON” and the polygon_class for the Nike store POI = “SHARED_POLYGON.” Read more about polygon_class in the Places Manual. Blue: OWNED_POLYGON with a single POI. RED: SHARED_POLYGON housing 2 POIs. At SafeGraph, we focus on a deep understanding of the physical landscape and we hope to share this context with our partners who set out to do the same. What details are we missing? What are we getting wrong? What other metadata would be useful for you? Check out our docs site to learn more. #### Geospatial Data 101: What It Is and How to Use It Eugene Chong is a product analyst at SafeGraph. With direct experience using this data, he covers the benefits of using geospatial data and shows you how you can use SafeGraph data in a variety of ways.As the first presentation in SafeGraph’s Knowledge Series, Eugene sets the bar for educational seminars that teach you about how geospatial data can be used, as well as guidance on how to use SafeGraph data effectively for analysis.What is spatial data?First, Eugene defines what spatial data is, explaining that it’s data containing spatial components such as coordinates, geometry, or addresses.SafeGraphs’s data is geospatial in nature, as it refers to places on Earth, and relationships people have with places and spaces. This data is valuable for analytics in business, transportation, crime, economics, politics, and much more.Spatial data uses vectors — points, lines, polygons — to define the shapes and structures of real-world objects and places on a map. For example, buildings are represented with precise polygons that show the shape, or footprint, of those buildings.Other use cases, such as weather mapping, are more suited to raster grids than vector data. These determine averages within a set pixel size, and are particularly useful for representing data that is continuous (like land cover or elevation) rather than specific elements (like a POI or building footprint).What makes spatial data powerful?Data visualization is the main element that makes geospatial data so valuable, as it provides a simple, optimal way of analyzing data. Maps are extremely easy to analyze, and are a preferred method to use for analyzing geographic areas. Looking at data on a map reveals relationships and insights that cannot be seen in a table or a graph.Eugene covers issues with accurately mapping data, and how data visualization can be manipulated to display data in certain ways. For example, gerrymandering is used as an example of how districts (or geographic areas) can be drawn in ways that suit specific needs, rather than accurately reflecting the true demographics of an area.What is GIS and what are the main GIS tools?To help you use geospatial data for yourself, Eugene explains what geographic information systems (GIS) are, and covers some of the best tools on the market by their type, including desktop, code-based, database, and web-based solutions. He then outlines the tools that he typically uses and why, touching on why you’d choose between different GIS tools for what you need.He also outlines a general GIS workflow for evaluating new data. This covers ingesting data, exploring data, joining layers, and analyzing data. This helps walk you through the process of how to use a GIS solution to draw valuable insights from your data.SafeGraph’s Places Data offers information on points of interest (POI) and building footprint data to places where people spend time or money, or business operations take place. #### Global Port POI and Geometry Data for Improved Supply Chain Analysis Since fall of 2021 SafeGraph has been steadily growing the number of geographies in which we provide Places (points of interest) data and currently are past 200 nations (which, for those geo-nerds among you, means we’ve surpassed the 193 UN members count by also adding support for a number of sub-national administrative units and autonomous regions).Places will always be the foundation for our other datasets, such as Spend, which append additional columns of information to each row (location) for further context. But we also recognize that organizations need to contextualize these points by adding in their associated geometries reflecting the surface area of particular places. That’s why we have prioritized expanding our Geometry datasets worldwide, including unique categories of places that are relevant for our customers. Above all, one category of locations has been requested more than others: ports.By understanding where ports are globally and their surface area extent, analysts can get right to asking critical questions about the locations themselves, well before joining other datasets for added detail.‍‍As of writing this blog post, we’ve cast a wide net (maritime pun intended) by releasing approximately 5,400 ports across 172 countries.By intersecting existing mobility data with these POIs and polygons, customers will be able to ask macroeconomic questions around the overall state of trade in a given region(s) or drill down to specific ports and make comparisons of relative performance. With the addition of a wkt_area_sq_meters column analysts can also explore relationships between the size of ports and their import/export volumes.Beyond direct analysis of the ports themselves, this large addition to SafeGraph’s offering only strengthens the existing opportunities for site selection analyses by allowing for customers to examine which POIs are (or are not) concentrated around ports and consider the business ramifications of their presence. Understanding the types of POIs that thrive or flounder (maritime pun #2) in the vicinity of ports will inform the uncovering of potential gaps and opportunities in their respective markets.While the problems for which analysts are trying to solve with ports are quite varied, we at SafeGraph have observed that one of the most common goals is to develop a clearer measure of trade activity in lieu of transactional information or some other formal data source from respective port authorities. This is particularly prevalent in emerging markets where there may be little to no data publicly or commercially available detailing import/export figures per port. Compounding this problem is the introduction of COVID-19 into the equation: averages from previous years and historical data trends are proving less valuable as we see supply chain constraints so drastic that container ships can be seen stuck in queue for miles outside many of the world’s largest ports.Like most spatial analysis, the best solutions in this regard involve the layering of multiple types of data. We’ve seen a trend over the last several years of organizations using satellite imagery to identify the number of ships coming in over time to a particular series of locations. This can be more reliable than one might expect if averaged over a long enough period, even with the shortcomings of satellite imagery such as cloud cover or a limited image inventory that week in mind. Still, even the best satellite imagery comes at 30cm resolution and to purchase such high resolution imagery is extremely expensive, especially when buying many exposures to then measure change over time. Analysts often rely on less clear resolutions such as 50cm or revert to medium resolution in the 1m - 5m range. This resolution will still capture objects such as container ships, land vehicles and containers themselves but will likely fail to delineate port boundaries, which are hard to make out from above.With accurate geometry information, analysts can intersect their imagery with reliable polygons to ensure their counts are realistic and not accidentally looping in vehicles or other measured objects outside of the port boundary (for example, workers’ vehicles or nearby industrial facilities not tied to port operations). An added benefit is the fact that buying satellite imagery is usually done by submitting a polygon to the provider who then prices their quote based off of the surface area of the polygon submitted. Thus, less accurate polygons might potentially lead to buying coverage that’s not needed. Over many months, this could bleed tens of thousands in unnecessary costs.With SafeGraph’s ports coverage, clients can know ahead of time that the boundaries will be accurate regardless of how remote a market may be. They’ll also save the countless hours required if they were to digitize port polygons in-house, and finally avoid the likely errors that would come along with such a manual effort. #### How Commercial Data Sharing Will Develop in 2024 It’s clear that the way in which companies and individuals share their data with other entities is changing. What’s more complex is that there are many factors catalyzing this technological and economic shift. Industry experts have observed multiple causes for commercial data sharing’s development in recent months, from increased concerns about compliance to the global surge in demand for external data, for which we have generative AI to thank. In this article, we’ll predict five ways that commercial data sharing will change in 2024 and examine the reasons for that change. Whether you’re a data analyst, sharing data between tools and departments, or a chief data officer, tasked with exchanging data securely, or a data provider, who shares data in order to monetize it, make sure you’re prepared for a commercial data sharing ecosystem with ever more SaaS solutions, regulatory scrutiny, and commercial considerations.1. Generative AI will continue to cause a data market boomIn both the technology industry and general news outlets globally, it’s been emphasized that without data, there’s no AI. Generative AI has catalyzed a data market boom, with companies requiring masses of external data to train their models.More pressingly, many cases have shown that a generative AI model is only as trustworthy as the data it’s fed. Numerous instances of bias and obscenity in chatbot outputs have been a result of flawed datasets used to train the AI system. As such, it’s not just that there’s a scramble for masses of data for AI, there’s a scramble for masses of quality, accurate data. AI has demonstrated more compellingly than ever that it’s imperative to work with trusted data providers. Commercial data providers selling data for AI training should always be able to prove that they take the required steps to remove inaccuracies, empty fields, and outdated information from their datasets. Indeed, the preliminary step before commercial data sharing - that is, sharing a data sample free of charge as part of the data evaluation process - has become even more important for the end user to test that the data can train a reliable generative AI model.And the easiest way for providers to share said data sample? That brings us to our next prediction for commercial data sharing in 2024: there’ll be increased adoption of cloud-agnostic sharing technology. 2. Increased adoption of cloud-agnostic sharing technologyCloud-agnostic data sharing technologies have made it easier for organizations to share data across diverse cloud environments. In 2024, it’s likely that the demand for cloud-agnostic data sharing technologies will surge. They simplify data integration across diverse cloud ecosystems and contribute to greater data accessibility and collaboration. For that reason, it’s likely that investment in data sharing tech will also increase, as Bobsled’s recent $17M Series A round attests to. On the whole, both will contribute to a data sharing that’s faster and more cost-effective. Bobsled, for instance, facilitates data sharing by offering a standardized framework that operates independently of specific cloud providers, enabling organizations to transfer and process data with flexibility and efficiency.Flatfile, another player in this space, allows users to integrate and cleanse data regardless of the underlying cloud infrastructure, fostering interoperability and reducing friction in data sharing processes. Additionally, Weld offers a data integration platform that seamlessly connects disparate data sources across multiple clouds, ensuring a cohesive and unified approach to data management.Following the success of these three example SaaS companies, we could see more cloud-agnostic data sharing companies being created in 2024. There’s certainly demand for the solution they offer. However, it’ll be interesting to see how new players in the field differentiate themselves. The exchange of data is the final part of the end-to-end commercial data sharing. Before this fulfillment step, there’s the question of how the data deal is initiated. For which there are also software solutions, and we predict that there’ll be more.3. New software solutions for commercializing dataWhere there’s a user pain, there’s a software solution (or several). The same is the case for commercial data sharing. Data providers and buyers have long complained about many bottlenecks when it comes to monetizing and purchasing data. In response, we’ve seen new SaaS solutions being developed by companies large and small, aimed at making commercial sharing easier. For example, data providers have long complained about the shortcomings of data marketplaces. To begin, there’s the struggle of integrating with a data marketplace in order to publish data products on it. Such integrations usually require a lot of time and engineering heavy lifting, sometimes months for getting listed on just one marketplace. When providers are finally published on various data marketplaces, there’s the overhead of managing business across these disparate channels. Software like Data Commerce Cloud (DCC) emerged to alleviate data providers of both pains. Following Shopify’s VP as an omni-channel commerce solution, DCC enables providers to sync their data products to multiple data marketplaces with a click, sparing the engineering effort. Leads from these data marketplaces land centrally in the provider’s DCC inbox. For providers, DCC is a SaaS tool that makes commercial data sharing easier. For buyers, there’s a greater variety of data providers to choose from, whichever the data marketplace they’re using.Commercial data sharing will become an even smoother process with the emergence of more SaaS solutions facilitating it. Which is important, as there are higher stakes: regulation and compliance is an ever-growing consideration for commercial data sharing.4. More concerns about security and regulation = more cautious sharingHeightened concerns surrounding security and regulation have impacted commercial data sharing between companies. A Lowenstein survey published in January 2024 found that 57% of financial firms were concerned about data breaches, with 20-25% concerned about the increased compliance burden and privacy issues surrounding personally identifiable information (PII). Companies are more apprehensive about sharing sensitive information, fearing the potential repercussions for both their reputation and the security of their customers. This has led to a paradigm shift in the dynamics of data sharing, with organizations becoming more vigilant in choosing their partners.Moreover, data protection regulations have intensified this caution. Stringent measures such as the General Data Protection Regulation (GDPR) and other global data privacy laws have compelled companies to reevaluate their data-sharing practices to ensure compliance. Failure to adhere to these regulations can result in substantial fines and legal consequences, creating a deterrent for companies engaged in data sharing. This regulatory environment has not only affected current practices but is also influencing the trajectory of future data-sharing initiatives, with companies now prioritizing robust data governance frameworks to navigate the regulatory landscape.Looking ahead, commercial data sharing in 2024 and beyond will likely be shaped by an ongoing tug-of-war between the necessity for collaboration and the imperative to safeguard data. Companies will need to strike a balance between the benefits of shared data for innovation, while implementing robust security measures and adhering to evolving regulations. As new regulations - and new security threats - emerge, data sharing will continue to evolve, prompting organizations to adopt adaptive strategies that prioritize security and compliance in an increasingly interconnected business environment.5. Slower growth on data budgetsYear over year, companies have been allocating more budget towards investing in external data. This changed in 2023, when a Lowenstein survey found just 28% of respondents believed their budgets would increase by more than 25%, down from 65% in the previous year. As such, the general economic downturn has - and will likely continue to - put some limitations on commercial data spending. With tightened budgets, organizations may find themselves constrained in their ability to acquire high-quality external data sets, hindering the depth and breadth of information available for sharing. This financial constraint could limit the scope of data collaborations and partnerships, especially for smaller companies that heavily rely on external data to complement their internal analytics.Moreover, the decreased investment in external data may lead to a more conservative approach when it comes to sharing existing datasets. Companies, facing financial pressures, may become hesitant to engage in data-sharing agreements or collaborations with external entities, fearing the costs of compliance, security, and governance. As a result, data sharing may experience a slowdown, as companies prioritize internal cost-cutting measures over external investments, potentially stalling the growth and innovation that arises from collaborative data initiatives. Striking a balance between cost considerations and the strategic value of external data will be crucial for organizations aiming to capitalize on data sharing within constrained budgets and tough economic conditions.About the authorLucy Kelly is a researcher at Datarade, the company facilitating the exchange of Big Data. She writes about the various use cases for external data, leading data providers, and developments in the tech industry, with a focus on data monetization trends. #### How Data Sharing Transformed Insurance: World of DaaS interview with Verisk CEO, Scott Stephenson New podcast with Scott Stephenson, CEO of Verisk. Our conversation is available everywhere (Apple Podcasts, Spotify, YouTube, etc.). Please subscribe, follow, and review. ‍‍ I am a huge fanboy of Verisk. They have one of the most powerful data co-ops and have built a $30 billion market cap behemoth. I really enjoyed diving into Verisk’s unique history and learning how it transformed Insurance through the power of data sharing. Some learnings from our conversation:‍ Verisk was originally a non-profit born out of regulation.‍ Verisk was originally called the Insurance Services Office or ISO and was founded as a nonprofit, in 1971 to serve a consortium of 280 or so insurance companies. Previously, insurance companies paid organizations in every state to act as an intermediary in handling, formatting, and QAing their data before it was handed off to the regulators. ISO was created as a national organization to handle this more efficiently.‍ Verisk built nearly duplicative datasets to solve for different use cases.‍ When all this data started coming together for insurance companies, it became obvious that it could solve a lot of problems. If you can imagine back in the day with fraud, you may have a fraudster submitting a similar claim to six different insurance companies. A central organization with all this data would be able to see similar claims from the same person, and it would be able to flag it. So a cooperative of insurance companies would have really great downstream effects. But this didn’t happen immediately. Verisk originally collected data for the purpose of understanding underwriting practices. Due to government regulation around data privacy, they had to build a second data set, nearly identical to their original asset, for the purpose of trying to root out fraud in the claims flows. ‍ Data contributory models are really hard to achieve.‍ It’s really hard to get going in B2B because businesses see their data is an asset. The US property and casualty insurance industry has the most established data cooperative. Outside of insurance, there are very few industries with strong examples. Verisk had the benefit of regulatory requirements.‍ Another way to jumpstart a data coop -- ZoomInfo used exhaust data. Check out my conversation with ZoomInfo CEO, Henry Schuck.‍ When successful, data cooperatives are really economically powerful. ‍ Data cooperatives are incredibly powerful. Verisk’s customers greatly benefit from collective information. A single customer would not have anywhere close to the data asset that Verisk created. And it’s much cheaper for them to participate in Verisk’s contributory model than build a solution on their own. Hope you enjoy this episode of World of DaaS — would really appreciate it if you subscribe and review Apple Podcasts, Spotify, YouTube, etc.). #### How Our Definition of a “Place” Has Evolved Key Takeaways SafeGraph’s evolution offers clear examples of geospatial technology applied at global scale. The expansion of Places data reflects the broader location intelligence revolution. Improved POI data supports greater location intelligence across industries. A brief story about SafeGraph’s rapid evolution as a Data-as-a-Service company  Only seven years ago, SafeGraph opened its doors as one of the few data-only companies in existence, allowing us to position ourselves uniquely as a Data-as-a-Service (DaaS) provider. While our mission has always been to democratize access to clean, accurate, and highly granular data about physical places—also known as location-based data or geospatial data—at the time, we captured only a few million places across the U.S.  Fast-forward to the present day. SafeGraph’s Places database now includes rich attributes for over 80 million locations across more than 200+ countries and territories worldwide. While expanding the scope and depth of our data has been crucial to our growth, it has not come without obstacles. With each period of expansion, we have continuously reevaluated—and at times challenged—our own understanding of what qualifies as a “place” in order to meet evolving customer needs.  As we continue to progress on our mission, we believe it is important to share some of the insights we have gained along the way and highlight the learnings that have shaped how we define places today.  Scaling Our Data Alongside Ever-Growing Customer Expectations When we first entered the market, we filled a gap that desperately needed filling. Accessing good, clean, and accurate geospatial data in the U.S. often required hours of manual effort just to make datasets usable. Our initial focus was therefore on ensuring that our data could be used immediately upon delivery.  A Tireless Quest to Scale and Grow Our Dataset Once customers experienced how refreshing it was to work with clean, accurate, and regularly updated datasets, they began asking how we could expand our offering to support more use cases. Each question became a catalyst for exploring new ways to scale our dataset—both by adding more physical places and by appending richer metadata to existing records.  For example, we initially focused on places where people spent money, such as retailers and restaurants, within the U.S. We later expanded to places where people spend time but not necessarily money, including parks, schools, offices, warehouses, and manufacturing facilities. We then incorporated small-footprint POIs like electric vehicle charging stations and ATMs. Finally, we began appending new attributes—such as “store ID” (to support integration with transaction data) and “category_tags” (to help isolate places using narrower text descriptors)—and expanded our coverage globally.  It was important for us to move fast and position SafeGraph as the trusted source of  comprehensive and accurate geospatial data that captures the full complexity of real world places. We also knew that if we couldn’t provide our customers with the data they needed, they would look elsewhere. As a result, we became relentless in expanding and refining our datasets to ensure we consistently met our customers’ expectations. This steady expansion mirrors the data transformation in location intelligence, where richer context increasingly drives better decision-making.  Staying laser-focused on data accuracy  Scaling coverage is only one side of the equation. As the dataset grows, it becomes exponentially more complex and must be balanced by rigorous quality assurance to uphold our promise as a high-quality data provider. This reality led us to ask an important question: “What is a real place?”  Because businesses open and close every day, we constantly verify that places in our dataset are truly open and operating in the real world. Adding new data sources also introduces lower-quality records that may not represent legitimate physical places, such as online-only stores or event names mistakenly treated as venues.  The long-term precision of our datasets and our customers’ trust in our ability to deliver high quality data go hand in hand. We cannot have one without the other. We also consider the potential impact, or “cost”, that false positives can create for businesses that rely on our data as a source of truth. For us, those kinds of errors aren’t acceptable. While no dataset can be perfectly accurate at all times, our team consistently goes above and beyond to minimize errors and deliver dependable geospatial data.  How the SafeGraph Places dataset is built  If you’ve ever wondered how we build the SafeGraph Places dataset, here is a brief overview of the process—one of many examples of geospatial technology in practice:  First, we capture baseline information about physical places, including names and addresses, from open and accessible sources known for accuracy. A strong example is the Starbucks Store Locator, which must remain current to serve customers effectively. This represents a high-veracity data source. Once a POI is confirmed to exist, we layer in additional raw metadata to add depth. This includes geographic coordinates, operating hours, POI descriptions, and store footprint geometry. To maintain accuracy, we refresh tens of thousands of data sources each month.  Here’s the “fun” part. Knowing that our customers come to us to access crystal-clean data, we process raw digital signals in a Spark pipeline using geographic heuristics and machine learning to clean them up. We do this, first and foremost, to de-duplicate POIs that appear across various data sources and, secondarily, to attach only the most accurate and precise metadata to each record in our Places schema.  As part of this quality assurance process, we use several layers of geocoding to ensure the geographical coordinates (latitude and longitude) are based on a POI’s rooftop, standardize POI names and addresses, and apply machine learning to infer the most appropriate category description based on the associated metadata.  Of course, there is a significant amount of work that happens between each of these steps to ensure the final product is exactly what our customers want and need. But if there is one takeaway, it is this:  we make it a priority to go above and beyond to deliver the most accurate and comprehensive geospatial data available No one does “places” better than SafeGraph  So, what is a “place”? As this journey shows, the answer has evolved—and will continue to evolve—as customer expectations change. What once seemed like a simple geospatial concept has grown into something far more expansive, reflecting both current needs and future possibilities.  As the need for accurate location intelligence accelerates, our definition of a place will continue to adapt. What will not change is our commitment to accuracy, completeness, and trust.  What’s your definition of a place? Share your thoughts with us. Excited about using our updated POI dataset to enhance your business insights, request a demo now. Schedule a Demo FAQ’s 1.What is location intelligence? Location intelligence is the use of highly accurate and comprehensive geospatial data about physicals to drive business insights and decision making. 2. How does geospatial data support the real estate data revolution? The real estate data revolution is driven by richer, more accurate geospatial datasets that enable better market analysis, site evaluation, and operational decisions at scale. 3. How does geospatial data support location intelligence? Geospatial data provides the spatial foundation for location intelligence by defining where places exist and how they relate to one another. It enables analysis of movement,proximity and patterns in the physical world. 4. Why does data quality matter in GIS and geospatial analysis? Data quality matters because even small inaccuracies or outdated records can distort analysis, leading to flawed insights and poor decision-making at scale. 5. What role do POIs play in understanding places? POIs provide real-world context by defining what exists at a location, how it functions, and how it relates to surrounding places. 6. Why does the definition of a “place” continue to evolve? As use cases expand and real estate analytics trends advance, the definition of a place must adapt to ensure geospatial data remains accurate, relevant, and decision ready. Location intelligence is the use of highly accurate and comprehensive geospatial data about physicals to drive business insights and decision making.The real estate data revolution is driven by richer, more accurate geospatial datasets that enable better market analysis, site evaluation, and operational decisions at scale.Geospatial data provides the spatial foundation for location intelligence by defining where places exist and how they relate to one another. It enables analysis of movement,proximity and patterns in the physical world.Data quality matters because even small inaccuracies or outdated records can distort analysis, leading to flawed insights and poor decision-making at scale.POIs provide real-world context by defining what exists at a location, how it functions, and how it relates to surrounding places.As use cases expand and real estate analytics trends advance, the definition of a place must adapt to ensure geospatial data remains accurate, relevant, and decision ready. #### How SafeGraph built a reliable, efficient, and user-friendly Apache Spark platform with Amazon EMR on Amazon EKS At SafeGraph, we rely on Apache Spark, one of the most widely-used large-scale data processing frameworks, to generate our global POI dataset, which includes detailed attributes such as brand affiliation, advanced category tagging, and open hours. Hundreds to thousands of Spark applications each day are used for data transformation, machine learning model inference, and operational tasks.Managing the reliability, efficiency, and iteration speed of engineers authoring these Spark applications presents a major challenge for our platform engineering team. However, by choosing the right Spark service provider, we can create a strong foundation for our Spark infrastructure that addresses these issues. In a recent blog post, coauthored by Nan Zhu, the Tech Lead Manager of our Platform Engineering team at SafeGraph, and Sr. Solution Architect Dave Thibault from AWS, we shared our journey of building our latest Spark platform on top of AWS EMR on EKS. By doing so, we were able to create a robust and efficient foundation that meets our needs and even led to a 50% reduction in costs compared to our previous Spark managed service vendor.Read the full post here on the AWS Big Data Blog. #### How SafeGraph Powers Location-Based Audiences How SafeGraph Powers Location-Based Audiences In The Ultimate Guide to Location-Based Audiences, we explored how location data is revolutionizing advertising by enabling more precise targeting for marketers. Now, we’ll dive deeper into how SafeGraph’s places data seamlessly integrates into audience-building workflows, providing adtech leaders with the tools they need to create more accurate and context-driven location-based audiences. Accurate POI Data: The Key Ingredient in Building Location-Based Audiences Creating visits for advertising audiences is both complex and critical for developing effective location-based ad products. While machine learning, high-fidelity GPS data, and sophisticated pipelines are necessary to map visits accurately, all of this technology relies on a key foundational element: accurate data about the underlying physical locations. SafeGraph Places provides this foundation with comprehensive POI data and accurate building footprint polygons, including detailed 'tenant-split' polygons. This detailed information makes it easier to distinguish between closely located businesses or identify individual tenants within multi-unit buildings. Without precise polygon data and rich POI metadata, even the most advanced technical solutions would struggle to deliver accurate results. Building Audiences with SafeGraph Places With SafeGraph data, RainBarrel was able to rapidly scale the development of audiences by 10x. Although SafeGraph doesn’t provide MAIDs (Mobile Advertising IDs), geolocation data, or other personal information in its data about physical locations, our data plays a critical role in building location-based audiences. Here's how advertisers can combine MAIDs with SafeGraph’s POI and polygon data to build effective audience segments: Collect MAIDs from mobile pings through a third-party source. Map the pings using SafeGraph’s POI and polygon data to determine where these pings occurred. Model visit patterns by using SafeGraph data features in combination with mobility data:some text DistanceToPlaceCentroid: Data users can calculate the distance between the cluster centroid and the POI centroid. DistanceToPolygonWkt: Data users can calculate the distance between the cluster and the nearest point on the POI polygon. NAICS x Hour: Data users can combine the first four digits of the NAICS code with the hour of day, enhancing targeting by time of visit (e.g., visiting a bar at 11 p.m. vs. Walmart at 11 a.m.). Once you’ve modeled the visits, you can create advertising audience segments like frequent Starbucks visitors. Other Ways SafeGraph Data Enhances Location-Based Audiences SafeGraph’s data can be used in various ways to refine audience building: ‍ Example 1: Identifying POIs by Category Tags or NAICS Codes Advertisers can use SafeGraph’s category tags or NAICS codes to identify specific types of businesses operating at POI, like gyms or restaurants, and build segments around frequent visitors to these locations. SQL SELECT placekey, location_name, naics_code, category_tags FROM safegraph_places WHERE lower(category_tags) ILIKE '%gym%' OR naics_code IN (713940); -- NAICS code for fitness centers ‍ Example 2: Querying POI Counts in Specific Geographies Advertisers can analyze POI concentrations by region to identify high-competition or untapped markets. SQL SELECT region, COUNT(placekey) AS poi_count FROM safegraph_places WHERE lower(category_tags) ILIKE '%cafe%' GROUP BY region ORDER BY poi_count DESC; ‍ Example 3: Checking if a Mobile Ping Falls Within a Specific Polygon Accurate audience segmentation depends on determining if a mobile device ping collected by an advertiser or third-party geolocation data company occurred inside a particular POI’s boundaries, providing more precise advertising strategies. SQL WITH ping_data AS ( SELECT device_id, ping_latitude, ping_longitude FROM mobile_ping_table ) SELECT p.device_id, s.placekey, s.location_name FROM ping_data p JOIN safegraph_geometry s ON ST_Within(ST_MakePoint(p.ping_longitude, p.ping_latitude), s.polygon_geom); SafeGraph’s Role in Location-Based Audiences Building successful location-based audiences requires access to accurate, high-quality data. SafeGraph delivers the essential POI and polygon data advertisers need to create precise, contextually rich audience segments to drive stronger campaign performance. With monthly data refreshes, SafeGraph ensures advertisers have up-to-date information on business and other attractions at different POIs to scale their audience-targeting efforts effectively. To learn more about how SafeGraph’s data can enhance your location-based marketing strategies, reach out to us for a demo or to explore how our data can fit into your existing workflows. #### How SafeGraph Raised Its Unconventional Series A Financing Since we published the news of SafeGraph’s $16 Million Series A Round last month, we have received a ton of questions from people about our (admittedly) unconventional raise. As a service to current and future entrepreneurs, we organized the inbound questions and answers in one place.We hope this acts as a resource and proves helpful to you.What was the thesis behind getting so many (over 100) individual investors in the round?Our belief is that even if someone invests a small amount (our smallest investor put in $10,000), that person has skin-in-the-game and is much more willing to help and promote the company. So we wanted to bring in as many people as possible.Of course, we only approached people that we thought could help SafeGraph and have positive value to the company.Because SafeGraph is selling into so many diverse industries, not one person has all the experience we need. We wanted a group of backers with broad experience (including government experience) to provide us with long-term help, insights and advice.Note: raising from lots of people is a fairly large time effort and not for everyone. There is a lot of chasing people down for signatures and wires … and a ton of email back-and-forth. I’d only recommend doing something like this if you are the type of entrepreneur that considers herself organized and on-top of email.SafeGraph has over 100 individual investors — that’s insane. What tools did you use to manage all those people?I just published “SafeGraph Raises $16 Million Series A” https://t.co/XPYmE7WJTm— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) April 19, 2017 The main reason companies do not bring in lots of investors is the preconception it is hard to manage and clutters the cap table.We used a few different tools to help us make the fundraise easier.First is that we used e-signatures for everything. Besides being a startup CEO, I’m also an active angel investor, and I’m always surprised on the number of PDFs I get rather than e-signature documents. (I even get a lot of hard-copy stock certificates in the mail). E-signatures make the process much easier. I’m not sure it matters what e-signature tool you use — there are probably at least 5 good ones.We also used DocSend to send out the deck to VCs. It was really helpful to see which VCs opened it and who else in their firm reviewed it. Second, we’re really big fans of eShares. eShares makes managing a cap table much easier. Investors and stakeholders can accept certs, request documents, etc. In my opinion, eShares is worth using for any venture-backed company with more than 3 shareholders.Most importantly, we set up a no-carry SPV (special purpose vehicle) to collect money from our individual investors. The SPV has the same terms as our venture investors but only takes up one entry on the cap table because it is one LLC. We actually were not smart enough to do this at the beginning of the process (we wish we were) and only moved to the SPV mid-way through the process.To make sure our SPV investors were treated the same way as our venture investor, SafeGraph covered the SPV fees (about $10,000). The process was fairly easy but had some kinks … mainly because this is still a new thing to do. It will be easier over time as these SPVs become more common (as I expect they will over the next few years as they simplify everything).I heard SafeGraph got multiple term sheets, why did you choose IDG Ventures?We went with Alexander Rosen at IDG Ventures for four reasons:They really understood our business. SafeGraph is a data company. And while data companies look a lot like SaaS companies, there are some very important differences that most VCs do not understand.We have a long relationship with Alex (I have been friends with him for 15 years) and have always found him a very thoughtful and helpful person.They were patient with us. Some of the VCs made high-pressure plays to try to get us to decide on a term sheet quickly. IDG allowed us to take the time to make sure we were making the best long-term decision.They were willing to allow us to bring in a bunch of individual investors. Many other VCs had more onerous ownership requirements which would not have allowed us to bring in so many individuals.Why involve a venture firm at all? Why not just raise it all from individuals?We wanted to have an active board member whose job it is to spend a significant time helping SafeGraph. All the individual investors in SafeGraph have day jobs and do not have the time to proactively help the companies they invest in (but they are awesome at reactively helping). The day job of a venture capitalist is to help their portfolio companies.How long did it take you to raise the financing?It took us a total of 2.5 months from start to close. That included reaching out to all the individuals, setting up the SPV docs, preparing for the venture raise, meeting with the venture firms, selecting a term sheet, and closing the final docs.I heard you invested your own money in the round, is that true?Yes. I invested $2 million of the $16 million we raised. I thought it was important to put my money where my mouth (and energies) is. I’m also not taking a salary from SafeGraph.Taking no salary or investing a lot of money in one’s company is not a path that most venture-backed founders can or should take. But it can be a good idea if you were fortunate to have a big exit in the past and now you want to swing for the fences. Even if you did not have a huge exit in the past, my advice to founders is to take the minimum salary you can get by on so you can telegraph to your shareholders that your financial goal is to make the shares very valuable.How many of the 100 individuals did you know personally before you approached them for an investment?Either myself or Brent Perez (SafeGraph’s cofounder) knew almost every investor beforehand. Some of the investors we have known for over 10 years. We optimized the list to people we knew and trusted. We did bring in a few extraordinary people that we did not know beforehand but all of those came recommended by people we trust.Join SafeGraph: We’re bringing together a world-class team, see open positions. #### How to Apply for Data Science Jobs: The Trick to Landing the Job Data science is a broad discipline with a range of applications and unique roles. When applying to jobs, it’s important to do proper research about what the role entails and the industry you’ll be working in. It’s also important to consider what employers are looking for in a data scientist. These can have a huge impact on what you do each day, and your overall satisfaction — and success — in a role.To help you apply for data science jobs, we’ll cover the following:Do you need a degree to be a data scientist?How to find a data science jobHow to apply for data science jobs: 7 tipsBefore we get into tips for the application process, we’ll briefly discuss whether or not you need a degree and where to look for data science jobs.Do you need a degree to be a data scientist?No, a degree is not required to be a data scientist. However, most data scientists have post-secondary education in related fields. In fact, many data scientists have advanced degrees, such as a masters or PhD in their field. Data scientists rely on their hard, technical skills most to get a job in their field.Bachelor or advanced degrees in the following fields are typically a good foundation for data science roles:StatisticsEngineeringMathComputer SciencePhysicsWhen applying for roles as a data scientist, you will rely mostly on technical skills, such as your knowledge of programming languages, data analysis solutions, statistical methods, and how to implement this information for practical results.How to find a data science jobWhen applying for data science jobs, you first need to know where to look. Below, we cover the top places to look for data science jobs.1. Browse online job boardsJob listing sites are some of the first resources you think of for finding a job in your field and the industry that you want. However, since these are the first places most people go, they are very popular, competition is high, and the boards can often be saturated with listings of mixed quality.There are also a limited number of opportunities on these online job boards, as you’re only able to apply for what people list here. There will be many opportunities that are never listed on these job boards that you will miss out on entirely.That being said, there are a number of opportunities and if you can sift through the listings, you can find good quality opportunities.Check out some examples:Indeed: As one of the leading job board sites, Indeed is a great place to find data science jobs. You can even apply directly within Indeed to make it extremely simple.‍Glassdoor: You can find a wide range of job postings on Glassdoor, easily filtered by industry, role, and more.There are also a number of job boards specific to the data science industry. As a specific marketplace for these industries, you’ll eliminate a lot of clutter you’d find on more generic sites.KDnuggets: As a reliable data science hub on the internet, KDnuggets is a great resource for finding data science job listings.Outer Join: Outer Join is a job board focused on remote jobs in data science. If you’re looking for data science jobs you can do from home, this is a great place to look.‍‍‍StatsJobs: StatsJobs is a job board centered on jobs related to statistics. This is a great place for data science jobs with a heavy focus on statistics.‍DataJobs: As the name suggests, DataJobs is a job board designed for finding data science and data analytics jobs.2. Network on social media platforms and emailNetworking is a great way to find data science jobs, especially ones that are related to the industry and niche that you want to work in. By connecting with the right people and companies, you can find opportunities closely related to your interests and future career goals.While this can work effectively, these services are saturated with users looking for jobs. Because of this, some networking services are more valuable than others in terms of providing access to good, quality jobs. Aside from good old fashioned email, we’ve listed some of the top platforms for networking below.Check out some examples:LinkedIn: As a site specifically designed for professional networking, LinkedIn is a great place to network for business purposes. They even have job listings and job search options to help you find jobs.‍Facebook: While Facebook is commonly thought of as a personal networking site, many people have professional Facebook pages, and businesses run their own pages as well. Using your Facebook as a way to network - when done properly - can be a great way of finding available job opportunities.Instagram: While it isn’t the most traditional method of finding jobs or networking, plenty of people create professional Instagram profiles for their business or professional persona. This can be a great option for connecting with people you’d otherwise never find on traditional networking sites.‍Twitter: Although you have a limited character count, Twitter can be a great place to network with peers, businesses, and the like.3. Search company websitesOne of the best ways to find a data science job that is ideal for you is to go directly to the source. Find a company that fits your industry and the unique data scientist role you want, and then reach out to them to connect, or follow their job postings to apply when an opportunity comes up.Finding companies this way can give you a leg up on competition - especially competition that found them via a job board - as you’ve shown a direct interest in their company. This effort shows that you’ve done some research about their business and have considered how you’d fit in with their team and in this role.Check out some examples:Careers page: Some businesses will list jobs to the public on a careers page. This is a great place to review postings, and a great place to apply to jobs directly.‍‍‍Contact page: Even if you can’t find job listings on a company website, you can still reach out via their contact page. If you are interested in a certain type of role, do some research into what they do and speak to how you would fit and offer value in your message.‍About page: A company’s about page is a great place to learn about what they do and their culture, allowing you to determine if you’d be a good fit with their company. Not only can you understand what your role would look like, but you can sometimes even see the team and learn about the work environment and values.4. Meet at conferences and eventsDespite the ability to network online easily, in-person connections go further, enabling you to create a deeper, longer lasting connection. Meeting with someone face-to-face makes it easier to remember that person, and allows you to create a more personal connection. When they are considering hiring a new data scientist, you are more likely to come to mind first.Conferences and other in-person events are ideal networking opportunities. Be sure to treat both people and businesses as equal opportunities. You may meet an incredible manager at a company that doesn’t particularly interest you. That personal relationship could still lead to a data scientist role down the road if they change companies or their team expands.Check out some examples:SafeGraph events: The SafeGraph Community runs in-person and virtual events, helping you learn about data science applications, use cases, and more, all working towards building your experience and knowledge of data science.‍Industry conferences: Attending conferences in your specific field will help you connect with professionals and businesses in your industry. They are also great places to learn about market standards, best practices, and changes.5. Join a community of data scientistsJoining a private or public community forum of data scientists is a great way to network and find jobs. Share your experiences, research, and publications, and challenge each other to gain inspiration and fresh ideas.While you may not be able to apply directly for jobs in these communities, they are an ideal method of networking, as you know members of the community are interested in and have experience in your field, industry, niche, or even the specific job you’re interested in. These are ideal connections for networking, and there are often members of other organizations there as well, helping you gain roles outside of the organization running the community.Check out some examples:IBM Data Science Community: Join discussions based on the type of information you want, including AI skills and hands-on learning.‍Kaggle: More commonly recognized as a data science tool, they’ve grown a large community and have cultivated a high-quality resource pool.‍‍‍Open Data Science: Gain access to data science and AI news, and connect via their Slack community.How to apply for data science jobs: 7 tipsApplying to data science jobs can be challenging; some companies are asking for very specific skills while others loosely list technical requirements. With such a wide range of applications, use cases, and industries, data science is a broad field with many roles. Therefore, a major part of applying to jobs and finding success with this process is making sure you are applying for the right roles for your goals, experience, and expertise.Below are the top things to consider and leverage when applying to data science jobs for a better chance at landing the job.1. Format your resume to be read by applicant tracking systems (ATSs)Many places - especially data science companies - use ATSs (applicant tracking systems) to automatically parse resumes and identify top candidates for interviews based on specific fields. If your resume isn’t properly formatted to be read by an ATS, you could miss out on opportunities without ever being seriously considered - or manually reviewed.One of our team members used LaTeX, a typesettings system for high-quality technical papers, for their resume. Their resume looked awesome, but they weren’t getting any traction. When they put their resume through an online ATS checker, they got a horrible grade back, indicating that the ATS could not read the resume well. In fact, it couldn’t even read bullet points correctly, likely resulting in the resume being overlooked.Ensure that your resume follows an easily readable format, and that it’s designed so that ATS systems can easily parse your information.2. Each data scientist role is uniqueWithin the world of data science, there are a variety of types of jobs and specific roles. Broadly speaking, there are a few areas of focus, such as statistics, visualization, data management, data architecture, and machine learning. However, there are in fact many more than this.A data scientist role can focus on any - and all - of these roles. Because of this, it’s important to understand the role you’re applying for and the responsibilities that come with it. Depending on your experience and interest, you may be a great fit for one data scientist role, while completely uninterested in another.It’s also important to remember that data scientist, data analyst, and similar variations often refer to similar roles and responsibilities. There can be significant overlap between these roles and sometimes very similar roles will be labelled very differently. You may even find project management or business operations jobs closely related to data that suit your interest best.3. Data science roles vary greatly across industriesMore than just differences in data scientist roles, there are many differences in the requirements, responsibilities, and opportunities across industries and sectors.A data scientist at a large company like Apple may have an extremely niche role at any given time, but they may also have increased opportunity to move around to many niche roles throughout their time with the company. Alternatively, a data scientist at a startup may be expected to wear many hats as projects demand. Further still, some roles may require domain expertise, such as in energy, biology, or even customer service.4. Understand where you are and plan for where you want to beWith so many unique data scientist roles, there is a wide range of opportunities in your field. When applying to jobs, it’s important to have a clear understanding of where you are and where you want to be in your future career.Do you want to get a solid understanding of the various areas of data science with the goal of moving up in the organization? Do you want to specialize as an expert in a specific niche area of data science, such as machine learning or geospatial data?Planning for your future outcomes will help you determine the best role for you, as well as help you ensure you steadily work towards your end goal throughout your career.5. Curate — and leverage — your online presenceYour online presence - whether employers admit it or not - is a key component of the hiring process. Interviewers will likely look you up online to see what your online presence is like, and the type of content you engage with. How you represent yourself online can speak volumes about who you are as a person and employee.When applying to data science jobs, be sure to curate your online presence accordingly. Keep profiles on job-finding apps like LinkedIn up-to-date, and be conscious of how you represent yourself on personal accounts as well. If you have one, make it easy for employers to access your online data science portfolio, showcasing your work.6. Understand yourself and the role you wantWith such a variety of data science roles, to find the best fit for you, you need to understand yourself, the type of industry you want to work in, and the type of work you want to do regularly. When you have decided you’d make a good fit and want to apply for a role, be sure to show them how you fit with your resume, cover letter, and any other communication you have. An email message with an attached resume or cover letter is a great place to write something that may not fit in your formal docs, but could make you stand out.To find a data scientist job that suits you (and you will want to stay at), you’ll want to have a clear understanding of the things you like to do in the field of data science, and the industries that interest you the most. With such a wide spectrum of options, finding the right fit is best for ensuring that you apply for - and get - the right job for you.7. Feature your work in an online portfolioHaving a data science portfolio that can be easily accessed by employers is a great way to showcase your work. Whether you have years of experience or you simply have projects from school that you can use as examples of your quality of work, a portfolio speaks volumes about yourself and how you present yourself.Include an easy to use link to your portfolio on your resume so your interviewer can easily pull it up and review it. Make sure that it’s presented cleanly and that it works, or this could cost you a job right away.Given the wide range of roles in data science, applying to data science jobs can be a challenging process. You’ll need to identify which roles fit with your experience, background, and interests, which industries are most appealing to you, and what skills are most applicable to the jobs you want before you can even apply.Once you’ve identified the roles that interest you the most, you can develop a targeted resume and cover letter. This will allow you to develop a more targeted application and increase your chances of success.Learn why data scientists choose SafeGraph data for their analysis, and practice using these datasets yourself to brush up on your skills. #### How to Build a Catchment Area Map for Enhanced Trade Area Analysis Key Takeaways A catchment area map visualizes where customers come from and supports better trade area analysis. Catchment maps help identify coverage gaps, overlaps, and competitive pressure across locations. Businesses use catchment maps for site selection, market penetration analysis, and targeted marketing. Catchment areas can be defined using distance buffers, drive or walk times, or mobility data. High-quality POI data and the right GIS or BI tools are essential for accurate catchment mapping. To understand customer acquisition and choose the best new sites for your stores, you need to have a clear understanding of where your customers are coming from. Catchment area maps (or trade area maps) help you visualize this information so you can draw better insights.We’ve already covered everything about catchment areas in the ultimate guide to catchment areas, so this post will focus on the value of a catchment map and how to build it. To do that, we’ll cover the following:What is a catchment map?Why is a catchment map useful?How to build a catchment map to conduct better trade area analysisBefore we dive into how to to build a catchment map, we’ll explain exactly what it is and why it’s so useful for your analysis.What Is a Catchment Map?A catchment map is a data visualization of one or more catchment areas, enabling analysis of customer acquisition. Catchment areas are geographic locations from which a business, organization, or service attracts visitors. These catchment areas are determined by the region under analysis.Catchment areas are determined based on a variety of factors, such as the distance to the location, travel time to the location, population within an area, and even geographic boundaries. For example, a retail location will have a catchment area surrounding the building to analyze foot traffic. However, there may be multiple catchment areas for analyzing inbound traffic to your retail location, such as billboards, bus stops, and more.Why Is a Catchment Map Useful?Catchment area maps allow you to visualize customer behavior including drive times and more. A map is the best version of this data, as it enables you to analyze your store locations and get a good idea of how they perform. You can draw better, deeper insights from this type of data when it’s displayed as a map.Identify coverage gaps and overlaps - By mapping all of your locations and analyzing catchment, you will be able to easily visualize gaps and overlaps in your coverage. This can help you expand your business to areas that are looking for your business and eliminate any chance of cannibalization.Site selection - Being able to easily visualize the trade area that your business locations cover helps you easily identify the best options for new site locations. When trying to identify new sites, you can use catchment maps to view your existing coverage, identify your gaps, and consider competitor threats, helping you choose the best site locations.Competitor analysis - Catchment analysis allows you to plot competitor locations in relation to your existing or proposed business locations, which is extremely useful for determining your performance in relation to competitors. This can help you choose new site locations and help you better compete in the market.Analyze market potential and penetration - Depending on what you include in your catchment area analysis, you can gain a clear understanding of the market potential in your area and possible market penetration.Focus your marketing on your core audience - Understanding your catchment area can save you money on wasted marketing, as you can stop sending marketing messages to people outside of your trade area, focusing on customers that are more likely to visit your physical location.Now that you know why having a catchment area map is so important, you’ll want to know how to build one. We show you how to build a catchment map below for your trade area analysis.How to Build a Catchment Map to Conduct Better Trade Area AnalysisA catchment area map is an extremely useful asset for your trade area analysis, helping you easily visualize data and make better (and faster) decisions.You can build your trade areas with or without considering data analytics, but it is recommended that you use as much point of interest (POI) analytics as you can to inform your catchment area mapping. Below, we walk you through the steps to build a catchment map to enable trade area analysis.Step 1. Identify store location pointsIdentify the stores or locations you want to run a trade area or catchment analysis map of and obtain the relevant datasets. Plot these locations on a map. This will form the basis of your catchment area map, and will determine the trade area(s) that you want to analyze. Be sure to include not just your store locations, but also those of competitors and complementary businesses that may impact your strategy.Step 2. Identify methodologyBased on your needs, choose the methodology right for you: calculating catchment areas by buffer trade areas, walk/drive time trade areas, or mobility trade areas. These determine catchment area boundaries based on different methodologies, allowing you to analyze your trade area based on a variety of factors.See more about the methods for calculating catchment area to help you decide the best method for you.Step 3. Load the data into the geospatial processing tool of your choiceAfter choosing the methodology, you can load data into the geospatial tool you want to use to process the data. Select a GIS or BI tool with capabilities that match the methodology you want to use, and set the parameters in the geoprocessing tool.Step 4. Perform the geoprocessing necessary for your methodologyRun a radius based on distance or a drivetime or walktime isochrone, representing the travel time based on method of transportation. You can even load mobility data to see where people are coming from and map those CBGs and POIs with a symbology that reflects the different volumes from each place.Step 5: Publish the map with necessary map elementsWhen the data is reflecting trade areas in the way you want, create a map that is easy to read, has appropriate labels, a legend, and at the appropriate scale for your intended audience. These catchment area maps are the best method of visualizing trade area and customer acquisition data, painting a vivid picture of your catchment area.Building a catchment area map is extremely important for visualizing consumer behavior to your business and understanding where your customers come from. Using a catchment map can help you analyze your trade area, making it easier for you to target your customers and drive more traffic. When building your catchment area maps, be sure to rely on high-quality point of interest data from SafeGraph. FAQ’s 1. What is a catchment area map? A catchment area map is a visual representation of one or more catchment areas that shows where customers or visitors originate. 2. How is a catchment area map different from a trade area map? They are often used interchangeably. Both visualize geographic areas from which a location draws customers, though “catchment map” emphasizes customer origin. 3. What data is needed to build a catchment area map? You typically need store or location points, POI data, demographic or mobility data, and a GIS or BI tool to process and visualize the analysis. 4. What is the best method for defining a catchment area on a map? Mobility-based catchment areas are the most realistic, as they reflect actual customer movement patterns rather than assumed distance or travel time. 5. Who should use catchment area maps? Retailers, marketers, real estate teams, urban planners, and analysts involved in site selection and location strategy benefit most from catchment maps. A catchment area map is a visual representation of one or more catchment areas that shows where customers or visitors originate.They are often used interchangeably. Both visualize geographic areas from which a location draws customers, though “catchment map” emphasizes customer origin.You typically need store or location points, POI data, demographic or mobility data, and a GIS or BI tool to process and visualize the analysis.Mobility-based catchment areas are the most realistic, as they reflect actual customer movement patterns rather than assumed distance or travel time.Retailers, marketers, real estate teams, urban planners, and analysts involved in site selection and location strategy benefit most from catchment maps. #### How to Decide Whether to Build vs. Buy a POI Database As point of interest (POI) information grows in importance for businesses – for both outward-facing applications and inward-facing dashboards – a key question many companies have to grapple with is whether to build vs. buy the data. Building a POI database affords greater customization and data control, but comes with many expenses and considerations – both initially and moving forward.Your organization can mitigate some of these upfront costs by instead buying or licensing a pre-existing database as a starting point. However, you will still need to spend time, money, and engineering and data resources to correct any inaccuracies, duplications, and omissions. And you will still need to maintain the database over time to ensure the data stays fresh and precise.There’s a happy middle to this dilemma, though: buy or license a database where the provider is committed to doing all the setup and maintenance work for you. That’s what we do here at SafeGraph. To show you why we think we’re the best option, we’re going to explore the pros and cons of each choice in the buy vs. build decision through the following sections:The Build vs. Buy Decision: Understanding Your OptionsOption 1: Build a POI Database YourselfOption 2: Buy a Partial POI Database and Clean/Improve It YourselfOption 3: Buy a Complete POI Database, like SafeGraphBefore getting into the specifics of each option, we’ll first talk a bit about why build vs. buy is such an important decision to get right in terms of acquiring a POI database.The Build vs. Buy Decision: Understanding Your OptionsWhether your organization needs POI data to power a consumer-facing map application, analyze a trade area, create a market forecast model, or fulfill some other use case, one thing is constant. And that is the quality of your outputs is only as good as that of your inputs.Let’s take the consumer-facing map application as an example. If a consumer goes to find a place using your company’s map, but your POI data hasn’t been updated since last quarter, it’s possible that place may have closed down or been turned into a different place entirely.The pace at which POIs open, close, move, or change is faster than you might think. That’s why it’s very difficult to maintain a highly-accurate, large-scale POI database. So for companies needing to do so, there are three general options:Build a POI database from scratch by sourcing high quality data and merging multiple datasets.Buy a partially-complete POI database, and then clean/upgrade it with additional data and ongoing maintenance.Buy a complete, externally-managed database like SafeGraph Places.The decision to build vs. buy a database of POI information requires your business to consider a number of factors. The following sections will outline many of these aspects for each of the three main choices, so you’ll know what’s in store before you make your final selection.Option 1: Build a POI Database YourselfChoosing to build vs. buy a data pipeline of POI information has a few core benefits. First, your organization can customize the system to fit its exact use cases. Second, your company doesn’t have to pay or rely on an outside company to continually manage and update the database. And finally – on a related note – being able to own the data (or at least get relaxed licensing terms on it) gives your business greater freedom in terms of what it can use the data for.However, your organization needs to make a lot of upfront commitments and decisions if it opts to build vs. buy POI data. Here are some aspects that will need to be taken into account.Setting up the InfrastructureTo start, your business needs to decide what kinds of use cases it will likely use POI data for. Based on these, it can determine what capabilities it wants the POI database to have. We mentioned this customization is an advantage of choosing to build vs. buy software, but it’s not without its tradeoffs.The database will need one or more software engineers dedicated to setting it up and managing it, and they aren’t cheap to hire. You’ll likely need upwards of $150,000 for each. Then there’s the costs of computing to source, process, clean, and maintain the data infrastructure. How much your business needs will depend on the scope of the data you plan to cover (e.g. maybe just for a specific country or a limited number of data attributes). But you should expect costs in the $100,000 to $600,000 range, for starters.Acquiring Base DataAfter setting the database up, your company will need to start populating it with data. One inexpensive option is to copy from open datasets, such as OpenStreetMap. But there are issues with relying solely on free POI data; often, it’s incomplete, not updated regularly, difficult to download, or lacking documentation to explain how it works. Another thing to be careful about is reading the licensing terms of any dataset your organization downloads. The provider may not always allow the data to be used commercially, or for specific use cases.Another method is to build a web crawling tool to scrape publicly available POI information. This is a faster way of doing things, but is still subject to many of the problems with using free data listed above. It’s also an added cost for your business, usually in a starting range of $100,000 to $600,000 depending on the breadth and depth of data you want to collect (e.g. a territory or country vs. a full continent).Building and Cleaning the DatabaseOnce you have some entry-level POI data, you’ll need to branch out into acquiring specialized datasets for your company’s specific needs. However, this involves much more work than simply fitting datasets together like puzzle pieces. There are all sorts of challenges with processes like matching entries between datasets that refer to the same place (and, by extension, verifying that place actually exists), as well as cleaning the data to eliminate duplicate entries within the same dataset. Varying attributes and data formats – especially for addresses – make these processes all the more difficult. That’s why they often require advanced machine learning techniques to do well.To illustrate, once a place is verified to exist by checking it across multiple datasets, it has to be geocoded, classified, and assigned attribute data. This can be tricky and time-consuming to do manually (e.g. which attributes should be prioritized? Is a place’s coordinates at an exact address or at the centroid of a building, for instance?). Using a geocoding program or advanced machine learning algorithm can save your organization time here too, but you’ll have to weigh it against how much it will impact your bottom line.In any event, not cleaning and merging data properly can cost your company far beyond what it paid to purchase or license the data. If using it internally, incorrect data can throw off your organization’s analysis and modeling. And if using it in a consumer-facing capacity, customers may quickly leave your product and not recommend it to others, citing its unreliability.Opportunity CostsOne other cost to keep in mind when your company is mulling over build vs. buy decision criteria for a POI database is how many resources would otherwise be spent in other areas of your business. If your organization has geospatial data and applications at the core of its operations (or is planning to make this shift), then it might make sense to use the build vs. buy framework. If not, then you need to ask a couple of key questions.The first one is how long it will take before your business starts seeing a return on the investment of building vs. buying a database. The process takes longer than other options – possibly 6 months, if not more – so your company needs to be able to cover the expenses until the project gets off the ground. Also remember that your company will be responsible for maintaining and updating the database going forward. So be sure to factor that into the time and money spent as well.The other, related, question is what your organization would otherwise be doing with what it spends on the build vs. buy strategy. That includes getting the database up and running, as well as maintaining it over time. Again, this comes back to what your company’s key value proposition is. If that isn’t geospatial data and applications, then it’s usually a better idea to start with a premade database instead of trying to reinvent the wheel. This frees up your organization to focus its time, money, and engineering resources on doing what it does best.‍Option 2: Buy a Partial POI Database and Clean/Improve It YourselfIf your company is considering a buy vs. build approach instead, it will need to find a dataset to download or license as a starting point. This is still a difficult and time-consuming process, not only to find potential providers, but also to evaluate their data quality. What scope of geographies does the data cover? When was the data last updated, and how often is (or was) it reviewed? What contextual information does the data have, and how many records actually have this information filled in? These are all questions your company needs to ask.A potential option, if your company’s a larger enterprise, is to use a database built by a competitor you acquire or merge with. This, of course, requires many other considerations, as there are a lot of costs, negotiations, and management involved with acquisitions and mergers. Another option is to use a free open-source database, such as OpenStreetMap. Again, though, there are numerous pitfalls with using geospatial data that’s contributed and updated primarily by volunteers, just to save on upfront costs.In any case, there will still be work to do in terms of setting up the database’s infrastructure. That includes not only processing and storing the data, but also doing any necessary cleanup work. Depending on which provider you license from, the data may be missing entries, or have entries with incorrect attribution or that are duplicates referring to the same place. The longer it has been since the data was last updated (and the less frequently it was updated), the more likely it is your organization will encounter these errors and omissions. Again, not dealing with these issues in a timely manner can cost your business down the road, in terms of inaccurate analysis/modeling or loss of consumer trust.Your organization will require the help of data scientists and software engineers to do all of this work, both initially and going forward. Like we discussed in the build vs. buy analysis, this represents not only a monetary cost on top of what your organization pays to license the data. It’s an opportunity cost in terms of what your business would otherwise be devoting time and engineering / data science resources to.‍Option 3: Buy a Complete POI Database, Like SafeGraphIf your organization is set on the buy vs. build strategy but doesn’t want to deal with all the extra data scrubbing and maintenance work, there is a third option. It can buy or license a POI database from a third party that has already taken care of all the merging, cleaning, verifying, and updating tasks – namely, us here at SafeGraph.This option is much less expensive – in terms of time, money, and engineering/data resources – than building a POI database from scratch. Additionally, the effort SafeGraph puts into making our Places dataset as complete, accurate, and fresh as possible ultimately saves your company resources over alternatives that may have lower upfront costs. Because our Places data is so precise and well-documented, your business can use it right out of the box. There’s no need to take up your engineers’ or data scientists’ time verifying whether or not a place exists, plotting exactly where it is on a map, deciding which classifications and attributes apply to it, or erasing/merging any duplicate entries.All of this doesn’t just free up your organization to invest its time and human resources into the things it really wants to be doing. It also saves your company money because it eliminates the costs of software and talent acquisition associated with setting up and maintaining an in-house POI database. Meanwhile, your business can be confident that its internal analyses and models – and its consumer-facing applications – are powered by accurate data that allows both stakeholders and customers to make well-advised decisions.As an example, the commercial real estate consultants at Avison Young previously used POI datasets that were often messy and limited to information about big brands. This made it difficult for them to deliver answers to their clients’ site selection questions in a timely manner, as analysts were spending up to 40% of a project just cleaning data and adding in missing information from open sources. Switching to the SafeGraph Places database has given them a much faster way to get a comprehensive overview of the business mixes in trade areas. This has allowed them to get insights to their clients sooner, so these businesses can act upon the advice before they lose the opportunity.Learn How You Can Save by Buying Accurate, Robust POI DataIt’s important to consider the cost of building vs. buying a database of POI information, should you decide your organization needs one. And remember this cost is not just monetary; it also includes time, human resources, and opportunity.That’s why you should do as much research as you can before you make your final decision. For example, check out a sample of SafeGraph’s POI data from our Places dataset. Or if you want to schedule a demo to get an in-depth look at how Places can work for your organization, get in touch with our sales team. #### How to Do Effective Catchment Marketing for Your Local Business Key Takeaways Catchment marketing is built on understanding where customers come from and how they behave around store locations. Catchment analysis enables more precise, cost-effective local marketing by reducing wasted outreach. Demographic and mobility data help personalize messaging, offers, and campaign targeting. Catchment insights can be used to target competitor customers and underserved areas. Integrating catchment analysis into marketing strategy improves regional coverage and customer retention. Catchment analysis is centered on understanding how the local community interacts with your store locations, allowing you to better understand - and sell to - your customers. Below, we cover how to leverage catchment analysis to perfect your marketing efforts, helping you acquire, retain, and engage your customers.To get the most from catchment analysis, you’ll need to rely on high-quality point of interest (POI). We’ll cover what it is and the best ways to use it for your local marketing efforts.What is catchment analysis?What is catchment marketing?6 ways catchment analysis improves local marketing strategyHow to implement a catchment analysis into your marketing strategyTo start, let’s cover catchment analysis, and explore why it’s so useful for marketing.What Is Catchment Analysis?Catchment analysis is the process of examining catchment areas for your store locations. Catchment areas are the areas around your business from which you draw customers, and help you understand customer behavior. In catchment analysis, you will analyze trade areas for your store locations.The end goal of catchment analysis is to have a clear picture of where your customers come from and who they are. With demographics information such as home addresses, age, income, education levels, and more, you can get a good idea of your main audience, allowing you to better serve them, as well as target new locations.Catchment analysis will show you the coverage of your store locations, revealing any gaps and overlaps. You can use this to inform where to open new locations, close existing locations in underperforming areas, and expand locations to better serve the most successful trade areas.What Is Catchment Marketing?Catchment marketing is the process of determining and carrying out your marketing strategy based on the insights you gather from your catchment analysis. This allows you to build a local marketing strategy rooted in customer behavior around your store locations.When done properly, it allows you to more effectively market to your customers and save effort and money on wasted marketing materials. The more you know about your customers, the better you can target your audience with marketing, ads, offers, and messaging, helping you drive more traffic and improve conversions!6 Ways Catchment Analysis Improves Local Marketing StrategyCatchment analysis, by nature, is a method of local analysis and marketing, as it centers on specific store locations. Since catchment area analysis lets you understand where store visitors come from and who they are, it’s specific to the trade area of your store.In all cases, trade area analysis will allow you to more effectively market to and serve your customers. With more information on demographics and where your customers come from, you can save on marketing resources and better cater your marketing campaigns to those customers.We cover the best ways to use catchment analysis to improve your local marketing:Increase personalized marketing: Tailor your marketing activities and verbage to who is visiting your store. See mobility data origin CBG and other places they travel to ascertain demographic and brand affinity information.‍Define your marketing distribution area: Knowing where your customers come from allows you to focus your marketing materials, saving on wasted marketing for customers that don’t fall in your catchment area.‍Market to competitors’ customers more effectively: Tailor your marketing activities to target competitor visitors to drive them to your brand instead. Use catchment maps to see where competitor locations are, where your trade areas overlap, and then how to target these customers and bring them over to your brand.‍Understand the market environment: Demographics data allows you to identify the catchment area, helping you understand if it’s a residential or commercial area, and the average age, income, education level, and home address of potential customers in the area.‍Improve regional coverage: Perfect local market coverage by identifying gaps and overlaps in your trade areas for your own store locations. You can then expand locations to better serve the areas that are showing growth and success, close areas that are underperforming, and identify the best places for new store locations.‍‍Leverage geofences efficiently: Understanding your catchment area allows you to deploy geofences in the right areas for greatest impact. These will automate mobile push notifications to places in your catchment area or in places you want to add to your catchment area.How to Implement Catchment Analysis into Your Marketing StrategyTo gain the full benefits of catchment area analysis, you need to effectively integrate it into your overall marketing strategy and process. Be sure to consider how you will implement your analysis into your marketing efforts, including how to develop and distribute content using what you’ve learned about your customers.Below, we help you integrate catchment analysis into your marketing strategy to gain traction and conversions.Step 1. See who is currently coming to your store/in your catchment areaThe main benefit is the information you gather about your customers, including key demographics and brand affinity characteristics. These will be big drivers in determining your marketing preferences and activities, allowing you to leverage your efforts for best results. This can be used to create better marketing materials, but also save on overproducing marketing materials and distributing them to areas where they are unlikely to gain traction.Ultimately, this lets you maximize the value of the marketing materials that you produce, allowing you to gain the most growth and conversions.Step 2. Identify other populations you would like to reach (competitor customers, areas you want to expand to)Beyond serving your existing customers better, good catchment analysis allows you to find new customers that you were missing. Identify key demographic and brand affinity characteristics that can influence their marketing preferences, and use this to better target your outreach to potential customers, helping you steal customers from your competitors.Step 3. Develop targeted, personalized marketing based on demographic and brand affinity characteristicsUsing the insights from your catchment analysis, you can develop a personalized marketing approach. Create tailored ads, offers, coupons, notifications, and messaging to better target your known customers based on personas. This will garner more traction, allowing you to gain new customers and retain existing customers more effectively.Now that you know what catchment marketing is and how to use catchment analysis to inform your marketing efforts, you can start using these strategies to gain new customers, grow your business, and increase sales. For this to work, you need to use high-quality, reliable point of interest (POI) for the trade area under analysis.SafeGraph’s Places provides insights about competitors, including business listings and proximity. With data enriched by demographics data, you’ll be able to analyze customer behavior and improve your coverage and outreach. FAQ’s 1. What is catchment marketing? Catchment marketing is the practice of designing marketing strategies based on insights from catchment area analysis, focusing on where customers originate and how they interact with a location. 2. How does catchment analysis improve local marketing? It helps businesses target customers more accurately, personalize messaging, reduce wasted spend, and focus marketing efforts on high-value trade areas. 3. What data is needed for catchment marketing? Catchment marketing relies on high-quality POI data, demographic data, mobility patterns, and competitor location information. 4. Can catchment marketing help target competitor customers? Yes. By identifying overlapping trade areas, businesses can design campaigns to attract customers who currently visit competing locations. 5. Is catchment marketing useful for small local businesses? Yes. Catchment marketing is especially valuable for local businesses because it focuses on nearby customer behavior and efficient use of limited marketing budgets. Catchment marketing is the practice of designing marketing strategies based on insights from catchment area analysis, focusing on where customers originate and how they interact with a location.It helps businesses target customers more accurately, personalize messaging, reduce wasted spend, and focus marketing efforts on high-value trade areas.Catchment marketing relies on high-quality POI data, demographic data, mobility patterns, and competitor location information.Yes. By identifying overlapping trade areas, businesses can design campaigns to attract customers who currently visit competing locations.Yes. Catchment marketing is especially valuable for local businesses because it focuses on nearby customer behavior and efficient use of limited marketing budgets. #### How to Use Geocoding API to Drive Business Growth Key Takeaways Geocoding converts written location descriptions to geographic coordinates for systems to analyze. A Geocoding API automates location processing at scale across systems and workflows. Businesses use geocoding to overcome fragmented, inconsistent location data. Geocoding supports better decisions in operations, planning and long-term growth. Location sits quietly behind many everyday business decisions. From showing nearby stores to estimating delivery times, companies depend on knowing where something is located and what surrounds it. A Geocoding API plays a central role in making this possible. It translates addresses and place names into geographic coordinates that a data system can better utilize to provide meaningful business insights and actions.Before understanding how a Geocoding API works to support businesses, let us first look at the basics of geocoding to understand it better.The Basics of GeocodingGeocoding is the conversion of written descriptions of a place into geographic coordinates. People might know an address, a historic landmark, or even an area by name, but for data systems, these descriptions are just text. Geocoding converts these texts into latitude and longitude, creating a precise and more consistent reference for a physical location.The result of this is what we call a geocode. A geocode is not an address. It is the coordinate-based identifier of a place, usingg latitude and longitude. While addresses can change, vary in format, or contain errors, geographic coordinates remain fixed. This is why geocodes are far more reliable for analysis, comparison, and automation.A Geocoding API is the interface that makes this process usable at scale. It allows applications and data systems to send location inputs such as addresses or place names and receive standardized coordinate outputs in return. Instead of handling geocoding manually, businesses rely on APIs to apply the same logic consistently across volumes of location data.At this foundational level, geocoding is about structure. It turns descriptive locations into  data that systems can store, reuse, and analyze.What a Geocoding API does in practiceIn practice, a Geocoding API handles how location data moves through real workflows. This is where the mechanics of Geocoding become operational.Geocoding usually works in two directions. Forward Geocoding (typically known just as ‘geocoding’) begins with an address or place name and transforms it into geographic coordinates. This is often used when businesses want to map customer addresses, store networks, service areas, or delivery destinations. After conversion, these locations may easily be visualized and analyzed using distance measures and other analysis that becomes possible with coordinates. Reverse Geocoding works in the opposite manner. It takes geographic coordinates and translates them back into a human-readable location. This process is often used to contextualize movements – turning location pings from deliveries or users into real-world locations.A Geocoding API applies both directions automatically and consistently. It ensures that new location data entering a system is processed the same way as existing data. Over time, this creates a reliable location layer that different tools and teams can depend on without repeated cleanup or interpretation. Rather than being a one-time conversion step, the API becomes part of ongoing operations, quietly standardizing location data as it flows through the business.Why businesses use a Geocoding API Businesses often need to start  using a Geocoding API when location data becomes a constraint rather than an asset.As organizations grow, they collect location information from many different sources. Customer records, transaction logs, delivery addresses, partner systems, and third-party data might all describe one common location but in different address strings and place names. These inputs are often incomplete, inconsistent, and difficult to compare. Over time, this fragmentation makes it harder to run location-based analysis.Addresses alone also limit what businesses can measure. Text descriptions cannot answer questions about distance, proximity, or coverage. Teams may know where customers are in theory, but struggle to understand how locations relate to one another in practice. This creates blind spots in planning, operations, and strategies.Finally, location data does not scale well when handled manually. As volume increases, so do errors, delays, and operational overhead. Without a standardized way to process locations, businesses find it difficult to maintain accuracy across systems and teams.These challenges are what pushes the organizations towards a Geocoding API in the first place.How Geocoding APIs Drive Business GrowthGeocoding APIs address these challenges by turning location into structured, usable data.When addresses are converted into geographic coordinates, location stops being descriptive and becomes measurable. Businesses can analyze distance, identify clusters, and compare regions using a consistent reference. This shift enables clearer insights into where demand is concentrated and where performance varies.Operationally, geocoded data supports more efficient workflows. Routes can be optimized, service areas refined, and coverage gaps identified. Over time, all these efficiencies reduce costs and improve customer experience, particularly as scale increases.Expansion planning is no longer about gut instinct or based on speculation — it is data-driven. New locations, markets, or service zones can be  assessed for availability, proximity, and existing demand patterns as predictors before embarking on a major business decision.In this way, Geocoding APIs improve the quality of decisions that in turn drive growth. Better location data leads to better planning, execution, and long-term outcomes.Real-World Business Use Cases of Geocoding APIsGeocoding APIs are used wherever location influences cost, experience, or decision-making. While the applications vary by industry, the underlying value remains the same: turning location descriptions into something measurable, comparable and actionable.Customer and Market AnalysisBusinesses geocode customer addresses to understand where demand is concentrated. When customer locations are mapped and analyzed, patterns emerge that are not visible in spreadsheets. Teams can identify high-performing regions, underserved areas, and geographic differences in behavior. These insights support more focused marketing, pricing, and regional planning.Logistics, Routing and Service Coverage Delivery, conveyance, and field activity largely depend on precise location information. These Geocoding APIs allow businesses to map destinations into coordinates, which are easier for planning routes and calculating distances. This contributes to lowering delivery times, fuel consumption, and identification of sensible service areas as business scale.Address Validation and Data QualityIssues with incomplete or inconsistent addresses are a common source of operational friction for e-commerce operators and other user applications. Geocoding APIs are thus used to  validate user-input addresses and to enhance accuracy across customer records, orders and internal databases. Cleaner location data leads to fewer failed shipments, and more satisfied customers. Expansion and Site Planning Geocoding provides an analytical lens through which decisions taken by companies about the location of new stores, warehouses or service zones can be rooted. By taking factors such as customer proximity, access and coverage into account, teams can evaluate future locations before making long-term investments. This mitigates the risk and increases success in growing.Choosing the Right APIChoosing a geocoding API depends on how location data is used within the business.Accuracy is essential. Location data must be reliable, especially in regions where addresses vary in structure. Inconsistent results can weaken analysis and lead to poor decision-making.Coverage matters as well. Businesses operating across multiple regions need consistent performance at different geographic levels, and gaps in coverage can create blind spots. To ensure strong accuracy and coverage, the geocoding API should rely on a high-quality address database, such as SafeGraph’s Address.Scalability and latency are other key factors. As data volumes grow, the API should be able to handle larger datasets without delays. Finally, geocoded data should integrate easily with existing maps, analytics tools, and operational systems.How to Get Started with GeocodingMost businesses already collect address data. The first step is identifying where location affects decisions or operations.A common starting point is geocoding a single dataset, such as customer or delivery addresses, to assess accuracy and consistency. From there, geocoding can be applied to additional data as needed. Over time, geocoding becomes part of regular data workflows. New location data is processed as it enters the system, keeping the records aligned and ready for analysis. FAQ’s 1. What is Geocoding in simple terms? Geocoding is the translation of an address or a place name into geographic coordinates, such as latitude and longitude. 2. What is the difference between geocoding and a Geocoding API? Geocoding is the process itself of converting a text name to a coordinate. A Geocoding API automates this process at scale. 3. Why are geocodes more reliable than addresses? Addresses can be variable, have errors, or change over time. The coordinates keep geographic positions constant, which allows for more uniform analysis and automation. 4. Can we use geocoding for only maps? While maps are a frequent output, geocoding powers analytics, routing, planning, and operational decision-making as well. 5. When should a business adopt a Geocoding API? When location starts to have cost, efficiency, customer experience, or growth implications, geocoding is a must. Geocoding is the translation of an address or a place name into geographic coordinates, such as latitude and longitude.Geocoding is the process itself of converting a text name to a coordinate. A Geocoding API automates this process at scale.Addresses can be variable, have errors, or change over time. The coordinates keep geographic positions constant, which allows for more uniform analysis and automation.While maps are a frequent output, geocoding powers analytics, routing, planning, and operational decision-making as well.When location starts to have cost, efficiency, customer experience, or growth implications, geocoding is a must. #### How to Use POI Data for Catchment Area Analysis Tips for working with SafeGraph Places data to improve retail site selection Choosing the right location for a retail business or other commercial property is, in many ways, the key to its long-term success. A “bad” location—for example, a storefront in a secluded area without any surrounding businesses or one that’s not easily accessible from main roads or highways—can spell doom and gloom right from the start. This is why businesses and real estate planners do catchment area analysis, also referred to as trade area analysis, before making any investment in the purchase of property or land. There are a number of different approaches for running catchment area analysis for site selection. That’s why we thought we’d let you in on a little secret about how you can use POI (point of interest) data to not only fuel your efforts around catchment area analysis but also glean actionable insights around how to engage and retain customers as well as create better overall customer experiences that local consumers actually want and need. So if you’ve never considered using POI data to inform your retail site selection decisions, in this blog we’ll show you how you can do this effectively with the help of SafeGraph Places data. Why POI data makes sense for retail site selection Before we get into the nitty-gritty of using SafeGraph Places data for catchment area analysis, it’s important to first set some context around what retail market planning teams care about as they make decisions about either opening new locations or closing underperforming stores. The primary reason a market planning team will do catchment area analysis for site selection is to ensure a new store location can perform at or above expectations—typically, in terms of store-level EBITDA or other key financial metrics. To do this, they may assess factors, such as: Coverage of POIs in a particular category and region Potential market share within the catchment area or region as of day one The likelihood of ongoing consumer demand within the trade area or region The volume of key competitors already doing business in the surrounding area Ease of accessibility to the store location (including parking) In short, before a business makes an investment in a new store location, it’s critical to understand the risk-to-reward ratio of doing so. Obviously, no business comes without its risks; however, by leveraging accurate and up-to-date data to create an educated hypothesis around a store’s potential for long-term success, it’s much easier to minimize any potential risks head-on. This is where using POI data for running catchment area analysis can really come to the rescue. Using SafeGraph Places data for catchment area analysis For the purposes of this exercise, say you’re considering opening a new retail location, either a restaurant or a clothing boutique, in one of two fast-growing Austin neighborhoods: one just south of the city center (South Congress) and the other in the city’s northern suburbs (Domain). By plotting the SafeGraph Places dataset into a mapping tool like CARTO, you’ll get a visual representation of the two trade areas, including buffer areas across the neighborhoods as well as the approximate walking distance between places in those areas. In the example map below, you can see via the category tags that the neighborhood in North Austin has a high density of clothing and shoe stores whereas the South Congress area is more restaurant-centric. So if you’re looking for a location for a new restaurant, you might have a lot more success if it’s in the heart of a retail district that caters to a constant influx of shoppers—and just so happens to have less competition from nearby restaurants. On the other hand, if you’re looking to open a new retail boutique, you might want to consider the South Congress area because there are a lot of eateries around that can amplify the shopping experience for consumers. Then again, in both scenarios, you’ll want to pay close attention to what kind of restaurants or retail boutiques are already in the area because, in spite of a higher concentration of certain POIs, there might not actually be a high concentration of the same type of restaurant or retail shop you’re looking to open. That in and of itself could create a unique opportunity for success. In any case, the big takeaway here is that POI data can shed a tremendous amount of light on how to not only address the wants, needs, and expectations of the target demographics (aka, people) who live in or frequently travel to a specific retail trade area but also how to use your new restaurant or retail location to provide added (and differentiated) value in those areas. ‍ Now, let’s take a closer look at what kind of information you can find on this map. By hovering over a pin for each POI shown here, you get access to a wealth of metadata, such as: POI type (full-service restaurant, limited-service restaurant, clothing store, shoe store) POI name Street address Category and sub-category “Opened on” date Geolocation (latitude and longitude) The number of similar POI types within a 0.5-mile radius This data allows you to see at a quick glance the key differences of the two trade areas, which can also help answer important questions and make site selection decisions a lot easier: Are there any coverage gaps? Does coverage overlap in any way? Where are competitors located in relation to the proposed location? What is the concentration of similar POI types in a given retail trade area? How far is the location from other shops, restaurants, public transportation, etc.? How many new stores have opened in the past year (area growth rate)? Answering these questions (and more), therefore, allows you to assess the potential of one trade area over another, from a variety of different angles and key considerations, in order to hone in on a new store location that will have the highest probability for success as of day one. For many market planners, the most important factor is to settle on a location that caters squarely to a business’s target audience, where there are fewer similar store or restaurant types—including direct competitors—in the immediate vicinity while also being in close proximity to other complementary shops, restaurants, and commercial or public buildings to benefit from the organic residual foot traffic that exists within a given neighborhood. Finding that sweet spot is not always easy. However, by visualizing SafeGraph Places data in this way, it becomes a lot easier to understand the “big picture” market dynamics of a trade area. Enhance retail site selection with SafeGraph Places data While there are certainly a number of factors beyond POI data alone that can go into the catchment area analysis process, it goes without saying that using POI data can enhance market planning and drive better, more informed retail site selection decisions in significant ways. This is made even more effective by tapping into the power, accuracy, and granularity of the SafeGraph Places dataset, which now covers 41M+ POIs, 10K+ brands, and 400+ categories in more than 220 countries and territories. Not to mention, every POI in our dataset includes detailed metadata, including precise geocodes, store IDs, category tags, open hours, open and close dates, nearby public transportation, and spatial hierarchy. So whether you are doing catchment area analysis or want to leverage location data for retail and real estate analytics, the SafeGraph Places dataset has got you covered (no pun intended!). The moral of this story is fairly simple: Long gone are the days of relying solely on old, traditional methods for doing catchment area analysis successfully. Now, you’ve got a new and, dare we say, more effective and incredibly comprehensive way to fuel retail site selection decision-making with a higher degree of confidence, thanks to SafeGraph Places data. #### ICSC 2025 Recap: The Data Arms Race Is On ICSC 2025 Recap: The Data Arms Race Is OnThe most recent ICSC conference in Las Vegas brought together thousands of leaders across retail, commercial real estate, and technology. The conversations this year centered on a few key questions: How do we grow intelligently, build resilience, and better understand consumers at the local level?We joined the event to connect with customers, partners, and innovators who are reshaping the future of physical retail and real estate.Key Takeaways from ICSC 20251. Growth is happening, but with more focusRetailers are expanding again, but with a much more strategic mindset. Growth plans now revolve around optimizing portfolios, analyzing market potential, and making decisions backed by data. This includes knowing which stores to invest in, which markets to enter, and how to manage risk along the way.2. Experience is the new anchor tenantRetail is no longer just about convenience. Brands are doubling down on creating experiences that drive loyalty and foot traffic. Whether it’s in-store events, new service models, or destination retail, understanding how people interact with physical spaces is critical.3. Real estate is becoming more creative and flexibleLandlords, brokers, and developers are increasingly looking at adaptive reuse and non-traditional tenants. Vacant big-box stores are being transformed into health clinics, co-working hubs, and entertainment venues. Flexibility and creativity are top of mind.4. Everyone is exploring AI, but it starts with clean dataAI was a recurring theme at ICSC (not surprisingly), but one message stood out. Without clean, normalized data, even the most advanced AI tools will fall short. Whether it’s powering internal analytics or external platforms, data quality is everything.How SafeGraph Supports Retail and Real Estate LeadersIf there was one unifying theme at ICSC this year, it was the need for clarity about the market, the customer, and the environment around each store. SafeGraph provides that clarity by delivering granular, reliable data on physical places, how people interact with them, and what’s changing on the ground.Whether you’re opening your 5th location or optimizing your 500th, SafeGraph can help:Validate site selection decisions with rich Places data, including accurate POI attributes like brand affiliation, category, NAICS code, open hours, and spatial hierarchy. You can confidently assess what’s actually present at a location — not just what’s on paper.Analyze competitive landscapes with precise POI metadata and spatial geometry. Understand nearby brand presence, retail saturation, and business mix to identify underserved areas or high-density competitors.Understand how local context impacts performance, from co-tenancy insights to proximity analysis. Use the data to evaluate what surrounds each potential location — whether it’s complementary businesses, anchors, or points of friction.Fuel predictive models with clean, machine-readable data, including persistent, normalized schemas, and accurate POIs across millions of locations in the US and globally.Let’s keep the conversation goingWe left ICSC energized by the momentum in this space and the openness to smarter, data-first decision-making. If you're interested in learning how SafeGraph can support your team, we’d love to connect.‍ #### Improving Location Intelligence with Richer Category and Amenity Data Understanding a place goes beyond its name and address. Businesses, marketers, and data teams rely on places data to power personalization, enhance search, ensure compliance, and drive strategic decisions. Yet, many datasets provide only high-level industry codes or vague classifications, making it difficult to extract meaningful insights.SafeGraph’s March 2025 release expands category_tags and introduces new amenity columns, offering richer metadata on places in the Accommodation and Food Services industry. This update improves how businesses identify relevant locations, enabling more precise search, filtering, and analysis.More Granular, Useful Category TagsSafeGraph’s category tags are now better defined to describe a place’s granular category, cuisine, and common dish/good. This structured approach makes it easier to filter locations, power recommendation engines, and improve search accuracy.New Definition of Category TagsCategory tags now strictly answer the question:"What words or phrases best describe this place or what type of food/goods does it sell?"Old category tags mixed high-level business types with operational details like "Delivery" or "Reservations." These operational details are now captured separately in new amenity columns, leaving category tags focused on what a business is or sells.Examples of New Category Tags:Granular category: Dive Bar, RV ParkCuisine: Mexican Food, Oaxacan FoodCommon dish/good: Bagels, Coffee, PastriesExpanded Coverage and Hierarchy100+ new category tag values added based on common search queries.Better fill rate – more places now have category tags for improved coverage.Hierarchical structure – Category tags now reflect parent-child relationships. For example: Asian Restaurant includes Japanese Food, Korean Food, etc.Japanese Food includes Sushi.New Amenity Columns for Better Search and FilteringMany attributes don’t belong in a category field—for example, whether a place has delivery, reservations, or a drive-through. SafeGraph now captures these details in dedicated amenity columns, making it easier to filter and analyze businesses.New Amenity Columns & DefinitionsEach amenity category is designed to answer a specific question:Accessibility (“How can I get there and am I able to get around?”) Example: Parking, Wheelchair Accessible RestroomActivities (“What can I do while I’m there?”) Example: Karaoke, Pool and Billiards, TriviaAmenities (“What things do they offer to use while I’m there?”) Example: Bar On-site, Restroom, WiFiOwner Demographic (“What type of people run this place?”) Example: Black Owned, Women Owned, Veteran OwnedPayment Options (“How can I pay for things there?”) Example: Cash, Credit Card, Debit CardService Options (“What kind of meals do they serve, and how can I get them?”) Example: Accepts Reservations, Breakfast, Delivery, Happy HourSetting (“How will it look, feel, and sound while I’m there?”) Example: Family Friendly, Kid Friendly, Moderate Noise, Touristy, UpscaleHow Businesses Use This DataDifferent industries leverage this data in unique ways. Some key examples:Marketing & Personalization – Refine cash-back offers, improve targeted promotions, and personalize recommendations by identifying high-value locations more accurately.Search & Discovery – Filter restaurants, hotels, and retail locations by cuisine, service options, or amenities, making searches more relevant and precise for map users.Out-of-Home (OOH) Advertising Compliance– Ensure compliance by avoiding restricted locations, while brands can prevent ads from appearing near sensitive businesses or schools.Retail & Real Estate– Make smarter expansion and investment decisions by analyzing location-specific details such as demographics, business type, and service offerings.‍These updates make SafeGraph’s data more actionable across industries, improving search relevance, reducing manual work, and enabling more precise decision-making.Want to see how this data can work for you? Get in touch with SafeGraph to explore the latest updates. #### Insurance Risk Assessment, Enhance with Accurate POI Data Some of the brightest minds in actuarial modeling, data science, and machine learning work in insurance. That’s because insurance is deeply dependent on using measurable truths about the world (data) to predict the future. Better predictions about the future lead to more accurate risk predictions. More accurate risk predictions mean more precisely written PIFs (policies in force). More precise PIFs means fewer high-risk exposures and more profitable policies, which means a more profitable insurance business. There is a reason self-proclaimed data-nerds have been working in insurance for 100s of years before “data science” was cool, and that’s because in the world of insurance, data is king.‍‍Insuring businesses against fire damage is one of many factors in commercial insurance risk assessment.Insurance risk assessment is a three-dimensional, geospatial problemThere are a lot of factors to consider when insuring a commercial business, and many of those are geospatial. If you are selling flood insurance, for example, knowing whether a business is located near a river with a history of flooding as well as how likely that river is to flood again is critical to writing a risk-balanced policy. Similarly, if you are selling fire insurance, you need to know if a business shares a wall with an open-flame commercial kitchen and bakery. But where does this relevant geospatial data come from? Roof-top geocodes are not enough to assess co-tenancy riskSharing a wall with an open-flame commercial kitchen is a great example of co-tenancy risk. Unfortunately, co-tenancy risk is particularly difficult to assess. Unlike primary data about the business (like where is the business located), the policy-holder may not be fully aware of the relevant co-tenancy information (like what are the surrounding businesses). Most natural disaster-related geographic data—such as knowing whether a given business is located in a flood zone and how much it rains at this location—is collected by governments, changes infrequently, and is available via many GIS solutions. In contrast, point of interest (POI) co-tenancy data changes frequently as businesses open and close or as malls and retail parks are rebuilt or expanded. Accurate, complete, and timely POI co-tenancy data is rarely available in existing GIS solutions. Imagine that you need to assess the co-tenancy fire risk for a business. Assuming you have access to accurate POI co-tenancy data, the first thing you need to verify is whether the insured business is near another business that has a high risk of fire. This requires knowing:Accurate category data about all nearby POI in order to identify potential high-risk neighboring businesses; and Accurate roof-top geocodes—also known as “building centroids”—for every business in order to calculate the distance between the insured business and the high-risk POI and then assign it a particular risk value.But roof-top geocodes are not enough. Being across the street from a high-risk POI does not carry the same risk as sharing a wall or being in the same building, even if the distances between roof-top geocodes are the same. Co-tenancy data is, therefore, about understanding the geospatial relationships and hierarchies between businesses. Is this business located inside of another business, such as a Starbucks inside a Kroger grocery store? Is this business within its own stand alone building or does it share a parent structure (parent building) with other businesses, as is typically the case for indoor or outdoor malls? The risk of a fire spreading from one business to another is much greater when businesses share a wall or a building. SafeGraph Places contains rich geospatial co-tenancy dataSafeGraph Places is the most comprehensive and accurate dataset about commercial points of interest, covering 6MM+ POI in the United States and Canada. SafeGraph Places was built as a geospatial POI dataset from the very start. It goes beyond simply providing essential metadata (i.e. address, category, corporate branding, etc.) to give you access to rich geospatial data, including precise roof-top geocodes, polygon building footprints (2-D shapes), as well as co-tenancy information, such as business and building "parent-child" relationships (i.e. Is this POI inside another POI and/or does this building structure contain multiple businesses?).The key fields related to co-tenancy data are the parent_placekey and the polygon_class columns.Let’s apply this to the real world and look at how you can use this data effectively to assess the co-tenancy fire risk for a chain of fitness centers in California.Case Study: What are high-risk locations for fire damage?Let’s consider the portfolio of Anytime Fitness (SG_BRAND_6daa255524fe5ac244c3bed9cfbde479) locations in California. At the time of first publication (February 2020), Anytime Fitness had 124 locations across California. Now, imagine we are tasked with evaluating a commercial insurance policy for these 124 business locations, with the risk of fire damage as a key consideration factor for this policy. It’s important to note that the risk of any business sustaining fire damage is low (fires are rare), and, moreover, not all POI are equally at risk of fire damage. A clothing store is much less likely to experience an accidental fire than a commercial kitchen or bakery, because the latter have open flames and hot ovens burning all day every day. We need to know what other businesses are in close proximity to each Anytime Fitness location in order to accurately assess the risk of fire damage. Similarly, as a mitigating factor to minimize the potential risk of fire damage, we want to know how closely located our locations are to fire stations. After all, in the unfortunate event of a fire, the number of seconds and minutes until a fire truck arrives has a significant impact on how much fire damage is incurred. Therefore, locations that are closer to fire stations have less risk of fire damage than locations located far away from fire stations. ‍For our model, we defined high-risk POI for fires as any POI belonging to the following NAICS: 722511 - Full-Service Restaurants722513 - Limited-Service Restaurants311811 - Bakeries and Tortilla Manufacturing (Retail Bakeries)Our simple fire risk model consists of three components: log_num_co_tentant_hr_poi is the (natural log of the) number of high-risk co-tenants. We use natural log to capture the diminishing additional risk of N+1 high-risk co-tenants as N gets large.0 vs 1 high-risk co-tenant has a greater impact than 10 vs 11 high-risk co-tenants. ‍dist_to_nearest_hr_poi is the distance to the nearest high-risk POI (in meters)‍dist_to_nearest_fire_station is the Distance to the nearest fire station (in meters).We combined these features into a simple model to calculate a risk score: When undergoing an insurance risk assessment, use this simple model to calculate fire risk.Where: w1 is the weight given to having high risk co-tenants.w2 is the weight given to the exponential drop-off of risk based on proximity to a high-risk POI. We use the reciprocal function (1/x) to model the diminishing relevance of distance as distance gets large. A distance of 10 meters vs 20 meters from a high-risk POI has a greater impact than the difference between 1010 meters and 1020 meters.‍w3 is the weight given to the exponential drop-off of benefit as you move farther away from a fire station. Again we use the reciprocal function for the same reason described for w2‍The larger the risk score, the higher the risk.Remember, this is only a simple model for assessing fire risk. In practice, a risk model may account for many other factors, including building materials, weather data, appliance data, foot-traffic data, etc. The weight given to each of these variables can be determined by fitting a model on many years of historical claims data. Nonetheless, despite its simplicity, our model reveals interesting insights about fire risk when applied to the Anytime Fitness locations in California. Figure 1. This histogram shows the distribution of insurance risk assessment scores based on our model. Most locations are clustered around a neutral risk assessment score of 0. Large positive risk assessment scores (skewed far right) represent unusually high-risk locations. Large negative risk assessment scores (skewed far left) represent unusually low-risk places. (see Figure 2)Figure 1 shows a surprisingly symmetric histogram of risk scores for the 124 Anytime Fitness locations. Most locations are clustered around a neutral risk score of 0. However, a handful of Anytime Fitness locations are on the far left tail of this distribution, which means these locations have the lowest risk for potential fire damage. Let’s take a closer look at an example of what this means in a real world scenario. Figure 2. The green polygon indicates an Anytime Fitness location (Placekey: 224-222@5v6-gng-d35) with very low risk of burning to the ground in a fire, according to our model. The red arrow points to the nearest fire station, which dramatically lowers the risk score.Figure 2 shows an Anytime Fitness with one of the lowest risk scores (-134) in California . What makes it so low risk? First, it has zero same-building co-tenants, which also means it has zero high-risk co-tenants. Second, it’s not near any high-risk POI; it shares a parking lot with a dentist office and a child-care center—both of which are low-risk POI for fire). The nearest high-risk POI is a restaurant located about 72 meters away and across the street. Being that far from a high-risk POI is unusual for Anytime Fitness locations (median = 51 meters). Finally, the red-arrow indicates the nearest fire station, which is very close. Therefore, considering its close proximity to a fire station and the absence of nearby or co-tenant high-risk POI, our model identified this as a very low risk location.What about the other end of the spectrum? Figure 3 shows the Anytime Fitness location with the highest risk score (+153) in California. This is because it’s located within a strip mall that contains six other high-risk POI for fire. These are all considered co-tenants because the strip mall is one contiguous structure (in the SafeGraph schema, they all share the same parent_placekey). To throw gas on the fire (pun intended), the nearest fire station is 2,800 meters away, much farther than the average distance (median = 1,124 meters). Figure 3. The green polygon indicates the Anytime Fitness location (Placekey: 222-225@5z5-s5v-p5f) with the highest risk of fire damage in California, according to our model. It is high-risk because it is located in a strip mall that contains six high-risk POI within the same contiguous structure (i.e., all share the same parent_placekey). Moreover, this location is over 2800 meters away from the nearest fire station, which is much farther than average.Better data leads to better insurance risk assessmentThis article demonstrates how SafeGraph Places point-of-interest (POI) data makes it possible to build more accurate commercial insurance risk models. Our simple model of fire risk requires the following:Roof-top geocodes (building centroids): Knowing exactly where businesses are located in order to calculate accurate distances.Spatial hierarchy: Knowing whether a POI is inside of a parent structure and/or sharing a building with other POI.Category information: Definitively knowing which businesses are high-risk POI.Completeness (high recall): We must have data on every high-risk POI; missing data will create inaccuracies in risk assessment scores.Traditionally, it has been hard to find accurate data on these attributes for all places. We are changing that at SafeGraph. Our number one focus is to make this data as complete, accurate, and accessible as possible, helping you build better models and more accurate predictions.Risk modeling is complex and has many variables to consider. Here are some other ideas for how SafeGraph data can enhance your risk-modeling: Proximity to POI associated with increased crime or unsavory places. Proximity to police stations. Square-footage (as a fraction of total enclosing structure). Proximity to schools or other types of protected institutions. Use polygon_class and geospatial joins to get even more detailed on co-tenancy. Try It For YourselfWant to try this model out on more locations? Plug your data into the accompanying Google CoLab Notebook to replicate the results of this blog post. #### Insurance Risk Modeling with Places Data Property risk profiles are increasingly data-driven. With the abundance of data available, insurers are able to develop more accurate property risk models than ever before. Critical elements to risk modeling, like environmental characteristics, proximity to other structures, and historical risk events, all contribute to a property’s insurance profile. But to stay competitive in an increasingly saturated insurance market, insurers must constantly improve their risk models and policy pricing to be as accurate and precise as possible. To give them an edge, industry leaders are turning to places data. Places data - like points of interest (POIs) and building footprint - is fundamental to insurance risk modeling for a few reasons. In this blog post, we break down what insurers need to know in today’s competitive market. ‍Ready to dive into data-driven insurance risk assessment? Check out our notebook for measuring co-tenant risk.Pinpointing the PropertyAccurate POI data fuels insurance risk models with up-to-date information on which businesses are nearby a specific property.POI data provides a source of truth for a property’s location. An accurate and precise data point for a property’s physical location is essential for measuring the environmental (like flood or wildfire risk) and commercial (such as the type of business being conducted at that location) factors that may impact its insurance profile. But POI data is also crucial for conducting proximity analysis to identify nearby places that may pose additional risk to a property. ‍SafeGraph Places data includes the geographic location and business detail attribution required to fully analyze a particular property. Measuring Proximity and Co-TenancyBuilding footprints are a key part of insurance risk modeling. While using POI data to see where a property is, truly understanding its location and what is nearby requires accurate polygonal representation of that property’s structure. When looking only at point data, a property may appear to be outside of a flood zone. However, when analyzing that POI’s geometry, you may discover that in reality a part of the building extends into the zone. Similarly, a shoe store may appear to be located in a regular shopping center, but actually share a wall with a fireworks store next door. These discoveries will impact insurance risk models and policy pricing for those properties. Shopping centers have complex geometry that insurers need to understand completely before writing a policy.For non-residential properties in particular, accurate and precise building footprints can make the difference between an effective risk model and one that actually hurts the insurer. With an abundance of malls, office complexes, and shopping centers, co-tenancy analysis is quickly becoming one of the most critical components of insurance risk modeling. Co-tenancy analysis enables insurers and risk modelers to see which businesses are located within the same parent polygon. SafeGraph Geometry data provides parent and child POI polygons, along with attribution for identifying the spatial relationship between them. The polygon_class and enclosed fields simplify this process, making it easy to understand the spatial hierarchy of multiple POIs.‍Precision, Accuracy, and Transparency‍SafeGraph’s sole focus is data. That means we do not build analytics or reporting tools. Instead we focus on curating and delivering high-precision facts about the world so our customers can make informed decisions based on accurate information. Our rigorous machine learning processes make us expert de-dupers so data is delivered clean and accurate for insurers to ingest into their risk models. With monthly updates, our data is a source of truth for physical places that insurers can rely on. Frequently updated places data prevents risk models from becoming outdated and inaccurate, which is especially important in today’s pandemic when many businesses are moving or closing at unprecedented rates.But SafeGraph’s main differentiator is our transparency. We communicate everything - even our faults - with our customers. Our technical docs site is always up-to-date and open to provide the public with schemas and statistics for our datasets. Every month, we publish our release notes that openly communicate any bugs and our plans for fixing them. Even within the data itself, we indicate how we created a particular record, so you can decide how to factor that into your models. Reliable data for risk modeling is critical for insurers looking to move faster without sacrificing accuracy and precision. Under- or over-priced policies can damage an insurer, either by opening them up to potential financial loss or pushing their customers towards the competition. But with SafeGraph Places data, insurance risk models can accurately and precisely measure how locations impact a property’s risk exposure, so policies can be priced accordingly. ‍‍ #### Introducing Expanded SafeGraph Category Tags: More Granular Than NAICS Codes You might already be familiar with the Category Tags column in SafeGraph Places, which for the past few years has provided more detailed info than NAICS codes about what is located at a specific POI. But until now, Category Tags has been largely limited to covering restaurant types. For example, while we classify restaurant POIs under “Full-Service Restaurant" or NAICS 722511, we also tag each location with other relevant information, like “Pizza” or “Italian Food.” Data scientists use this extra detail to develop more accurate models, create more informative consumer-facing mapping applications, and better understand market landscapes. But the value in those details is not unique to restaurants. And that’s why, in our April 2022 release, we’ve expanded Category Tags beyond restaurants to cover other types of POIs that have many subcategories falling under the same NAICS code. SafeGraph users have long relied on restaurant Category Tags to differentiate between different cuisine types. Now we've expanded our Category Tag coverage to more NAICS codes to provide the same level of granularity in categorization. What are Category Tags? NAICS codes are an essential industry standard that must be included in POI data to make it valuable across use cases, but we have increasingly heard of a need for greater granularity in classifying places within NAICS categories. It’s important to remember that NAICS codes were created with government entities in mind as the end users, so they were developed with specific methodology and classifications mainly used by government organizations. Today, data scientists exist across industries, and those who use POI data often need to identify businesses or other places at a more granular level than NAICS codes allow. We initially built Category Tags because of these shortcomings in standardized category classification systems, like NAICS codes. With these larger standards, POIs must fit into a single box, and can only be so specific. While still necessary for larger-scale analyses, NAICS codes are not always updated to reflect new business types, and do not offer much flexibility in granularity. Additionally, NAICS codes often have obscure descriptions that align to government classification needs, and are not very intuitive for consumer-facing search or mapping applications. Using the restaurant example again, consumers are more likely to search for “Mexican food” than “Full-Service Restaurant” in a mapping app as they look for a place to eat dinner. Updates to Category Tags in the April 2022 SafeGraph Release Restaurants are not the only broad NAICS codes, and we’ve had feedback from data scientists that Category Tags would be useful for POIs in other industries as well. As a response, we’ve now expanded our Category Tags to other NAICS codes that often require more granularity to analyze properly, such as healthcare and retail. For example, healthcare POIs can be classified by medical specialty, such as cardiology or dermatology. These more granular categories are not accessible using NAICS codes like “Offices of All Other Miscellaneous Health Practitioners” (621399). You can see a full list of available Category Tags by NAICS code here. We'll continue to improve these tags in the months to come and plan to apply them to more and more brands. Because we want to expand the use of existing tags across more NAICS codes, we’d like to ask you to please share your feedback early and often as we are eager to hear how these new tags are being put to use. Explore a sample of Category Tags in this interactive map. Top Use Cases for Category Tags Category Tags are valuable anytime POI data is being used to inform strategies, develop models, or populate mapping applications. Here are some of the most popular use cases we’ve heard for the granular classification and analysis enabled by Category Tags: Mapping: A consumer-facing mapping application that wants to provide intuitive searching for users when they type in a place category (ex, coffee, pizza) Site selection: A retailer performing site selection who wants to see what types of goods different stores sell and how that impacts the competitive landscape (ex. party suppliers, outdoor gear) Trade area analysis: A healthcare organization that is building a catchment model to understand access to different types of medical care (ex. audiology or pediatrics) Category Tags provide additional metadata for POI that is not available from NAICS codes. Schedule a demo to see how Category Tags enable more granular analytics and informative mapping. #### Introducing Point POIs - Data for ATMs, Transit Stops, & More Since we first launched point POIs, our customers have begged for more categories and geographies. Now, we offer hundreds of thousands of new point POIs to facilitate your investment research, site selection, trade area analysis, and more. Getting data on non-traditional places (locations without defined polygon boundaries) used to be near impossible. To address this gap, in July 2021 we introduced point-only POIs. Point-only POIs are places without any associated geometry that still are relevant to organizations and their analytics, such as ATMs, kiosks, and transit stops. Our July release included 182,000 point POIs, and since then we’ve continued to build our database. Our September release of Core Places includes 746,953 POIs in the US, Canada, and Great Britain, providing users with a more complete view of the world around them and enhancing their critical analyses. Transit stops are strategically important POIs that often lack traditional geometry. Point POI data provides coordinates for these and other non-polygonal locations to enrich analysis with every detail needed. ‍ What are point POIs? Non-traditional POI locations such as ATMs, electrical vehicle charging stations, kiosks, transit stops, and vending machines may lack structural boundaries, but are mapped using x and y coordinates in the point POI dataset. Learn more about them in our data schema. How can I get point POIs? SafeGraph’s new point POIs are an extension of SafeGraph Places data, which provides baseline information for every record in the product suite, including location name, address, lat/long, category, brand, and more. This means that customers receive every column in SafeGraph's base schema for point-only POI rows, which can easily be appended to Places in order to enhance use cases including investment research, site selection, trade area analysis, urban planning, and network planning. This is just a fraction of all the potential use cases for point POIs, as the opportunities to use point POIs for spatial analysis purposes are endless. Interested in adding point POIs to your current POI dataset? Contact your SafeGraph Customer Success Manager, or reach out to get a quote. #### Introducing SafeGraph Places: The Source of Truth about Physical Places ‍Our goal is to be the source of truth about physical places in the world It turns out that getting basic truthful information about a place is really hard. It is hard to even find a good source of which stores are where. We know … because when we started SafeGraph we tried to buy it. We evaluated over 20 vendors and none of them were high quality. High quality places need to: 1) Have accurate polygons (over random centroids) 2) Eliminate Noise (i.e. PO Boxes) 3) Delete duplicates and inaccurate (or outdated) places Enter SafeGraph Places. In our current version (v 1.1), SafeGraph Places consists of almost every place in the US where one can spend money. We’re working on having every place in the US you can spend time (including office buildings, homes, parks, schools, etc.). And eventually, our goal is to be able describe every place in the world. SafeGraph Places 1.1 is curated: 5MM+ points of interest (POI) covering all places someone can spend money (including all key brands as well as “mom & pop” shops) Accurate polygons for every place Additional critical information such as major brands (McDonald’s, Starbucks, etc.), name (Tampa Marriott Westshore, Dominique Ansel Bakery, etc.) hours of operation, street address, and category data Big companies are basing their business on SafeGraph Places. One of our early customers is one of the largest mobile carriers in the US. They run their location stack on top of SafeGraph Places. Before choosing SafeGraph, they evaluated a dozen vendors over a period of many months — going through the data programmatically and by hand. SafeGraph algorithmically combines the best of all sources There are thousands of sources of data about a place. Our challenge is to merge this data together and use the best attributes of each source. SafeGraph ingests data from thousands of diverse sources that together represent billions of discrete pieces of information about places of interest. Our system programmatically ingests, compares, validates, merges data and draws precise polygons. We leverage unique, advanced truth data to continually improve the accuracy of Places, ultimately resulting in a map of places of interest that best represents truth. Getting to the truth requires crazy machine learning Simple things, like merging data from different sources, turn out to not be so simple. Semantic brand detection and hierarchy is important. We identify true brands from POI with merely similar names (e.g. Lee’s Sandwiches vs. Lee’s Deli) Understanding hierarchical relationships like native substores (e.g. Walmart Vision Center) vs. foreign substores (e.g. CVS inside Target) enables us to better filter or keep POI. Spatial transformation and interpolation is important (and really hard). We intelligently partition an overall building shape into substores (think of a strip mall). We also strive to understand spatial relationships of substores within malls, stadiums, airports, and more.‍ You cannot have great location data without great polygons Getting precise maps is really, really hard To really understand a place, you need to know its dimensions or shape (in geospatial parlance, that’s called a “polygon”). Essentially, it is a map that describes a place. SafeGraph has detailed polygons for all 5+ million places we currently track (places in the US where you can spend money). Centroids & radiuses have baked-in errors Centroids & radiuses have baked-in errors Relying on centroids can significantly reduce accuracy. They overlap, have different radiuses that are hard to calculate, and the “centroid” is usually not at the center. …but polygons can represent the truth ‍Focus on making sure we have ALL the stores Traditional POI vendors usually only have 80% of a given brand’s stores. We ensure that we have close to 100% of every brand’s store in SafeGraph Places. We’ve also focused on the long tail of “Mom and Pop” stores. In addition, we go to great lengths to keep our POI data fresh as stores open, move, and close. SafeGraph Places has been our secret … until now While we have great customers for SafeGraph Places, we are only now marketing it broadly. Our initial customers include some of the most advanced geospatial companies (and they helped us make the product better). Now we are opening up SafeGraph Places to a wider set of customers. Introducing SafeGraph Places: The Source of Truth about Physical Places goal is to be the source of truth about physical places in the worldhttps://t.co/tByUls8p0k — Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) July 25, 2018 ‍ #### It’s Our Moral Obligation to Make Data More Accessible Key Takeaways Unlocking institutional data can significantly accelerate research, healthcare innovation, and economic progress. Longitudinal, real-world data produces stronger insights than survey-based research. Privacy-safe technologies make secure data access possible without exposing individuals. The real value of data emerges when multiple datasets are securely connected. Expanding responsible data access is both an innovation opportunity and a societal responsibility. Most of the world’s data is sitting on a shelf, being used in a very narrow domain. This data, if properly activated, could solve some of the world’s biggest problems and lead to more health, happiness, and love for society. We could use this data to uncover some of society’s biggest secrets.The data is there. We just need to use it.We need the courage to harness the world’s data for good. We have a MORAL OBLIGATION to get this data into the hands of millions of innovators. Not doing so is a true failing of society. This data can save hundreds of millions of lives and help all of humanity … which means not using it hastens the death of hundreds of millions of people.But there are hundreds of special interest groups fighting against it. They have good intentions. They know this data can make people’s lives better. But these special interest groups fight against making data more accessible to either enhance their profit, enhance their power, or just protect the status quo.Like Marc Andreessen’s piece, It’s Time to Build, this piece is a full-throated argument to massively increase the accessibility of data. And we need to do it now.The Deep Truths of Humanity are at Our FingertipsFor decades, there have been very powerful, sensitive datasets completely unavailable to the research and business communities because the institutions who own them have been unwilling to share or sell. This unwillingness has been largely driven by a concern for people’s privacy, which in a vacuum makes sense.But what if we could have our cake and eat it too?Large societal institutions like the government and big tech companies have tons and tons of data… and 99.999% of it isn’t accessible to the millions of brilliant researchers, engineers, and entrepreneurs out there. We’re talking about data that can fundamentally change the trajectory of society. And because they have a monopoly on the data, they monopolize innovation and slow down technological progress.There’s an obesity epidemic, economic inequality is still very high, wages are stagnant, we’re in the middle of an opioid crisis, average lifespan is not increasing, and public policy is responding very incrementally. The human condition is not improving fast enough, yet we’ve somehow convinced ourselves to become risk-averse at a time when we need to be daring.We need courage.But the solution is right in front of us. It’s time for a step-function change in progress and it starts with making data more accessible. These institutions aren’t inherently wrong for being cautious with these datasets — people’s privacy is at stake and that’s important.But the game is not zero sum. Protecting personal privacy and developing next-generation technology and research are essential and mutually inclusive.It doesn’t have to be one or the other – we can choose to make data more accessible and protect people’s privacy.Before we dive in, let’s clarify that making data accessible is not the same as making it free. It’s okay to charge for data (we do that at SafeGraph), but it’s not okay to let it go to waste.We as a society have a moral obligation to release data (for free or sell it at a reasonable price) in a privacy-safe way.We need open information to power innovation.Data should be an open platform, not a trade secret.— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@Auren Hoffman (operations)) November 21, 2017Research Is Artificially ConstrainedTax data lives in the hands of a fewThe IRS has income data on hundreds of millions of people over decades – including the incomes of people’s parents and grandparents. It is one of the largest and most comprehensive longitudinal studies in history.However, only a select few researchers have access to the data.The IRS is rightly concerned about people’s privacy. This is super sensitive data. But what if we could give out access to the data while completely protecting our privacy? It is possible (read on). We can allow people to ask questions of the data without seeing the underlying sensitive data. We can do it. We just need the courage to work on it. And give every researcher in the world access to one of the most important longitudinal studies the world has ever seen.Source: xkcdRaj Chetty is famous. He’s a Professor of Economics at Harvard. He won the John Bates Clark Medal. His studies have been cited by thousands of articles. He’s amazing. He is one of roughly four researchers that has access to the IRS data.By analyzing the tax returns, Chetty and his colleague were able to publish many monumental longitudinal studies. One example is where Chetty and his colleagues analyzed upward mobility across generations throughout the U.S. They found that upward mobility was heavily influenced by where one grew up.His finding: upward mobility exists – it’s just not evenly distributed.Other amazing research Chetty has been able to conduct by having access to de-identified administrative data include:Income Segregation and Intergenerational Mobility Across Colleges in the United States: statistics on parents’ incomes and students’ earnings outcomes for each college in the U.S. using de-identified data from tax recordsRace and Economic Opportunity in the United States: An Intergenerational Perspective: A study of the sources of racial disparities in income using anonymized longitudinal data covering nearly the entire U.S. population from 1989-2015The Fading American Dream: Trends in Absolute Income Mobility Since 1940: estimate rates of “absolute income mobility” – the fraction of children who earn more than their parents – by combining historical data from Census and CPS cross-sections with panel data for recent birth cohorts from de-identified tax recordsThis type of work has a huge impact on public policy but the data is only available to an amount of people you can count on one hand. This doesn't make any sense.But how did Chetty get access to this data? He had to apply through a rigorous RFP by the IRS. I’m sure it also helps that he is an esteemed academic from prestigious institutions. Therein lies the problem. You shouldn’t have to have a John Bates Clark Medal to get access to this data. We should make this data available to EVERY innovator.Imagine if there were a million other researchers working with the same data. Society would benefit enormously. We could better understand what types of social programs are working, where to best allocate resources and how to help humanity. Data accessibility is the cornerstone of this innovation in data and data-as-a-service.So let’s open up access to this data in a privacy-safe way.By the way, this data doesn’t have to be free. I’m sure there are lots of costs with administering a dataset this size in a privacy compliant way. It’s totally okay for the IRS to charge money to recoup those costs. There are still hundreds of thousands of researchers that could afford a reasonable data access fee.We need access to real data Currently, most researchers work with survey data, which is not very accurate, consistent, or large enough. The real datasets are over 1,000 times the size of survey data. And the real data produces studies that are longitudinal – you can follow the progression of individuals over the years – survey data is usually about a moment in time. Real data reveals what actually happened. Survey data reveals only what people remember happened.I recently discussed with Susan Athey on World of DaaS how Raj Chetty had a famous paper on government experiments in the 1980s. The government moved low income families to a higher income area and paid for their housing. The findings did not show any improvement to the people’s life situation so the experiment was labeled a failure. But when Chetty ran the numbers years later, he found that their kids actually did improve greatly from it. Young children who were part of this relocation program had a higher rate of college attendance and higher overall earnings.This is going to sound obvious but it must be said: producing longitudinal studies working with real data with high response rates and low attrition produces much better results than using survey data.Healthcare data can seriously change the gameThe Center of Medicare and Medicaid Services (CMS) and the Veterans Affair (VA) have a LOT of data about people’s physical wellbeing. In fact, there are thousands of healthcare datasets at the federal, state, and local governments. Almost all these datasets are not accessible, again for privacy reasons.Before we proceed further, we must acknowledge that the CMS does share a lot of statistics about people. The IRS does this as well. That’s not the main point here. In order to progress society, researchers don’t just need statistics. But they don’t even need full access to the underlying data either. They do need to be able to ask questions of the data about the care people receive over long periods of time.The same is true for healthcare providers and insurance companies. Lots of data in very few hands.And it makes sense. There are lots of regulations to make sure nobody’s health situation can be identified. HIPAA violations exist for a reason. Health data is extremely private (and it should be). But like tax data, there is a way to make asking questions of the data available while still protecting people’s privacy. We don’t need to make a choice between progress and privacy – we CAN DO BOTH.And if there’s a privacy-safe way to make it accessible (more on this later), then why don’t we do it today? The ramifications are infinite. Here are a few obvious ones:We could more cost effectively provide care while reducing medical errorsData driven diagnostics and analytics could result in the best treatments prescribed quicklyResearch analysts can identify regional patterns in public health, healthcare costs, and quality of careThe efficacy of specific drugs when mapped against certain health problemsWe even could solve some of the most incurable diseases (like cancer)Here’s a chart from the AEI that’s made rounds on the interwebs over the past 5 years – healthcare costs have been outpacing inflation significantly:If opening up access to micro-healthcare data creates an opportunity to reduce our healthcare costs while upleveling the quality of care, then shouldn’t we strive to pursue it?Isn’t it our duty?It is our obligation to make data accessible. In fact, it is a moral obligation that we should not shun.Startup innovation is at stakeBig tech companies like Google, Amazon, and Apple also have a LOT of data about us. No surprise there. Pretty much all of it stays within their ecosystem.These companies have some of the smartest people working on some of the hardest problems we have today. But at the same time, there are millions of other smart people who could solve very challenging problems if they had access to this data.By hoarding the data, these tech companies significantly slow innovation. Not selling (or sharing) their consumer data is morally wrong. We should build a world where access to data — to knowledge and history — is made available to all potential innovators.The world already has democratized access to compute power. Today that’s available to anyone. Open access to compute power (via AWS, Microsoft Azure, Google Compute, and more) has massively accelerated innovation. And no, it’s not free. But it is available to anyone that wants to pay for it.That’s the future we need for data. It doesn’t all have to be free, but it should be accessible to all. How many companies (and frankly, industries) exist today because compute became accessible to all? Well, 10x that impact if data became accessible.Imagine the innovation.Joining datasets makes them more valuableInherently, data has no value. It’s the information that can be derived from data that is valuable which ultimately dictates the value of the data. Combining datasets opens up new types of information, thereby making the value of each dataset more valuable.One of the big ways that data becomes useful is when it is tied to other data.The reason for this is simple: data is only as useful as the questions it can help answer. Joining, linking, and graphing datasets together allows one to ask more and different kinds of questions.— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) June 14, 2019We won’t solve most of society’s problems by only unlocking one or two datasets (although it will help a lot). We need a movement to make all datasets accessible, and to then enable us to join data from different datasets to draw deeper insights. Making just the IRS data or Google’s data accessible will help, but not enable all the insights we need. Joining multiple datasets is where the power lies.Travis May, CEO of Datavant, wrote a piece on how healthcare data is mostly fragmented (full disclosure: I am an investor in Datavant). But it’s when you combine data about prescriptions, doctor’s visits, hospital check-ins, and lab tests, you get a clearer picture.“All of these disparate data points have limited utility when analyzed individually — it is when they are brought together that these data points form a full picture of the patient’s health. Each additional piece of data that can be linked together has the potential to exponentially increase the value of the data set for understanding key public health questions.” - Travis MayWhat if we could combine pharmaceutical data with people’s physical records from their doctors, their hospital visits data, and the wellness data from their Apple watches? The leaps in pharmacology and physiology would be huge!Imagine if we could combine anonymized IRS data and the Medicaid and Medicare data? By empirically tieing people’s financial wellbeing to their physical wellbeing, we could see all sorts of new programs. We could fund programs to direct public health initiatives right to the people who need it most.Advancement in public health and policy alone would be mind boggling. There won’t be a question of how to best allocate resources. The data will all be there.There should also be an easy way to join these datasets. Similar to how the Placekey is a common identifier for every physical place, we need to have encrypted identifiers for people data as well. Identifiers should be SIMPLE:Storable. You should be able to store the ID offline. For instance, I know my SSN and my payroll system stores my SSN.Immutable. It should not change over time. An SSN on a person is usually the same from birth until death (except if you enter the witness protection program).Meticulous (high precision). The same entity in two different systems should resolve to the same ID. It should be very difficult for someone to claim they have a different SSN.Portable. I can easily move my SSN from one payroll system to another.Low-cost. The ID needs to be cheap (or even free). If it is too expensive, the transaction costs will make it hard to use in many situations. The SSN itself has no cost.Established (high recall). It needs to cover almost all of its subjects. An SSN covers basically every American taxpayer (and more).It sounds scary to combine this data. It sounds like something that could hurt privacy. But what if these datasets could be joined without having access to the underlying data? Where each dataset is still stored decentrally but questions can be asked across dozens of datasets. That’s actually possible. We just need the courage to build it (and to fight the special interests that want to protect the status quo).We’ve seen It Work in the PastGovernment institutions have done it before…There are many examples throughout history of making data more accessible leading to innovation and societal good.Government is actually great at sharing specific types of data. For example, local, state, and federal governments share an abundance of data about property, mortgages, and real estate transactions. It’s messy and the structure varies from one locality to another but it’s out there.This resulted in companies like First American, CoreLogic, and Zillow that ingest, clean up, and package this data for sale. Their datasets are then utilized by governments themselves for urban planning and economic development. This is a great example of how opening up access to data can transform society for the better.Weather data is another example. The National Weather Service and NASA make their data accessible resulting in businesses like AccuWeather. There are also lots of companies that help industries like agriculture innovate by helping make sense of this data.We take this for granted. But progress comes from building on top of data.All this innovation was only possible because institutions chose to make their data more accessible.Data Privacy Is ParamountThe biggest challenge in opening up access to data boils down to protecting people’s privacy. It’s incredibly important to protect individual privacy and that’s not really up for debate. But we have ways to solve for this.Innovation in privacy technologyThere have been huge advancements in privacy technology over the past decade, ensuring personally identifiable information is kept private and safe. But people are still making decisions like we had the same tools in the 1980s. Some of this is because the entrenched special interests are powerful but some is just because people are unaware of all the advances in protecting people’s privacy.After reading this piece, the entrenched special interests will still be powerful. But at least the reader will have a better survey of how society can promote innovation AND still protect privacy.Let’s start with Differential Privacy, which is probably the most commonly used measure to ensure data privacy. To boil it down to simple terms: Differential Privacy adds “noise” or slight modifications to processes that ingest sensitive data.How much noise it creates depends on the process. The idea being that even if you remove some data points from the dataset, you arrive at basically the same end product. So if researchers are asking questions of a dataset and trying to run analysis on it (the process), complying with Differential Privacy can allow them to arrive at effectively the same answers without compromising data quality even if you remove select data points.Why is this important? Because it ensures that the end user can’t deduce who is in the dataset (any one person in a dataset can theoretically be added or removed and the answer would still be the same). Lots of organizations use Differential Privacy today including Google, Microsoft, Apple, Facebook, JP Morgan, and even the US Census Bureau. Differential Privacy, when done well, makes it virtually impossible to reconstruct an underlying dataset or identify any one individual.Here are some tactical methods of implementing privacy:Synthetic DataIf direct access to data is not available, it is possible to create synthetic data. By randomly altering each data point in a dataset, it’s possible to create a new dataset with the same statistical properties as the original dataset, but no actual data point is the same. By using an algorithm to randomly modify data points, nobody’s privacy is at risk. But the resulting dataset can be just as useful as the original one.Source: Data in Government BlogHomomorphic EncryptionAll data tied to people can be homomorphically encrypted which allows end users to perform computations on data without ever decrypting it and seeing the underlying data. The resulting computations are also encrypted and can be decrypted.So if our fictional character John wants to figure out (X+Y), he can submit a computation to add X and Y and will receive an encrypted solution. He’ll then have to decrypt that solution. If the solution to X+Y is 10, he will not receive the answer 10, rather the answer will be encrypted waiting to be decrypted.So why doesn’t everyone use homomorphic encryption today? Well, it is slow and expensive. But it’s getting better and faster every day. Nudging Homomorphic Encryption is one of the most important things we can do.Functional EncryptionA method of encryption in which unique decryption keys allow end users to perform functions on the data. This means that if you have access to a decryption key, you can perform specific analyses on private data which can be accessed without ever accessing the data itself. This allows us to ask questions of the data and see results but nothing else.How is this different from Homomorphic Encryption? Functional Encryption requires access to a specific key which corresponds to a specific function on the data itself. The output is also not encrypted in functional encryption, unlike Homomorphic Encryption. The one drawback of Functional Encryption is that the generation of decryption keys to perform functions on the data can be a bottleneck for widespread use.Going back to the example with John, he can submit the computation of (X+Y) and will receive a decrypted solution (e.g. if the answer is 10, he will receive that).Multi-Party Compute Two or more parties jointly compute a function on an encrypted dataset. The input data is masked (not encrypted), meaning that the underlying data is obfuscated or modified. The output is shared amongst the parties computing the function. The benefit of this methodology is that it is very hard to leak data since there are multiple parties computing the function together, rather than a single point of failure. The drawback is that it does require multiple parties to compute a single function.Travis May from Datavant recently stated that we don’t have to trade privacy for data utility and both can be achieved. He went on to describe that the tradeoff only will need to be made when we’ve achieved the efficiency frontier, which we’re nowhere close to. He uses the graph below to visualize this:The combination of all this new technology now means it is entirely possible to join different types of data about people without ever uncovering who they are.We Can Have Our Cake and Eat It TooIf we make data that exists today across large public and private institutions more accessible, it’ll be a huge step forward for humanity. Doing so will result in unparalleled economic and policy innovation.It doesn’t have to be free, but it also can’t be egregiously expensive.Is the problem really privacy? On the face of it, it would seem so. But if you dig deeper, it’s a very solvable problem. The real problem is having the courage to work on the very hard task of making data privacy compliant.Our goal shouldn’t be to hide the data; it should be to make it safe and securely accessible.It starts with a collaboration between large institutions (access) and people (consent). The technology is there to make sure it’s executed safely and privacy is protected. And by the way, willingness to open up access will most definitely result in advances in privacy technology.We should ask the institutions to meet us halfway. If you make the data accessible, we promise you the world will rise to make sure it’s used for innovation in a safe way.All we need is courage.You can find me on Twitter @auren FAQ’s 1. Why is data accessibility important? It enables better research, smarter policy decisions, and faster innovation. 2. Can sensitive data be shared safely? Yes. Techniques like differential privacy and encryption allow analysis without revealing personal information. 3. Does accessible data mean free data? No. Data can be paid, as long as it is responsibly available. 4. Why we should combine datasets? Connecting datasets creates deeper insights than analyzing them in isolation. It enables better research, smarter policy decisions, and faster innovation.Yes. Techniques like differential privacy and encryption allow analysis without revealing personal information.No. Data can be paid, as long as it is responsibly available.Connecting datasets creates deeper insights than analyzing them in isolation. #### Leverage joint demand and location intelligence for stronger occasion-based marketing This blog was reposted with permission from PredictHQ | Author: Valerie Williams | Original Source Use joint data sets to make data-driven business decisionsJoint datasets help to improve the accuracy of forecasting models by providing a larger and more diverse set of data for the model to work with. One example of a powerful joint data set is the combination of demand intelligence and location intelligence,made possible by a partnership between PredictHQ and SafeGraph – a provider of granular point of interest (POI) data which businesses use to better understand the physical world. Today, we’ll explore how to leverage the power of demand intelligence powered by geospatial data to build stronger marketing campaigns, specifically marketing campaigns based on occasions. Leverage SafeGraph + PredictHQ data for stronger targeted marketing campaignsWith access to accurate POI data and intelligent event data, organizations can gain even greater insight into events tied to a unique location – which can be used for more efficient targeted marketing campaigns. A better understanding of demand-driving events tied to a specific location can also help businesses to identify opportunities for growth and expansion, such as by launching new products, entering new markets, or what we’ll be exploring today, where and how to market around a specific event. Occasion-based marketing, also known as event-based or situational marketing, involves targeting marketing efforts around specific occasions or events. For example:Retailers offering special promotions or discounts for holidays, such as Black Friday, Cyber Monday, and ChristmasRestaurants offering special menus or deals for Valentine's Day, Mother's Day, and Father's DaySports venues offering special ticket deals or promotions for specific games or events, such as opening day or a rivalry gameAccommodation and travel companies offering special deals or packages for seasonal events and school holidays, such as spring break or the summer vacation seasonEvent-based marketing has proven to be particularly effective because it aligns the marketing message with the consumer's current mindset and priorities, which increases the relevance and appeal of the offer – especially when it’s paired with local knowledge of impactful occasions.Occasion-based marketing also creates a sense of urgency and scarcity, which can motivate consumers to make a purchase. For example, a special Valentine's Day menu would create a sense of urgency because the offer is only available for a limited time. How the joint dataset powers occasion-based marketingA solid understanding of impactful events in various geographic areas supports better decision-making around marketing, including targeted and occasion-based marketing campaigns. When you can easily identify high-demand areas and windows of time, you can spend more time and effort tailoring messaging and tactics to align with a particular occasion, as well as the unique needs and preferences of consumers in those areas. For example, a business might use location intelligence to identify areas with a high concentration of potential customers, and then use demand intelligence to understand the factors that are driving demand in those areas, such as a public holiday, festival, or community event that sparks an influx of foot traffic.Combine this insight with current consumer preferences and market trends to speak directly to what customers love most about a specific occasion, and how they can achieve this with your product or service. There are countless examples of retailers who have struck gold by providing consumers with relevant, timely, and personalized offers and messages specifically tailored to occasions and events they care about.How can your business leverage local demand insights for occasion-based marketing campaigns?If you're a marketer, you know that successful marketing depends on reaching the right people, at the right time, with the right message – joint location and demand intelligence can help you do just that.Insights powered by location and demand intelligence give companies across industries a better understanding of their customers' behavior and preferences. This information can help you fine-tune your marketing messages and offers to specific occasions such as holidays, sporting events, or special sales.For example, if you know that a large number of your customers visit a particular location at a specific time of day, you can use your knowledge of location-based demand to determine when to send targeted offers or promotions. This can help you increase foot traffic and boost sales. It can also help you identify trends and patterns in customer behavior, which reveals valuable insights into what your customers want, and how you can better serve them. #### Leveraging Places Data to Make Informed Investment Decisions The Challenge of Getting the Complete Picture For data scientists conducting due diligence for deal sourcing teams, gaining a comprehensive view of a company's performance relative to its industry peers is incredibly challenging. This task is further complicated when evaluating businesses with a physical presence or those providing services to brick-and-mortar establishments. In these scenarios, up-to-date and accurate location data is crucial. However, obtaining this data can be difficult, as open source data often lacks precision, and web-crawling methods can be time consuming and unreliable if not done properly. In this blog, we'll explore why using point of interest (POI) data is essential for accurate market analysis and illustrate how it was effectively leveraged by a private equity firm to assess a potential acquisition. Why Use POI Data for Investment Diligence Before diving into our specific use of SafeGraph Places data, it's important to contextualize why POI data is invaluable for investment research and due diligence. Coverage of POIs in specific categories and regions helps assess the saturation of brands and potential market share. Competitive analysis examines the volume of key competitors in the vicinity, a crucial factor for any new location. Accessibility of the location, including ease of entry and availability of parking, directly impacts customer footfall. By leveraging accurate and up-to-date POI data, businesses can minimize risks and create educated hypotheses about a store’s potential for success. Case Study: Dunkin’ and Starbucks in Broward County, Florida In collaboration with a prominent private equity firm who was looking to acquire a coffee shop chain, SafeGraph provided Places data to perform an in-depth market map analysis of the quick-service restaurant (QSR) chains, Starbucks and Dunkin’ in Broward County, Florida. This data allowed the firm to assess total addressable market (TAM), evaluate market performance by chain, identify submarket gaps, understand opportunities for chain expansion, and cannibalization. TAM, Saturation, and Competitive Analysis ‍Using SafeGraph Places data, the firm mapped out all Dunkin' and Starbucks locations within Broward County, Florida in Esri ArcGIS. They found that both chains had a significant presence, with multiple outlets scattered throughout the county. Accessibility ‍They analyzed the accessibility of these locations by considering factors such as ease of entry, parking availability, and proximity to major roads and commercial centers. Their findings indicated that both Starbucks and Dunkin’ generally had good accessibility due to their positioning near high traffic areas and shopping centers. However, this also suggests that future expansion could be more challenging due to higher barriers to entry in these prime locations. Insights and Recommendations ‍By leveraging SafeGraph Places data, we provided the private equity firm with high quality data to generate actionable insights: Market Expansion Opportunities: Identified specific areas in Broward County with gaps in coffee shop presence, highlighting potential locations for new outlets. Competitive Positioning: Analyzed the locations of all major competitors to determine the most viable markets for expansion and competitive advantage. Customer Engagement: Assessed site accessibility and provided recommendations for improving location convenience, aimed at increasing customer foot traffic and satisfaction. By tapping into the granular and accurate data provided by SafeGraph Places, coupled with the powerful visualization capabilities of Esri, we offered the private equity firm a more holistic market picture for their market analysis. This approach not only supported strategic decision making for new investments but also enhanced the firm’s confidence in selecting an acquisition that was more likely to succeed. ‍ For a limited time at no cost, download the Starbucks and Dunkin' POI dataset to see how SafeGraph Places data can help with informed investment decision making. ‍ #### Maintaining a Source of Truth in a Dynamically Changing World At SafeGraph, we are entirely focused on curating the highest quality location data to serve as a source of truth for what is happening in the real world. We ingest raw data from a variety of sources, then clean, de-dupe, and standardize it so data scientists can spend their time on actual analysis. We curate our Places data monthly to ensure it is fresh and an accurate representation of the real world. We include open/close dates to provide context on when a place began operations, or stopped them. This empowers data scientists to perform market analysis to see exactly how certain brands, business categories, or regions are expanding or contracting. But in such a dynamically changing world, where businesses close, change names, and various other changes happen at millions of places around the world, we must constantly be vetting our data to make sure it is a true representation of reality.We’re not the only provider of POI data out there, so we often hear of organizations comparing our data to similar datasets to assess coverage and quality. We decided to perform a similar analysis comparing the quality and coverage of SafeGraph Places and another leading POI dataset in the UK. TL;DR: We found that 17% of the other provider’s POIs were invalid.Comparing SafeGraph Data to Other ProvidersWe first matched SafeGraph Places data to the other providers'. For POIs that did not match, we began a thorough investigation to identify if we were indeed missing these places (no one is perfect), or if the other provider’s data was inaccurate.Some of the real world factors that can affect POI counts include business closures or acquisitions. To determine if the businesses showing up in the other dataset but not SafeGraph fall into either of these categories, we looked for relevant news articles and checked website domains.For example, we found that Debenhams department store purchased the Principles brand, which was not properly reflected in the other providers’ dataset and so produced inaccurate brand counts. Similarly, the brand MK One was still present in the other dataset even though it has been sold and rebranded multiple times.There are also some discrepancies between how SafeGraph classifies brands and the methodology of other providers. To be a valid brand in SafeGraph data, a brand must have more than one location and their own storefront or dedicated space in a store (e.g,. a Sunglass Hut inside Macy's is still a valid brand, but a Clinique makeup counter is not). We classify brands this way based on feedback from data scientists about how they think of brands and separate store locations. For example, if a kiosk inside a store has its own phone number and hours of operation, it is considered its own brand location.Many brands included in the other provider’s data do not meet this standard of accuracy. For instance, the other dataset includes POI data for Envy, an online-only shoe store, as well as Carlton Sports, a brand that is only sold in other stores and does not have its own store locations. Another example is Connections, an educational consulting service for boarding schools that does not actually own or operate the schools themselves (the other dataset associated the brand with the school locations).Accurate, Fresh, and Reliable POI DataBusinesses open and close frequently, and to stay on top of these dynamic changes, data providers need to be thorough in their research and data curation. SafeGraph ensures data accuracy and freshness by thoroughly researching and vetting the raw data we ingest. Our sole focus is curating location data of the highest quality so data scientists can spend their time analyzing, not cleaning, their data. We deliver monthly updates of our data to provide an up-to-date view of the real world, unlike other providers whose quarterly- or annually-updated data becomes quickly stale. var divElement = document.getElementById('viz1645645957793'); var vizElement = divElement.getElementsByTagName('object')[0]; if ( divElement.offsetWidth > 800 ) { vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';} else if ( divElement.offsetWidth > 500 ) { vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';} else { vizElement.style.width='100%';vizElement.style.height='727px';} var scriptElement = document.createElement('script'); scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js'; vizElement.parentNode.insertBefore(scriptElement, vizElement); ‍ #### Moore's Law Strikes the Satellite Industry ‍At SafeGraph, we’re big fans of satellite companies because at their core, they’re data companies. We recently spoke with Peter Plazter, CEO of Spire, on “World of DaaS” podcast about the satellite industry and to dive into what most people don’t understand about satellites. Spire has launched and operates the world’s largest multipurpose satellite constellation that uses radio frequency to collect data.Moore’s Law Strikes Again! Satellite Performance/Cost Grows ExponentiallyOver the past few decades, the cost per kilogram for satellites has been steadily decreasing while performance per kilogram has been exponentially increasing – growing 10x every 5 years!Not All Satellites Are Created EqualThere are similar devices in the transportation industry that have windows and passengers and seats and captains and engines and steering wheels and we end up calling them planes, ships and trucks. And everyone understands the difference between a plane, a ship and a truck right? Unfortunately, we still think of different types of satellites as… satellites. They are just as different from each other as a plane, a truck and a ship.So let’s dive into the different types of satellites: Listening satellites Small satellites in lower orbit working off radio frequency signals (Spire) – these satellites are “software-defined” meaning you can change what data they collect even after launch.Looking satellitesLarge satellites with cameras that collect imagery (Planet Labs) – these satellites are not “software-defined” with limitations on changing which type of data they collect post-launch.Talking satellites: Telecommunication satellites that provide internet and TV services - these are what most people think of when they hear the word satellite. Radio Frequency Data Enables Data Processing Onboard the SatelliteSpire satellites focus on radio frequency data and have an advantage in that they can work day and night, rain or shine. Using edge computing, data is processed locally in order to replace overhead with lower latency of the data. This is both faster and more energy efficient than processing data on the ground. The really cool part? Spire satellites disintegrate after 2-3 years, faster than a plastic bag from your grocery store, preventing any unwanted space debris. Repurposing Existing Technology for New UsesMarine vessels and airplanes have identification technology onboard for safety reasons. Automatic Identification System (AIS) for marine vessels were developed for safety reasons and meant to be picked up by other vessels in their vicinity. Spire developed a technology that searches and listens to AIS technology aboard ships all the way from space. The information they collect is very rich and contains over 20 different data points, including location, speed, origin and how deep it sits in the water. Radio Frequencies Captured From Satellites Are Used to Predict the WeatherWeather prediction is generally driven by data from space. 95% of the world’s population lives on less than 10% of the world’s land surface and over half of the world’s population lives on 1% of land. Ground sensors are great for areas that are inhabited but for the majority of Earth’s surface area which is remote, we need to rely on space technology. Listen to the full episode here‍ #### More Data Doesn't Always Mean Better Data When working with any data source, quality is always better than quantity. Here are 4 telling signs that you might be going slightly astray.Whoever originally said “less is more” probably wasn’t thinking about data. But little did they know that, in today’s data-driven age, this simple concept would hold so much water. There’s no question that data now plays a critical role in propelling businesses, governments, non-profit organizations, and academic institutions forward. It can fuel new innovations and new insights that can conceivably change the world for the better. In a business context, specifically, it can provide a competitive edge for driving long-term growth and success. But we’ve said it before, and we’ll say it again: Not all data is created equal. When working with a new data source, you must evaluate it to ensure that it can actually do whatever you’re trying to accomplish. In this sense, being choosy about what data sources you use is essential to avoid getting pummeled by a data avalanche that can actually cause more harm than good.The problem with having more access to more dataIf you’re trying to answer specific questions or (dis)prove certain hypotheses, it’s easy to first go down the rabbit hole of collecting as much data as you possibly can to support your objective. After all, having more data at your fingertips feels more complete and, therefore, can lead you to believe that it will give you a competitive edge by default.Unfortunately, that’s rarely ever true. If you’re not working with the right data, it’s just more data—and more data, on its own, can actually get in the way of uncovering actionable insights. We’ve seen this play out in a variety of ways during the COVID-19 pandemic. On the more positive end of the spectrum, we’ve seen how many local, state, and federal governments—as well as the businesses and people in the communities they serve—have relied heavily on SafeGraph’s accurate data to mitigate the ongoing health and financial impacts of this crisis on the ground level. We’ve covered this at length via the SafeGraph blog—be sure to take a look. But while data can do a lot of good, misusing it can have serious consequences for a business. Most people who use data incorrectly have no idea they’re doing it, and then inadvertently go on to make misguided decisions with a false sense of confidence. ‍Bad data can affect businesses in different ways: Retailers with dirty customer data may open a store in the wrong location or insurance underwriters with inaccurate geocodes may significantly underestimate a property’s flood risk. Regardless, what all industries have in common is the high cost of making important decisions based on bad data.So, why does this happen? For starters, it has a lot to do with data simply being more accessible and affordable than ever before. While this has, on the upside, led to a rise in the tools and resources available for collecting, cleaning, processing, and analyzing data more effectively, it still doesn’t negate the fact that the vast majority of data sources today are inherently flawed. As a result, many trained data experts still have to do a delicate tight-rope walk to balance the right amount of data with the right level of data accuracy before leaning into any insights drawn from that data. This is one of the primary reasons why we are such big believers in data standards. Approaching and analyzing all data sources via an objectively critical lens is the only way to drive outcomes that can lead to positive change—and minimize potential harm.4 signs that you’ve fallen into the “more is better” data trapWhen you’re wading in lots of data, it can be hard to single out the good data from the bad. However, there are a few ways to ‘sanity check’ yourself to ensure that you’re constantly staying focused on quality at all times. Here are four things to look out for: 1. You’ve added more data than you needHaving a wide selection of data to choose from is crucial for sourcing the right solutions, but it doesn't mean you need to use it all. While data providers should focus on selection to ensure they offer what users need, users should only concern themselves with the data that's actually going to solve their business problems. Adding more data to the equation, simply because it's there, can lead to over-complication and incorrect results. As Aaron Lipeles in Toward Data Science so aptly puts it: ‍"Making a dataset wider, by adding a lot of extra fields, increases the odds that something somewhere will look like it’s correlated when it’s not. The only way to mitigate that risk is to make the dataset deeper by adding more examples."‍Here’s another way of looking at this predicament. Having more columns in your dataset is great. It indicates that you’ve got a lot of information about a particular entity at your fingertips. But if some of those columns are wholly irrelevant to the kind of analysis you’re doing, they could easily lead you to draw conclusions or make correlations that are completely inaccurate. As a rule of thumb, you should have a clear understanding of what you’re trying to get out of the data before you actually start working with it. That’s the only way to ensure that you stay focused on the parts of the dataset you need in order to avoid any distractions that could eventually lead you down a wrong and errored path. ‍2. You can’t see the forest for the trees‍When you’re faced with a sea of data to cull through, it’s all too easy to go on a ‘wild goose chase’ until you find the hidden gems in there. But the problem here is that, if the data is all bad, to begin with, you’re going to spend a lot of time trying to make sense of it—in addition to cleaning and sorting it—only to draw bad conclusions that lead to bad decisions.See how quickly this becomes an undesirable domino effect? Massive amounts of data quickly can lure you into a false sense of confidence that makes you want to force the data to be usable. Unfortunately, that won’t do you—or anyone relying on your insights—any good. So avoid the temptation to hide behind an endless flow of data because let’s face it, it’s not likely to end well."More data is not better if much of that data is irrelevant to what you are trying to predict. Even machine learning models can be misled if the training set is not representative of reality." – Michael Grogan,“Why More Data Isn’t Always Better” in Towards Data Science 3. You don’t have a real strategy in placeAt this point, you might think that we’re a broken record, but that’s only because this is a really important part of the data analysis equation. If you don’t go into this process knowing what problem you need to solve, you can’t build a backward strategy for identifying what elements you need—including the right datasets—to get you there. As Michael Grogan in Towards Data Science points out, “More data is not better if much of that data is irrelevant to what you are trying to predict. Even machine learning models can be misled if the training set is not representative of reality." This further underscores just how important it is to first know what you’re trying to accomplish and then find the data sources to support it. A lot of organizations these days tend to acquire data just for the sake of acquiring it, perhaps thinking that it’ll come in handy one day. Or maybe it’s purely out of a deeply-rooted fear of ‘feast or famine.’ Either way, before getting too far ahead of yourself, be sure to confirm that you have an actual use for the data in front of you and then cross-check it for accuracy to ensure that it won’t ruin anything good that you already have in place. Today's most data-mature organizations carefully align their data strategy and procurement processes. 4. You can’t connect the data togetherDepending on the problem you’re trying to solve, there’s a very good chance that you’ll need to work with multiple datasets to get to the answer you’re looking for. But if the datasets can’t connect together well—with or without some serious cleaning—you’ve got an uphill battle on your hands. Acquiring all that data is really only helpful if you’re able to join datasets together. Not being able to do so often signals a deeper problem with the data that, in all reality, you should avoid altogether. While the industry has made strides in recent years with advancements like Placekey, not all data is easily joinable, and it may end up costing more time and resources to work with than it is really worth.‍Don’t confuse “more” with “better” ‍The moral of the story: When it comes to data, more isn’t always a good thing. In fact, working with more bad data will actually make your life a lot harder and, even worse, lead you to drive bad conclusions that then inform bad decision-making. Trust us, none of that is worth it. That’s why, at SafeGraph, we’ve made a point to own our niche and focus squarely on data related to physical places. For us, keeping a narrow focus gives us a leg up in prioritizing quality over quantity just as much as it enables us to provide consistently high-quality data—that can easily be connected to other datasets—at a meaningful scale (and growing!). But we've also made it easier than ever to connect our places data with other datasets when needed. We’ve been through the trenches and have worked with datasets of all shapes and sizes. We know what good data looks like and can spot a bad dataset from a mile away. In delivering on our mission to democratize access to data for all, we will never compromise on quality for the sake of quantity. That simply wouldn’t do anyone any good. ‍‍We’re laser-focused on being really good at one thing: places data. Schedule a demo to see how quality data can make a difference in your analytics. #### Now Available: SafeGraph UK Data SafeGraph UK Data is Now AvailableSafeGraph UK data is available in England, Scotland, and Wales.We’re excited to announce that we’ve expanded our global footprint for Places data into the United Kingdom. Effective immediately, SafeGraph now offers the Core Places and Geometry datasets for over 1.3M places in the UK, covering over 500 brands. “Our expansion into the United Kingdom is a huge win for our organization’s ability to provide value to our current customers who have a global footprint, as well as future customers who have been asking for this data for the last year or so.” - Lauren Spiegel, VP of Product, SafeGraphOver the last few months, we’ve teased the announcement with many prospects and have had an overwhelming amount of interest. In particular, Core Places and Geometry data are often used to help boost advertising effectiveness. Many brands, retailers, and advertising technology companies that do business in the UK have shown a large need for SafeGraph data as a critical ingredient in their campaign planning.SafeGraph recently partnered with a well known location and audience targeting platform that is using SafeGraph UK data to improve their ability to create and see ROI on out-of-home advertising campaigns.An executive from their revenue team said, “We're excited to work with SafeGraph to help boost our OOH campaign planning process in the UK. SafeGraph Core Places data is a critical ingredient to helping ensure we're accurately creating audiences and measuring the ROI on our campaigns effectively."Outside of advertising, SafeGraph has seen a high volume of need for UK places data in mapping, real estate, consumer retail planning, and more. You can read about these common use cases here. SafeGraph’s partner network is also excited about the geographic expansion. CARTO, a global leader in location intelligence and spatial data science, works with many UK organizations who have been eagerly awaiting the UK launch of SafeGraph data.“We’re thrilled that our partner SafeGraph will be launching in the UK. A growing number of our clients in real estate, retail, and CPG in the UK have asked for this in the past - so we’re excited to see how they start using the data to navigate the new normal we’re seeing on the UK high street and shopping centers.” - Florence Broderick, VP Marketing, CARTOWhat’s in SafeGraph UK Data?Both the Core Places and Geometry datasets are now available in England, Scotland, and Wales. SafeGraph Core Places for the UK represents 1.3M points of interest (POI) and covers base information such as location name, address, category, and brand association. View the entire schema for SafeGraph Places here.SafeGraph Geometry in the UK represents POI building footprints with spatial hierarchy metadata depicting where child polygons are contained by parents, or when two tenants share the same polygon, for all 1.3M points of interest found in Core Places.View the entire schema for SafeGraph Places here.To get started with SafeGraph UK data, contact our sales team. #### Open Census Data Update: Now Including ACS Data From 2016-2019 Open Census Data is Integral to Geospatial Analysis‍ As places data becomes increasingly part of business analytics across industries, so does combining it with other data sources and types for deeper insights. The geospatial data ecosystem is continually evolving to include new types of data related to a physical place, and as organizations become more data mature, they are finding new ways to join and leverage multiple datasets. Demographic data is fundamental to most geospatial analyses. Accurate representations of populations as they relate to physical space can reveal consumer behavior, identify areas of need, and inform strategic plans. Whether joined to points of interest (POI), communities boundaries, or any other type of geospatial data, demographics provide critical information to analysts in every industry. ‍Collecting and Accessing Open Census Data Detailed demographic information is increasingly used for data enrichment. The most well-known source of demographic data is often a country’s census. In the US, the Census Bureau collects and delivers over 7,500 demographic attributes related to gender, age, income, ethnicity, and other key socioeconomic traits through the annual American Community Survey (ACS). While the census occurs every ten years (you probably remember filling out the 2020 census pretty recently), ACS data is collected every year. Census and ACS data can be aggregated at various geographic levels of granularity, enabling analysis at the scale required by a particular project. But this aggregation is not always easy or accessible for everyone who would like to use the data. While US Census Bureau and ACS data is free to download on government websites, many users find it cumbersome and time consuming, especially for the increasing number of non-data scientist users. US Census and ACS data is free, but requires multiple downloads for all data variables at a national level. To make census data more accessible and easily joinable to other geospatial datasets, like our places data, SafeGraph downloads and joins it to census block group (CBG) geometry. Our data download page is updated with ACS 5-year estimates from 2016-2019, along with associated CBG geometries, to provide free and easy access to essential demographic data. Working with Open Census Data‍ SafeGraph provides an easy, consolidated download for all ACS attributes and national CBG geometry. SafeGraph offers open access to census data, specially designed to enable users, regardless of their level of expertise working with geospatial data, to easily access and incorporate key demographic data into their analysis. Our updated site gives users multiple options for downloading census and ACS data in GeoJSON format. On our census data page, users can download: 2016-2019 ACS data (separate download files for each year to help keep things organized) CBG geometry data applicable to all 2016-2019 ACS files Getting Started with Open Census Data‍ With census and ACS data, users with varying levels of data science experience across industries can analyze how places relate to people, and how people interact with places. The analytics opportunities are endless, but at SafeGraph we often see places data joined to demographic data for improved: Site selection Trade area analysis Location-based marketing Investment research To help you get started, we’ve created a new data science notebook that lets you enrich your POI data with census data. Be sure to check out our other census data resources: Open Census Data: Everything you need to know to get started Open Census Data technical docs Join point of interest data to census block group geometries Questions about working with open census data? Don’t hesitate to reach out - we’re here to help. #### Open Census Data: Everything You Need to Know to Get Started   Key Takeaways Open census data provides granular demographic insights that support location-based and population analysis. Census block groups (CBGs) are the most detailed geographic level commonly used for analysis. FIPS codes act as the primary key for joining census data with other datasets. Census attributes are encoded using table IDs, which must be selected based on analytical needs. Linking open census data to places typically requires geocoding and a point-in-polygon spatial join. There are thousands of reasons to bring census data into your workflow. Whether you are a retailer considering opening a store at a new location, an out-of-home advertiser considering investing in new billboards, a researcher from the CDC trying to identify risk-factors for cancer across a national patient population; all of these use cases (and many more) have one thing in common: you need to know demographic information about a place. This is a beginners guide to working with Census Data. Using these instructions, you'll be analyzing census data in 15 minutes or less. Step 1: Download Open Census Data The Census tracks a staggering amount of data, and all of this data is open and available to the public. Unfortunately, getting the data is not trivial. We’ve written about this problem before, and as a labor of love for the data science community SafeGraph painstakingly organized all of the data into a FREE and convenient single download of CSVs. SafeGraph didn’t add any frills or whistles or abstractions, and it doesn’t cost anything, we kept it simple. You should use this. Once you have the data in hand, there are fundamentally 2 things you need to know to get going: what are the rows and what are the columns? What Are The Rows? Census data is organized geographically, and all of the Census and American Community Survey data can be keyed on the FIPS (Federal Information Processing Standards) geography code. The US government divides the USA into different nested boundaries: States > Counties > Census Tracts > Census Block Groups > Census Blocks. Image Credit & more info: Esri The census block group (CBG) is the highest granularity for which the Census reports most of its data, and so each CBG is a unique row in in the data. The FIPS encodes all of this information for each census block group. Not a real FIPS. So, each row of the data is a unique census block group (CBG), encoded with a 12 digit FIPS code. And in Census data, the FIPS code (i.e., the primary key of the entire dataset) is listed in the column named `census_block_group`. What are the columns? Each “column” in the data is a particular census attribute estimated by the US Government (remember that each “row” is a unique census block group, or CBG). For example, one attribute is Sex By Age By Veteran Status For The Civilian Population 18 Years And Over which is a population estimate for civilians (non-veterans) over the age of 18. Since that is a mouthful, the government assigns each attribute a unique table_id code. In this example the table_id is B21001e7. SafeGraph's Open Census Data file includes over 7,500 table_ids (columns) for the 220,000+ census block groups in the US. It’s easy to get overwhelmed by Census data, but you probably do not need to analyze all of the Census data at once (or, you know, ever). So it is up to you to decide which Census attributes matter for your question. To help wrap your head around the possibilities, let’s look closely at how the table_id code is organized. It encodes information about what type of code it is. For example the code B01001e19 encodes that this attribute is about Age and Sex, specifically the population estimate for Males aged 62 to 64 years. Here is how that table_id code breaks down: B01001e19 (encoded as B-01-001-e-19) B = Table Type: This is a Base Table (as opposed to C for Collapsed Table) 01 = The Subject: Age; Sex 001 = The Specific Table: Sex by Age (002 is Median Age) e = estimate (as opposed to m for margin of error) 19 = Specific Population: Males 62 - 64 You can read more about the meaning of the table_id code. However for most purposes you can generally consider these arbitrary short-hand codes for long variable names. The metadata for each table_id located in /metadata/cbg_field_descriptions.csv. In addition to the full long variable names, this file also includes the subject and specific table broken out for each table_id. This is a very useful reference for identifying exactly which variables you want to analyze. How do I pick which specific variables I want? Rather than scrolling through 7500 table_ids, it’s useful to review all possible top-level subjects (i.e. the first 2 digits after the B or C): 01 Age; Sex 02 Race 03 Hispanic or Latino Origin 04 Ancestry * 05 Citizenship Status; Year of Entry; Foreign Born Place of Birth * 06 Place of Birth * 07 Migration/Residence 1 Year Ago 08 Commuting (Journey to Work); Place of Work 09 Relationship to Householder 10 Grandparents and Grandchildren Characteristics * 11 Household Type; Family Type; Subfamilies 12 Marital Status; Marital History 13 Fertility * 17 Poverty Status 18 Disability Status * 19 Income 20 Earnings 21 Veteran Status; Period of Military Service 22 Food Stamps/Supplemental Nutrition Assistance Program (SNAP) 23 Employment Status; Work Status Last Year 24 Industry, Occupation, and Class of Worker 25 Housing Characteristics 26 Group Quarters * 27 Health Insurance Coverage 28 Computer and Internet Use * 29 Citizen Voting-Age Population * 98 Quality Measures * 99 Allocation Table for Any Subject * Not included in SafeGraph's Open Census Data Cheat Sheet: Here are some of the most commonly referenced attributes in the census: Population sizes: Total Population: B01001e1 Population by Age and Sex: B01001* Population by Household Incomes: B19001* Population by Education Level: B15003* Population by Race: B02001* Population by Hispanic Ethnicity: B03003* Population by Race & Hispanic Ethnicity: B03002* Population by Household Type: B11001e* * wildcard character Summary Statistics: Median Household Income: B19013e1 Aggregate Household Income: B19025e1 Per Capita Income: B19301e1 Median Age: B01002e1 Once you know which table_ids you want to include, the documentation explains exactly in which files to find them (hint: if you want the variable B01002e1 it is in the data file cbg_b01.csv). Step 2: Join Open Census Data to your Data Clearly the best way to join your data to the Census data is on the FIPS code (the primary key of Census data). For one-offs you can manually look up FIPS for census block groups using an address or a point, we recommend this tool, but there are many resources. A programmatic workflow generally looks like this: List of places --> List of lat,long coordinates → Geospatial Join with CBG geometries (point-in-polygon) to link to FIPS If you are starting with SafeGraph Places, then you already have latitude, longitude coordinates for every place. If not, you will need to geocode the data using a tool like the ArcGIS geocoder or the Google Maps API, or match the data to a safegraph_place_id to get geospatial coordinates. To complete the Geospatial join you will need the CBG geometry boundary data. The CBG geometries are available as a geojson or shapefile from the USA government. Good luck googling for them. Don’t worry, we have you covered: SafeGraph includes this data (as geojson) in our Open Census Data, and we highly recommend using it. FYI here is a python working notebook that shows exactly how to do the geospatial join in Python. You’re done. It’s that easy. Now you have rich Census data joined to your original places data and you can answer your burning demographic questions. Does this new candidate retail site cater to my target audience? Do my ideal customers live near these billboards? Do certain demographic factors correlate with cancer diagnoses? For a full, interactive, coded and working example of using Open Census Data to answer a simple demographics question about a place, see Beginners Guide to Analyzing Census Data: A Python Notebook. More Resources: For a full, end-to-end, working example of using Census data to answer a simple demographics question about a place, see Beginners Guide to Analyzing Census Data: A Python Notebook. For a working example of geospatial joins using geopoandas in python, see: Point-in-Polygon Geospatial Join: A Python Notebook. FAQ’s 1. What is open census data?Open census data refers to publicly available demographic and socioeconomic data released by the U.S. Census Bureau and the American Community Survey. 2. What is a census block group (CBG)?A census block group is a geographic unit that represents the highest level of granularity for most census estimates. 3. What is a FIPS code and why is it important?A FIPS code uniquely identifies geographic areas and is used as the primary key to join census data with other datasets. 4. What are table IDs in census data?Table IDs are shorthand codes that represent specific census variables, such as age, income, or education levels. 5. How do you join census data with location data?Typically by geocoding latitude and longitude coordinates and performing a point-in-polygon spatial join to map locations to census block groups. Open census data refers to publicly available demographic and socioeconomic data released by the U.S. Census Bureau and the American Community Survey. A census block group is a geographic unit that represents the highest level of granularity for most census estimates. A FIPS code uniquely identifies geographic areas and is used as the primary key to join census data with other datasets. Table IDs are shorthand codes that represent specific census variables, such as age, income, or education levels. Typically by geocoding latitude and longitude coordinates and performing a point-in-polygon spatial join to map locations to census block groups. #### Outlining the SafeGraph Recruiting Process How we hire at SafeGraphWe started SafeGraph to graph datasets together to solve humanity’s biggest secrets. We consider ourselves historians focused on veracity and truth.We need to attract extraordinary people to help us do extraordinary things.This is the high-level recruiting process for roles at SafeGraph. If you are a candidate, we hope this transparency will help guide you through this process. If you are another company, we hope this will help encourage a discussion of how to make the recruiting process better. If you are reading this 300 years from now, we hope this gives you a lens into world-changing start-ups in 2017.Our goal is to go from initial contact to offer in 10 days.That’s moving very fast … and we need everyone aligned with that goal. Candidates put themselves out there when they talk to us and we think we owe them a quick answer. If the answer is to come work at SafeGraph, great. If the answer is that SafeGraph is not the right fit, the candidate deserves to know that quickly too so they can move on to other amazing companies. We also are trying to take as little of your time as possible to make a good decision … if you get all the way to the offer stage, it should be 5–10 hours total of your time in the process.We are looking for people that match our culture and values.We telegraph our culture to everyone we interview. We want to give people the ability to opt-out early in the process if they do not think SafeGraph is the right fit for them.If you love these values and they resonate with you, then this is going to be a great process. If you do not feel you are a fit with the culture, let us know and we are happy to refer you to other technology companies where you can be successful.SafeGraph Team Values:We optimize for growth. Everyone in the company wants to grow super fast. Everyone. We want teammates who are optimizing for growth over all things. SafeGraph will accelerate your growth with the following commitment: (a) you will work on really hard things; (b) you will have a high chance of failure; and (c) you will only work with amazingly talented people.Feel the need for speed and focus. Yes, we are a start-up. Which means we need to move at lightning speed. That means we need to maximize our output, so we constantly push ourselves to improve over the long hours we work. We also know that focus is essential. We move fast with the hiring process to optimize for start-up oriented people who can make decisions quickly, with incomplete information. Plus we know that we can only do so many things at once; we want to focus on you intently, make a decision as efficiently as possible, and either bring you on or move on to the next challenge.Coaching over managing. SafeGraphers are self-starters that do their best work when given a high level of autonomy. Every team member needs to clearly understand the company strategy and execute. Instead of hands-on, active management, we coach our team members to take their work (and the goals of the company) into their own hands.Deeply curious. SafeGraph’s mission is to understand the truth about the past so we can answer some of the world’s most vexing questions. So we want teammates who have opinions about the world, diverse interests and backgrounds, and lots of curiosity. We like dinner parties and board games over party parties. We know a LOT about processing big data and machine learning. But we also know a lot about foreign policy, economics, healthcare, sociology, psychology, transportation, investing, energy, neuroscience and more. Our ultimate goal is to help answer questions like: Is increasing the minimum wage good for the economy? Is wine good for you? What is the best treatment for leg cramps? Where was the origin of Zika virus? What is the actual unemployment number? What lifestyle factors help prevent Alzheimer’s? What factors predict successful marriages? Is it worth it to send my child to private school? Is a public company committing fraud?A-Players Only. Some startups seem to think demonstrating “hockey-stick” growth in employee headcount is what telegraphs success. We think differently. At SafeGraph, our goal is to only hire a small number of truly amazing employees (and rely on vendors for all other tasks). In order to attract and keep such employees, we know we must do 3 things: 1) constantly focus on employee growth, 2) do everything possible to promote from within, and 3) ensure our employees are always being compensated at top of market rates.Enter the interview process …Once we identify someone with the right background who is interested in talking with SafeGraph, we set up the below process (which may be modified depending on your situation).Stage Zero: Written InterviewWe love your resume so we set up your Initial Phone Interview. For some jobs (especially non-engineering jobs), we have a written interview prior to the first phone screen. The written interview is going to be a core skills check on your role: if you are a sales person, we may ask you to describe a sales strategy. We may also ask you some general questions like how you deal with cognitive biases and your strategies for growth. The SafeGraph Written Interview will take 15–25 minutes to complete, depending on the position. We’ll go over these responses during the phone interview.Stage One: Initial Phone InterviewThe first round interview is generally with the leader of the group you are interviewing with. This is an initial call and will be, at most, 25 minutes. This is to learn a bit more about you and what you are looking for (and to help you learn more about SafeGraph). We’ll go over your written interview if you completed one. For engineers, this is usually followed by a more technical screen where you remotely work out a few coding problems with a member of our engineering team (<45 minutes).Stage Two: 2-minute check-inHere we do a 2-minute check-in with you to get your information, collect any additional questions you have (we write them down and make sure they get answered), understand your compensation requirements and get you our NDA so we can go deep on the business with you. Also: we might love you but this is an opportunity for you to drop-out if you don’t love us (and yes, we will cry … but that is ok).Stage Three: The on-siteYou’ll come to our office (we’re on Market and 9th in San Francisco), and meet the team in-person. This will be max two-hours of your time. Like all our interviews, we work on respecting your time; we keep things quick and give you the option of meeting after-hours or weekends — we can meet 8am to 10pm. At some point in the process, you will definitely meet with the CEO (we think that hiring and developing talent is one of the 3 most important things a CEO does). The goals of these conversations are two-fold: 1) Give you an opportunity to get to know the kind of team we are building at SafeGraph; and 2) Let our team understand what it would be like to work, collaborate and communicate with you.Optional Stage Four: Project/ReferencesDepending on your background and the interviews thus far, there may be a project. And we may ask you for references … and we are likely doing off-list references at this stage (we will be very respectful if you are currently employed). This will take 2–4 hours of your time and another 1–2 hours of our time.Penultimate Stage: Preparing you for an offerIf you reach this point, we almost certainly want to give you an offer. But you might not yet have enough information to accept the offer. You’ll have a call with us and go over any questions you have about the company, culture, role, and more — everything you need to know to make a decision. The goal of this stage is to get you ready to receive an offer so that, assuming our offer exceeds your compensation needs, you will accept on the spot.Final Stage: AcceptanceYou get the formal offer — our goal is to be in the 90% percentile when it comes to total market compensation. We hope you accept (you’ll have a short time to accept after you receive the offer so you need to be prepared to make a decision before we give it to you). We’ll discuss your start date (which can be the next day or many months later). And we’ll get your preferences for things like external monitor, keyboard, standing desk, etc.The Transition:Everyone at SafeGraph is super excited you’ve accepted your offer. As your start date approaches, we will send you relevant reading materials and begin discussing some potential projects to begin working on for your first week. We already know you are a self-starter, but we also know that providing the right context about SafeGraph’s vision and strategy is essential for you to hit the ground running. Of course, before you start we will get you all the paperwork, health benefits, payroll, laptop, and more so you can spend your first day at SafeGraph doing real work and immediately contributing to the team.Your First WeekOnce you start at the company, it is important you are doing meaningful work. If you are an engineer, you’ll be pushing code on an important project your first week.Is SafeGraph the place for you?If this is exciting to you, come work with us! #### Owning the Work Marketplace: World of DaaS interview with Upwork CEO Hayden Brown New podcast with Hayden Brown, CEO of Upwork (NASDAQ: UPWK). Our conversation is available everywhere (Apple Podcasts, Spotify, YouTube, etc.). Please subscribe, follow, and review.‍ I am a massive fan of Upwork. One of SafeGraph’s values is to respect our own time — to get leverage — and Upwork helps us achieve this goal. SafeGraph has hired hundreds of freelancers on Upwork. So I was particularly excited to chat with Hayden about the freelance market. Here are some highlights from my conversation with Hayden Brown. Friction is a feature at Upwork ‍ Hiring freelancers has less friction than hiring a full time employee, but it’s not frictionless. Upwork includes higher friction experiences when they want to collect more information. When they onboard a new freelancer, they want to really understand them and their skill sets so they can effectively match them with opportunities. If they create a blank profile that won’t do anyone any good. Upwork doesn’t run any advertising for their supply side Hayden strongly believes that maintaining an equilibrium between the demand and supply side is essential to their marketplace’s success. But she also believes that one side will be more constrained than the other. For Upwork, that’s the demand side. Upwork is so well established as the dominant freelance marketplace, they do not have to run any paid marketing for their supply. That’s really interesting because most well-known marketplaces are more constrained by their supply than by their demand. Strong brands have the great benefit of significantly reducing the need for paid marketing as we’ve seen with the CEOs of Verisk and FICO on past World of DaaS episodes. Full time employees should spend majority of their time on highest order thinking‍ In order to let your employees focus on your hardest problems, you need to automate repeatable work. When building a leverage stack, a company will start with low code tools (for instance, I’m a huge fan of Airtable). Then move onto APIs, then middleware solutions like Zapier, then freelancers, and lastly other employees. Smart companies do everything they can before hiring employees. Some make sure to exhaust work marketplaces, like Upwork. It’s completely agile and flexible. You can turn on and off the resources as your business needs change throughout the year, quarter, etc. ‍ You don’t need someone with specific expertise to be your FTE‍ I don’t have a full time doctor nor a full time plumber. There's no reason to have someone full time, who has this very, very specific knowledge. You can rent expertise … even deep expertise. Start-ups can REALLY benefit from renting experts and focusing their hiring on general athletes rather than position players. Upwork has a lot of deep macroeconomic data. Upwork is basically running a $2.5 billion economy with all of this risk insight around labor pricing. Their Chief Economist ... interestingly a FTE ... publishes reports and is prolific on social media about their insights and connecting that to other studies. They publish through research studies, annual freelance report. They can do what redfin does for real estate.   Note: if you enjoy this episode of World of DaaS, be sure to follow Hayden Brown on Twitter. Hope you enjoy this episode of World of DaaS — would really appreciate it if you subscribe and review Apple Podcasts, Spotify, YouTube, etc.). #### POI Data for Lead Generation: Automating CRM Enrichment and Territory Planning Key Takeaways POI data helps automate lead generation by keeping business data accurate and current. CRM records can be filled and updated automatically using location data. Sales teams can spot new opportunities as businesses open, close, or change. Territory planning improves when real-world location details guide coverage. Data-driven lead generation reduces manual work and supports steady growth. The Supercharge Your Revenue with Places Data blog touched on why marketing and sales teams use POI (point of interest) data to identify ideal customer profiles (ICPs) and optimize the “top of funnel” pipeline. In this post, we will explore how marketing and sales teams leverage high quality POI data to automate lead generation through integrations that support CRM enrichment and more effective territory planning, helping scale the next steps in the sales process.  Automating Lead Generation with POI Data Lead generation depends on accurate and detailed business information. A high-quality POI dataset ensures your team spends less time on manual research and more time engaging prospects. CRM Integration Most CRM (Customer Relationship Management) tools require sales representatives to input a long list of details for each prospect: physical address, company website, phone number, business category, etc. A POI dataset containing this information can autofill these fields when integrated with CRMs — saving time and reducing data entry errors. Example: A company selling point-of-sale systems to fine dining restaurants in the Phoenix metro area might gain “The House Brasserie” as a qualified lead. SafeGraph Places Data can automatically populate the following fields: Location_name: The House Brasserie Street_address / city / region / postal_code: 6936 E Main St, Scottsdale, AZ 85251 Naics_code / sub_category / category_tags / service_options: 722511 / Full-Service Restaurants / Fine Dining, American Food, Steakhouse / Accepts Reservations, Offers Catering, Private Dining Phone_number: +1-480-634-1600 Website: https://thehousebrasserie.com Want to see how POI data can automate lead generation and CRM enrichment? Request a demo of SafeGraph Places Data to explore how accurate location data can support smarter prospecting and sales workflows. Request a Demo Sales Workflow Automation The best sales representatives look for ways to automate manual research so they can spend more time progressing deals. A high-veracity POI dataset with monthly updates enables teams to receive alerts about important account changes, supporting lead generation automation without manual effort. Example: A salesperson covering Home Depot doesn’t need to manually track openings and closings. With SafeGraph data, a newly opened store can be flagged as a new opportunity: Placekey: 222-224@3x5-6kn-9vf Brand: The Home Depot Street_address / city / region / postal_code: 4515 S Regal St, Spokane, WA 99223 Opened_on: 2025-06 With this automation in place, sales teams can act on new opportunities as soon as they appear. Common POI Data Inputs for Territory Optimization Category_tags, amenities, service_options: Differentiate similar businesses for more precise, data-driven lead generation, such as identifying sushi restaurants with outdoor seating versus burger restaurants that offer delivery. Opened_on / closed_on dates: Identify whether a market is expanding or contracting based on business openings and closures, helping teams prioritize high-growth territories. Wkt_area_sq_meters (polygon footprint): Estimate revenue potential per square foot or filter businesses by physical size to align territories with deal value. Latitude, longitude, and street_address: Accurately locate businesses to cluster opportunities, streamline travel, and design efficient sales territories. By incorporating these POI data inputs, sales leaders can build balanced, high-performing territories that support sales workflow automation, reduce wasted time, and maximize market coverage. Conclusion  Automating lead generation is most effective when it is based on accurate, real-world business data. POI data supports data-driven lead generation by keeping CRM information up to date, highlighting new opportunities as they appear, and making sales workflows and territory planning more efficient. When sales and marketing teams work with location data that reflects real market changes, they spend less time researching and more time connecting with the right prospects. This leads to a scalable and practical approach to lead generation that is easier to manage and delivers more consistent results. FAQ’s 1. What is POI data in lead generation? POI data provides structured, location-based business information that helps sales teams identify and target the right prospects faster. 2. How does POI data automate lead generation? By integrating with CRMs and workflows, POI data automatically fills business details and flags new opportunities without manual research. 3. How does POI data improve CRM accuracy? It keeps CRM records current by auto-updating addresses, categories, websites, and phone numbers, reducing manual errors.  4. How often is POI data updated? High-quality POI datasets are refreshed regularly, often monthly, so sales teams can act on real market changes. 5. How does POI data support territory planning? It helps sales leaders design balanced territories using business locations, openings, closures, and market density. 6. Is POI data useful for data-driven lead generation? Yes. POI data enables data-driven lead generation by linking sales decisions to real-world business activity. 7. Who benefits most from POI-based lead generation automation? Both small teams and large enterprises benefit by saving research time and scaling consistent, automated prospecting. POI data provides structured, location-based business information that helps sales teams identify and target the right prospects faster.By integrating with CRMs and workflows, POI data automatically fills business details and flags new opportunities without manual research.It keeps CRM records current by auto-updating addresses, categories, websites, and phone numbers, reducing manual errors. High-quality POI datasets are refreshed regularly, often monthly, so sales teams can act on real market changes.It helps sales leaders design balanced territories using business locations, openings, closures, and market density. Yes. POI data enables data-driven lead generation by linking sales decisions to real-world business activity.Both small teams and large enterprises benefit by saving research time and scaling consistent, automated prospecting. #### Points of Interest Data: 5 Most Common Use Cases Key Takeaways Points of interest data forms the foundation of meaningful POI use cases across industries such as mapping,real estates, finance,CPG, and public health. Many POI use cases depends on accurate and frequently updated data, especially when paired with mobility datasets. POIs are dynamic, not static, especially in commercial and urban environments. Reliable POI data supports decision-making in mapping, real estate, CPG expansion, finance, and healthcare. Regular update cadence is critical to avoid outdated or misleading spatial insights. Strong POI use cases rely on consistent, up-to-date location data. Humanity has known “where things are” since the dawn of cartography, with clay tablets from as far back as ancient Babylon detailing the precise locations of various sites around cities. In recent years, with the widespread proliferation of smart devices, we can now augment this geographic analysis with human movement at varying scales and over time. But the new availability of mobility data doesn’t mean points of interest are any less important. In fact, the increasing amount of things we can do with geospatial data makes having reliable and accurate POI data even more critical.If we stop to think about points of interest and the ways in which they’re collected, we’ll come to find that they are most valuable when combined. Overlaying places with people, over space and over time can have many applications, but without a solid foundation of places, there is not much to analyze.Context MattersYes, mobility data is important. Its importance, however, is limited to revealing the spatial behavior of human beings (or cars or trains or critically endangered Russian eagles) in relation to physical places. But it’s important to note that if you don’t know where places are located, the mobility data is rendered largely useless.Places ChangeAnother nuance is what we’ll call the fallacy of staticism (arguably not a word, though we found a document from Nietzche employing it and feel we’ve got a green light on this one). By staticism we mean the idea that they represent static, near-monolithic locations. This couldn’t be farther from the truth. Especially when it comes to commercial businesses their physical locations are anything but static. Look no further than this past year of tumult to validate our position that POIs are dynamic data points.Once we acknowledge that businesses come and go, that points of interest change constantly, and the world as a whole lacks permanence, the next question becomes, ‘what cadence are changes being observed and reported in the data?’ If you have mobility data detailing movement patterns throughout Manhattan with a daily update cadence, they’d be largely useless if the POIs against which the data is intersected are more than a year old.At SafeGraph, we recognize that the world is always changing, so we update our Places data every month to ensure that your decisions are based on facts about the physical world, not rusty pings from times long past.Because everything has a geography, we learn of new uses for points of interest data everyday. But five use cases for places data do come up more than often than others. While far from exhaustive, here’s a breakdown of the primary roles that POIs play in driving key analyses across five sectors of business.Points of Interest for MappingAt the risk of using a convoluted term here, we’ll refer to companies with a direct focus on mapping products in the geospatial industry. Think of the mapping applications you use on your phone to find a gas station or coffee shop. The goal is to “know where things are” and update information about those things. If a store name changes or a place goes out of business, clients of mapping companies, whether individual users or enterprise clients, will want to know about it. Analyses are often performed on the data but the main goal always begins with knowing the location of as many places on Earth as possible.Points of Interest for Real EstateMost Starbucks in this part of Chicago fall into an “Uptown Individuals” Esri Tapestry lifestyle segment. Real estate analysts may expect areas surrounding Starbucks stores to attract people from that lifestyle segment in the future, or decide to open a new store in a similar area.Real estate firms want to know how patterns in peoples’ movement and the presence (or lack thereof) of certain businesses can act as proxies for growth in a given region. The classic example here is following certain brands’ physical expansion into new neighborhoods, the logic being that as popular brand A arrives, their target demographic inevitably follows. These sorts of analyses can be performed with points of interest data in both positive and negative scenarios, allowing real estate professionals to also identify areas to hold off on investing in for the time being.Points of Interest for CPGWithout POI data, CPG brands would not be able to analyze purchasing power related to a specific location and how that could impact their sales.The consumer packaged goods space covers everything from cookies to paper towels and beyond. Often we see among CPG brands a desire to more accurately identify and predict their total addressable market. This is the hypothetical denominator of all businesses that could carry a given product. Reaching this figure and expressing it spatially allows companies to better focus their expansion efforts at scale. CPG firms can also use points of interest data at a much more granular level to plan out expansion within a given city; for example, a candy company looking to sell into convenience stores in Chicago starts by identifying clusters of corner stores from neighborhood to neighborhood.Points of Interest for Financial InstitutionsPOIs enable financial institutions understand where to invest based on proximity to competitors or key socioeconomic factors.The term ‘financial institutions’ is intentionally vague and refers to every arm of finance and capital markets, as well as the ancillary offerings that serve them. Think investment banks, retail banks, PE firms, hedge funds, and management consultants. The variety of applications of points of interest data for these organizations is as wide as their own diversity, but generally speaking, the focus here is understanding how variations in locations across space and time can be used to predict implications for investments. For example, tracking the openings and closures of a particular segment of business across an entire country can better equip a team to make calls on whether to invest in expanding their investment into public equities of that sector. These same analyses can take on a much more granular level, sometimes even down to the block.Points of Interest for Healthcare or Public HealthVisualizing concentrations of healthcare facilities helps researchers identify underserved areas and populations.When we think about health in the context of location data, it seems the more common association found in discussion is around mobility data. It makes sense: understanding how people move around and where they do and do not spend time can shed light on myriad public health inquiries. Yet mobility data lacks room to fully shine without the ability to consider its relationality to a given subset of POIs. The value of points of interest data from a health context is in no way dependent on using it with mobility data. Many use cases require none at all. Consider the measurement of access to primary health care in a given part of the country. The real inputs we’d hope to have on hand are the places themselves, then demographic data on what kinds of people and how many live within a catchment area or other radius around each location. We might also look at survey data detailing what modes of transportation are available and preferred by the local population to further inform the parameters used to create the catchment areas.A Geospatial FoundationPoints of interest are critical for geospatial analysis in any industry. Without reliable places data, mapping, real estate property management, financial analysis, CPG planning, and public health initiatives would lack essential information for identifying areas of opportunity and risk. It’s a quirky trend in the world of location analytics that information on how people move about often occupies more of the spotlight than information on what places lie where - but it’s important to remember that without POI data, mobility data would likely lead to stale insights without context. FAQ’s 1. What is points of interest (POI) data? POI data represents physical locations such as businesses, public facilities, landmarks, and services, along with attributes like category, address, and operational status. 2. Why is POI data important for geospatial analysis? POI data provides context to spatial patterns. Without accurate location data for places, movement, demographic, or economic analyses lose meaning. 3. How is POI data used alongside mobility data? Mobility data shows how people move, while POI data explains where they are going and why those destinations matter. 4. Which industries rely most on POI data? Mapping services, real estate, consumer packaged goods, financial institutions, and public health organizations are among the most frequent users. 5. Why does POI data need frequent updates? Businesses open, close, and relocate constantly. Outdated POI data can lead to incorrect conclusions and poor strategic decisions. POI data represents physical locations such as businesses, public facilities, landmarks, and services, along with attributes like category, address, and operational status.POI data provides context to spatial patterns. Without accurate location data for places, movement, demographic, or economic analyses lose meaning.Mobility data shows how people move, while POI data explains where they are going and why those destinations matter.Mapping services, real estate, consumer packaged goods, financial institutions, and public health organizations are among the most frequent users.Businesses open, close, and relocate constantly. Outdated POI data can lead to incorrect conclusions and poor strategic decisions. #### Polygon Data: Top 3 Use Cases Any geographer knows points, lines, and polygons form the basis of vector data analysis. These three data types enable data scientists to effectively map, visualize, and model physical attributes on Earth’s surface. When it comes to physical places, polygon data is essential for getting a true representation of a property or building. While point data for places has its use for competitive analysis, creating trade areas, and other key business intelligence operations, there are many mission-critical use cases that require building footprints or geometry.Geometry reveals more detailed spatial relationships than point data alone.Precision Matters for Polygon DataA polygon is defined by Esri as “a GIS object that stores its geographic representation—a series of x and y coordinate pairs that enclose an area—as one of its properties (or fields) in the row in the database.” While there are many ways to represent a point - such as through building centroid, street-level geocode, or address geocode - polygons by definition must represent the physical boundaries of a property or building. There is no “close enough” for polygons. If it’s inaccurate, it’s wrong.The main reason why precision and accuracy are so fundamental to polygons is that the most common use cases for geometry or building footprints require a truthful representation. Top 3 Use Cases for Polygon Data:1. MappingThis one may be obvious, but should not be overlooked. While all geospatial data is intended to be mapped in some capacity, mapping polygons almost always means creating an accurate visualization of a property or building’s boundaries. When creating a map, if the decision is made to use polygons over points to represent places, there is most likely a desire to represent how properties relate to one another in the most detail possible. While point data is helpful for many use cases, geometry provides a more accurate representation of what a place really looks like.Mapping building footprints or geometry reveals important relationships between physical places. With detailed spatial hierarchy information, polygon data can be mapped to show which places are located within another place. Accurate and precise geometry also enables visualization of co-tenancy or adjacency, which can inform site selection or risk assessment strategies. Mapping polygons also provides information related to the accessibility of a place, such as where an entrance is located or where the closest parking lot is for that location. 2. Visit Attribution Polygon data can also be more informative of consumer interaction with a place than point data alone. This is particularly true for advertisers or retailers deriving store visit attribution. Mobility data makes it possible to see how consumers move throughout the day, but without contextual location information, that movement is meaningless. When combined with polygons, mobility data shows which places people visit and how long they stay there.Visit attribution with accurate building footprints is more precise than using a point or centroid radius.But as with mapping, this information is only valuable if it is accurate and precise. Accurate polygons are critical for correctly attributing visits to specific places. Visit attribution that relies on a centroid radius is likely to both under- and over-attribute visits to places, particularly when places are located close together or within the same structure. Spatial hierarchy and accurate polygons ensure visits are correctly attributed to places, which boosts the efficacy of location-based marketing and retail analytics.3. Insurance Risk AssessmentPolygon data’s ability to showcase co-tenancy and adjacency with precision is fundamental to insurance risk assessment. Measuring and modeling property risk is dependent on what is going on at or near a specific location. For example, a nail salon that opens in a strip mall next door to a fireworks store will have a higher risk profile than one that opens across the street, or next to a grocery store.Geometry data is the best way to assess co-tenancy and adjacency risk.In an increasingly competitive and data-driven insurance market, insurers must stay on the cutting edge of risk assessment and modeling to win and retain customers. There is no room for error, or data that is “good enough.” Under-assessing a property’s risk can lead to increased exposure for the insurer, while over-assessing can lead to customer churn and dissatisfaction. Polygons give insurers the precision they need to ingest into their models to assess risk with confidence.SafeGraph Polygon Data is Accurate and PreciseOur sole focus is places data. From understanding where POIs are located, to what their boundaries look like and how they relate to places around them, and even how consumers interact with them, SafeGraph is the expert in physical places. This maniacal focus enables us to curate and deliver polygons of the highest accuracy and precision to power mapping, visit attribution, risk assessment, and more. ‍SafeGraph Geometry data provides building footprints or geometry for POIs in the US, Canada, and UK, along with essential metadata. We include spatial hierarchy information to help users understand the relationship between places within the same polygon, as well as brand and parking lot details. Our metadata also includes a field that indicates whether the polygon was generated from machine learning or hand-drawn, providing full transparency into our data curation process. More about polygon data:SafeGraph Places Story MapBuilding Footprints: Essential Data for Accurate Geospatial AnalysisGeometry: The Anchor of SafeGraph PlacesData schema #### Product Spotlight: Enhanced Filtering with SafeGraph’s Category Tags & Amenity Data Isolating businesses by location specific characteristics is an everlong industry prompt that is riddled with data access challenges. Knowing the general category of a business is a start, but it leaves much to be desired when thousands of businesses share a similar category. A more narrow set of point of interest (POI) features are required to truly explain differences in business outcomes, customer profiles, etc. Does it offer delivery? Is it family-friendly? Can I pay in cash? These types of questions are critical across user search, real estate site selection, audience creation, and compliance.To make these details more accessible, SafeGraph revamped its Category Tags feature and introduced Amenity Columns: a structured way to query key attributes about physical places. These updates complement our broader effort to improve location intelligence with richer category and amenity data, making it easier to filter places based on the nuances that matter. Available for all bar, restaurant, cafe, hotel, and retail trade places globally.Why This MattersTraditionally, many business characteristics such as “accepts reservations,” “has Wifi,” or “cocktail bar” were buried in free-text fields or inferred from inconsistent metadata, which made them difficult to analyze at scale.SafeGraph’s POI Category Tags and Amenity Columns solve this by:Structuring attributes into logical columnsAdhering to a normalized set of “reader friendly” string valuesLinked to all other contextual attributes like location name, brand, address and coordinate info, open hours, website, phone number, category, etc.What’s IncludedWe’ve organized these rich attributes into eight intuitive columns: Column Answers the question... Example Values category_tags What words or phrases best describe this place, or what type of products does it sell? Latin American Food, Mexican Food, Oaxaca Food, Tacos, Sports Bar, Cocktail Lounge, Auto Parts & Accessories accessibility Can I get there and get around? Parking, Wheelchair Accessible Restroom activities What can I do there? Karaoke, Pool and Billiards, Art Classes amenities What resources are available? High Chairs, Wifi, TV, Spa, Outdoor Seating owner_demographic Who runs the business? Women-owned, Veteran-owned payment_options How can I pay? Accepts Cash, Accepts Credit Cards, Accepts Apple Pay service_options What kind of service is offered and how? Accepts Reservations, Breakfast, Delivery, Drive Through, Vegan Options setting What’s the environment like? Family Friendly, Moderate Noise, Touristy, Upscale All values are consistently formatted and optimized for large-scale querying.How to Use the DataThese new features aren't just easier to interpret; they’re designed for direct integration into your existing analytics workflow.Whether you're running SQL queries, training ML models, or enriching location search experiences, actionable insights are easy to extract. Below are a few real-world SQL query examples that demonstrate how to combine Category Tags and Amenity Columns to narrow on specific sets of locations:1. Find sports bars in New York that have a TV, wifi, parking, happy hour, burgers, and accept reservations:2. Identify cafes in California that offer bagels, delivery, have outdoor seating, and are family friendly with a casual vibe:3. Identify Department Stores in Texas that sell perfume:Real World POI Use CasesMarketing & AdvertisingBuild more accurate/custom location based audience segmentationsOptimize asset selection for OOH campaigns with more precision Filter out sensitive locations like ‘Kid Friendly’ or ‘LGBTQ Friendly’ Maps and Generative AI Search & DiscoverySurface relevant results from narrow, place based search queries (e.g., ‘Quiet’ cafes with ‘Outdoor Seating’ and ‘Wifi’) Tailor results by user preferencesEnhance AI-search capabilities with local intent and better contextual recommendationsRetail & Real EstatePrioritize expansion into areas with specific demand attributes (e.g., restaurants with ‘Drive Through’ and ‘Delivery’)Understand competitor footprint by absence or presence of specific store attributesIdentify gaps in service (e.g., ‘Live Music’ venues in high-density zones)Build with Better ContextWith SafeGraph’s POI Category Tags & Amenity Columns, you no longer need to guess what a location offers - you can know.1. Explore Category Tags & Amenity Columns in our Docs Site2. Talk to our team to learn more‍ #### Product Spotlight: Enriching Transaction Data with SafeGraph Places Enriching transaction data with structured place information is critical for accurate analytics, fraud detection, and AI workflows. SafeGraph Places provides verified store-level context to help fintechs, payment processors, and data platforms clean and standardize messy merchant strings at scale.In transaction data, the same physical store can appear under dozens of messy, inconsistent names. These discrepancies aren’t just cosmetic – they introduce real friction in analytics, UX, and risk analysis. In this blog, we’ll walk through a few real-world examples of how messy merchant strings create friction – and how structured enrichment solves it.Real-World Matching Problems and Use Cases1. Messy Strings for the Same Store A single Whole Foods location might appear as:WHLFDS #1615Whole Foods Market ATLWholeFoods 16152. Brand Confusion Across Sources "Ashley Home Store" vs. "Ashley Furniture" vs. "Ashley Outlet" — all the same brand, but difficult to reconcile without identifiers.3. Hidden Geographic Impossibilities A card used in Atlanta (WF MKT 1615) is used again 20 mins later in Chicago (WholeFoods #2921). Without place context, fraud like this could be missed.These examples highlight a common challenge: raw transaction strings are messy and inconsistent, making it hard to extract accurate merchant-level insights. That’s where enrichment comes in.Here’s how teams are already using SafeGraph’s enriched POI data to solve this:Payment ProcessorsImprove statement accuracy and merchant-level reportingReduce error rates in merchant ID mappingPower loyalty programs linked to merchant identityFinTech PlatformsDrive PFM features like budgeting and spend trackingNormalize merchant names for improved UXMerchant Intelligence TeamsBenchmark brand or store performance by regionEnable accurate parent company roll-upsAI & ML TeamsAI models like LLMs, fraud classifiers, and recommendation engines are only as strong as the data behind themNoisy merchant strings and mismatched locations reduce accuracy and add entropySafeGraph’s structured POI data improves signal-to-noise ratio, powering smarter, more context-aware AI systemsSpend Analytics / Consumer ResearchCategorize merchants cleanly by industry or verticalTie real-world behavior (visits) to spend behavior📌 Case Study: Plaid, a leading fintech infrastructure provider, uses SafeGraph Places data to improve the precision of their transaction enrichment. By integrating store_id and other SafeGraph attributes, Plaid connects roughly 50% of card-present transactions to verified merchant locations — significantly improving match accuracy while reducing manual cleanup cycles. (Read the case study)How SafeGraph HelpsSafeGraph’s store_id, name_aliases, mcc, and naics_code columns anchor ambiguous transaction strings to a single, verified POI.The result:✅ Fewer false positives✅ Cleaner attribution✅ Stronger fraud signals✅ Better analytics✅ More usable user experiences🔑 Key Columns for Transaction Enrichmentstore_idDefinition: The unique ID associated with the store as provided and maintained by the store/brand itself.Most store_ids are alphanumeric and can be found directly on official store locators. Some are displayed in plain sight next to the store name while others are embedded within the store locator URL or other non-obvious places. For example, the store_id for a Dunkin’ store is “352872” (appended at the end of the URL).store_id is especially useful as a join key when working with transaction data. For example, “TJ256Y8” may be the only location-specific information within a transaction dataset. A Places dataset that also contains "TJ256Y8" as a store_id enables a join to contextualize transaction data (or other internal, store-level data) with SafeGraph places information.name_aliases (*Beta release available for testing and feedback!)Definition: An array of alternative names for the place. These can include common colloquial names, registered business names, store-specific names within a brand, and/or parent company names. Values are ordered by string length from smallest to largest.Do you call your local coffee shop by its old name while transplants call it the updated name under new ownership? Is it "Ashley Furniture" or "Ashley Home Store," and are both correct? These sorts of name differences permeate transaction data, making joining to places data especially challenging.We’re excited to bridge this gap through name_aliases, to accelerate the training of entity resolution models or building embeddings that support semantic search and categorization for fintech use-cases.mccDefinition: An array of Merchant Category Codes describing the business.Merchant Category Codes (MCCs) classify businesses based on the type of goods or services they offer. These codes are widely used across payment ecosystems to categorize, track, and analyze transactions.The mcc column in SafeGraph contains an array of all known MCCs associated with the business, ordered from most to least commonly used at the point of sale. Example: [5942, 5814, 5945].naics_codeDefinition: SafeGraph uses the 2017 North American Industry Classification System (NAICS) developed by the US Census Bureau.NAICS codes are hierarchical:72 = Accommodation and Food Services722 = Food Services and Drinking Places7225 = Restaurants and Other Eating Places722513 = Limited-Service RestaurantsWhile developed for US industry classification, NAICS has proven effective internationally and will continue to be our core classification system until further notice.📊 About the ColumnsColumnPurposeExample Valuesstore_id Unique brand-assigned ID for the location11615, TJ256Y8, 0021name_aliases Common alt names for the same business"Ashley HomeStore", "Ashley Outlet"mccPOS merchant type classification[5942, 5814, 5945]naics_codeIndustry classification (US Census standard)7722513, 445110, 452311🧪 How to Use the DataThese fields are designed for use directly in enrichment workflows. Here’s how to join and filter your transaction logs:1. Join transaction logs to Places using store_id2. Match merchant names against known aliases3. Filter enriched transactions to Whole Foods locations in TexasTurn Raw Transactions into Reliable InsightsWith store_id and name_aliases, SafeGraph provides scalable tools to turn noisy merchant descriptors into precise POI matches. As AI continues to reshape fintech and commerce, data cleanliness becomes a competitive advantage. Enriched POI data from SafeGraph helps ensure your models, agents, and analytics operate on trusted, contextualized merchant signals. #### Product Spotlight: How Geocoded Address Data Powers Global Location Intelligence Expanding into global markets takes more than ambition. It requires a solid foundation of reliable infrastructure, and at the core of that is accurate address data. In many regions around the world, clean, standardized, and geocoded addresses are difficult to find. Across LATAM, MENA, Southeast Asia, and Eastern Europe, public records are often incomplete. Address formats vary not just from country to country, but often within countries themselves. Some regions rely on informal or local naming conventions, while others use entirely different addressing systems that apply only to specific areas. These localization challenges make it difficult to verify or normalize addresses at scale, especially when trying to support global operations with a single system. Without a trusted source of address data, global operations run into delivery failures, compliance gaps, and poor user experiences. In this post, we’ll show how SafeGraph solves these challenges with structured, geocoded address data – and walk through a real example from French Polynesia. The Problem: Inconsistent Address Data Creates Real-World Risk For many companies, expanding internationally means stitching together fragmented local data sources, manually verifying addresses, or relying on customers to input clean data – none of which scale. Without consistent, verified address data: Mapping tools fail to locate addresses accurately Deliveries go to the wrong location or fail altogether Onboarding systems reject valid users due to formatting issues Compliance teams lack confidence in user location The result is higher operational costs and a worse user experience in the very regions where growth matters most. The SafeGraph Solution: Verified, Structured, and Scalable Address Data SafeGraph provides a single source of high-quality, geocoded addresses – especially in countries where traditional providers fall short. Each address in our dataset goes through a multi-layered verification process: AI-powered validation: We use machine learning to detect and correct anomalies, match address strings to precise coordinates, and assess consistency across sources. Local signal confirmation: When applicable, we incorporate on-the-ground confirmations and various signals to verify that an address actually exists. ‍Standardized schema: All address records follow the same structure, including fields like primary_number,street, city, region, postal_code, country, latitude, and longitude. We continuously expand our global coverage based on customer demand - and offer custom sourcing for countries not yet supported. Available in 25+ Countries Our global address dataset includes support for dozens of hard-to-source countries, including: Turkey Morocco Israel Bulgaria Bosnia Georgia Albania El Salvador We recently added support for French Polynesia, a remote and complex region that illustrates the quality and flexibility of our sourcing and are always adding more based on customer demand. French Polynesia: A Real-World Example French Polynesia spans over 100 islands, many of which have no formal address systems. As a result, large parts of the country have no addresses to capture. SafeGraph provides strong coverage in the populated areas where address systems do exist, offering verified, high-precision data where it matters most. Use Cases We Support Teams across industries use SafeGraph’s global address data to power: Mapping & Navigation Search, autocomplete, and routing work best when addresses are both accurate and widely available. Poor coverage limits what users can find. By expanding address coverage in hard-to-source regions, SafeGraph increases the success rate of location-based services. Rideshare & Delivery Reduce failed pickups and missed deliveries by routing drivers to exact building centroids rather than vague street-level estimates. Expanding address coverage is just as critical — if an address isn’t in the database, drivers can’t be routed to it at all. Address Verification Normalize and validate user-submitted addresses worldwide in real time, reducing form errors and fraud risk. Broader coverage improves the success rate of verification requests, especially in regions where address data is typically incomplete or unavailable. Shipping & Logistics Improve delivery accuracy, reduce customer service escalations, and cut down on re-delivery costs. More precise address data also leads to better shipping quotes by enabling accurate distance and cost estimations upfront. Fraud Detection & Market Expansion Confirm the real-world existence of locations to fight synthetic identity fraud and assess market viability. Expanding address coverage in emerging markets makes it easier to evaluate fraud risk and identify real opportunities with greater confidence. Example Queries Example 1: Search for nearby addresses within a radius Example 2: Verify if a specific address exists The Outcome: Launch Faster and Operate with Confidence By using SafeGraph as your global address provider, you can: ✅ Consolidate vendors and simplify sourcing ✅ Free up internal data science and ops resources ✅ Speed up market entry and product launches ✅ Improve user experience and reduce errors ✅ Minimize fraud and compliance risk in new markets 📥 Download the Free French Polynesia Dataset We’re offering a free sample of SafeGraph’s address data in French Polynesia. See the structure, precision, and verification quality firsthand – and test it in your own systems. Download the dataset Need coverage in a different country? Let us know and we’ll prioritize it for sourcing. #### Reasons Why Data Scientists Choose SafeGraph Data Key Takeaways Choosing a data partner is as important as choosing the data itself, especially in location intelligence. Data quality, accuracy, and joinability matter more than sheer data volume. SafeGraph focuses exclusively on building clean, reliable, and up-to-date data on physical places. High-precision brand identification and POI polygons improve analytical accuracy. Frequent updates and transparent documentation help data scientists trust and operationalize the data faster. SafeGraph’s laser focus on clean, accurate, and up-to-date data, related to physical places, makes it a trusted leader in the growing data marketJust as all data sources are not created equal, nor are data companies either. So if you are currently looking to partner with a data company, you need to be crystal clear about what your data-related goals are and what you hope to achieve by working with specific datasets.As you begin shopping around for a data partner, it may seem like getting access to more data would be the better option. A word of caution: More data isn’t always better; if the quality and accuracy aren’t there, then no amount of data is going to do your organization any good.In fact, if the data supplied is fundamentally flawed or can’t easily be joined to your own datasets, you risk wasting hours cleaning it to make it useful. And in the worst-case scenario, bad data can lead you to make bad decisions that can be both costly and potentially harmful—to both your organization and the people impacted by those decisions.Long story short: Choosing the right data partner is not something that should be done haphazardly. Especially in the location-data space.Make SafeGraph your go-to for location-based dataAs you already know, at SafeGraph, we are 100% focused on data. That’s all we do—and our goal is to do it better than any other data company out there.All of our resources are dedicated 24/7 to the production and delivery of the highest quality, most accurate location-based data available to our customers in the U.S, Canada, and now, the UK, too. We don’t let ourselves get distracted by an array of ancillary “services,” like consulting or designing fancy graphics, that many other data companies offer. It’s simply a distraction that takes us away from our mission of democratizing access to clean and accurate data for all.We often like to think of ourselves as an ingredient in a recipe. When you put the highest quality ingredients into what you’re making for dinner, you’re going to have a higher quality meal overall. The same applies to our business. The higher quality the data we provide to our customers and partners, the higher quality solutions, insights, decisions, and innovations they’ll be able to drive from it. It’s really just as simple as that.And if you didn’t hear the big news, we just closed a round of Series B funding, which will open up even more opportunities for us to double down on what we do best: data.But aside from our core foundations and values, there are a few primary reasons why we’ve quickly become the go-to choice for data scientists for their location-based data needs:1. Accurate brand identificationThe SafeGraph Places dataset captures core location information, spatial hierarchy metadata, and a lot more for over 6,200 brands. In fact, our data is so accurate, that it can identify 95% of a brand’s actual locations (when compared to first-party data from that brand). That level of accuracy is truly unprecedented in the location-based data space. And because our data has been able to deliver on this level of accuracy consistently, over and over again, we’ve built an incredible amount of trust with our customers. They come to us because they know what our data is capable of helping them to achieve.The visual below is a great example of this in action. It shows how our data can accurately identify the 14,884 Starbucks locations across the U.S.SafeGraph Places data identifies nearly 15,000 Starbucks locations in the US.2. High-quality polygonsEvery single point of interest (POI) in our Places dataset is precisely matched with its best-fitting polygon. More often than not, this aligns with the POI’s building footprint. However, there are a number of outliers that we have to look at in a lot more detail to ensure this information is being accurately represented. This includes things like strip malls or shopping centers where multiple businesses often live under a single roof.The truth is, this is one of the most technical and cumbersome aspects of the location-data puzzle. High-quality polygons are extremely difficult to both engineer and produce. But when done right, it makes all the difference. This is also why a big chunk of our engineering team focuses squarely on this work. For us, it really is that important to get right.A glimpse into SafeGraph’s approach at building accurate POI polygons.‍3. Constant updatesWe release a fresh, updated, and ever-growing batch of SafeGraph Places data every month. For us, this is absolutely critical. In fact, at the beginning of the COVID-19 pandemic, the need for accurate location-based data was more important than ever, so we updated our dataset weekly to ensure that governments, academic institutions, non-profit organizations, and businesses had the most up-to-date data to work with at all times.As a point of comparison, most other data companies update their data only once per quarter or even once per year. We don’t think that’s enough. Especially at a time like this when, unfortunately, we’ve seen a lot of volatility in the market, with so many businesses opening and closing as a result of the pandemic.The key takeaway here is simple: With so much changing at an ever-increasing pace, everyone needs access to the most accurate data in order to drive the most actionable insights.The SafeGraph team is proactive in notifying users of the latest Places dataset releases.‍4. Documentation transparencyOur goal is to make our data as easy to use as possible right from the get-go. So, our product team has gone the extra mile by putting everything that a data scientist—or anyone visiting our site—could ever need right on our SafeGraph Places Documentation page. This includes our schema, summary, statistics, data science resources, API documentation, and more.And because we are laser-focused on ensuring that our customers get access to only the most accurate data at all times, we proactively alert customers whenever we find an error in the data to both make them aware of it and to let them know that we’re on a fix. We understand the impact that even the seemingly smallest errors can make, so we’ve made it a priority to maintain the highest level of transparency around all matters pertaining to our data.A brief snapshot of SafeGraph Places documentation.Choose SafeGraph for your location-based data needsWhile it goes without saying that all of the reasons above explain why you should partner with SafeGraph for your location-based data needs, the thing that should stand out in your mind—forgive us for saying this on repeat—is this: SafeGraph provides the most comprehensive, accurate, and up-to-date dataset on physical places in the market today.We get it, seeing is believing. Schedule a demo today. FAQ’s 1. Why do data scientists prioritize data quality over data volume? Large datasets offer little value if they are inaccurate, outdated, or difficult to join with existing data. Poor-quality data increases cleaning time and risks flawed analysis. 2.What makes SafeGraph’s Places data reliable? SafeGraph emphasizes accurate brand identification, precise POI polygons, and frequent updates to reflect real-world changes. 3.How often is SafeGraph Places data updated? The dataset is updated monthly, with more frequent updates introduced during periods of rapid change, such as the COVID-19 pandemic. 4.Why are POI polygons important in location-based analysis? High-quality polygons allow analysts to understand how people interact with physical spaces more precisely than point-only location data. 5.How does documentation transparency benefit data scientists? Clear documentation reduces onboarding time, improves usability, and builds trust by openly communicating schema details, limitations, and corrections. Large datasets offer little value if they are inaccurate, outdated, or difficult to join with existing data. Poor-quality data increases cleaning time and risks flawed analysis.SafeGraph emphasizes accurate brand identification, precise POI polygons, and frequent updates to reflect real-world changes.The dataset is updated monthly, with more frequent updates introduced during periods of rapid change, such as the COVID-19 pandemic.High-quality polygons allow analysts to understand how people interact with physical spaces more precisely than point-only location data.Clear documentation reduces onboarding time, improves usability, and builds trust by openly communicating schema details, limitations, and corrections. #### Reimagining Democracy with Quadratic Funding and Quadratic Voting: World of DaaS interview with Microsoft OCTOPEST, Glen Weyl New podcast with Glen Weyl, Microsoft’s OCTOPEST. Our conversation is available everywhere (Apple Podcasts, Spotify, YouTube, etc.). Please subscribe, follow, and review.‍ Glen is Microsoft’s Office of the Chief Technology Officer Political Economist and Social Technologist (OCTOPEST), where he advises Microsoft’s senior leaders on macroeconomics, geopolitics and the future of technology. He is also the author of Radical Markets, which is one of the more thought provoking economics books. Here are some highlights from my conversation with Glen Weyl. Quadratic voting wants to understand how much you care‍ Traditional democratic voting systems only ask us to share our favorite option. In San Francisco, we have preferential voting, which asks us to rank the options from our favorite to least favorite. But this still misses how much we care. Quadratic voting, however, draws out exactly how much you care. People receive a pool of credits to allocate. But it's quadratically more expensive for every additional vote. To buy one vote, it costs one credit, to buy two votes it costs four credits, to buy three votes it costs nine credits, and so on. You are incentivized to have a little bit of influence on a lot of things. But if you care a ton about certain things, then it makes sense to spend your credits for those particular types of things. Millions of people already use quadratic voting across the globe While quadratic voting is new and very different, it already has a bunch of success. In Taiwan, quadratic voting is used to rank hackathon projects. Judges use these votes to signal what is really important to them, and that's how they give out the prizes from local governments. The Colorado State Government uses quadratic voting to allocate the state budget and make a lot of executive branch decisions. Civilization VI, a popular strategy game, uses quadratic voting as its voting mechanic to determine global policy. Quadratic funding wants to address our “free-rider” problem Today, we have little incentive to make small, individual contributions. We assume that these contributions won’t make a difference, so we’re left with a few large contributors. Kickstarter claimed to create a more democratic system, but funding still lies in the hands of few. Quadratic funding challenges this by matching every dollar that you give inversely proportional to what share of the community you are. It favors things that have many individual contributors more than the ones that have few contributors, encouraging people to make contributions, no matter how small. Corporations can use quadratic funding to allocate capital‍ Most companies have divisions that don’t cooperate with one another. But it’s in the interest of the company to encourage cooperation, especially when it comes to cross-cutting infrastructure. Quadratic funding can incentivize cross-division funding by creating a matching fund where headquarters will match contributions made by multiple teams. Treating data as labour can improve data quality We put up with really low quality data, because people aren't engaged with the process of designing and collecting it. Most people do not understand where and how their data is used. So even when they have perfect information and could make really easy fixes to their data, it just doesn’t happen. Helping people participate in data collection can massively improve data quality. Glen shares an example of where this works. In Taiwan, a bunch of people who were worried about pollution and have IoT devices in their houses are now active participants in data creation. They invest in making sure these are working, improving the quality, and in exchange, they ask the government to place IoT devices in certain places and monitor the air pollution. ‍ ZoomInfo is another example. They contact everyone in their B2B contact database, explain how their data is used, and provide them the ability to correct any of the information collected. In return, users know that they’re making the entire community better and more valuable. Note: if you enjoy this episode of World of DaaS, be sure to follow Glen Weyl on Twitter. Hope you enjoy this episode of World of DaaS — would really appreciate it if you subscribe and review Apple Podcasts, Spotify, YouTube, etc.). #### Retail Site Selection Checklist: 7 Steps for Choosing a New Location Retail site selection is the process of choosing a location for your retail stores. While these decisions were previously based on intuition and experience, these decisions now rely heavily on rich analytics and robust data.To help you choose the best site for your retail store, we have a retail site selection checklist outlining all of the things you need to consider when choosing a retail store location: Map the current store landscape Understand how other brands impact your stores Identify and locate your target demographic Enrich with more contextual data Analyze current store performance Identify lookalike locations Determine desired retail space size and layout ‍Generally speaking, we’ve tried to list them in the order you would want to do them, however, depending on your desired outcome, the process may look slightly different. Feel free to rearrange these items as they work best for you and your site selection process. Regardless, these serve as a great starting point for essential things to check for when performing a site selection.Retail site selection checklist: what you need to know before choosing a store locationThere are a number of things you want to consider when choosing a new retail site, including your business performance, demographic data for your existing locations, and much more, but there are a few key nearly universal things to address when selecting a new retail site.It’s always important to think of site selection as not only choosing new locations, but also deselecting underperforming and failing locations. While much of our strategies listed are about finding what is working, honing in on what it is that is driving this success, and then replicating it, understanding where your retail stores’ biggest shortcomings are can inform your site selection — and deselection — just as much.Map the current store landscapeWhat you need to do: Map and visualize current stores for your brand so you can see what you are dealing with.‍Why you need to do this: This will help you identify areas where you don't have any stores, or where you have more stores than typically normal for your brand, giving you a baseline so you can dig into why that is and determine if those are good locations. Some stores may be located too close to each other and cannibalize business - which you can identify and then fix.Understand how other brands impact your storesWhat you need to do: Map and visualize competitive and complementary businesses.‍Why you need to do this: Doing this can show you where there are areas of opportunity or risk. Maybe your stores located closest to competitors are the most successful, or vice versa, so having an understanding of that will help your site selection. Similarly, understanding where complementary brands are (think a juice bar located near a yoga studio), could help you find locations where your target customers may already be.Identify and locate your target demographicWhat you need to do: Overlay demographic data with your store locations to see who lives closest to your stores. You can also use mobility data to see which census block groups (CBGs) your customers are coming from when they visit a particular POI.‍Why you need to do this: Demographic data indicates the types of people (age, gender, income, race, education level, etc.) that are either near your store or visit your store/nearby. This information will help you determine lookalike markets and also better tailor your customer experience. This will help you market more effectively, as well as choose the type of store that is best suited for a specific location (drive thru vs traditional dine-in restaurant, etc.).Enrich with more contextual dataWhat you need to do: Add other relevant data to your analysis, like proximity to highways or public transportation, building types, and more.‍Why you need to do this: This can help provide more context on why a store is successful or not. Enriching with more information provides analysts a way to differentiate store locations and really understand why one is better than another. Different brands may have different needs so the contextual data will vary. For example, a gardening store might want to overlay with residential property information so they know how many people near the store actually would need their products versus people who live in cities with no land.Analyze current store performanceWhat you need to do: Look at key metrics like revenue and foot traffic for each store location.‍Why you need to do this: All of the previous steps lead up to this one. This will indicate which locations are most successful so you can derive why they are successful and find look alike locations. For example, if locations nearby universities are the most successful, you can target locations with similar demographics. The characteristics of successful stores uncovered in the previous steps (how close they are to your other stores, competitors, complementary businesses, target customers, etc.) will help you define what characteristics you should look for in the next store location. You can also use this information to see which stores are not performing well and decide where to close - site selection works both ways.Identify lookalike locationsWhat you need to do: Using the insights derived from store performance analysis, find potential new locations that have similar characteristics to what your most successful stores have. You can also use this to model new opportunities by testing a location in a new area, using that as a lookalike for your future locations.‍Why you need to do this: With data backing up why a store is successful, you can then use data to identify stores matching that profile and are thus set up for success, since you know that strategy works. Demographics and foot traffic data for census block groups (CBGs) can also help you identify where your target customers live and interact with places like your business the most.You can then use this to help model expansions to areas and demographics you haven’t tested. If your locations are all in suburban areas and are trying to set up in a rural area, you can test that location. Then use that location as a lookalike for other locations you may want to set up, and an indication of how you can do in the new area or with the new demographics.Determine desired retail space size and layoutWhat you need to do: Use existing location performance to determine the ideal retail space size and layout for the new location.‍Why you need to do this: To find the best location for your business, you’ll need to compare the retail spaces available to you, including not just their location, but their layout and size. Use the previous performance of your existing locations, along with information about the potential site, to determine the best location size and layout.‍Site selection is just one component of a successful retail strategy, but it’s essential for proper resource management and audience targeting. Make sure you address all of the things above before choosing a site for your new retail location — or before closing a location. #### SafeGraph 2022 Year in Review Key Takeaways Ended the year with 41.4M POIs (+275% YoY) Increased the brands we cover to 9,993 (+20%) Expanded our polygon coverage to 15.9M POIs (+72%) Increased our POI coverage of 220 countries + territories (+83%) 2022 was an exciting year for the data industry as a whole. We want to thank all of our customers, partners, and employees for their support and for helping us to get to where we are. Today, 100+ customers depend on SafeGraph data to power their applications and insights. We are incredibly excited to carry this momentum into 2023 and continue our mission to open the world’s data for innovation.‍Thank you to everyone who has supported us in our mission to empower modern builders to create world-class location-based applications and analytics tools.Happy holidays from the SafeGraph team! #### SafeGraph 2024 Year in Review SafeGraph 2024 Year in Review As 2024 comes to a close, we’re reflecting on a year filled with milestones, exciting growth, and some unforgettable moments. From major product enhancements to new leadership and team adventures, it’s been a year that’s truly shown the strength of the SafeGraph team and community. Product Enhancements That Transformed 2024 We’ve been hard at work enhancing our product offerings to better meet the needs of our partners. Some of this year’s highlights include: Global Coverage Expansion: This year, we expanded our global coverage based on customer demand, adding POIs across multiple countries and categories. Some highlights include +1M POIs in Italy (March release); +225k POIs in France (August release); +500k POIs in Western Europe (October release); and +1.2M POIs in the US, Canada, Mexico, and Brazil (December release). New Rich Attributes: We introduced 3 new rich attribute columns that provide more granular details about malls in the US and Canada. This information helps our customers better assess real estate opportunities, perform retail site selection, and measure customer behavior across different mall layouts. Deduplication Model Update: In May, we released a significant update to our deduplication model which removed 325k duplicate places globally. This effort especially improved data precision across services, retail, food, and hotel related categories. Revamped Category Model: In November, we bootstrapped years worth of customer feedback as training data and leveraged the latest advancements in LLMs to upgrade our category (naics_code) model. This effort increased the accuracy of our naics_code inferences for millions of non-branded POIs globally. Geometry Improvements: Throughout the year, we made various enhancements to our Geometry product to help customers attribute real world visits to places more reliably. Notably, in April, we ingested new building footprint datasets in the US and Spain which sharply reduced the rate of places with synthetic polygons. Rich Attribute Improvements: Many of our rich attributes are built for specific use cases that enable narrower search queries, aggregation queries, or enable unique joins to other datasets. As customers increasingly find utility, we did our part to ensure these columns improve over time. Notably, in September, we increased the category_tag fill rate in the US by 4 percentage points, corrected store_id values for 30+ brands, and implemented additional parsing and QA logic to handle open_hours reported in various formats. By the Numbers: SafeGraph’s 2024 Impact As the source of truth for data on physical places, SafeGraph has enabled product builders to spend less time cleaning and curating data, unlocking valuable resources for application and model development. Over 120 customers are leveraging SafeGraph POI data to drive innovation across Adtech, Mapping, CPG, Retail, and Financial Services. From optimizing advertising campaigns to enhancing mapping products and enabling smarter retail site selection, our data empowers customers to make data-driven decisions and unlock new opportunities for growth. New Leadership: Jason Richman Steps In as CEO One of the biggest moments this year was welcoming Jason Richman as SafeGraph's new CEO. A 7-year veteran of SafeGraph, Jason previously led the revenue function and brings a clear vision to maintain SafeGraph’s position as the market leader in data for physical places. Meanwhile, our founder, Auren Hoffman, has transitioned to Chairman, where he continues to provide strategic insights. Together, this leadership team is paving the way for an even brighter future. SafeGraph Culture: Fun, Growth, and Connection This year wasn’t just about work – it was also about coming together as a team. In August, we held our annual retreat in Denver, and it was nothing short of fun and inspiring. From brainstorming sessions to team dinners and cornhole competitions, the retreat embodied the spirit of SafeGraph: hard work paired with unforgettable memories. Looking Ahead to 2025 We’re already gearing up for another big year. With exciting new product launches, partnerships, and industry initiatives on the horizon, we can’t wait to see what the future holds. To our customers, partners, and team: Thank you for being part of the SafeGraph journey. Here’s to making 2025 even better! Happy New Year, The SafeGraph Team #### SafeGraph Data Used to Support Turkey Disaster Response SafeGraph provides POI data to the OpenStreetMap community in Turkey to support rescue and recovery efforts following the Turkey earthquake SafeGraph was founded, first and foremost, to democratize access to clean, accurate, and comprehensive geospatial data on physical places. But our mission truly goes beyond access alone. As data experts and enthusiasts ourselves, we know the power that location data can have in transforming and creating a much deeper understanding of the world around us. That’s why, in the past, we’ve often made our data readily available for academic institutions and non-profit organizations—especially during times of crisis—because we know that having access to the very best location data can lead to new innovations and sometimes life-changing solutions that have the potential to make a meaningful impact on the world. For this reason, we’ve always believed that using location data has incredible power to support disaster response. In fact, using SafeGraph data to find real-time solutions during times of crisis has been at the heart of our DNA from the very beginning. We saw this really come to fruition during the COVID-19 pandemic and, since then, have seen it being used to solve big socio-economic questions around access to healthcare, access to healthy food, and beyond. That’s why we didn’t think twice about making SafeGraph data free and accessible to support on-the-ground organizations in Turkey in recent disaster response efforts following the catastrophic earthquake in Turkey and Syria—particularly, to help map out the most affected areas where thousands of buildings have collapsed and countless lives have been lost. Putting SafeGraph data into action in Turkey Yer Çizenler is an Istanbul-based NGO founded in 2017. As the organization’s name in English, “Mapping for Everyone” might suggest, the organization’s sole focus is to support the production, management, and sharing of publicly-accessible geospatial data—via free and open tools and software—to support various humanitarian efforts. Following the recent earthquake in Turkey and Syria, Yer Çizenler quickly organized a disaster relief project, in collaboration with Humanitarian OpenStreetMap (OSM) Organization and other partners, for mapping out the most affected areas as a way to support humanitarian work in the field. However, because location data in the region is still relatively underdeveloped, Yer Çizenler needed to lean into the entire mapping community to help close the gaps. This is where SafeGraph comes into the picture: We donated Turkey-based POI (point of interest) data to be ingested by OSM to prioritize response efforts based on urgency, region, or proximity to specific POIs, like pharmacies, via a powerful disaster response dashboard. ‍ A great example of how SafeGraph POI data was used to help pinpoint the location of pharmacies in a region of Turkey affected by the earthquake. Check out the full map here. ‍ Additionally, working closely with Said Turksever—a local lead from Yer Cizenler as well as a project manager at Meta—we’ve provided up-to-date POI data to help verify before and after (earthquake) street-level imagery to aid in response efforts. All in all, while a long road to recovery is to be expected in this situation, it’s clear that when industries like ours—and well beyond—come together to provide essential resources to support life-or-death humanitarian efforts, like those currently happening in Turkey and Syria, it can make an incredibly positive impact. Even in the aftermath of genuinely catastrophic events. The power of location data to help overcome major crises Our recent collaboration with various organizations in Turkey is yet another example of why the entire team at SafeGraph is not only proud but also committed to continuing to make our accurate and up-to-date geospatial data available whenever disaster strikes—as a way of doing our part to help on-the-ground organizations amplify their ability to do good and, more importantly, make a difference in the lives of potentially millions of people. As the crisis in Turkey and Syria continues to unfold, we will monitor how SafeGraph data is being used to support rescue teams and identify any new and innovative use cases for our data that we can immediately put into action in the future. For now, though, be sure to follow Yer Çizenler to get real-time updates on how their active mapping work is continuing to make a positive impact on disaster response efforts in the region. #### SafeGraph Expands Its Footprint Into the United Kingdom A strategic decision that furthers our mission of democratizing access to high-quality, reliable, and accurate data for allYou might have recently seen the news that we’ve made an important decision to expand SafeGraph’s footprint into our first international destination outside of North America.We’re excited to say that this spring our Places and Geometry datasets will launch in England, Scotland, and Wales for the very first time. If you’d like to take a sneak peek, we’ve created a sample SafeGraph Places UK dashboard to give you a sense of what’s to come.So, needless to say, our team has been quite busy lately. Not to mention, we moved our company headquarters to Denver in December, which was an exciting change, in and of itself.But with all this incredible change happening, I thought there was no better time than now to give you a little insight into the strategy underlying these big decisions as well as the big bets we’re placing today to make SafeGraph the most trusted source of data worldwide tomorrow.‍Democratizing access to high-quality data is our missionOur mission hasn’t changed one bit since SafeGraph was created. We still firmly believe that amazing things can happen—whether it be in the form of disruptive insights or new innovations changing the world for the better—when we make it possible for anyone or any organization to be able to access high-quality, reliable, and accurate data. In fact, as we actively chart the course towards our company’s future, it’s becoming even clearer to us just how important our mission of “democratizing access to data for all” truly is.There’s no shortage of data today. It’s virtually everywhere. But just because it’s everywhere doesn’t mean all of it is good or provides meaningful or actionable value. That’s why, for us, we don’t look at data through the lens of ‘quantity versus quality’ but rather as a combination of quantity and quality.But this is also one of the biggest reasons why we’ve been such a big proponent of setting essential data standards from day one. We know that data is powerful. We also know that it can add so much more value when it’s packaged up in a way that people can actually use without having to spend hours scrubbing and cleaning it just to join it to other data sources.So, when thinking more “big picture” about our decision to expand our footprint into the UK, it’s merely just another opportunity for our team to put clean and useful data in the hands of more organizations and more people around the world. Plus, we know that many of our current customers and partners have international footprints of their own. Our goal is to continue doing whatever we can to scale our business in a way that adapts to our customers’ global data needs.4 answers to your questions about our expansion strategyWhenever big changes happen or important decisions are made, people tend to have a lot of questions. So we’ve decided to take a stab at answering some of those questions that we think are probably top-of-mind. If there’s something else on your mind, just let us know.1. What drove our decision to make the UK our next international market?As mentioned above, a lot of the decisions we make are based on what our customers tell us they need. In spite of COVID-19, the world is still more interconnected today than ever before. Our customers, many of whom have business operations in multiple countries, have big problems that they need to solve (with data) that stretch beyond the US market alone.That’s one of the reasons why we first expanded our international footprint into our neighbor in the north. Knowing that Canadian businesses and consumers share a number of operational and cultural similarities to their American counterparts, bringing SafeGraph to Canada made complete sense as an extension of what we were already doing in the US. More importantly, it was an opportunity to create a holistic and insights-driven view of the many dynamics shaping a big portion of North America.Similarly, when looking out across the pond, it goes without saying that a very large share of our American and Canadian customers have a presence in the UK as well. We feel that it’s an important stepping stone for eventually replicating what we do across the rest of the world.2. Will the data in the UK be the same as it is in the US and Canada?The availability of underlying data sources is different in every country. This will undoubtedly have an impact on how we go about acquiring our data and, in all honesty, could also pose some unique challenges that we haven’t yet faced in the US or Canada. Our product management and engineering teams are working tirelessly to understand the ins and outs of the UK data market, so that we can scale our operations effectively while still providing our customers with the same clean, reliable, and accurate data that they’ve grown accustomed to by working with us thus far.3. Will Placekey play a role in this at all?Our partnership with Placekey is a critical component for our expansion plans into the UK and eventually into other markets in the near future. Placekey’s recent expansion into the Netherlands is aligned with our own international aspirations, and we hope to continue to expand alongside Placekey in the future. Because of the different ways in which different countries collect data, we’re going to start working with datasets that we haven’t necessarily worked with before in the same capacity. So, we will most definitely rely heavily on Placekey, as a universal identifier for any physical place in the world, to unlock the power and potential of the new location-based datasets we’ll be working with. Here’s another way to look at it: Placekey will be the glue that makes it possible to join multiple datasets together in a universal way. It’s a true game-changer and something that will allow us to scale and grow our business more effectively in the future.4. What about data quality?Expanding into another country doesn’t mean we suddenly throw our data standards out the window. The quality, reliability, and accuracy of SafeGraph data, as you know it today, will not change one bit. As mentioned above, we will have to overcome some new country-level challenges in order to provide a final product that meets—and hopefully, surpasses—your expectations. Quality is not negotiable for us. We take pride in our data because we know that it’s the best on the market. Our teams will continue to deliver the same level of quality in every market we expand into down the road, beginning with the UK. And we’ll work with our customers every step of the way to ensure that nothing has slipped through the cracks.We’re just getting startedThis is an exciting chapter in SafeGraph’s history, one that is poised to shape our business for years to come by setting an important foundation for extending the same quality, reliability, and accuracy of SafeGraph data to businesses, non-profits, governments, academic institutions, and other organizations around the world. We believe in the democratization of access to good, clean data—and we will work tirelessly to see that mission through in everything we do. This is just the beginning of many more exciting things to come. #### SafeGraph Featured on Mapscaping: A Podcast for the Geospatial Community The challenges of building geospatial truthsets.SafeGraph was the featured guest of Mapscaping on November 20, 2019We were humbled to have SafeGraph’s Ryan Fox Squire as the featured guest on Mapscaping. We are huge fans of Mapscaping. If you aren’t a listener, you should be!Checkout @MapScaping's latest podcast - SafeGraph Data Scientist & PM @RyanFoxSquire talks about how we're building our geospatial datasets. https://t.co/6nZJ8IuerZ— SafeGraph (@SafeGraph) November 20, 2019 You can listen to the podcast here or wherever you get your podcasts.From Mapscaping:Collecting and validating geospatial data for every commercial location in the USA and Canada is not an easy task. It requires aggregation of data from multiple sources and formats. This data then needs to be validated and decisions need to be made about which data sources represent the truth in the case of conflicting data. Safegraph does this weighing datasets based on certain criteria and using a voting system.Data is being scraped and curated from multiple different sources but it some cases it is also necessary to create data. Think of the use case of a shopping center. If you think of the shopping center as being a collection of geometries where the entire shopping center is the parent geometry and the individuals business in the shopping center are child objects. In this situation Safegraph has had to digitize entire shopping centers manually in order to properly represent the parent/ child geometry relationships.Thanks for reading! If you found this useful or interesting please upvote and share with a friend. #### SafeGraph Joins Placekey Initiative as a Founding Partner to Build a Standard Identifier For Physical Places SafeGraph Joins Placekey Initiative as a Founding Partner On October 7th, 2020, Placekey officially launched with the help of 10 founding partners, including SafeGraph, Esri, Carto, Veraset, and more. Additionally, over 500 organizations signed up to be contributing partners to show their support. Auren Hoffman, SafeGraphs Founder & CEO, has long been a believer and activist for data standards. You can read more about his point-of-view on why data standards and how they increase data flow and benefit everyone. “Data is most powerful when it’s standardized”, says Auren. Placekey does not benefit a single organization, but helps to level up the entire industry so that innovation can happen faster. What is Placekey? Placekey solves a major problem for organizations that leverage data assets to drive geospatial innovation. Without being able to access a common identifier, data analysts must dedicate a significant amount of time cleaning, normalizing, and organizing datasets in order to join them together. “Placekey is a standardization the entire industry can all agree on,” said Matt Shaw, VP of Engineering at Fiddlehead. “I’m excited to discover what I can do with all the time freed up from data cleansing and normalization.” placekey.io More importantly, Placekey will provide organizations with a free, common, universally-accepted industry standard for identifying physical places. “To take the next step in unleashing global innovation around the power of location data and information, we need an open, commonly-used designation for place,” said Keith Masback, Principal Consultant of Plum Run, LLC. “Placekey unlocks that potential.” There have been a number of attempts to create an industry standard for identifying physical places in the past. Unfortunately, none have ever been broadly adopted. Previous identifiers were built and designed specifically for mapping data but not necessarily to support joining datasets together. Grid system identifiers, for example, are great at identifying physical locations on a map but often fail to provide context for what is at that location. Placekey, on the other hand, takes a unique, places-centric approach that can be applied consistently and uniformly across different disparate datasets. And in an effort to encourage continued GIS- and Placekey-related innovations, the SafeGraph Community was formed, creating a space where over 5,000 Placekey users can collaborate, ask questions, and share insights and innovations with each other. This Slack community is free to join and open to geospatial experts using Placekey. SafeGraph has taken the necessary steps to adopt Placekey within their own dataset. If you preview or buy data from SafeGraph, you will now see the associated Placekey column for every single POI. This allows SafeGraph data to be easily joinable to other datasets using Placekey. More details about Placekey and the Placekey Community can be found at placekey.io. How Does Placekey Work? The “What” and “Where” of any physical place. When both parts of a Placekey come together, the final result reads as What@Where. This is a unique way of shedding light on both the descriptive element of a place as well as its geospatial position in the physical world via a single identifier. The first three characters refer to the Address Encoding, creating a unique identifier for a given address. An address at “555 Main Street Suite 105” will have a different Address Encoding than “555 Main Street Suite 106.” However, "444 Second Street, Suite 4" will have the same address encoding as "444 2nd St. #4" to adjust for common address formats. The second set of three characters in the What Part refers to the POI Encoding. If a specific place has a location name (like "Central Park") and is already included in the Placekey reference datasets, these characters will be present. The benefit of the POI Encoding is that it can point to a specific point-of-interest that may have existed at a certain address at a given point in time. Placekey POI Encoding The Where Part, on the other hand, is made up of three unique character sequences, built upon Uber’s open source H3 grid system. This information in the Where Part is based on the centroid of that place. In other words, we take the latitude and longitude of a specific place and then use a conversion function to determine a hexagon in the physical world, representing about 15,000 sq. meters, containing the centroid of that place. The Where Part of the Placekey is, therefore, the full encoding of that hexagon. Placekey Where Create A Placekey For Free Today Placekey officially launched on October 7th with SafeGraph participating as a founding partner. You can watch all of the launch videos on the Placekey Sessions & Seminars page. With the launch, Placekey made it available for anyone to create a Placekey using their free tool. Developers can also request an API key to create Placekey’s at scale with their own datasets. Visit Placekey.io to learn more. #### SafeGraph Named Databricks’ Data Sharing Partner of the Year Here at SafeGraph, we are honored to be awarded Databricks’ Data Sharing Partner of the Year thanks to our support and adoption of Databricks’ Delta Sharing. We are excited by the opportunity to continue to grow our partnership with Databricks and use this momentum to further the Delta Sharing initiative."We are excited to be Databricks’ Data Sharing Partner of the Year and to help build up Delta Sharing from the beginning. We like it as the open protocol for data sharing, and our customers appreciate getting our data with little work and the ability to integrate easily with their workflows. As a data company, SafeGraph is focused on democratizing data for all - Delta Sharing helps us to bring value to our customers from day one," said Felix Cheung, Senior Vice President of Engineering at SafeGraph. Our partnership with Databricks promotes SafeGraph’s key value of making data more accessible. Through Delta Sharing, our customers are able to access the world’s most accurate places data within minutes. Delta Sharing is significant for data providers like us because it removes the pipeline hurdles involved in users evaluating or receiving our data, and allows data users to immediately start processing the data how they prefer.Data users are able to securely exchange SafeGraph data with customers, partners, and suppliers to better collaborate and unlock value from the data through Delta Sharing. This process improvement allows data users to collaborate and unlock value from the data quickly.“We are honored to announce SafeGraph as Databricks’ Data Partner of the Year,” noted Jay Bhankharia, Senior Director of Data Partnerships at Databricks, “SafeGraph’s thoughtfully curated data enhances customer value and provides rich insights for Databricks customers. We are excited to continue to deepen the partnership and build more value with SafeGraph in the Lakehouse ecosystem.”Learn more about the SafeGraph and Databricks partnership by viewing any of the following resources:Accelerating Business Value with Delta Sharing WebinarSafeGraph Partners with Databricks on Open Data Sharing BlogBuilding Reliable Data Pipelines for Machine Learning Webinar #### SafeGraph Partners With AWS and Databricks To Launch Industry’s First Full-Stack Location Solution SafeGraph is thrilled to announce an exciting partnership with AWS and Databricks to make insights about the physical world easier than ever.Today Amazon launches AWS Data Exchange , a new platform for sharing data. SafeGraph is honored to be a founding data partner for the AWS Data Exchange, launching today with over 20 powerful datasets available for free or for purchase (you need to sign in to your AWS account to see the listings).SafeGraph is honored to be a founding data partner for the AWS Data Exchange, launching today with over 20 powerful datasets available for free or for purchase.‍Also, if you want to learn more about using SafeGraph data in Databricks, register for our upcoming webinar.SafeGraph is the source of truth for points of interest (POI) data.SafeGraph is just a data company, that’s all we do.SafeGraph has two primary datasets:Places: Base information about a point of interest (POI) such as location name, address and brand association for top ~5,500 national brands. Available for ~6.1MM POI.Geometry: Geometry information for commercial POIs that includes the polygon of the POI and spatial hierarchy metadata defining whether the polygon is contained within another POI. Available for ~6.1MM POI.AWS is one of the most important cloud services companies in the world. Making SafeGraph data available in the AWS Data Exchange is 100% aligned with the SafeGraph mission to democratize access to data.The AWS Data Exchange is now hosting over 20+ datasets from SafeGraph, including:SafeGraph Core Places — Entire USA (5.3MM records)SafeGraph Core Places — USA Gas Stations and Convenience Stores (135k records)How do I work with SafeGraph data from AWS? Answer: Databricks.Databricks is a unified analytics platform that enables data science, data engineering and business analytics teams to derive value from data at scale and with ease of use in a collaborative manner.At its core, the Databricks platform is powered by Apache Spark and Delta Lake in a cloud native architecture, which gives users virtually unlimited horse power to acquire, clean, transform, combine and analyze data sets within minutes from a notebook interface, with popular languages of choice (python, scala, SQL, R). Because Databricks is a managed platform, customers do not have to become big data devops gurus to power their analytical needs, which reduces administrative burden, costs and risks of their data driven projects.How do we load SafeGraph data from AWS Exchange into Databricks Data Lake?To demonstrate the power of SafeGraph data inside Databricks, we are highlighting two datasets from SafeGraph currently available for free inside AWS Data Exchange.SafeGraph Places — Starbucks in the USA (Free)SafeGraph Open Census Data (Free)Getting your data running in Databricks is just a few clicks away. We’ve published full step by step instructions for loading SafeGraph data into Databricks from AWS Data Exchange on the Databricks blog.SafeGraph + AWS + DatabricksReading SafeGraph data from AWS Data Exchange into Databricks is quick and easy.Combining these technologies and datasets enables you to answer powerful and precise questions about consumer behavior.Want to get more SafeGraph data?There are over 20 datasets available for free or for purchase in AWS Data Exchange. Check them out!Special thanks to Andrew Hutchinson and Prasad Kona from Databricks and Ryan Fox Squire from SafeGraph for help developing the demonstration notebook and content of this blog post. #### SafeGraph Partners with Databricks on Open Data Sharing Data sharing has become important in the digital economy as enterprises wish to easily and securely exchange data with their customers, partners, and suppliers to better collaborate and unlock value from that data. But to date, a lack of an open, standards-based data sharing protocol has resulted in data sharing solutions tied to a single vendor or commercial product introducing vendor-lock in risks. As part of SafeGraph’s commitment to democratizing access to data for all, we are excited to participate in the launch of the Delta Sharing project with Databricks and other leading data organizations to bring a free and open data sharing protocol to consumers. Now, Delta Sharing enables anyone who wishes to securely and efficiently exchange data with another organization to send and receive data through a free and open standard protocol, regardless of which computing platforms are involved. SafeGraph is thrilled to support Delta Sharing. We’re excited to work more efficiently with our customers and partners with an open, cost efficient and scalable protocol, agnostic of computing platforms or clouds. With Delta Sharing, SafeGraph Places data is now more accessible than ever.Like Placekey, the universal standard identifier for a physical place, Delta Sharing eliminates vendor lock-in for data providers and consumers. As an open standard usable by any vendor, Delta Sharing is easy to add into products that read Apache Parquet. Part of the widely adopted open source project Delta Lake, Delta Sharing will benefit from a vendor-neutral governance model and a vibrant ecosystem across all clouds. Delta Sharing is part of the widely adopted open source project Delta Lake.Learn more about Delta Sharing and review the technical documentation to get started. #### SafeGraph Partners with Dewey to Democratize Access to Data for Academics SafeGraph is committed to democratizing access to data. That’s why over the past few years we have partnered closely with universities to provide data for research across a wide range of topics, from the spread of COVID-19 to the economic impacts of natural disasters and more. Today, SafeGraph is excited to announce a new partnership with Dewey that furthers our ability to provide quality data to academics. Dewey is a brand new platform that unlocks access to data from multiple vendors for use in academic research and curricula. SafeGraph’s high veracity and up-to-date point of interest data, including our rich contextual attributes about places, is now available via the Dewey platform for use in academia. With the SafeGraph and Dewey partnership, it’s never been easier for academics to browse and download the data they need for research or curriculum development. Accessing SafeGraph data through Dewey Anyone affiliated with a university that subscribes to the Dewey platform can now access SafeGraph data for free. Dewey makes it easy to search for and download the exact data you need to conduct research or develop a new curriculum. If your university is not subscribed to Dewey, you can browse and purchase SafeGraph’s datasets on an ad-hoc basis. We offer a number of free samples on Dewey so you can get your hands on the data before making a subscription decision. If you want to work with Dewey to get your university subscribed, you can get in touch with them here. SafeGraph is the first data company to partner with Dewey, which will soon offer similarly streamlined access to data from other providers as well. To stay in the loop and receive updates about new Dewey partners, you can subscribe here. The SafeGraph Community is now part of Dewey For over two years, the SafeGraph Community has been a valuable resource for academics looking to engage with others conducting research with data. Moving forward, the community will be run by Dewey in order to foster similar relationships and discussions among academics about all sorts of data, not just datasets provided by SafeGraph. Over the next few weeks, you’ll see the SafeGraph Community transition over to Dewey - stay tuned for more information. Continue to share your research with SafeGraph We love to see the research academics have conducted with SafeGraph data. If you are working with SafeGraph data via Dewey, please be sure to share your results with us so we can highlight your research on our publications page. #### SafeGraph Raises $16 Million Series A Announcing our round led by Ridge Ventures and over 100 well-known individuals including Peter Thiel, Adam D’Angelo, Romesh Wadhwani, Eric Cantor, KT zu Guttenberg, Jack Dangermond, Barry Sternlicht, Pete Briger, Naval Ravikant, and Nicolas BerggruenToday, SafeGraph is excited to announce our first financing: $16 million from an incredible group of investors and supporters.SafeGraph’s goal is to help answer humanity’s biggest questions by providing access to the largest ground truth dataset in the world. We predict the past by focusing on veracity and truth. Our belief is at least one iconic company will be built on democratizing access to data … and we aspire to be that company.SafeGraph’s first product is geospatial data and our goal is to be the most accurate record of truth for understanding physical places. In the next year, our goal is to be the best company for urban planners, retailers, academic researchers, marketers, investors, and many more to understand how people and places relate to each other.After leaving LiveRamp, I teamed up with Brent Perez to figure out what to do next. Our goal was to (1) build an iconic business that has the potential for $100 billion market cap; (2) work on super hard technical problems that get us excited to come to work every day; (3) work with amazing people that we love being around; and, most importantly (4) do something that truly adds value to society.Enter SafeGraph.Investors: Ridge + SafeGraph + 100 IndividualsTo accomplish this, we wanted to team up with the most helpful investors.Alex Rosen at RidgeWe picked Alexander Rosen and Ridge Ventures (formerly named IDG Ventures USA) because of their deep understanding of data, enterprise start-ups, and alignment with our mission. Alex has already proven to be incredibly helpful to SafeGraph with introductions to investors, partners, and recruits, and I would personally recommend Ridge Ventures to other entrepreneurs that have big visions and want supportive partners.One of the great things about Ridge Ventures is that they understand the power of having lots of brilliant people rooting for the company. Because of this, we were able to bring over 100 additional amazing individuals as investors.The best minds in AI investing in SafeGraphWe’re lucky enough to add some of the best minds who are thinking deeply about the future of AI: Peter Thiel (my long-time mentor, friend, and first backer of LiveRamp), Adam D’Angelo (CEO of Quora and fmr CTO of Facebook), Romesh Wadhwani (CEO of Symphony Technology Group), and Bryan Johnson (fmr CEO of Braintree).Assembled the deepest policy thinkersSince our #1 goal at SafeGraph is to be truly value-add to society we enlisted the help of people with deep experience in government and academia: Prince Turki Al Faisal Al Saud (fmr head of Saudi Intelligence); Nicolas Berggruen (Chair of Berggruen Institute); Eric Cantor (fmr U.S. House Majority Leader); acclaimed authors Niall Ferguson and Sam Harris; Meghan O’Sullivan (ran Iraq and Afghanistan policy under George W. Bush); Mona Sutphen (fmr Deputy Chief of Staff to President Obama); Karl-Theodor zu Guttenberg (fmr German Minister of Defense); Kotaro Tamura (fmr Japan Senator); and other amazing folks like Lenny Mendonca, Pierpaolo Barbieri, and Tewodros Ashenafi.Enter pioneers in geospatial technologyBecause our goal is to be the best company in the world at understanding physical places, we wanted great geospatial and real-estate thinkers as SafeGraph backers: Jack Dangermond (CEO of Esri and one of my heroes); Barry Sternlicht (CEO of Starwood); Joseph Meyer (CEO of Observer Media).Finance innovators back SafeGraphBecause finance is omnipresent (and Brent and I have little experience in it), we assembled some true innovators in finance: Naval Ravikant (AngelList CEO); Pete Briger (Fortress Chairman); Ash Gupta (President of American Express); Charles Songhurst (fmr Chief Strategy Officer at Microsoft and newest board member of SafeGraph); Chris Farmer at SignalFire — the VC firm that most understands the world of data; and Feroz Dewan, Steve Drobny, Marcelo Hallack, John Pfeffer, Alexander Tamas, Dan Benton, Jonathan Sands, and Elizabeth Weymouth.Most of the top marketing technology execs back SafeGraphThe long list includes: Mike Baker (CEO of DataXu); Michael Barrett (CEO of Rubicon); Tim Cadogan (CEO of OpenX); Tom Chavez (CEO of Krux); Mike Derezin (LinkedIn); bruce falck (fmr CEO of Turn); Adam Foroughi (CEO of AppLovin); Raj Gajwani (Google); Rajeev Goel and Amar Goel (founders of Pubmatic); Jonah Goodhart and Aniq Rahman (CEO and President of Moat); Yaz Iida (President at Rakuten USA); Matt Keiser (CEO of LiveIntent); Greg Murtagh (fmr CEO of Triad Retail); Brian O'Kelley (CEO of AppNexus); Kim Reed Perell (CEO of Amobee); Bob Pittman (CEO, iHeartMedia. CEO, Clear Channel Outdoor. Fmr CEO, AOL); donn rappaport (CEO of ALC); David Rodnitzky (CEO of 3Q Digital); Eric Roza and Chris Scoggins (CEO and EVP at Datalogix); Tod Sacerdoti (CEO of BrightRoll); Dipanshu D Sharma and Stephen McCarthy (CEO and CFO of xAd); Kamakshi Sivaramakrishnan (CEO of Drawbridge); Omar Tawakol and Grant Ries (founders of BlueKai); Are Traasdahl (CEO of Tapad); Div Turakhia (CEO of Media.net); Joe Zawadzki (CEO of MediaMath).We also brought on some some super-supportive entrepreneurs and angels including:Jack Abraham; Jonathan Abrams; Adrian Aoun; Arup Banerjee; Ethan Beard; Jean-Jacques Bienaime; Andrew Bursten; Saran Chari; Doug Chertok; Sara Clemens; Matthew Cowan; Matias De Tezanos; Dan Engel; JeffEpstein; Scott Faber; Russell Fradin; David Friedberg; Jared Friedman; Ric Fulop; By Shruti Gandhi; Brad Garlinghouse; George Garrick; Mark Goldstein; Mike Greenfield; Fabrice Grinda; Will Harbin; Joel Hornstein; David Hunt; Brett Hurt; Mark Jacobstein; Tomer Kagan; David Kidder; Jaya Kumar; Ariel Lebowits; Yishai Lerner and Mihir Shah; Paul Levine; Josh Levy; Lee Linden; Patrick McKenna; christopher michel; Allen Nance; Adam Nash; Itamar Novick; Mark Organ; Mark Pincus; Julia Popowitz; MR Rangaswami; Eric Ries; Sumon Sadhu; Ken Sawyer; Evangelos Simoudis; Michael Stoppelman; Bart Swanson; Jonathan Swanson; Greg Tseng; Jose Vargas; Tabreez Verjee; Sam Yagan; and Adam ZeplainOver 30 former colleagues from LiveRamp invested in SafeGraphI’m most proud of bringing in current and former LiveRampers — the company I spent over 9 years at as CEO. These are some of the most talented people Brent and I have ever met and we are so blessed to have them as friends.Almost the entire executive team at LiveRamp invested, including my cofounder (and CTO) Jeremy Lizt; LiveRamp CEO Travis May; COO James Arra; CPO Anneka Gupta; VPs Joel Jewitt, Ari Jacoby, Dan Scudder, David Yaffe. Additionally — all five corporate officers of Acxiom invested: CEO Scott Howe, CFO Warren Jenson, Data President Rick Erwin, EVP Jerry Jones (and I already mentioned LiveRamp CEO Travis May). We also want to thank these incredibly talented current and former LiveRampers for investing: Sean Carr, Eric Chernoff, Ken Dreifach, Bryan Duxbury, Michael Feldman, Abhishek Jain, Andy Johnson; Piotr Kozikowski, Ian Meyers, Bryan Morris, Chris Mullins, Diego Panama, Ben Podgursky, Mike Safai, Armaan Sarkar, Justin Schuster, Manish Shah, Daniel Stevens, Chris Taylor, Michel Tricot, Porter Westling, and Takashi Yonebayashi.An amazing team of sixteen employeesWe built an amazing team at LiveRamp, and Brent and I are seeking amazing and talented people to help build SafeGraph. I’m super proud of the team we’ve already attracted. Our engineers are some of most talented big data and machine-learning developers in the world. Our BD people are creative and insatiable. Our product people understand our customers better than they understand themselves.An unconventional fundraising for an unconventional company.Most companies don’t raise $16 million for the first round of financing. And pretty much no company does it by bringing on over 100 individual investors in addition to a top tier VC. But we’re not a conventional company. #### SafeGraph Vision Strategy and Values Driving Data Innovation SafeGraph Vision and Strategy We need open information to power innovation. Information should not be hoarded so that only a few can innovate. We need as many organizations as possible working to solve the challenges facing humanity. SafeGraph’s mission is to make the world’s data open for innovation while protecting individuals privacy. Be the world’s archivists. To provide truth sets to the world’s innovators, we need to meticulously record and archive the world’s facts and events. Our job is to find out what happened, when it happened, and where it happened. We leave the application of the knowledge (why it happened and what will happen in the future) to our customers. We’re a 21st century news organization (without the op-ed department) whose product is data. Seek the truth about the world. Data that isn’t true has little use. SafeGraph needs to independently verify the veracity of every data element. Of course, data can never be 100% true … but we should strive to make it 100% true. Be the data utility to all. SafeGraph will service all applications, not just a select few, which means that we cannot become an application because we’d compete with our customers. We enter markets carefully and methodically — only if we can have the very best data and become the market leader. Security, safety, compliance, and privacy are paramount. Our partners entrust us with their core data assets, which means we have a duty to protect them. To be good stewards of any sensitive data, we should only enable uses that are a net-good to society. And we should always pass the “mom test.” SafeGraph Values to Achieve Our Vision Do fewer things but be great at them. It is 100x better to be the best than it is to be “just” really good. But it’s very difficult to be the best at many things. Every team member strives to do as few things as possible, and the company strives to do as few things as possible — so that we can be the best at what we do. This is also why we strive to only do one new thing at a time — series beats parallel. Judgement is the x-factor. It is essential that every team member at SafeGraph makes key decisions autonomously, so that we move fast and limit bureaucracy. But as Voltaire (sometimes attributed to Spider-Man’s Uncle) said, “with great power comes great responsibility.” To make great and effective decisions at all levels of the company, we need to (1) clearly communicate the company’s strategy to all team members; (2) hire super smart teammates that work hard; (3) only hire people who have demonstrated sound judgment and are deserving of our trust. We are the enablers, not the solvers. As a company, it is important we have the humility to accept that our clients will ultimately be the ones to make the world a better place and solve humanity’s greatest challenges … we are just an enabler. This humility should always color everything we do. Respect our own time — get leverage. Because we hire only the most talented people, SafeGraph team members must constantly seek leverage. We put an extremely high value on our own time. A team member rarely does a repetitive or mundane task more than a few times before she automates it (through engineering, outsourcing, selecting a vendor, etc.). SafeGraphers should spend over 75% of their time doing things that are really hard and that only they can do. We know that the more we can leverage ourselves, our teams, and our organization, the bigger SafeGraph can scale. Respect others’ time — don’t be a bottleneck. Humility means respecting the time of our coworkers, partners, clients, recruits, etc. We never want to be a bottleneck as a company or as an individual. Bottlenecks cause frustration and cost us customers and revenue. They lower morale and create uncomfortable conversations like “did you get that email I sent Tuesday?” We return all emails and calls within 12 hours, even if it’s just to say “I’ll get back to you tomorrow.” We strive to never be bottlenecks. Focus on growth. Great team members continue to improve and grow. The only way to do that is to actively solicit feedback on how to get better and find ways to work on one’s strengths. Great team members also focus on making those around them better, and they give feedback often. Giving constructive feedback (suggestions of how to improve flaws) is helpful but giving specific, positive feedback can lead to even faster growth and higher leverage of people’s strengths. We have extremely high expectations of ourselves and of our team members. Join Us: We’re bringing together a world-class team, see open positions. #### SafeGraph's Data Sourcing Process Some of the most common questions we receive are focused on how we build and source our datasets. SafeGraph creates our products from a combination of machine learning, web crawling, and third-party licensing. In this blog post, we break down the sourcing process for each of our datasets. 1. Places SafeGraph’s Places dataset provides points of interest (POI) data and detailed attribution for non-residential places. Along with the geospatial coordinates and address of the POI, we provide information like brand affiliation, open/close time, and NAICS codes for deeper context and increased analytics possibilities. SafeGraph sources POI data in a variety of ways: Crawling open store locators on the web (ex. crawling a brand’s website that lists where all of its stores are) Using publicly available APIs and crawling open web domains that provide updated locations for a specific category of POIs (ex. websites that list where all airports are) Processing and modeling to infer additional attributes (ex. inferring what category a POI is) Licensing third-party data to fill in the gaps Once we ingest all of this sourced data, we go through a rigorous de-duping and merging process to make sure the Core Places dataset is clean and ready for use. We also identify spatial hierarchy relationships so end-users can understand how POIs relate to one another. 2. Geometry Building upon this POI data, SafeGraph produces our Geometry dataset, which provides building footprint polygons for POIs. As with our Places dataset, we source reliable third-party data and use machine learning to infer the shape of buildings from satellite imagery. Along with the geospatial coordinates, address, and brand affiliation of the POI, Geometry data provides the shape of the place of interest, formatted as Well-Known Text (WKT) for easy mapping and analysis. SafeGraph’s Geometry dataset also includes additional attribution, such as the presence of a parking lot in the provided polygon, building height, and spatial hierarchy information. We disclose whether or not each polygon is synthetic, which indicates whether or not the polygon is inferred from machine learning. ‍ 3. Spend SafeGraph Spend provides aggregated and anonymized debit and credit card transactions at individual points of interest. We build this dataset by partnering with one of the largest transaction data providers, used by the world’s top financial institutions. The data we source from our partner is consumer-permissioned and not tied to individuals. Rather than providing individual transaction amounts with timestamps, Spend data delivers aggregations of transactions taking place on specific dates at specific places. This enables users to analyze how spending is changing at different locations over time, and at the level of geographic granularity needed for their particular use case. Have a specific question? Let us know - we are here to help. #### SafeGraph’s Geometry Dataset now includes Parking Lots SafeGraph is excited to announce a collection of premium geometry rows depicting the shape and size of surface parking lots in the US. Parking Lots data shows the relationship between 6M US places and the surrounding parking lot(s) likely serving those POIs. Looking at parking lots as an extension of the places they serve, this data provides additional insight into the places our customers care about. In this blog we will explore the methodology used in creating SafeGraph Parking Lots.How We Created Parking LotsSafeGraph Parking Lots provides a 2D polygon that follows the boundaries of parking lots as precisely as possible. Creating these parking lot polygons was a two-step process: first, with the aid of a trained dataset, AI algorithms were able to recognize and delineate individual parking lots from satellite imagery covering the entire US. Next, we refined these results by cleaning up the data and managing edge cases. This was necessary as the results from the AI varied widely. Let’s have a look at some of these challenges and edge cases below.Edge cases: managing very small and very large parking lots While AI is able to “see” parking lots as 2D polygons and delineate them as such from satellite imagery, it is not good at matching the edges of drawn boundaries to where they are in reality. This would result in a single parking lot being split up into multiple polygons that in reality should have been merged together as a single geometry. We solved this problem by filtering the data, using a minimum threshold value of 100 square meters for a single parking lot, which can host a maximum of six cars. This approach filters out many small, individual parking lots that got counted as separate parking lots but are in fact part of a larger geometry. Also, it turned out that some large parking lot features consisted of many smaller parking lots that were joined together incorrectly. Using a similar method, these were also filtered out of the dataset.Dealing with “broken” polygonsSimilar to incorrectly delineating the edges of a parking lot, the AI would incorrectly create parking lot features with holes inside them, resulting in “donut” polygons. Examples are polygons drawn around parked cars on the parking lot itself or where a tree forms a shadow on the satellite imagery. This is obviously bad data that needs to be filtered out of the dataset. To do this, a statistical metric was calculated for each parking lot feature that measures the size of a parking lot relative to its parameter. Low values could indicate bad data, but not always. Because the AI would often incorrectly mark smaller holes but correctly mark large ones, we decided to measure the size of each hole inside a parking lot polygon, and eliminate the ones that didn’t meet a specific threshold value, so that spatially correct holes would be kept - for example, holes indicating an apartment building structure with a parking lot around it. (Left) AI incorrectly marking small holes outlined in yellow. (Right) Result after applying our threshold value.Simplifying parking lot geometriesFinally, drawing straight lines around the edges of a parking lot turned out to be a problem for the AI. While you only need two edge points to draw a straight line on a map to delineate one side of a 2D feature, the AI would instead draw jagged lines between extra, unnecessary coordinates. Applying simplification of these geometries resulted in nicer looking polygons and less data points, which makes it easier for customers to consume, process, transfer and visualize the Parking Lots data.(Left) Jagged lines created by AI without simplification. (Right) Cleaner lines after applying simplification.How can I use SafeGraph Parking Lots?Here are a few examples of ways data on parking lots can provide greater context to specific use cases.Urban Planning: Determine if there is enough surface parking to support consumer traffic and POIs in a given area.Site Selection: Understand how the presence, or lack thereof, of parking lots impacts the suitability of individual locations.Mapping: Expand upon neighborhood maps by including surface parking polygons and highlighting accessibility to POIs.Risk Assessment: Identify the density of impervious surfaces in an area and how it can contribute to flood risk and other natural hazards.As with all of our datasets, we will continue to enhance our parking lot coverage and grow it to meet peoples’ use cases. Interested in checking out the data yourself? Download a free sample of SafeGraph Parking Lots. #### SafeGraph’s Response to Congressional Inquiry on User Privacy In May 2022, several lawmakers requested information about how SafeGraph handles data regarding physical locations, including family planning clinics. As part of its commitment to cooperate with lawmakers, SafeGraph voluntarily provided the lawmakers with a comprehensive response to answer their questions and to correct inaccuracies about SafeGraph’s data stemming from previous reporting. As the response makes clear, SafeGraph is committed to data privacy and remains at the forefront of privacy innovation by utilizing practices and techniques that are at the cutting edge of industry privacy standards. SafeGraph does not sell data that identifies individuals, nor can our data be “de-anonymized” using any known method of re-identification. Our data products provide historical insights about places, not the individual people who visit them. In the spirit of transparency, a core value of SafeGraph, we are choosing to publicly share our formal response to the lawmakers’ questions. We are grateful for the opportunity to correct the public record on these important matters. You can read our entire response here. #### Scaling Data As a Service (DaaS) with Platform Engineering Key Takeaways Platform engineering helps data companies scale DaaS by abstracting infrastructure complexity from product teams. SafeGraph uses platform engineering to enable faster data delivery, higher reliability, and better developer experience. Abstraction layers for data lakes, Spark workloads, and Kubernetes reduce operational friction as data products grow. A strong platform engineering approach allows startups to scale data services without sacrificing accuracy or performance. SafeGraph is a geospatial data company that curates high-precision data on millions of places around the globe. Our datasets provide detailed, accurate, and up-to-date information on points of interest and how people interact with those locations. To scale SafeGraph’s Data as a Service (DaaS) for a rapidly growing user base, we built a platform engineering team that serves as an enabler for other teams. In this blog post, we share our approach to platform engineering, how it is implemented at SafeGraph, and how it enables product development teams to deliver data products more efficiently. Why Platform Engineering Is Essential for Scaling Data Startups Why do startups need a platform engineering team? This is not an easy question to answer, especially for companies like SafeGraph. Large organizations such as Google, Facebook, Netflix, and Uber rely heavily on platform teams to build and operate their own infrastructure stacks, as they face challenges related to massive traffic volumes, complex use cases, and scale. These challenges may not appear to apply to smaller startups. Cloud vendors and open-source technologies have reduced, if not eliminated, the need to build infrastructure from scratch. However, as highlighted in prior discussions, platform engineering is still essential to integrate vendor solutions into a cohesive, cloud-based infrastructure that supports business needs. In addition, SafeGraph faced several challenges that platform engineering helped address: Product developers were occasionally blocked by errors or issues in tools such as Apache Spark and Kubernetes for data services, where vendors could not always provide adequate support due to limited domain contextIntroducing innovative technologies often creates friction by requiring changes to established workflows and working habitsSelecting among multiple solutions for specific use cases involved complex trade-offs, including concerns around vendor lock-in.The Role of the Platform Team in Solving Key Challenges The platform team serves as a domain expert responsible for unblocking product development teams from issues related to development tools and infrastructure. In addition to addressing immediate challenges, the platform team builds long-term infrastructure solutions that support the rapid growth of engineering teams as the business scales. Beyond technical enablement, the platform team also plays a key role in fostering a strong engineering culture by reducing operational overhead and minimizing interruptions to product development workflows. Ready to build on high-precision data? Get a free sample of SafeGraph’s datasets and see how accurate, regularly refreshed data can support analytics, Modeling, and decision making at scale.  Get a Free Sample Platform Engineering at SafeGraph: Scaling Infrastructure to Support Data Growth In this section, we explain how SafeGraph fulfills the core missions of platform engineering. Resolving Infrastructure Bottlenecks  Unblocking other teams from immediate issues related to infrastructure and development tools is one of the most critical responsibilities of platform engineers. Many startups build their infrastructure on top of open-source technologies such as Kubernetes and Apache Spark. While open source offers flexibility, it does not always mean low cost and can introduce significant operational overhead or become a bottleneck in the product development pipeline. As a result, the platform engineering team must act as the domain expert for these open-source systems. Success in this role has a twofold impact on the platform team and the broader organization. Delivering high-quality products to customers in a timely manner is always a top priority for startups. In-house experts who deeply understand the technologies used in product development are especially valuable in challenging situations. Although platform engineering often involves short-term costs, such as temporary workflow interruptions or resource diversion, solving immediate problems builds trust across engineering teams and offsets these costs by reducing the time and effort required to resolve issues without sufficient domain expertise. One example of how the platform team at SafeGraph addressed immediate product development challenges involved improving the performance of our Spark-based data processing stack. A Spark job that processed a small volume of data could run for up to 24 hours, only to fail frequently and stall downstream consumers.  This work is a practical example of Apache Spark optimization in production, where deep understanding of execution internals directly improved reliability and performance at scale.  After investigation, the platform team identified two root causes: The single-threaded task serialization mechanism in Spark’s DAG Scheduler was overwhelmed by a multi-threaded job submission approach, leaving executors idle. The Spark job implementation relied on Scala’s parallel collections, which imposed an expensive hash code calculation in the default fork-join pool. By batching job submissions and avoiding the default fork-join pool, the team reduced execution time from 24 hours to approximately 3 hours, meeting reliability expectations and restoring downstream stability. We have many other examples of how platform teams can resolve immediate and unexpected issues to unblock product development teams and maintain focus on business priorities. Long-term Infrastructure Solution Another key mission of the platform engineering team is to build long-term infrastructure solutions that support the company’s growth. For startups, building infrastructure often involves introducing the right technologies for specific purposes, such as managing service and data job configurations or simplifying service deployment. Introducing New Technology Introducing innovative technology also brings several challenges. New tools often require significant changes to existing workflows that engineers are already accustomed to, creating friction between short-term product delivery and long-term infrastructure benefits. This trade-off is a common challenge for many companies. Evaluating whether technology can serve long-term needs is equally difficult. This challenge is evident across areas such as data warehousing, stream processing, and messaging queues, where a wide range of technologies exist. These options differ significantly in design and capabilities and can impose high switching costs when business requirements change. To address these challenges, the platform engineering team focuses on building sustainable and efficient infrastructure solutions by: Minimizing operational overhead and easing the adoption of new technologiesEnabling flexibility across different solutions while keeping switch-over costs low Both objectives can be achieved through effective infrastructure abstraction. Abstraction for Easier Adoption One example of addressing complexity through infrastructure abstraction is SafeGraph’s machine learning (ML) model management, deployment, and versioning system. This system effectively functions as an MLOps abstraction layer, shielding ML engineers from tooling complexity while preserving robust model governance.  We have built multiple ML models to support the delivery of high-quality data products. As our customer base and user requirements have grown, the number of ML models has increased, creating challenges in managing them effectively. MLflow emerged as a promising solution to address these challenges. However, introducing MLflow at SafeGraph came with significant costs. Adoption required adding configuration files for each ML project and modifying existing workflows to include additional manual steps, such as running commands before committing models. One example of the low return on investment from this change was the effort required to display Git commit hashes in the MLflow Run UI. To leverage MLflow’s built-in functionality for this purpose, ML engineers were required to complete multiple steps, including: Adding a project description file to their project directory   Changing their existing workflows, such as local Python runners or Jupyter notebooks, to use the MLflow command line to run projects  These requirements introduced friction, especially given that the underlying goal was simply to surface metadata in the UI. Similar challenges arose when displaying training data versions in the MLflow UI, where engineers had to manually log parameters using the MLflow API, often repeating the same steps across projects. Overall, while MLflow is a powerful MLOps tool, it introduced significant distractions for ML engineers, diverting attention from their primary goal of applying state-of-the-art techniques to improve data product quality. Rather than adopting MLflow strictly according to official documentation or vendor guidance, we built an internal library that exposes APIs for logging parameters and metrics, uploading models, and integrating with Git. These APIs automate previously manual steps and automatically capture metadata such as artifact versions and data read/write versions. As a result, ML engineers can focus on their core work while benefiting from robust MLOps capabilities without disrupting their existing workflows. Abstraction for Flexibility  Another benefit of abstraction is maintaining SafeGraph’s flexibility in an uncertain and evolving technical landscape. This approach is an example of data lake format abstraction, allowing teams to work with versioned datasets without coupling to a specific storage technology. When we began building SafeGraph’s data lake, several technologies in the market, including Delta Lake, Apache Iceberg, and Apache Hudi, could serve as its foundation and provide key capabilities such as data versioning and time travel. We narrowed the decision to Delta Lake and Apache Iceberg, and the choice proved challenging for several reasons. Delta Lake is primarily developed by Databricks. Although it is open source, Databricks also offers versions with proprietary features within the Databricks Spark Platform. As a result, adopting Delta Lake can implicitly lock computing and storage layers to the Databricks ecosystem. Apache Iceberg, despite its highly active open-source community, was still relatively early at the time, and we encountered several usage constraints and bugs during evaluation. Additionally, differences in how Delta Lake and Iceberg implement similar functionality can result in high switching costs if a format change is required in the future. To address this dilemma, we built an internal library that provides APIs for common data lake operations, such as reading and writing versioned datasets and viewing dataset history. While these operations are implemented using Delta Lake or Apache Iceberg, product development teams do not need to be aware of the underlying format. This abstraction ensures that switching to alternative formats in the future would require minimal or no code changes. Engineering Culture Enabler A well-established and healthy engineering culture, which is essential for any strong technical organization, does not come without cost. These costs arise from changes in mindset and behavior and from unavoidable tooling overhead, even when there is alignment on adopting new practices. The platform team serves as an enabler of engineering culture by reducing these associated costs. At SafeGraph, we aim to build an engineering culture that values operational excellence in services. Operational excellence is driven by comprehensive monitoring, timely alerting, and other capabilities that help engineers improve service SLAs and debug issues efficiently. These capabilities are best built and maintained as part of platform engineering, rather than being delegated to individual product teams and requiring them to allocate limited resources to develop tools from scratch. The platform team is also well positioned to promote the desired engineering culture. Because the “products” delivered by the platform team are used across teams, cultural practices that lead to success become highly visible, and their benefits are easily shared and reinforced organization-wide. For example, the platform team at SafeGraph built solutions to minimize or eliminate manual steps in using Terraform, a tool that is widely known for its steep learning curve. As the user experience improved progressively across teams, the benefits became widely shared and reinforced a culture focused on minimizing unnecessary human intervention in processes. By resolving immediate challenges, building cost-effective long-term infrastructure, maintaining a future-oriented approach, and enabling strong engineering culture, platform engineering plays a critical role in optimizing engineering organizations for efficiency and sustainability. FAQ’s 1. What is platform engineering in data companies? Platform engineering is the practice of building internal platforms that abstract infrastructure complexity, allowing data and product teams to focus on delivering reliable data products instead of managing systems. 2. How does platform engineering support DaaS scalability? By standardizing data pipelines, compute workloads, and deployment processes, platform engineering enables data-as-a-service platforms to scale efficiently without increasing operational overhead. 3. Why is platform engineering important for startups offering data services? Startups benefit from platform engineering because it reduces technical debt early, improves system reliability, and supports rapid growth without constant re-architecture. 4. How does SafeGraph use platform engineering? SafeGraph uses platform engineering to manage large-scale location data pipelines, optimize Spark workloads, and ensure consistent, high-quality data delivery to customers. 5. What technologies are commonly used in platform engineering for data platforms? Common technologies include Kubernetes, Apache Spark, data lake abstraction layers, and MLOps tooling to support scalable and resilient data services.  Platform engineering is the practice of building internal platforms that abstract infrastructure complexity, allowing data and product teams to focus on delivering reliable data products instead of managing systems.By standardizing data pipelines, compute workloads, and deployment processes, platform engineering enables data-as-a-service platforms to scale efficiently without increasing operational overhead.Startups benefit from platform engineering because it reduces technical debt early, improves system reliability, and supports rapid growth without constant re-architecture.SafeGraph uses platform engineering to manage large-scale location data pipelines, optimize Spark workloads, and ensure consistent, high-quality data delivery to customers. Common technologies include Kubernetes, Apache Spark, data lake abstraction layers, and MLOps tooling to support scalable and resilient data services.  #### Setting up SafeGraph for a Prosperous Future We made the difficult yet important decision to lower the cash burn at SafeGraph. The hardest part about the decision was to reduce the size of the SafeGraph team by approximately 25%. The cost of hard decisionsWhen reducing expenses, there are a lot of easy decisions you can make before facing the tough ones. You negotiate harder on software expenses. You decrease your paid advertising. You cut contractors. But ultimately you need to make some really hard decisions.You need to cut some high-potential investments. Even if you cut in all those places, you likely still need to reduce the size of the team (which is the largest expense for most tech companies).This is really hard. Especially when the people you work with are so incredibly talented and passionate. That is the case with SafeGraph. These are super talented people. People I definitely would love to work with again. They’re great people we hope to hire again. They are genuinely kind, enthusiastic and fun people I care deeply about. They brought their 100% to work every day and now they are being told that they are not going to participate in the next step of the journey.It is incredibly challenging and I’m fully responsible.These are people you care about, who gave it their all and now they are being asked to leave. It is really hard. Yes, you meet with each person individually and treat them with dignity. Yes, you give them a good severance. Yes, you help them find new jobs. Yes, you give them support. But it’s still an incredibly tough situation.Cash is the best offenseTaking your own advice is hard. As an investor in companies, you get to dole out advice to founders. The cost of the advice is cheap and you don’t have to actually make the hard choices yourself.For the last six months, I’ve been giving advice to founders that they should conserve their cash. But I was not taking my own advice.In today’s environment, cash is becoming more and more valuable (and it will be very hard to raise more). Strangely enough, cash tomorrow will be much more valuable than cash today. Usually the opposite is true. Usually cash today is much more valuable than cash tomorrow. It’s going to be more important than ever to allocate capital efficiently and effectively. So if you have a lot of money in 12-18 months, you are going to have the ability to proactively act on growth opportunities. But you can only do that if you have a lot of cold hard cash on hand.It is a bit weird because although we are in an inflationary macro environment (where cash is less valuable every day), the tech environment is getting very deflationary. Tech salaries are actually flat. Software costs are going down (much easier to negotiate). And asset prices (for acquisitions) are going way down (cut by 50-70%). So for tech companies, expect expenses to fall in the next 18 months. (note: this is being written in June 2022).That means that the cost of an “investment” is higher and you need a much higher expected return to make investments in today’s environment. We see it already happening in the Venture Capital world where the rate of capital deployment is slowing down drastically. Since capital isn’t flowing in, the return on every investment a startup makes needs to be much, much higher. That means for any internal investment you are making, it should have either a much higher chance of paying off or a much higher overall return if it does pay off. So the more you save today, the more you have available for the future. Conserving cash means that you have more ability to go on offense in the future. If you think cash will be much more valuable in the future, conserve cash today. Now back to SafeGraphAs CEO of SafeGraph, I had not been taking my own advice. It is common to think you are different. And I fell for it. I did not listen to my own advice in Jan, Feb, March, April, and May. Now it is time to listen. We made this decision even though we had two years of cash in the bank before the cuts. Because in this environment, two years is not enough. We will do everything to help our former teammates land in even better roles than they’re leaving. They will forever be SafeGraph alums and it’s my personal goal to make sure every SafeGraph alum is highly successful throughout the rest of their careers. ‍If you’re hiring high-quality talent, please email me at ah@safegraph.com. I’d happily serve as a reference for any of the employees impacted by my decision. #### Should Your Organization Build or Buy a POI Database? Key Takeaways Building a POI database offers control and customization but requires significant upfront and ongoing investment. Buying a partial POI dataset still demands internal resources for cleaning, validation, and maintenance. POI data changes continuously, making long-term upkeep one of the biggest hidden costs. Data quality is critical for enterprise analytics and products, and organizations must consider the opportunity cost of trying to maintain POI data quality at scale. For many organizations, buying a fully managed POI database offers the fastest and most scalable path to value. Explore Your POI Data OptionsAs point of interest (POI) data grows in importance for businesses, both for outward-facing applications and inward-facing dashboards, organizations face a fundamental question: should they build a POI database internally or buy one from a dedicated provider. POI data supports a wide range of use cases, including consumer-facing map applications, trade area analysis, market forecasting, site selection, and investment research. Because the quality of these outputs depends directly on the quality of the underlying data, the build vs. buy decision is not a short-term technical choice. It is a long-term strategic investment. At first glance, building a POI database may seem attractive due to the promise of customization and control. Buying POI data, on the other hand, may appear faster but potentially limiting. The decision involves deeper trade-offs related to cost, data quality, maintenance, and objectivity. Understanding these factors clearly helps organizations choose an approach that aligns with their business goals and technical capacity. Understanding the Build vs. Buy POI Data Decision Whether your organization uses POI data to power a map, analyse foot traffic, or support internal analytics, one principle remains constant: the accuracy of your insights is only as good as the accuracy of your data. POIs open, close, move, and change attributes more frequently than many teams expect. Maintaining a large-scale, highly accurate POI database therefore requires continuous effort. Most organizations evaluating POI data strategies tend to fall into one of three approaches: Build a POI database from scratch. Buy a partial POI database and clean or improve it internally. Buy a complete, externally managed POI database. Each option comes with its own costs and trade-offs. Option 1: Build a POI Database from Scratch Building a POI database internally can make sense for organizations whose core business depends heavily on geospatial data. The primary benefits include customization, ownership, and full control over how the data is structured and used. However, these benefits come with significant responsibilities. Infrastructure and Cost Building a POI database requires substantial upfront investment in infrastructure, tooling, and talent. Organizations must design and deploy data pipelines for sourcing, cleaning, merging, and storing POI data. This typically involves hiring experienced software engineers and data scientists, often costing well over $150,000 per role annually, in addition to cloud computing and storage expenses that can quickly reach six figures. These costs do not end after launch. As coverage expands or new attributes are added, infrastructure must scale accordingly. Acquiring Base Data Once infrastructure is in place, organizations must source POI data. Open datasets such as OpenStreetMap may seem appealing due to their low cost, but they often suffer from incomplete coverage, inconsistent updates, limited documentation, and licensing restrictions that may prohibit commercial use. Web scraping publicly available POI information is another option, but it introduces legal considerations, additional engineering complexity, and costs that can rival licensed datasets. Cleaning, Matching, and Validation Raw POI data is rarely usable without extensive processing. Entries must be deduplicated, verified, geocoded, classified, and enriched with attributes. Variations in naming, addresses, and coordinates across datasets make matching records that refer to the same place particularly difficult. Advanced machine learning techniques are often required to perform this work at scale. Poorly cleaned data can lead to flawed internal analysis and unreliable consumer-facing products, eroding trust among stakeholders and customers. Opportunity Cost Beyond direct expenses, building a POI database diverts engineering and data science resources away from an organization’s core products. For companies where geospatial data is not the primary value proposition, this opportunity cost can slow innovation and growth elsewhere in the business. Option 2: Buy a Partial POI Database and Improve It Internally A middle-ground approach is to buy or license a partially complete POI dataset and then clean, enrich, and maintain it internally. This can reduce some upfront data collection costs while still allowing customization. However, this option still requires significant internal effort. Organizations must carefully evaluate data quality, including geographic coverage, update frequency, attribute completeness, and documentation. Older or infrequently updated datasets are more likely to contain errors, omissions, and outdated information. Even after licensing data, teams must invest in infrastructure and skilled personnel to correct inaccuracies, resolve duplicates, and maintain freshness over time. As scale increases, these ongoing costs can approach those of building a database from scratch.  Option 3: Buy a Fully Managed POI Database The third option is to buy or license a complete POI database from a provider that handles data collection, cleaning, verification, and ongoing maintenance. This approach significantly reduces time-to-value. Because the data is delivered in a ready-to-use format, organizations can focus their internal resources on analysis, modelling, and application development rather than data engineering. A fully managed database also reduces long-term maintenance risk. Dedicated providers invest continuously in improving accuracy, coverage, and freshness, spreading these costs across many customers rather than placing the burden on a single organization. In practice, organizations that switch from fragmented or poorly maintained POI datasets often see dramatic reductions in the time spent cleaning data, allowing teams to deliver insights faster and act on opportunities sooner. Core Factors to Consider When Choosing to Build or Buy POI Data Regardless of which option an organization is considering, several core factors should guide the decision. Cost Cost includes more than licensing fees or salaries. It also encompasses infrastructure, cloud resources, tooling, and the long-term expense of keeping POI data accurate and current. Quality High-quality POI data requires continuous validation and enrichment. Dedicated providers build specialized teams and processes to maintain this quality at scale, which can be difficult to replicate internally. Maintenance POI data is not static. Without continuous updates, even a well-built database quickly becomes outdated. Maintenance capacity is often the deciding factor between success and failure. Objectivity POI data that is created and consumed internally may introduce unintentional bias. Using third-party data provides a level of separation that helps maintain objectivity across industries and use cases. Conclusion The decision to build or buy a POI database is ultimately a strategic one. While building offers control and customization, it demands sustained investment in infrastructure, talent, and ongoing maintenance. Partial solutions reduce some barriers but still require significant internal effort.  Request a Demo of SafeGraph Places Data Request a Demo FAQ’s 1. What does it mean to buy POI data? Buying POI data typically involves licensing a dataset from a third-party provider that collects, cleans, and maintains the data. 2. Is building a POI database ever the right choice? Yes, particularly for organizations whose core business depends heavily on custom geospatial data pipelines and who have the resources to maintain them long term. 3. How often does POI data need to be updated? POI data changes continuously, so frequent updates are essential to maintain accuracy. 4. What are the risks of using open-source POI data? Open-source data may be incomplete, outdated, inconsistently maintained, or restricted by licensing terms. 5. How does POI data quality affect analytics? Poor-quality POI data can lead to incorrect models, flawed insights, and loss of trust. 6. Does buying POI data eliminate all internal work? No, but it significantly reduces the engineering and maintenance burden compared to building or partially managing a dataset. 7. How should organizations evaluate POI data providers? Key criteria include coverage, update frequency, attribute completeness, documentation, and long-term maintenance commitment. Buying POI data typically involves licensing a dataset from a third-party provider that collects, cleans, and maintains the data.Yes, particularly for organizations whose core business depends heavily on custom geospatial data pipelines and who have the resources to maintain them long term.POI data changes continuously, so frequent updates are essential to maintain accuracy.Open-source data may be incomplete, outdated, inconsistently maintained, or restricted by licensing terms.Poor-quality POI data can lead to incorrect models, flawed insights, and loss of trust.No, but it significantly reduces the engineering and maintenance burden compared to building or partially managing a dataset.Key criteria include coverage, update frequency, attribute completeness, documentation, and long-term maintenance commitment. #### Solving Social Media Market Failures: World of DaaS interview with MIT Professor, Sinan Aral New podcast with Sinan Aral, MIT Professor and Author of “Hype Machine”. Our conversation is available everywhere (Apple Podcasts, Spotify, YouTube, etc.). Please subscribe, follow, and review. Sinan Aral was one of the first people to seriously pay attention to and study social media. I am super interested in the mechanics of social media. Some might say I have an unhealthy obsession with exploring virality and the Wall Street Bets movement. I had a lot of fun diving in with Sinan. Here are some highlights from my conversation with Sinan Aral. Human social networks were already homophilous. Social media algorithms turbocharged this. People tend to make friends with people who are like them -- this is old news. They move to communities where everyone belongs to the same political party. They subscribe to niche news that they want to hear. But we’re now seeing this amplified by social media algorithms. These algorithms are optimized to recommend connections that are even more like you than you would run into in everyday life. This creates bubbles where you are completely blocked off from other perspectives -- your entire stream of information becomes tailored to a very specific worldview. Things spread on social media because they’re “novel” Sinan published a 10 year study in Science on the spread of false news on Twitter. The people spreading false news had fewer followers, followed fewer people, were less often verified, and had been on Twitter for less time. So what drove these viral posts? They were novel. Novelty isn’t a well defined concept, but you can think of it as a classic man bites dog situation. These posts inspired shock and awe -- they were different from what people had seen otherwise. Still it’s really hard to predict what will go viral While most viral posts are “novel”, they also tend to have qualities that are unique to them. These unknown quantities are key to why they go viral. So this makes it incredibly hard to create reliable models that predict virality. Breaking up a company won’t create competition Trying to break up companies is slow and laborious. It’s not clear how to break up all monopolies -- many of the biggest tech companies are very intertwined. And simply breaking up of a company doesn't prevent the next company from doing that same thing. To create sustainable competition, we would need regulation that applies not just to the current market leader, but anybody that comes after that market leader. The Wall Street Bets movement is getting bigger and stronger Wall Street Bets showed us that online collective crowds could successfully exercise collective behavior coordination. And institutional investors are now part of the conversation too. They’re monitoring and trading against it or adding fuel to the fire. When a big hedge fund goes down, there are other hedge funds helping the hedge funds go down. They’re helping the crowds. It's hard to predict where we will see the next successful group go. But it will likely evolve, and we will likely see movement toward new sub-reddits and social platforms. Note: if you enjoy this episode of World of Daas, be sure to follow Sinan Aral on Twitter. ‍ Hope you enjoy this episode of World of DaaS — would really appreciate it if you subscribe and review Apple Podcasts, Spotify, YouTube, etc.). #### Supercharge Your Revenue with Places Data For sales teams, generating high-quality leads involves more than just finding potential customers; it requires a deep understanding of the market landscape. This is especially true for businesses with physical locations, such as retail chains and local stores. In this blog, we'll dive into how point of interest (POI) data can transform lead generation efforts and boost revenue, particularly by providing detailed insights into retail brands and chains in Miami, Florida (download the dataset at no cost to follow along). We will also explore how the categorization of locations can further refine your lead targeting and sales strategies. The Importance of POI Data for Lead Generation POI data provides detailed information about specific physical locations, such as retail stores, restaurants, and other commercial establishments. This data is crucial for sales teams aiming to enhance their lead generation processes through better market mapping, competitive analysis, and targeted outreach. POI data is invaluable for lead generation by way of: Market Mapping: Visualizing the distribution of businesses within a specific area helps identify market saturation and potential opportunities. Lead Scoring: By examining the location and characteristics of potential leads, sales teams can prioritize high-value prospects. Competitive Analysis: Understanding the density and proximity of competitors allows businesses to identify less competitive areas for better market penetration. Leveraging Categories to reach your ICP An ideal customer profile (ICP) represents the perfect customer for a company's products or services. Creating an ICP allows sales and marketing teams to craft more effective strategies that lead to higher conversion rates. SafeGraph Places data leverages high precision NAICS/category codes that enable sales and marketing teams to segment leads by ICP. Real-World Application: Targeting Retail Brands for Supply Chain Solutions A major logistics company was looking to identify potential retail brands that could benefit from their cross-dock solutions. Cross-docking involves moving products directly from incoming to outgoing transportation with minimal storage time. The goal was to find clusters of retail stores that could use this service to streamline their supply chains. Identifying Opportunities Using POI data, the logistics company mapped out retail locations for key brands in Miami. They focused on finding areas with high concentrations of stores like Ross, TJ Maxx, and HomeGoods. This clustering analysis helped identify regions where a cross-dock solution would be most beneficial. Prioritizing Leads The team then prioritized leads based on the density of stores within a specific radius. By targeting areas with a high concentration of potential customers, they optimized their outreach efforts. They used additional data points like store size and proximity to major transportation routes to refine their lead scoring. ‍ Enhancing Sales and Marketing Efforts With detailed insights from POI data, the logistics company tailored their sales pitches to highlight how their cross-dock solution could address specific pain points for each target customer. They demonstrated how their service could improve efficiency, reduce costs, and enhance the overall supply chain performance. Transforming Lead Generation with POI Data Incorporating POI data into lead generation processes can significantly enhance the effectiveness of sales teams. Detailed location data provides a comprehensive view of the market, helping businesses identify new opportunities, prioritize high-value leads, and stay ahead of competitors. By leveraging categories within POI data, sales teams can tailor their approaches to specific niches, leading to more precise targeting and better sales outcomes. Integrating this data into sales intelligence tools and CRM systems offers the insights needed to boost revenue and achieve sales goals. Whether targeting retail brands in Miami or other markets, accurate and comprehensive POI data can revolutionize your lead generation efforts, driving success through informed decision-making and strategic prospecting. Ready to transform your lead generation strategy? Download our comprehensive Retail Brands in Miami POI dataset for free and discover how SafeGraph Places data can help your sales team drive revenue growth. #### The Importance of Reliable, Accurate, and Timely Open and Close Metadata for POIs The world is dynamically changing and so are businesses. By some estimates, 20% of new businesses close within one year of opening. Even before the pandemic accelerated business closures, the Small Business Administration reported in 2018 that since 1990, about 7-9% of all businesses (not just new ones) close each year. At the same time, the number of new business applications has continued to increase year over year, adding more complexity to staying on top of what businesses are in operation and where. Source: Oberlo When working with points of interest (POI) data, organizations are often using those POIs as a source of truth for what is happening in the real world. This truth set is then used to base critical business decisions off of. But with such a rapidly changing business landscape, POI data is often stale. Many providers only update their POI databases annually or quarterly, and some do not thoroughly vet their inputs. This bad information can result in serious consequences for business strategies. But there is a solution. POI data that includes open and close metadata reflects the real-world dynamic of businesses opening and closing over time. Within a POI dataset, a date is marked for when the business is first opened, and the same if it closes. Why open and close metadata matters in POI databases POI users need reliable data in order to be able to analyze it and make the right strategic decisions. Here are some examples of organizations and the business decisions they make based on when places open and close: Business owners looking for new market opportunities or performing site selection need the latest and most accurate POI data to zero-in on the ideal place to operate - or where to avoid. They may want to see where complementary businesses have recently opened (think juice bars looking for new yoga studios), or where competitors have recently closed. Open and close metadata reveals where others have had success or failure, and can be an indicator of how a new venture might do in that place. Analysts doing investment research rely on open and close metadata in a geographic area to monitor larger economic trends that can impact their decisions. POIs that reflect recently closed businesses in one or more areas can point to bad economic performance on a larger scale, but also how different regions are performing compared to one another. For example, certain areas might be more or less affected than others in an economic crisis, or showing signs of recovery sooner than others. Governments use open and close metadata for urban planning. By monitoring where businesses open and close, they can create maps showing service areas for different community amenities, such as gyms, healthcare centers, public transportation, and educational resources. This allows urban planning departments to identify underserved areas and better allocate resources. How SafeGraph curates accurate, timely, and reliable POI data In order to accurately report open and close dates in our datasets, SafeGraph has been tracking openings and closings for both branded and non-branded places since July 2019. If a POI from an existing source repeatedly disappears in our build pipeline, it is flagged as closed_on during the month in which it first disappears. As of January 2022, SafeGraph’s Core Places data contains closed_on dates for 892K+ global places, spanning 100 countries and more than 4,000 brands. This includes 718K+ unbranded POIs representing smaller stores that are harder to report on. ‍ If a new POI from an existing source repeatedly appears in our build pipeline, it is flagged as opened_on during the month in which it first appears. As of January 2022, there are 125K+ POIs globally with an opened_on date, spanning 113 countries and more than 4,000 brands. On average, more than 1,000 brands are flagged with a store opening or closure each calendar month. These flags are added to the Core Places product permitting final QA checks and overall data hygiene. An analysis where we compared company-reported counts with our open/close data showed our methodology works and results in reliable, up-to-date, and accurate POI data. Schedule a demo to learn more. #### The SafeGraph Standard: Core Attributes of SafeGraph Employees Hiring is really hard. It’s one of the toughest jobs that leadership has at any company. At SafeGraph, we try to codify and communicate what we look for in a candidate. That keeps us focused on what matters and attracts (in theory) only those candidates who will thrive in our unique culture. If you want to know what we look for in future SafeGraphers, and if you might be a good fit for our team, read on! SafeGraphers in pre-COVID days Identifying & Hiring A-Players Our goal is to look for and hire A-Players. These are people who bring ten times the value to their company as compared to their peers. These are also people who, if you asked everyone they’ve ever worked with, would be placed in the 10% percent of great employees. A-Players are endlessly resourceful, creative, responsive (get back to people quickly), have likely built something of their own, use leverage to work smarter, and push their boss to get more done (not the other way around). Like all high-performing companies, we avoid perennial low-performers: people who will never be amazing in any environment. They are generally easy to spot and don’t get very far in the interview process. Everyone is looking for A-Players. We are not the only company looking for A-Players. Other companies are too, and because it’s easy to spot an A-Player who is amazing under all circumstances (these people are known entities), there’s intense competition for their services. This makes it generally difficult to recruit them - everyone wants them, and few companies get them. A small percentage of people are perennial high-performers -- they will be A-Players in any environment. A small percentage of people are perennial low-performers -- they will be C-Players in any environment. Most people aren't clearly A-Players or C-Players. A-Players in some environments might be B- or even C-Players in other circumstances. Most people can be exceptional in specific cultures, even if they have not excelled in a prior work environment. The skills and personalities are different to thrive in different organizations. Therefore, we look for people who will be A-Players in the SafeGraph environment. When hiring at SafeGraph, our task is to find the people that will be exceptional within SafeGraph (but not necessarily in other places). For instance, someone that excels in McKinsey might not necessarily be someone that does great within a fast-paced start-up. The skills and personalities are different to thrive in different organizations. We filter for people who are already aligned (or ready to align) with our values. We talk a lot about the SafeGraph values. Because the SafeGraph values are aspirational, candidates do not need to embody all our values on their first day . In fact, I often stray from the SafeGraph values myself (I constantly find myself doing too many things and doing things in parallel rather than in series). The super important thing is that the candidate believes in our values and wants to help the company achieve them. While some values are aspirational, other values are non-negotiable. One of the reasons that hiring is so hard is that there are so many important criteria in a coworker. We could screen for 100 great attributes, but of course, no one particular candidate would fit into all 100. The important thing is to have a few core things you measure everyone on. Here are the three core attributes of every SafeGrapher: - Judgment - Growth - Pace Judgment is a non-negotiable value. We must hire people with great judgment from day one. This can be summed up as always acting in the best long-term interest of the company. There is no specific set of rules that will tell you what’s in the company’s best long-term interest at any given juncture, so you must use judgment to make decisions quickly. It’s my job, as CEO, to make sure every employee clearly understands our strategy so they can make judgments about what’s best for the business. To move fast we require each employee to make hundreds of decisions a week without direct input from me or their manager. That’s why when it comes to becoming an A-Player at SafeGraph, we say Judgment is the X-Factor. Growth mindset is a non-negotiable value. Everyone at SafeGraph is focused on growth, and every new person should be focused on growth. That means everyone we hire needs to have the potential to take on much more responsibility and grow substantially. We cannot hire people who have maxed out their potential because their growth curve is limited. Here are some simple questions we ask to evaluate growth potential: Is this person significantly better today than they were three years ago? If the answer is no, we should not hire this person because it might mean they are not growth-oriented. Is this person a voracious learner? Are they a self-learner? Red flags are people who do not have a command of their last company. For instance, a finance person who does not know how revenue was recognized at their company or a software engineer who only understands one piece of the pipeline are red flags. We look to hire people who take it upon themselves to learn. These people regularly seek out answers and proactively ask the right people the right questions. Would this person immediately be the best person at SafeGraph at something important? In the next six months, will this person be substantially better than their manager at most of what they do? Pace, Tempo, Speed, and OODA loops are non-negotiable values. The only way for start-ups to win is pace. The number of actions per person per week at a start-up needs to be WAY higher than at a larger company. Some of the smartest people in the world prefer to move slowly and deliberately -- those people are brilliant but will likely not be A-Players at SafeGraph. Here are some simple questions we can ask to filter for pace: Will this person ever be a bottleneck? The answer cannot be “yes”. Will this person push their manager to move faster? SafeGraphers should be self-managing and pushing their manager to move faster. Their managers should not need to push them for deadlines. Is this an optimistic “yes, and” person? It is much easier to move fast when a company is filled to the brim with glass-half-full people. It is often easier to identify all the problems with an idea and reasons not to do something. We want to be surrounded by people who find and act on solutions. Does this person have high expectations of themselves AND their teammates? For a start-up to succeed, everyone needs to have high expectations of themselves. But that alone is not enough -- because if they do not have high expectations for others in the organization, they will take too much on their shoulders. Also, high expectations for yourself is not the same thing as high ego. High expectations is holding yourself to a high standard and pushing for greatness. High ego is feeling entitled for personal respect. We want the former. Judgment, growth, and pace – these are the non-negotiables we look for at SafeGraph. A candidate who embodies these values has a high chance of thriving in our culture and becoming an A-Player on the SafeGraph team. SafeGraphers can be a little quirky too! ‍ ‍ #### The Secret to Using Location Data Marketing Perfectly Location-based marketing provides customers with the exact information that they need on a business’s products, events, and more. It not only analyzes customer needs, but also relates them to their location. In this fast-paced age of automation, it is necessary to find a location data marketing strategy that can keep up with demand and apply advertising in the most effective way. We’ll explain how by covering the following topics:What is location data marketing and how is it used?5 benefits of using location data marketingTypes of location-based data marketingHow to get started with location data marketingPrivacy concerns with location-based data marketingWhile there are many different ways to handle client information, location-based marketing has been used successfully by many companies around the world to do so in a very specific way. Let’s start with understanding exactly what location data marketing is, and how it utilizes location data to help marketers advise merchants on how to increase their revenue through advertising. What is location data marketing and how is it used?Location-based marketing (LBM) is a marketing tactic that uses geospatial information to profile customers based on their address. This allows marketers to more precisely tailor ads to customers in order to encourage them to purchase relevant products and services in their area.Marketers have come up with the idea to use location data as a tactic only in about the last twenty or thirty years. But as a concept, this kind of demographic profiling has been around for centuries. A store owner or marketplace seller from the 1500s would get friendly with customers and ask them more about their lives. They would then use this information to suggest merchandise that they believe would fit the customer’s likes and dislikes.Today, this type of salesmanship is much more difficult to practice. Many stores have thousands of customers in a day, and they are unable to dedicate the time to make these kinds of individual connections. So they rely on strategies such as LBM to understand their client’s needs. 5 benefits of using location data marketingThere are many benefits of using location-based data in marketing, but here are five that tend to provide the highest ROI. 1. Close the online-to-offline attribution gapWhen someone steps into your store, it’s important to know what influenced that decision. If you have information that a certain online campaign has brought more customers into the store, then you can create that positive link between online campaigns and in-store interactions. This allows you to directly measure the effectiveness of converting online traffic to offline sales. A great way to do this is with gift cards for an online account. Giving a customer a physical card gives them an incentive to not only visit the store in person, but also to choose your store when shopping online.If a customer clicks on an advertisement for a product you’re pushing strongly and then later purchases that product in-store, creating that direct link is ideal for studying the influence of online advertising on customers. For example, a customer may be interested in a specific brand of purses, and their browsing history on the company website indicates sustained attention on certain products. So if they drive by later, they might quickly step in to check out the brand that they had browsed earlier. LBM makes it possible for brands and businesses to capture people’s attention and influence their behaviors in real-time, when they’re simply going about their daily lives and are near stores they have visited online recently. 2. Deliver ads to the right people, at the right time, in the right locationAnticipating the needs of the customer is a tricky business. But with advancing technology, it's becoming easier to predict when customers will be more likely to appreciate advertisements shown to them. LBM promises to deliver highly-personalized marketing by using geospatial services that will appropriate foot traffic data to potential customers. This helps businesses understand the right times or places to deploy an ad, based on passerby behavior. It is crucial to not send ads to the wrong people, at the wrong time, in the wrong location, because it could have adverse effects for the brand. Annoyed consumers do not make for happy customers. 3. Improve user experience and customer satisfactionWhile not irritating customers is crucial, so is meeting customer needs in a personalized way. Not every brand appeals to each customer in the same way. LBM can predict that someone who purchases lipstick from a makeup company often is likely to be interested in new lipstick being released by that company in the future. However, that same person would be less likely to be interested in a completely different product being sold by this same makeup company. Why? Because they have no history of buying or searching for it. LBM delivers relevant ads and messages to consumers that will make them feel more connected to a brand. At the same time, it avoids over-advertising to them and causing irritation.4. Win market share The problem with marketing is that the company’s or brand’s competitors are doing it too. LBM allows brands and companies to specifically target high-potential customers. The closer they are drawn towards the company or brand, the less likely they are to be drawn away by competition. 5. Increase revenueIn any business, the bottom line is profit. LBM is able to directly drive foot traffic to stores, increase conversion of online traffic to in-store purchases, and substantially drive sales up. While LBM is an investment, it pays off in the end to have a highly strategic, technical, and data-driven marketing and advertising approach.Types of location-based data marketingThere are several different approaches to location-based marketing. It is not simply a question of knowing where consumers are, but meeting them there with the ads, offers, and services they need at that moment. This level of detail means location data marketing requires more than just location information, as businesses need to be able to send information that pertains to customers and is more conducive to their needs. GeofencingGeofencing uses virtual boundaries to enclose specific areas for analytics or event triggering. In LBM, geofences are used to send ads to devices when they cross a virtual boundary. They are also used to determine how suitable an area is for advertising by measuring how many people pass through it. Main benefit(s): By using a geofence, a business can measure how long and how often a customer spends time in the location near their store. Then they can offer the customer incentives to visit the store. Real-world example(s): A bubble tea shop notices that a customer spends 8 hours near their store, 5 days a week. This probably means that they work nearby. The company realizes that if they were to be able to entice this individual into the store, then the person could develop a habit of buying a refreshing drink before or after work, or during their lunch break. They decide to send a virtual coupon to the customer every time they enter the area. Another example is when a fast food company notices that their competition is taking away a lot of its customers. They decide to create geofences around their competitors’ locations, and offer customers who enter the area virtual coupons to influence them to patronize their business instead.GeotargetingGeotargeting is the practice of delivering different advertisements to people based on where they are situated. Geofences are often used to deploy geotargeted ads.Main benefit(s): Meeting consumers where they are with personalized ads, offers, and services is a strong tactic for converting them into customers. Geotargeting can be used near your store locations (or those of your competitors) to encourage consumers to buy your product by keeping your brand front and center in their minds.Real-world example: A restaurant with its location in downtown Chicago could geotarget ads to anyone using the internet in the city’s urban areas. They would not advertise to people in outer suburbs, because the chances that they will make the trip to visit that restaurant are smaller. BeaconingThis is a marketing technique to engage the customer while they are in the store. It monitors where customers are walking or standing in the store in real time. Then it sends the customer advertisements of items that are nearby and that they might be interested in. Main benefit(s): Beaconing engages customers so that they will buy more products that interest them. This could directly increase sales, as the more items that customers find that they are interested in, the higher the total sale when they checkout could be. Real-world example: A big box retailer can see where consumers are moving throughout their store, and send them ads for nearby products. This increases the likelihood that shoppers will make extra purchases.How to get started with location data marketingLocation data marketing may be new and highly technical, but it is very easy to get started with. The steps below will provide anyone looking to get started with a solid plan. They lay out a method that not only allows marketers to use location data, but also makes sure that the client understands how and why the data is being collected.Identify the problem you are solving: You cannot comprehend how to solve a problem until you understand exactly what it is. For example, are you having a hard time attracting customers, or are you actively trying to build the brand in new directions? Define a goal: This is directly related to the direction that the store wants to go in to solve this problem. Do you want to entice more customers, or do you want to increase brand awareness?Determine methodology for solving: Once you know what you’re trying to do and what issue it’s solving, decide which exact method will be used for fixing this problem (geotargeting, geofencing, etc.). Determine the inputs – such as the geographic boundaries or customer data – for this method, and make sure you acquire the appropriate type(s) of data. Do you have privacy concerns about collecting data? How can you resolve these privacy concerns?Source data to solve: Depending on your methodology, purchase, download, or collect data to include in the analysis.Run analysis: Analyze the resulting data and determine whether this method was successful in attaining its original objectives. Does the output fulfill the goal?For example, if a shoe merchant does not have enough foot traffic coming into the store:Identify the problem you are trying to solve: Not enough foot traffic; need to increase customer base. Define a goal: Use location-based marketing to measure the success of a marketing campaign. Determine the methodology for solving: Use geofencing to measure all the customers in the area who visit the shoe merchant’s store, and geotarget them by sending out virtual coupons via email. Source data to solve: Measure how effective this virtual coupon campaign was. Compare how many items customers bought during the campaign to sales from this same exact time period last year, and determine if sales have increased. Run analysis: Determine if the goal was met and if the location-based marketing method used successfully increased ROI. Privacy concerns with location-based data marketingLocation-based data marketing can raise red flags because of the sensitive nature of the data. It is the responsibility of the marketer to ensure that they are ethical when working with location-based data and follow guidelines to stay within their limits. LBM makes a marketer’s workload lighter by providing data that can engage customers in so many different ways, providing a higher level of success. #### The Techniques in Spatial Data Analysis and How SafeGraph Can Help Key Takeaways Spatial data analysis looks at geographic relations, not just discrete nodes. Spatial autocorrelation makes geographic data fundamentally different from traditional datasets. Network-based accessibility often produces more realistic insights than straight-line proximity measures. Data quality is directly related to the reliability of clustering, regression, and predictive spatial models. SafeGraph supports advanced geospatial modeling through verified place data, mobility insights, and integration with cloud and GIS platforms. Location is not simply another column in a dataset. It is the structure that defines how observations relate to one another. In spatial analysis, this underlying arrangement of relationships is often referred to as spatial structure. Analysts working with spatial data are not just analyzing values across observations. They are examining how outcomes are shaped by proximity, clustering, accessibility, and movement. This is an analytical process which is formalized using spatial data analysis. It supplies the statistical and computational framework to detect geographic patterns, measure spatial dependence, and model location-driven behavior. With companies increasingly using geospatial intelligence for retail expansion, infrastructure planning, mobility research, and risk assessment, the level of spatial analysis, and therefore development of spatial techniques, has increased. But the most crucial concern for a model lies in the quality and consistency of the data underlying it. It is thus necessary to know about these techniques. On the other hand, it is equally important to ensure reliable inputs.What Spatial Data Analysis Really EntailsSpatial data analysis refers to the analytical treatment of data that contains geographic coordinates or spatial geometry. Unlike conventional statistical analysis, it does not assume independence between observations.Spatial datasets possess an underlying spatial structure, meaning that relationships are shaped by location and arrangement in space. There are often nearby effects between entities. This phenomenon is called spatial autocorrelation and requires specialized methods that explicitly account for spatial relationships. Spatial data are usually represented in two forms. Vector data are the discrete elements like points of interest, road networks, or administrative boundaries. Raster data are continuous surfaces like elevation, temperature, or satellite imagery. A thorough spatial analysis combines geometrical data together with attribute data to find patterns that are invisible in non-spatial data.Why Spatial Structure Changes Decision-Making  The role of spatial data analysis is to unveil relationships determined by geography. However, retail performance may vary from neighborhood to neighborhood, not solely because of demographics, but because of accessibility, competitive density, or consumer mobility patterns. There could be some “cluster-ish” public health outcomes in certain districts because of environmental exposure or infrastructure constraints. Logistics efficiency is less a function of straight-line distance than it is of network connectivity and travel time. These trends are not coincidental. They are structural. Ignoring spatial relationships will lead to biased judgments and incorrect strategic decision-making. Integrating them will help in producing predictions more effectively, better allocation of resources, and stronger predictive models. Core Techniques in Spatial Data AnalysisStructural analytical methods underpin the professional geospatial workflow. Each tackles a distinct aspect of spatial form. Spatial Descriptive StatisticsThe first level of insight is provided by spatial descriptive statistics. Descriptive statistics such as mean center, standard distance, and spatial autocorrelation quantify to what extent points tend to cluster, disperse, or follow definite geolocation. In the domain of retailing this might include looking at whether store locations are focused in high-density corridors or well dispersed, geographically, in a metropolitan region. These baseline metrics inform every modeling choice.Buffering and Proximity AnalysisBuffering and proximity analysis proceeds from distribution to influence. A buffer is a zone about a geographic feature at a given distance. Analysts implement buffers to model service areas, assess competitive overlap, or establish infrastructure access. In practice, a trade area might be defined by a five-mile radius or, more plausibly, by a ten-minute drive-time isochrone. So here is where spot positions and up-to-date location data are important. Even small geocoding errors can materially distort competitive assessments. Spatial Entropy and Distribution AnalysisDiversity and concentration within geographic boundaries can also be analyzed using spatial entropy and distribution analysis. In commercial settings, entropy can be used to measure the distribution of different categories in a district, allowing us to determine if there is a single segment dominating the retail corridor or an even distribution of the district. Meaningful comparisons across regions depend on uniformity across classification standards.Hotspot AnalysisHotspot analysis detects statistically significant high-value or low-value clusters. Analytics, such as with Getis-Ord Gi* statistic, helps to identify regions where activity is greater than the random distribution would predict. In practice, this could mean finding high-demand retail locales, centers of mobility aggregation, or hotspots of risk. Accurate cluster detection depends on detailed data with little redundancy and exact representation in space. Geostatistical ModelingGeostatistical analysis takes modeling by the spatial level to prediction. Methods like kriging and variogram modeling estimate those values at unobserved areas while quantifying uncertainty. Such strategies are common in environmental science, resource modeling, and demand prediction. Data structure and low-noise datasets reduce unstable interpolation models and increase confidence intervals. Spatial Regression ModelingSpatial regression includes geography as an explicit element of predictive models. Classical regression relies on independent observations. Spatial regression resolves this by accounting for geographic spillover effects. For instance, retail sales in one neighborhood may be affected by commercial presence in adjacent areas. When they include spatial lag or spatial error terms, analysts gain a more dependable explanatory model. To apply these techniques effectively, the underlying dataset must be structured, current, and analytically reliable. Explore how SafeGraph’s global POI datasets are designed to support clustering, trade area modeling, and network analysis at scale. Explore the Datasets Network analysis addresses connectivity, not distance. Real-world movement happens in transportation networks, not in straight lines. Network models can calculate shortest paths, service areas, and flow optimization for services over any given road, transit network or logistics corridor. As applied in site selection and supply chain optimization, a network-based access provides a closer approximation of market accessibility than a mere radial gap. How SafeGraph Strengthens Spatial AnalysisSafeGraph provides a comprehensive global database of points of interest designed for geospatial analytics. The dataset includes detailed attributes such as place type, operational status, and consumer interaction metrics. Machine-generated processes combined with human verification support consistency and reliability across markets.In spatial descriptive statistics, normalized and deduplicated POI data improves clustering accuracy. In proximity and network analysis, precise geocoding enhances trade area realism. In hotspot detection, granular visitation patterns refine demand clustering. In regression modeling, consistent categorization strengthens explanatory variables.SafeGraph data integrates into established workflows through cloud marketplaces and GIS environments, including AWS-based infrastructure and leading geospatial platforms. This allows analytics teams to incorporate high-quality POI data without restructuring their technical stack.The dataset has also supported research examining mobility patterns and evaluating bias in location panels. Transparency in representativeness and spatiotemporal dynamics is increasingly important, particularly in public health and policy research where spatial inference must withstand scrutiny.ConclusionUsing spatial data analysis turns geographic coordinates into organized intelligence. Many of its techniques include descriptive clustering and buffering, advanced geostatistics, and network modeling. But even the most sophisticated method cannot compensate for inconsistent or incomplete data. As businesses, researchers, and public institutions rely heavily on location intelligence, the quality of spatial inputs becomes a strategic variable. Strong datasets don't just serve analysis. They determine its credibility. Once applied to analytical work in conjunction with structured, high-integrity spatial data, location is no longer a static characteristic but stands as a quantifiable enabler of insight. If your team is applying spatial regression, trade area modeling, or hotspot detection in production environments, the quality of your location data becomes a strategic variable. See how SafeGraph’s structured POI and mobility datasets integrate into existing GIS and cloud workflows. Schedule a Demo FAQ’s 1. What is spatial data analysis in simple terms? Spatial data analysis is the process through which data including geographic information is analyzed in order to uncover patterns, associations, and location-driven effects. 2. How is the analysis of space in contrast to analysis done in the same field? It considers spatial dependence. Some adjacent sites might actually impact each other, which demands special statistical techniques. 3. What does data quality have to do with spatial modeling? Clustering, regression, and accessibility models can be affected due to inaccurate coordinates, duplicates, or inconsistent classification. Clean data makes them more reliable. 4. How does SafeGraph support spatial analysis? SafeGraph brings structured global POI data, rich attributes, and integration options to improve clustering, trade area modeling, and predictive workflows. 5. Can SafeGraph data be integrated into existing geospatial tools? Yes, SafeGraph datasets are easily accessible and can be integrated into GIS platforms without major infrastructure changes. Spatial data analysis is the process through which data including geographic information is analyzed in order to uncover patterns, associations, and location-driven effects.It considers spatial dependence. Some adjacent sites might actually impact each other, which demands special statistical techniques.Clustering, regression, and accessibility models can be affected due to inaccurate coordinates, duplicates, or inconsistent classification. Clean data makes them more reliable.SafeGraph brings structured global POI data, rich attributes, and integration options to improve clustering, trade area modeling, and predictive workflows.Yes, SafeGraph datasets are easily accessible and can be integrated into GIS platforms without major infrastructure changes. #### The Ultimate Guide to Local Search Applications & POI Data What is local search? Whether searching for dinner places nearby on a mobile phone, tagging a social post to a location, or selecting a suggested location as the destination in a rideshare app - today’s consumers are constantly utilizing local search functionality. Also known as “discover near me” search, this capability has become an integral part of daily life, helping individuals discover and navigate to nearby places of interest - from restaurants, to flower shops, to hospitals, and more. At its core, local search helps consumers perform daily tasks more efficiently with the assistance of location-based applications, enhancing the user experience by providing relevant and reliable information about nearby establishments or places. But building a reliable local search application is not always straightforward; the data powering both the geographic and categorical elements of local search is complex to create and maintain in a dynamically changing world. Local search helps consumers discover places by both location and attributes, and is a functionality that requires complete, precise, and fresh input data to be valuable. To help product managers and application developers understand these complexities, we’ve created this ultimate guide to local search and the points of interest (POI) data needed to make a complete, accurate, and up-to-date tool that users can trust. The information in this guide will leave you feeling like an expert in local search and empower you to choose a data provider with clear evaluation criteria in mind. In this guide, we’ll cover: The value of UX in local search POI data evaluation criteria for local search Example local search applications & data Where to get data to power local search 1. The value of UX in local search It’s important to note that the success of local search relies heavily on the accuracy of the information provided. A high level of accuracy contributes to a seamless user experience (UX), allowing individuals to make informed decisions about where to dine, shop, or visit. On the other hand, low accuracy can lead to a disappointing experience that erodes user trust - like recommending dissolved businesses or providing directions to a place that never existed in the first place. Maintaining accuracy is not just a matter of user satisfaction; it is essential to building brand credibility and ensuring success in an increasingly competitive market. No matter the ultimate use case, accurate local search is critical to the user experience. Imagine a person arriving in New York City for the first time. They exit Grand Central Station after a long day of travel and want their favorite Chipotle meal, so they open their rideshare app to call a car to Chipotle. The intuitive expectation is that when searching for a Chipotle destination, the prepopulated POIs listed by the app are closest in proximity to their current location. This user likely also expects the locations listed are open. However, when they are on their ride they pass two open Chipotle stores, and find that the one they eventually arrive at has already closed. Inaccurate data used as an input for local search can result in a poor UX, and ultimately cause user churn. This poor UX not only disrupts the consumer’s day, but also reflects poorly on the rideshare app and its functionality. In today’s hyper-connected world, users expect reliable local search tools that do not disappoint, and this experience would be enough to lose this user to a competitive app. To avoid such user churn, it’s important to choose a robust and precise POI dataset to power a local search engine. 2. POI data evaluation criteria for local search Accuracy in local search presents unique challenges for product managers or application developers. It involves trust in capturing real places, ensuring completeness, and validating that attributes such as store hours, web address, phone number, street address, and coordinates reflect the latest available information. The importance of these metrics vary based on the specific use case for local search. For example, navigation companies prioritize precision, ensuring coordinates are correct for accurate directions. On the other hand, search providers optimize for recall, developing comprehensive listings with accurate attributes like phone numbers and websites that enhance user engagement, increasing the likelihood of consumers using the service again. In reality, the best local search functions have just the right mix of precision and recall. However, pursuing both precision and recall can sometimes be counterproductive, so it’s important to recognize the natural trade-off between quantity and quality. By addressing and understanding these complexities, local search providers can confidently prioritize the right metric for their use case. In doing so, they can deliver reliable and positive experiences that specifically meet their users’ expectations for exactly what they are trying to accomplish. Understanding precision requirements for local search In the simplest terms, precision refers to whether or not something is true. For a data product, that means making sure the entries included in the dataset are as correct as possible. In local search, precision can be examined from two dimensions: row precision (the existence of a place) and column precision (the facts about places that exist). Row and column precision are both important to consider when evaluating POI data for local search. Row precision Row precision means having confidence in the authenticity and accuracy of each entry in the dataset representing a place in the physical world with relevance to users. Fundamentally, row precision addresses the question of whether a record in a dataset should even exist. It helps determine if the information presented relates to an actual place that users can interact with in the physical world, or if it will contribute to a poor UX. While this may seem straightforward, the neverending volume of digital information means that data that was once accurate may no longer be correct or relevant. Additionally, the age of user-generated data, online reviews, and e-commerce has introduced the challenge of distinguishing between real and fake information on a scale never seen before. This is particularly an issue for crowdsourced or open source data. For instance, let's consider a boutique jewelry business. While it may have a website and a phone number, it could actually be a small business operating on Etsy and run from a private residence. In this case, it doesn't function as a physical storefront where visitors are welcomed and therefore shouldn't be considered a local boutique in a given area. It also should not be displayed as a potential destination for a navigation engine, or a potential competitive location for another brand performing site selection analysis. Row precision is critical to ensuring local search applications only direct users to real places. Ensuring that each row in a POI dataset corresponds to a real and currently operational place lays the foundation for accurate and predictable local search experiences. Once row precision has been established, additional accuracy metrics can be evaluated - like column precision. Column precision Column precision refers to the accuracy of specific attributes in a given row. This involves evaluating the truth of information that allows users to engage with a place, such as phone numbers, open hours, and website URLs. However, it’s important to note that column precision builds upon the foundation of row precision. A row must be identified as a real and current place before assessing the accuracy of the data included for that record. It's also essential not to mistake a dataset’s fill rate as a measure of column precision. For example, merely populating every phone number field with a brand’s corporate headquarters’ number may achieve a 100% fill rate, but does not guarantee accuracy or provide any value for users. Column precision is needed to provide detailed attributes that help users discover and interact with places. Ensuring high column precision increases the reliability of local search results, encompassing not only the comprehensive list of prospective places, but also the associated metadata that enhances the UX. This enables users to confidently rely on accurate phone numbers when they call a business to book an appointment, trust that a store will be open upon their rideshare’s arrival, and seamlessly follow a website link to learn more about their potential destination. The reliability of these interactions enhances the user’s overall engagement while exploring the places they seek, and thus translates into increased satisfaction with the product or application being used to perform local search. Understanding recall requirements for local search Local search creators must also factor in recall. Recall is the ability to find all relevant records within a dataset. In relation to POI data, recall refers to the confidence that all possible places meeting a certain criteria are represented in the dataset. For example, a POI dataset with perfect recall would include all possible restaurants, hospitals, national parks, etc. in a given geography. Let’s say a consumer wants to know all of their options for selecting a top restaurant nearby. How often is it that a local search function will return ALL possible restaurants - even the one that just opened yesterday? This example helps highlight the difficulty in attaining perfect recall. Points of interest open and close every week, making perfect recall a constantly moving target. Although it is difficult to attain, users expect high recall when searching for places nearby. To meet these expectations and deliver a strong UX, local search products require a frequently updated POI dataset as an input. Even with up-to-date POI data, recall is difficult to measure. It requires a truth set to compare against, and in practice, none exists. It’s virtually impossible to know, at any given moment, every single place that exists in the world. But innovations in data science and curation enable local search providers to get as close to a truth set as possible when building their products and applications. Due to these difficulties in measuring recall, it is important to beware of overreaching when sourcing or creating a POI dataset. Just like it is possible to have less than 100% of possible POI in a given dataset, it is also possible to have over 100%. Extraneous entries only dirty the dataset, leading to inaccurate local search results and a negative UX. It is important to consider the trade-offs that occur when more value is put on recall over precision, and vice versa. Trade-offs between precision and recall For the purpose of local search, precision can be thought of as quality, whereas recall represents quantity. As with anything with a delicate balance, by greatly improving one metric, substantial pressure is put on the other. For example, the more data you have, the harder it is to ensure all of the data is precise. This is true of both row and column precision in relation to recall. It is critical to consider whether recall or precision will provide the most value for a particular local search use case. To understand this, consider whether it is more important to ensure all possible results are returned in a search, or that all results returned in a search are real and up-to-date. Or alternatively, determine which bad UX (poor precision or poor recall) would be worse. For example, would it be worse for a user to navigate to a closed POI, or not find one in the first place? Any good dataset attempts to strike a balance between quality and quantity. But deciding which holds higher value for your use case will determine the priority of recall versus precision as you evaluate data providers. Example local search applications & data Technology companies in the consumer mapping, social discovery, and navigation spaces rely on local search to power their platforms and improve their user experience. Each of these groups uses local search for slightly different purposes, which we explore below. Consumer mapping companies Consumer mapping companies make it easy for people to find points of interest by enriching user GPS locations with highly precise POI data. ‍ Consumer mapping products are enable users to search for and navigate to places based on both location and specific characteristics. Today’s most commonly used consumer mapping applications and products include: Google Maps: Google Maps provides users detailed maps and navigation services. They use their own Places API to power their search engine and provide detailed recommendations to users on points of interest, such as the most popular restaurants nearby. Apple Maps: Apple’s mapping service is similar to Google Maps in that it offers various services to help users navigate to specific destinations, and offer recommendations for a wide range of points of interest nearby the user. Bing Maps: Bing is Microsoft’s search engine. As part of that service, they offer a mapping function similar to Google and Apple that provides imagery of most parts of the world, information on road networks, and details about points of interest. Social discovery companies Companies with social discovery applications and products have also begun to invest in local search to create highly personalized and interactive features for their users. Although they may not be focused on local search as their main offering, it has become an influential feature on many platforms. ‍ Social discovery companies leverage local search to help consumers share reviews and experiences at specific places. Examples of social discovery brands leveraging local search include: Yelp: Yelp leverages local search features so that users can discover nearby restaurants, shops, and services, as well as read or leave reviews for specific points of interest. Foursquare: Similarly, Foursquare offers local search functionality by providing recommendations for restaurants and various attractions based on user preferences and check-ins. Facebook: Facebook has various location-based features. Users can search for local businesses, check-in at places, and review recommendations from friends. Facebook also owns Instagram, which offers similar location-based search and tagging features. Snapchat: Snapchat uses local search in its "Snap Map" feature. Users can see snaps posted by others at specific locations, discover local events, and find nearby points of interest. TikTok: Similar to Snapchat, Instagram, and other social media applications, TikTok offers a location-based content discovery feature that allows users to tag or search for videos at a specific place. Navigation companies Local search is also an integral part of products and applications that help consumers navigate around the world. Many of the key players in today’s navigation space are taxi and rideshare apps, or companies that develop technology to power them. Today’s navigation companies use local search to help users get from one destination to another by using POI waypoints. Popular examples of modern navigation companies who leverage local search functionality include: Uber: Uber’s platform has a local search function to allow users to set specific pickup and dropoff locations based on POIs, which can be more recognizable than addresses alone. Lyft: Lyft also simplifies and enhances their user experience with POI data, making it easier for riders and drivers to find both each other and the end destination. Telenav: Telenav develops connected car solutions such as in-car navigation systems with local search to augment GPS routing systems. TomTom: TomTom also creates GPS solutions that use POI data to power local search and enhance the user experience for navigation products. Where to get data to power local search To meet this demand for data to power these products and applications, companies that specialize in mapping and geospatial data creation often develop POI datasets specifically for local search. You can see a full list of POI data providers here, but some popular examples include: Google Places API: As mentioned above, one of the products Google provides is the Places API. While this data is considered pretty globally comprehensive, there are strict usage terms which can be prohibitive for many applications or products. These API calls can also quickly add up, making it an expensive option for powering local search with many users. HERE Technologies: HERE is a mapping and geospatial data company offering POI data that can be used for local search capabilities. The company is focused more on providing POI coordinates for navigation solutions than detailed place information. OpenStreetMap (OSM): OSM’s free and crowdsourced map data can be used for geospatial local search applications, and is sometimes the strongest POI data source in data-scarce areas of the world. SafeGraph: SafeGraph is singularly focused on curating places data. We have an industry-leading global POI dataset, Places, that is well-known for its high precision and recall. Places data is used to fuel various local search engines, helping people find and navigate to what they're looking for with accuracy and efficiency. At SafeGraph, we understand the unique challenges faced by providers wanting to deliver accurate local search results. We are committed to being the trusted data partner for product managers and application developers, providing reliable and up-to-date information about our dynamically changing physical world. To ensure we provide as close to a truth set as possible to our clients, we maintain a maniacal focus on both precision and recall through a mix of the following: Continuous data assessment SafeGraph performs regular reviews and updates of our Places data, constantly recompiling information to ensure the ongoing accuracy of our monthly releases. This process crosschecks numerous sources and helps identify any outliers or outdated information that may have crept into the database. Manual sampling To validate data accuracy, SafeGraph conducts manual reviews of sampled data. This hands-on approach helps identify potential errors or inconsistencies that automated processes might overlook. These manual validations then fuel our row and column precision models, and we consistently increase our precision as the training data continues to accumulate. Tracking openings and closings SafeGraph keeps a close eye on business openings and closings, ensuring that our database reflects the most up-to-date information. By promptly updating our data, we ensure that local search functions can minimize the risk of recommending closed establishments to users. For more information, check out our blog on the importance of timely open and close metadata. SafeGraph openly shares our product updates with our users, including growth metrics and bug fixes. Transparency POI data is inherently messy, and any provider that pretends it isn’t is not being honest with their customers. SafeGraph is committed to being transparent with our users about the Places data we create. Each month, we publish release notes and accuracy metrics that help product managers and developers understand both precision and recall so they can make informed decisions for local search. Customer input and feedback We value customer input and feedback, and actively engage with local search providers to address any concerns, suggestions, or requests. This customer-centric approach allows us to improve data accuracy based on real-world usage and feedback, and it also helps narrow our focus on improving what is most meaningful to our customer’s customers. Get started with SafeGraph data for local search Precision and recall are not just “nice to haves,” but fundamental requirements to local search capabilities. As a data partner, SafeGraph recognizes the importance of striking the right balance between quantity and quality. Get in touch with our team today to see how the right POI data can enhance your local search product, elevate the user experience, and accelerate growth. #### The Ultimate Guide to Location-Based Audiences Location data can differentiate companies looking to stand out in a very crowded advertising space. The use of consumer data in advertising has grown exponentially in the last decade. Data on demographics, online activity, purchasing trends, geography, and various other factors have revolutionized how companies get closer to their target audiences and make more strategic bets for driving and improving advertising performance. To stay competitive, brands are leveraging additional data in their audience analysis to ensure they are targeting the right people at the right time. Places data is a particularly powerful supplemental dataset in this effort. Enriching privacy compliant consumer data with the context of physical places provides marketers a new level of sophistication in planning, buying, optimizing, and measuring programmatic campaigns, especially in online-to-offline attribution.‍We wrote about the rise of location-based marketing previously. Here, we elaborate on the importance of location for building audiences. The information in this guide will help marketers and ad specialists use Places data to enhance audiences, and enable you to increase ROI by incorporating geographic data in your ad strategy.In this guide, we will cover:The value of location-based audiencesUse cases for location-based audiencesBest practices for using point of interest data to build audiencesWhere to get data to power location-based audiencesThe value of location based audiencesTo stay competitive in today’s market, brands need to carefully curate the end-to-end customer journey. There are many ingredients that can be used to optimize this experience, all relating to what the consumer prefers, how they behave, where they go, and other details that enable personalization. However, consumer data without offline behavior is extremely limited in providing this necessary insight.Census block group (CBG) level data and generalized household demographics may adequately describe who a target consumer is, but they fall short in providing the reasons, locations, and relevancy required to target them effectively in a manner that is well received by that consumer. Mobility data combined with places data blends well with digital signals to give a more well-rounded picture of the target consumer and improves return on ad spend (ROAS).For example, consider car enthusiasts who routinely browse websites without actual purchase intent. They often look at car models they never plan to buy. Places data can be used to cross reference online signals with their physical movements, identifying whether they have recently toured an auto dealership (a much more likely signal of intent to purchase) or better yet, entered the actual showroom (an even higher converting signal). By understanding whether these consumers are “window-shoppers'' or potential buyers, a whole audience segment can be ruled out of ad targeting, saving a lot of money and increasing ROAS.Successful examples of location-informed audiencesHere are just a few examples of companies leveraging places data really well to provide advertising value to their customers:RainBarrel building a proprietary audience graph based on commercially available geospatial data that allows advertisers to target their messages to the right audience; Mobsta’s visual ad planning platform which uses POI data to help agencies more accurately pinpoint the optimal ad units in the right locations for reaching target audiences;Media Storm uses polygon-based POI geofences to accurately attribute visits from MAID pings, resulting in stronger, more efficient ad campaign performance;InMarket’s location-enriched audience creation methodology enables programmatic advertisers to reach consumers where they are most likely to convert..Viant’s data platform, which links together disparate data sources - including POIs and other geospatial datasets - provides an accurate understanding of ROAS.Each of these companies share a competitive edge that cannot be achieved without the use of clean, accurate places-based insights.Best practices for using places data to build audiences1. Leverage a mix of data sourcesIn today’s omnichannel world, consumers are interacting with brands both online and offline. It’s important to bridge both the physical and digital world by using a mix of data sources for a more holistic view of audience profiles. Combining data such as foot traffic, browsing behavior, and purchase history provides better insight on how an audience interacts with brands across different channels. These insights explain why individuals are engaging with a brand in a certain way, ultimately helping derive consumer affinity for brands and products. This enables marketing teams to create more effective campaigns that can reach audiences wherever they are. Diverse data sources also simplify the ability to personalize messaging to audience segments. Personalization hones in on specific audience needs and can lead to higher conversion rates and better brand engagement, plus happier customers. For example, research by McKinsey shows that personalizing the customer experience results in a 20% higher customer-satisfaction rate and a 10-15% increase in sales-conversion rates.2. Use accurate POI dataPoints of interest (POI) or Places data is important because it offers marketers a geospatial perspective for understanding their target audiences and where they might visit. Bad data will only lead to audience inaccuracies and misleading indicators. The physical world undergoes constant transformations with places opening, closing, and operational hours shifting. It is crucial not to settle for POI data sources that update annually or even quarterly. Instead, investing in POI datasets that refresh monthly, like SafeGraph Places, will ensure access to the most reliable information.3. Better yet, use accurate POI polygonsPOI polygons add an extra layer of detail to enrich your audience profile. Polygon data takes POI data a step further by providing the building footprints, estimated sizes, and where possible, actual tenant splits so you can create more specific location-informed audiences.Incorporating POI polygons aids in discerning whether someone actually visited a POI - and is therefore a part of a target audience - or if someone simply walked past it. This level of specificity allows you to precisely identify who in an audience has visited a POI, and offers a more reliable means to confirm campaign performance than relying on estimates or projections. 4. Layer location and behavior context into your analysisContext is critical when utilizing mobility data for audience profiling. Simply clustering individuals based on their proximity to certain POI is not enough. To gain deeper insights, you must derive context on how these places relate to behavior. For instance, it’s important to recognize that a consumer may act drastically different in two different cafe settings (say, one serving strictly espresso versus one offering wine late into the evening). Accounting for this will help you develop stronger, more targeted audiences. Make sure to consider what hours are typically most popular and what times places open and close for business. This way your algorithm can make sure to attribute those late night pings to the bar that is still open and not the bagel shop next door that closed at noon.Cross-referencing visits to other POI can also unlock patterns and preferences of certain audiences. For example, it can show if an individual frequents Target and a specific restaurant, say Chipotle, in the same trip. Without factoring in contextual nuances from POI data, mobility data alone may yield incomplete or even misleading insights into audience behavior and preferences.Get started with SafeGraph data to build location informed audiencesThe significance of POI data in crafting precise and strategic audience profiles is undeniable, playing a pivotal role in optimizing the return on investment for advertising by understanding your audiences at a deeper level. At SafeGraph, we understand the importance of data quantity and quality. That’s why we publicly track our accuracy efforts. Our team can help you uncover how leveraging accurate POI and polygon data will elevate your audience-building strategies, supercharge your campaigns, and ultimately grow your return on ad spend. Get in touch here. #### Three Reasons Why Brand Attribution is Essential to POI Data Key Takeaways Understanding what brand attribution is critical for accurate POI analysis. Strong brand attribution in POI data enables clearer brand hierarchies and relationships. Brand-level insights improve market analysis, targeting, and site selection. Incomplete attribution can skew analytics and decision-making. Accurate brand attribution for location data strengthens competitive intelligence. Points of interest (POI data) are foundational to a geospatial data ecosystem. As a key pillar of location data, POIs provide the geographic coordinates of non-residential places for mapping and analytics. While the latitude and longitude of POIs are core to locating these non-residential places, what many data scientists find most valuable when analyzing this data is detailed brand attribution for location data. What is brand attribution? In the context of POI data, Brand attribution refers to the process of accurately identifying, classifying, and linking physical locations to the brands (or chains) and parent companies that operate them. SafeGraph Places data includes a brand information CSV file with detailed brand attribution POI data for stores with multiple locations. This file includes a unique and persistent SafeGraph Brand ID, enabling data scientists to easily join POIs belonging to the same brand. It also includes brand name and parent brand ID fields, making it possible to identify relationships between parent and child brands. Finally, the brand information CSV provides NAICS codes and categories, as well as stock tickers and exchange details, allowing users to roll up brands into specific industries and market segments more reliably. Why is brand attribution important? Why brand attribution matters in POI analysis becomes clear when POIs are used for strategic decision-making. When done correctly, brand attribution POI essentials allow POIs to be analyzed with greater precision and confidence. At SafeGraph, many customers use brand information for the following reasons: 1. Identifying parent and child brand relationships One of the most compelling reasons for detailed brand attribution in POI data is the ability to identify parent and child brands accurately. A parent brand typically refers to a corporate entity that oversees multiple subsidiary or child brands. A good example is Kroger, which is the parent brand for grocery chains such as Fry’s, Fred Meyer, Harris Teeter, and Smith’s. Most POI datasets provide only parent or child brands, which limit the types of analyses that can be conducted. Others include inconsistent mixes of both, which can lead to inaccurate calculations that underpin larger decision-making processes. When both parent and child brands are included consistently, data scientists can roll up or drill down to the appropriate level depending on their use case. Analyzing parent and child brand relationships allows data scientists across organizations to better understand the market landscape. Sysco, a global leader in food distribution, relies on parent-child brand attribution in POI data to analyze market dynamics at both levels, identifying brands experiencing spikes or declines in demand. Without curated parent and child brand attribution, Sysco would need to dedicate significant internal resources to manipulate data manually or risk missing critical insights. With accurate attribution, they can monitor market trends, evaluate growth opportunities, and decide where to focus efforts on a weekly basis. 2. Understanding brand affinitiesThe right brand attribution in POI data enables companies to understand consumer behavior in relation to specific brands, helping them build stronger customer affinity profiles. Brand affinity analysis is particularly useful for advertisers seeking to deliver targeted and personalized messaging to consumers who frequent specific brand locations. Media Storm, the second-largest independent full-service media agency in the U.S., incorporates brand attribution for location data into its location-based marketing workflows. Using SafeGraph’s brand information and NAICS codes, Media Storm can identify client store locations, as well as competitor and complementary brands, allowing advertising efforts to focus on the places most visited by the target audience.3. Mapping market penetrationBrand attribution POI essentials are critical for mapping and understanding brand presence across markets. Competitive analysis, site selection, and trade area creation are all more effective when POI data includes comprehensive and accurate brand attribution. If a POI is attributed incorrectly or the brand information is missing, areas of opportunity may be overlooked or overstated. Inconsistent parent and child brand data can cause POIs to be missed, or mis-weighted in analysis. For example, a market analyst planning an aggressive ad campaign to gain share from a competitor needs to know where that brand has a regional stronghold. If brand attribution is incomplete or incorrect, results will be skewed and decisions compromised. To address this, SafeGraph provides extensive brand coverage for places in over 200 countries & territories globally.. Users can explore this data through the brand dashboard, which offers visibility into top brands at national and state levels.   Conclusion Brand attribution in POI data is not a nice-to-have feature; it is a foundational requirement for accurate analysis. From identifying parent and child brand relationships to understanding brand affinities and mapping market penetration, strong brand attribution enables clearer insights and more confident decision-making.  Organizations relying on POI data without comprehensive brand attribution risk incomplete analysis and missed opportunities. With accurate brand attribution for location data, businesses can unlock deeper market understanding and drive more effective analytics. To explore how brand attribution in POI data supports more accurate location analysis, see SafeGraph’s Places dataset in action. To explore how brand attribution in POI data supports more accurate location analysis, see SafeGraph’s Places dataset in action. Request a Demo FAQ’s 1. What is brand attribution? Brand attribution refers to linking physical locations to the brands and parent companies that operate them, enabling accurate analysis across POI datasets. 2. Why is brand attribution important in POI data? Brand attribution matters in POI data as it is tied to accuracy. It ensures reliable competitive analysis, site selection, and market measurement. 3. How does brand attribution improve POI analytics? Brand attribution in POI data enables rollups across parent and child brands, improves affinity analysis, and supports cleaner market segmentation.  4. What are brand attribution POI essentials? Brand attribution POI essentials include unique brand IDs, parent-child relationships, industry classifications, and consistent naming conventions. 5. How does brand attribution support location data analysis? Brand attribution for location data allows analysts to connect consumer behaviour, market trends, and brand presence to physical locations. 6. What happens when brand attribution is incomplete? Incomplete brand attribution can lead to missed locations, skewed results, and poor strategic decisions. 7. Where can I access reliable POI data with brand attribution? You can explore POI data with comprehensive brand attribution through SafeGraph’s Places dataset. Brand attribution refers to linking physical locations to the brands and parent companies that operate them, enabling accurate analysis across POI datasets.Brand attribution matters in POI data as it is tied to accuracy. It ensures reliable competitive analysis, site selection, and market measurement.Brand attribution in POI data enables rollups across parent and child brands, improves affinity analysis, and supports cleaner market segmentation. Brand attribution POI essentials include unique brand IDs, parent-child relationships, industry classifications, and consistent naming conventions.Brand attribution for location data allows analysts to connect consumer behaviour, market trends, and brand presence to physical locations.Incomplete brand attribution can lead to missed locations, skewed results, and poor strategic decisions.You can explore POI data with comprehensive brand attribution through SafeGraph’s Places dataset. #### Top 10 Use Cases of Geographic Information Systems Key Takeaways GIS connects data with location to reveal patterns not visible in traditional analysis. It supports decision-making across public, private, and environmental sectors. Use cases extend far beyond mapping into strategy, operations, and planning. Spatial insights improve efficiency, resilience, and long-term outcomes. GIS adoption continues to grow as geospatial data becomes more accessible. GIS has become a foundational layer in how organizations understand and manage the physical world. By connecting data to location, GIS helps teams see patterns, relationships, and constraints that are not visible in traditional spreadsheets or reports.Today GIS is used far beyond mapping. It supports decisions in business, government, healthcare, infrastructure, and environmental management. This article explores 10 of the most widely used GIS applications across industries, showing how location-based insights translate into real-world impact.1. Urban Planning and Smart City DevelopmentGIS plays a central role in urban planning by helping cities manage growth, land use, and infrastructure. Planners use GIS to analyze population density, zoning regulations, transportation networks, and public services in a single spatial view.By visualizing how people move and how resources are distributed, cities can plan roads, housing, utilities, and green spaces more effectively. GIS also supports long-term sustainability by helping planners anticipate future needs rather than reacting to congestion or service gaps.2. Retail Site Selection and Market AnalysisLocation is one of the most important factors in retail performance. GIS allows businesses to evaluate potential store locations by combining demographic data, competitor presence, accessibility, and surrounding activity.Rather than relying only on historical sales or intuition, retail teams use GIS to compare markets objectively, define trade areas, and understand how customers interact with physical spaces. This leads to better expansion decisions and reduced risks.3. Transportation and Logistics Optimization GIS is widely used to design and optimize transportation networks. Logistics teams use spatial data to plan routes, reduce delivery times and manage fleets more efficiently.By analyzing distance, traffic patterns, and connectivity, organizations can identify bottlenecks and improve network reliability. GIS also supports public transportation planning by helping agencies align routes with population demand.4. Environmental Monitoring and SustainabilityEnvironmental organizations and governments rely on GIS to monitor land use, climate patterns, and natural resources. GIS enables the tracking of deforestation, water availability, air quality, and ecosystem changes over time.These insights support conservation planning, regulatory compliance, and climate resilience efforts. GIS helps the decision-makers understand not just what is changing but also where those changes are occurring and why they matter.5. Disaster Management & Emergency ResponseGIS plays a critical role in disaster preparedness and response delivery. Emergency teams use GIS to identify high-risk zones, plan evacuation routes, and allocate resources during floods, earthquakes, or health emergencies based on location.During the time of active events, spatial data helps responders understand the impact areas on a real-time basis. GIS helps in assessing damage and planning better recovery strategies, improving readiness for future events.6. Public Health and Healthcare PlanningIn healthcare, GIS helps visualize how health outcomes vary across regions. Public health agencies use spatial analysis to track how diseases spread, identify underserved populations, and plan better healthcare facilities for the masses.By linking health data with geographical context, GIS supports targeted interventions, equitable resources allocation, and more effective emergency responses during outbreaks, as visible during the times of COVID too.7. Utilities and Infrastructure ManagementUtilities use GIS to manage complex networks such as power lines, water systems, pipelines, and telecommunications infrastructure. GIS helps track assets, plan maintenance, and respond to outages more quickly.Accurate spatial data allows utilities to understand dependencies across networks and minimize service disruptions, especially during extreme weather events.8. Agriculture and Precision Farming GIS supports modern agriculture by helping farmers and agribusinesses manage land, crops, and resources more efficiently. Spatial analysis is used to assess soil conditions, monitor crop health, and plan irrigation.By understanding how environmental factors vary across fields, GIS enables more precise farming practices that improve yields while reducing waste and environmental impact.9. Business Strategy and Enterprise Decision-MakingBeyond individual departments, GIS increasingly supports enterprise-level decision-making. Organizations use GIS to align operations, workforce deployment, and asset placement with geographic demand.By integrating GIS with other business systems, companies gain a spatial view of performance, risk, and opportunity supporting more informed strategic planning.10. Government Administration and Policy PlanningGovernments use GIS to support census analysis, land records, public safety, and policy development. GIS enables agencies to visualize how policies affect different regions and populations.This spatial perspective helps improve transparency, service delivery, and long-term planning at local, regional, and national levels.Why GIS Use Cases Continue to Expand The growing availability of geospatial data, combined with better analytical tools, has made GIS more accessible and more valuable than ever. As organizations face challenges where location plays an ever-greater role, GIS provides a practical way to analyze complex spatial relationships while preserving the nuance in the data.Across industries, the most common thread today is to connect data with places. GIS enables that connection smoothly, thereby turning geographic context into actionable insights.How GIS supports Decision-makingGIS is no longer a specialized technology used by a few technical teams now. Today, it has become a core capability for organizations that operate in the physical world.From planning cities and managing infrastructure to improving healthcare access and guiding business strategy, GIS supports better decision-making by grounding analysis in location. As data continues to grow in volume and complexity, GIS will remain essential for understanding where things happen and why those places matter. FAQ’s 1. What is Geocoding in simple terms? Geocoding is the translation of an address or a place name into geographic coordinates, such as latitude and longitude. 2. What is the difference between geocoding and a Geocoding API? Geocoding is the process itself of converting a text name to a coordinate. A Geocoding API automates this process at scale. 3. Why are geocodes more reliable than addresses? Addresses can be variable, have errors, or change over time. The coordinates keep geographic positions constant, which allows for more uniform analysis and automation. 4. Can we use geocoding for only maps? While maps are a frequent output, geocoding powers analytics, routing, planning, and operational decision-making as well. 5. When should a business adopt a Geocoding API? When location starts to have cost, efficiency, customer experience, or growth implications, geocoding is a must. Geocoding is the translation of an address or a place name into geographic coordinates, such as latitude and longitude.Geocoding is the process itself of converting a text name to a coordinate. A Geocoding API automates this process at scale.Addresses can be variable, have errors, or change over time. The coordinates keep geographic positions constant, which allows for more uniform analysis and automation.While maps are a frequent output, geocoding powers analytics, routing, planning, and operational decision-making as well.When location starts to have cost, efficiency, customer experience, or growth implications, geocoding is a must. #### Top 8 Alternative Data Use Cases for Making Smarter Financial Decisions   Key Takeaways Alternative data helps investors move faster and gain perspectives beyond traditional financial reporting. Buy-side and sell-side analysts use alternative data differently, based on their roles and risk tolerance. Geospatial, foot traffic, transaction, and online activity data are especially valuable for financial analysis. Alternative data supports decision-making across the full investment lifecycle, from research to portfolio management. 5.Combining multiple alternative data sources often produces stronger and more reliable insights. Traditionally, investment firms and financial advisors have had to rely on information from sources like press releases, news stories, quarterly financial reports, and stock offerings to decide which companies and assets to recommend or invest in. But as both people and information move faster due to the rapid advance of technology, investors are looking for more immediate ways to gauge what’s going on in the market. So they are increasingly turning to alternative sources of data for quicker insights and different angles than what they would get from traditional financial information sources alone. But what is this “alternative data”, and how are investors using it to make financial decisions faster and with greater accuracy? We’ll explain by way of some key definitions, as well as a selection of alternative data use cases to demonstrate financial functions that can benefit from using alternative data. Here’s what’s in store: What is alternative data, and why is it so useful? How alternative data is used: buy-side vs. sell-side analysts Top 8 alternative data use cases: making the most of alternative data We’ll start with an explanation of what alternative data is, and why organizations are increasingly utilizing it. What is alternative data, and why is it so useful? “Alternative data” is a term used in the financial services industry to describe data collected from non-traditional sources. This includes any data that an organization can’t either generate from its own operations or collect from official public sources (press releases, government agencies, etc.). Alternative data can be any type of data, but one type that’s being increasingly used is geospatial data. This is because it’s important for investment firms on both the buy-side and the sell-side to understand the geospatial components of consumer behavior, as well as the insights they can provide. For example, seeing that foot traffic to a specific store has decreased over a certain time period may be valuable for analysts looking to make a recommendation on which brands to invest in (or not). How alternative data is used: buy-side vs. sell-side analysts “Buy-side” and “sell-side” don’t refer specifically to buying and selling assets. Instead, these terms refer to different roles within the investment industry. “Buy-side” companies and jobs have more to do with directly making investment decisions for specific funds. “Sell-side” companies and jobs, on the other hand, are more about providing services and information to help out multiple investors at once. Here, we’ll explain in more detail what an analyst in each type of role does, and why they might use alternative data. Sell-side analysts Sell-side analysts typically work for brokerage firms or investment banks. Their job is to provide and/or sell financial services to the firm’s clientele. These include useful market information and recommendations on what securities to buy, sell, or hold onto. Their goal is to keep clients doing business with the firm so that the firm gets paid a commission whenever a client completes a transaction. They are typically under pressure to provide the newest, most relevant, and most accurate financial information to clients before their competitors do. However, researching companies and collecting data to build reports can be very expensive and time-consuming. That’s why they may rely on alternative data sources to broaden and speed up their analyses. Buy-side analysts Buy-side analysts typically work for firms that, on behalf of clients, directly invest in other companies by buying and selling assets (e.g. stocks). Their job is to survey one or more business sectors for good investment opportunities, then make recommendations exclusively to fund managers (as opposed to all of their company’s clientele or the public) based on a fund’s investment strategy. Because buy-side analysts usually have to monitor more of the overall market at once, they often rely on getting credible information from sell-side analysts. However, since buy-side analysts also have a very small margin for error, they will often do their own research to check it against the recommendations of sell-side analysts. Again, this is where using alternative data can help broaden and speed up traditional research that often takes a lot of time and money. So those are the kinds of people in finance who are likely to use alternative data. But what exactly can they use it for? We’ll cover some possible scenarios in the next section. Top 8 alternative data use cases: making the most of alternative data Alternative data has several applications for companies and positions on both the buy-side and sell-side of the financial markets. Here are a few examples of what alternative data can be used for in investing, and what type of role would most typically make use of it. 1. Predictive modeling Role type: Sell-side A big part of sell-side analysts’ work is making educated guesses about how a stock’s price or a company’s financial standing will change. Alternative data helps them factor in many different variables that could affect this change, such as supply of goods/materials, consumer demand, and economic trends. In some cases, alternative data can be used to predict financial performance in a new market based on modeling done in a previous market. 2. Demand forecasting Role type: Sell-side Sell-side analysts can also use alternative data to anticipate future increases or decreases in consumer demand for certain products or services. This can be critical for predicting how particular investments will perform. Foot traffic and transaction data are especially useful for this, because they show how consumers interact with specific stores and brands. Oftentimes, financial analysts will develop demand forecasting models by looking at foot traffic or transaction data in two areas with a similar geography, or the same area in two comparable time periods. POI data with open and close attribution can also be useful for this analysis, because it indicates how many stores are opening or closing in an area. This can be used to measure the relative health of a business or industry. 3. Investment research and deal sourcing Role type: Buy-side On the buy-side, analysts from hedge funds, private equity firms, and the like leverage alternative data in their research. After all, their initial investment research is critical for finding a profitable deal that avoids unnecessary risk. So the more details they have about an investment, and the more accurate those details are, the better their reporting and recommendations will be. For example, buy-side analysts can enrich their financial analysis with POI, property, and mobility data to see how consumers interact with stores at specific locations. What other points of interest are in the area, and do they help or hurt the business? How much foot traffic does the area get, and when? Do people actually enter the store or just walk past it? Do customers tend to buy particular brands at a specific store? All of these factors can signal whether a potential investment is worth seriously considering or not. 4. Due diligence Role type: Buy-side Once buy-side analysts have narrowed down the assets, funds, or companies they want to seriously recommend their fund managers invest in, they must perform rigorous due diligence. They must learn as much as possible about the potential risks and benefits associated with the investments they’re considering. There is potentially a lot of money on the line, so analysts need to be extremely confident in their reporting and recommendations. Alternative data allows analysts to dive extremely deep into their due diligence research and consider all the angles. They might be able to uncover unforeseen anomalies that point to an investment having more risks than its potential rewards warrant. On the other hand, they may also discover some hidden opportunities that make an investment more lucrative than it was first believed to be. 5. Portfolio management Role type: Buy-side Even after a firm purchases shares in a company or other financial assets, they still have to maintain their investment portfolios. Past the initial deal sourcing and due diligence, private equity firms and hedge funds need to monitor how well each investment is doing and plan its future with their company. Alternative data can help monitor changing dynamics related to each business or asset, and thereby help firms make decisions about whether or not to maintain specific investments. 6. Competitive advantage Role type: Sell-side Sell-side analysts tend to have a wider margin for error than buy-side analysts, so they’ll usually prefer speed to accuracy when collecting and disseminating financial information. This is where alternative data is useful, because it’s often more immediately available or accessible than traditional financial metrics. As an illustration, official sources of information such as press releases and quarterly financial reports are usually accurate signals of a company’s performance. But they are only sporadically accessible, as they are usually only released on the company’s timetable or in reaction to a major event. In contrast, companies tend to update their social media feeds much more often in an effort to stay engaged with their customers. By paying attention to these latter channels, analysts can get a more immediate sense of what’s happening at a company, and how consumers are interacting with the brand. This can give them signals that allow them to provide predictive investment advice faster than some competitors, who may merely react to financial news once it becomes official. Read our blog about up-to-date store open/close data to learn how alternative data is used to stay on top of a rapidly changing business landscape. 7. Brand or industry relationships Role type: Both Another use of alternative data is in observing relationships between stores, brands, or business sectors, especially over specific geographic areas. For example, some companies may be doing better or worse in certain places because of other businesses in close proximity or in the general area. So performance trends may be dependent on competing or complementary businesses moving into or out of the vicinity. To illustrate, a health food store or restaurant may be doing well because nearby gyms attract fitness-conscious people to the area. But it may see a sudden downturn if those types of complementary businesses close down or move away. Analysts may want to take relationships like these into account when choosing which companies to invest in, and when. POI and mobility data can also point to potential future partnerships, mergers, or acquisitions. As an extension of the example above, a business moving into a new market may be able to establish itself quickly if it cross-promotes with other nearby companies that cater to similar lifestyles but don’t directly compete. On a more granular level, by looking at publicly-available data on corporate transportation, analysts can find patterns in where a company’s executives have been traveling lately. Based on which similar businesses are at the destination(s), it might allow analysts to anticipate an acquisition, merger, or other partnership in the works before it becomes widely known. Read our ultimate guide to trade area analysis to see how geographic patterns can inform investment decisions. 8. Online activity insights Role type: Both Research, marketing, entertainment, and business transactions are increasingly moving to the online world. So it makes sense that the Internet can provide a wealth of data that could be turned into financial insights. This includes not just people’s online buying, bidding, and selling behaviors, but also their other online activity as a reflection of how they might spend money in the real world. As an illustration, if people visit a website or use an app associated with a specific brand, it could indicate that they intend to spend money on it. For example, if someone visits a musician’s website or social media feed, it may signal that they’re a fan and will buy that artist’s music or concert tickets (at least at some point in the near future). Or if someone downloads an application to track a sports league, it’s reasonable to think that they may eventually buy game tickets or merchandise (if only for a specific team). Analysts can use web traffic and app usage to see which companies are hot or not, though they may need to dig a bit deeper to determine how much of that attention is positive or negative (as bad publicity can signal an impending downturn). Taking this a bit further, online shopping has become increasingly common due to the convenience of being able to buy, sell, and bid from the safety and comfort of one’s own home. So foot traffic at brick-and-mortar stores might not tell the whole story of how well a business is doing. Financial analysts may need to look at things like online payments (through credit cards or payment processing services such as PayPal or Venmo), or activity surrounding logistics companies or warehouses. This data, when combined with information regarding consumer-facing storefronts, can give analysts a more complete picture of a company’s overall sales and performance. Learn more about connecting multiple alternative data sources for deeper financial insights. In short, alternative data allows investors and analysts to look at companies and assets through several unique lenses. This can enable them to anticipate market trends before they happen, or to see investment opportunities or pitfalls that they might not if using only traditional financial data. For more information on how alternative data is shaping the financial markets of today and tomorrow, check out SafeGraph’s financial services use cases for geospatial data. FAQ’s 1. What is alternative data in finance?Alternative data refers to non-traditional datasets such as geospatial data, foot traffic, transactions, online activity, and social signals used to inform investment decisions. 2. How does alternative data differ from traditional financial data?Traditional data is usually structured, periodic, and officially reported, while alternative data is often real-time, behavior-based, and sourced from everyday activities. 3. Who uses alternative data: buy-side or sell-side analysts?Both. Sell-side analysts often use it for faster insights and market signals, while buy-side analysts rely on it for deeper research, due diligence, and risk assessment. 4. Why is geospatial data important in alternative data analysis?Geospatial data provides context on consumer behavior, store performance, and regional trends that are difficult to infer from financial statements alone. 5. Is alternative data reliable on its own?Alternative data is most effective when combined with traditional financial data and evaluated with a clear analytical objective. Alternative data refers to non-traditional datasets such as geospatial data, foot traffic, transactions, online activity, and social signals used to inform investment decisions. Traditional data is usually structured, periodic, and officially reported, while alternative data is often real-time, behavior-based, and sourced from everyday activities. Both. Sell-side analysts often use it for faster insights and market signals, while buy-side analysts rely on it for deeper research, due diligence, and risk assessment. Geospatial data provides context on consumer behavior, store performance, and regional trends that are difficult to infer from financial statements alone. Alternative data is most effective when combined with traditional financial data and evaluated with a clear analytical objective. #### Unacast Announces Strategic Partnership with SafeGraph to Bring Enhancements to the Unacast Insights Platform This blog was reposted with permission from Unacast | Author: Unacast | Original Source The Unacast Insights Platform Levels Up With SafeGraph’s Premium Point of Interest DataToday, Unacast announced a strategic partnership to incorporate SafeGraph Places into its location intelligence product suite. SafeGraph’s point of interest (POI) data is now paired with Unacast’s machine learning-powered foot traffic data and accessible through the Unacast Insights platform. These updates will further modernize the way businesses utilize location insights, providing instant access to millions of new locations across a range of categories. This partnership allows Unacast to offer expanded and up-to-date insights into consumer behavior and trends with the most precise and reliable POI data available. SafeGraph’s high quality Places data contains a robust set of geospatial attributes to provide deep context about physical locations, including address string, geographic coordinates, brand affiliation, open/close date, and NAICS/category codes. The updated venue closure and opening statuses provide essential information to help businesses adjust their strategies quickly and efficiently. The enhanced interactive map on the Unacast Insights Platform is now even more user-friendly and visually appealing. With new polygon displays, icons, and labels, users can easily navigate and interpret the data to make informed decisions. The interactive map feature is perfect for visual learners who prefer a more intuitive way of digesting information. Additionally, Unacast’s popular ranking and benchmarking feature is now enriched with premium category and brand information, simplifying the process of staying ahead of the competition.The updated metrics on the Unacast Insights Platform offer a deeper level of analysis to help businesses understand their customers better. A new visit length analysis feature breaks down visits into short, quick, or extended stays, giving users a more nuanced view of consumer behavior. Additionally, the enhanced popular times feature shows hourly activity levels, allowing businesses to optimize their operations and marketing efforts for maximum impact.“We are thrilled to join forces with SafeGraph in bringing location intelligence to life in both our data sets as well as in our Insights platform. We’re already seeing their state-of-the-art approach to POI data along with Unacast’s proprietary visitation data help customers uncover new revenue opportunities and better assess their strategic investments in brick and mortar locations,” said Jonathon Schuster, Chief Product Officer at Unacast.The combination of Unacast’s mobility analytics and SafeGraph’s POI data ensures businesses now have access to even more valuable insights to inform their decision-making. The Unacast Insights Platform is now an even more powerful tool in helping retailers, real estate companies, municipalities and more unlock new opportunities and drive success.About SafeGraphSafeGraph is a pure-play data company that provides the highest quality data on global points of interest (POI), enabling innovators to create world-class analytics and applications, Learn more at www.safegraph.com.About UnacastUnacast is a global location intelligence and insights company shortening the time and resources it takes to get from data to action. Using technical expertise, state-of-the-art machine learning, and ethical responsibility, Unacast extracts the valuable information from location data, delivering trustworthy, reliable, and privacy-friendly location intelligence. Companies across industries, at every stage of growth, rely on Unacast to make more informed decisions that better align with the world around them. Learn more at www.unacast.com #### Using Connected Vehicle Data and Parking Lot Polygons to Attribute POI Visits Key Takeaways Connected vehicle GPS data often stops in parking lots, limiting direct POI visit attribution. Parking lot polygons provide the missing spatial link between vehicles and the POIs they serve. Combining connected car data with parking lot and POI geometry enables more accurate visitation models. Parking lot–to-POI relationships are especially critical for malls, strip centers, and multi-tenant locations. Polygon-based attribution reduces guesswork compared to radius- or centroid-based methods. According to data from Statista, there were about 84 million connected cars on the roads in the United States in 2021. That number is expected to surpass 305 million by 2035, making the United States the biggest market for connected vehicles. More connected cars on the road means that connected car and managed fleet data are becoming more abundant. Businesses have already started using data from connected vehicles to optimize their fleet logistics, analyze traffic patterns, and understand consumer and fleet visitation behavior at POIs (visit attribution).Using connected vehicle data alone for visit attribution presents a unique challenge: understanding where consumers have gone after they have parked their cars. While location data from mobile devices can typically be seen within a POI’s geofence (since people carry their phones with them), GPS data from connected vehicles usually ends well outside the POI in a nearby parking lot. Without accurate data on parking lots and the POIs those parking lots are serving, visit attribution with connected car data is guesswork. However, by combining connected car data with precise parking lot polygons and POI data, data scientists can create robust POI visitation models.SafeGraph now offers precise polygon data on parking lots for this use case as a part of our Geometry dataset. SafeGraph Parking Lots matches parking lot polygons with the POIs they are serving, helping you understand what POIs a consumer visited based on where they parked their car.SafeGraph Parking LotsSafeGraph Parking Lots currently contains the parking lot(s) serving over 6M Places. These are available as a collection of premium geometry rows depicting the shape and size of surface parking lots across the US.When combined with connected vehicle data, the SafeGraph Parking Lots and Places datasets can help you understand visitation behavior at POIs. For example, a small strip mall may consist of a number of mall tenants and a parking lot. These tenants and their parking lot would be linked in our data, so you would know that any connected vehicle that stopped in the parking lot was a patron at the strip mall.Precise polygon data on parking lots helps complete the picture of a driver’s journey. SafeGraph’s Parking Lot dataset provides context on the relationship between a parking lot and the POIs it is serving. Combined with connected car data, parking lot polygons make it possible to broaden visit attribution models to specific places.Learn more about SafeGraph Parking Lots by visiting our site and download a free sample to experiment with the data yourself. FAQ’s 1. Why is connected vehicle data alone insufficient for POI visit attribution? Connected vehicle GPS data typically ends when a vehicle parks, often outside a POI’s geofence, making it unclear which place was actually visited. 2. How do parking lot polygons improve attribution accuracy? Parking lot polygons capture where vehicles stop and link those locations to the POIs they serve, enabling more reliable inference of visits. 3. How is this different from mobile device location data? Mobile device data often enters a POI’s geofence because people carry phones indoors, whereas vehicle GPS data usually does not. 4. Which POI types benefit most from parking lot–based attribution? Multi-tenant locations such as strip malls, shopping centers, and large retail complexes benefit the most. 5. Can parking lot polygons be used with other mobility datasets? Yes. Parking lot geometry can complement multiple mobility sources where final destination ambiguity exists. Connected vehicle GPS data typically ends when a vehicle parks, often outside a POI’s geofence, making it unclear which place was actually visited.Parking lot polygons capture where vehicles stop and link those locations to the POIs they serve, enabling more reliable inference of visits.Mobile device data often enters a POI’s geofence because people carry phones indoors, whereas vehicle GPS data usually does not.Multi-tenant locations such as strip malls, shopping centers, and large retail complexes benefit the most.Yes. Parking lot geometry can complement multiple mobility sources where final destination ambiguity exists. #### Using Geospatial Data to Understand General Liability General liability is directly related to the number of people exposed to a potential perilPeople eat more ice cream in the summertime. As statements go, that one isn’t likely to make an anthology of insightful wisdom anytime soon. It turns out, however, that it actually has significant implications for how we assess general liability (GL) risk.The risk of most GL perils like slip, trip, and fall (ST&F) is directly related to the number of people who are exposed to the peril. The more people that walk over a patch of ice, the greater the chance that someone will slip on it. Despite the fact that customer visits is a key driver for GL exposure, however, insurers almost never consider it in pricing models.The reason for this is pretty simple – it’s not cost effective to sit outside someone’s business with a notepad and count how many people go in every day. As an industry, therefore, insurers rely on revenue as a proxy for customer visits. This works pretty well. There’s a strong correlation between the two, and revenue is way easier to get. There are several problems with it, however.Revenue data is not enoughLet’s consider an ice cream parlor and a soup shop. Both are classified with NAICS codes as “Snack and Nonalcoholic Beverage Bars”, and both have comparable revenue. The timing of that revenue is radically different, however. Anywhere north of the Mason-Dixon line ice cream parlors do the bulk of their business in the summer, while soup shops have higher volume in the winter. The result is that soup shops have far higher exposure to ST&F due to icy conditions. An annual revenue number will never show that, however. Pricing a GL policy simply on revenue and NAICS codes will generate the same premium for both businesses, when the actual risk may be different by a factor of three or more.Another issue with timing is lag. Since revenue is a lagging indicator, relying on it as a proxy for GL exposure means the data is always out of date. In a steady-state business this may not be a huge factor, but as we are attempting to understand the long-term effects of COVID on our economy, working on old data is like trying to drive a car by looking in the rearview mirror.In addition to timing, bias is also another problem with revenue data. Unless a business is publicly traded, insurers are relying on the business to provide them with their own revenue numbers. This is the same business that is attempting to minimize their insurance premium. Insurers can mitigate this conflict of interest by invoking audit clauses, but that brings us to the final problem with revenue – ease of access.Insurers use revenue because it has been easier to get than visit data. But easier does not equal easy. Actually verifying revenue data requires a human being to manually review tax returns, receipts, or other documents. This, in turn, undermines any attempts at straight-through risk assessment for small commercial policies. When it comes to revenue data insurers can have easy or we can have trustworthy, but they can’t have both.Geospatial data informs the most accurate modelsAll that said - there is a better way for insurers to use data to model GL and ST&F. Anonymized high-precision building footprints, and detailed points of interest provide insurers with a source of truth for how people are actually interacting with the physical places they are insuring. Collectively, these datasets give insurers a new option for estimating customer visits. We know what and where businesses are, and how consumers move in and out of those businesses. This gives insurers a new metric that is highly correlated to total visits and addresses the key problems with using revenue highlighted above. SafeGraph’s sole focus is curating geospatial data that businesses can depend on. With high quality POI and geometry data, insurers can have confidence in their GL and ST&F analysis.SafeGraph data enables insurers to:This means insurers can understand (and price for) seasonal patterns with historical data.Remove the conflict of interest in pricing based on self-reported revenue data. #### Using SafeGraph Polygons to Estimate Point-Of-Interest Square Footage Areas inferred from SafeGraph correlate with government datasets as high as 0.89.For a working demonstration showing how to calculate square footage from SafeGraph Geometry polygons, see this Python Jupyter Co-Lab notebook.We analyzed the public records (food-license permits) from 4 different major American cities and compared their data to SafeGraph Geometry. Shown here are 3 examples of SafeGraph Geometry polygons (green) superimposed on satellite imagery to verify their accuracy.If you could know one fact about a retail location to understand the business, would it be square footage?Knowing the square footage of a retail Point of Interest (POI) is an informative proxy for all sorts of details about that location. Is the POI located in a dense urban environment with a high premium on square footage or a suburban strip mall where retail centers can afford larger buildings? Coffee shops have very different square footage compared to big-box retailers (which in turn are different than major corporate chain restaurants, etc.).You can calculate square-feet from SafeGraph polygonsAlthough square-feet is not an explicit attribute reported in SafeGraph Places, the SafeGraph Geometry product features detailed geospatial polygons of the building footprint (see pictures above). It’s very easy to calculate square-footage from these polygons.We have a working demo showing how to calculate square feet in this Python Jupyter Co-Lab notebook.Want to re-create this chart yourself and dig into the details? Curious about those outliers? Want to analyze the correlation by category? Read the full post and customize the ready-to-run code at the Python Jupyter Co-Lab notebook (it runs in your browser!).When SafeGraph and the Government disagree, who is right?Although the overall correlation is very high, there are still notable outliers. When SafeGraph and the government data disagree, who is right and who is wrong? We investigated the top ten largest discrepancies one-by-one. Of the top ten largest differences, 5/10 of the errors were clearly in the government dataset and 5/10 of the errors were in the SafeGraph data. Room for improvement on both sides! And you better believe SafeGraph is working hard on it! Nonetheless, the overwhelming result is that SafeGraph data correlates strongly with data reported by the government on food-permit licenses.To see a detailed analysis (with pictures!) of the top ten largest disagreements between SafeGraph and the government data, check out the Python Jupyter Co-Lab notebook. #### Validating Spend Data for Brands Against Company Reporting‍ Historically, data about visit patterns has enabled hundreds of organizations - from the CDC to Sysco and everything in between - to measure and analyze how many people visit physical places, when they go, and where they come from. But many are still searching for a location-based way to analyze spending behavior.‍ Introducing: SafeGraph Spend‍ To empower our customers with places-based spending data, we developed a brand new product offering: SafeGraph Spend. Spend data makes it easier than ever to understand customer behavior and perform competitive analyses. While other data providers offer transaction data, no one else delivers it at the same precise, POI level as SafeGraph Spend. Spend data is updated monthly, making it much more fresh and accurate than quarterly earnings reports. Many organizations rely on company reporting for indicators of a brand’s sales performance, which is often published quarterly. SafeGraph Spend provides aggregated and anonymized transactions for specific brand locations at a monthly cadence, so performance can be analyzed more frequently and more accurately. The rich metadata included with Spend also makes it possible to understand spending patterns by day, geography, and medium (online vs offline). ‍ Validating Spend Data for Brands Against Company Reporting‍ To test the efficacy of Spend data, we decided to compare it to company reporting, which many organizations currently use to analyze a brand’s sales performance. First, we aggregated raw_total_spend by brand and compared it against publicly reported quarterly earnings figures. Some brands, such as Chipotle, differentiate between in-store vs online or delivery transactions in their quarterly reporting. SafeGraph Spend also differentiates between purchase types. In addition to raw_total_spend, Spend has additional columns such as online_spend and spend_by_transaction_intermediary which indicate the platform used in the transaction. For example, here we can see Chipotle transactions via delivery services using the online_spend column. The Spend dataset facilitates analyzing seasonal or event-based changes in spend by location. When looking at McDonald’s, we can see changes in spending behavior due to COVID-19 in Q2 2020. If we drill down to the state or individual store level, we can observe the COVID recovery rate more closely. For example, in early 2021, Colorado restaurants were required to have a certification to serve diners indoors. We can observe that change in SafeGraph Spend data for that timeframe. A similar methodology can be used to analyze the effect of new promotions, loyalty programs, or the addition of a restaurant to new delivery platforms. Watch our on-demand webinar to learn more on how this places-based transaction dataset enables you to uncover insights on spending changes over time at specific POIs, while comparing spending trends between locations and across regions. #### Validating Store Counts for Brands Against Company Reporting Maintaining timely, accurate, and reliable open and close data on POIs is crucial for numerous customer use cases, such as retail site selection, competitive analysis, and analyzing COVID-19 recovery trends. During 2020, many brands closed stores, while some were temporary due to quarantine requirements, some large retail chains such as Gap and Bed Bath & Beyond, signaled significant changes in their overall store footprint. To keep up to date with these changes, our QA processes were improved to ensure we could deliver timely and consistent reporting on closed_on dates. Other factors, such as industry trends, have also accelerated the pace of store closures, notable brands include Chase. Conversely, as the COVID-19 recovery continues in 2021, brands such as Chipotle are reporting plans to increase store counts. The rising trend of digital sales and food delivery may reshape how these new stores operate. The steps below can be used to extend this analysis for other brands: 1. Obtain company reported store counts from quarterly earnings reports found in SEC Filings or in the Investor Relations section of the company’s website (Chipotle). Take note of the countries included in the store count (US only or US + CA_, however SafeGraph currently doesn’t have POI counts for global store counts. 2. To measure the SafeGraph open POI count at the report date, use the most recent Core Places Release, filtering by brand and country if necessary. Take the sum of: - Total POI count Minus - POI where closed_on - Minus POI where open_on > report date 3. Compare the company reported store counts against SafeGraph open POI count by plotting a graph and calculating the correlation coefficient. Note: some company reported store counts may include child brands, whereas SafeGraph has each brand individually (Albertsons includes 19 child brands such as Safeway, Jewel-Osco and Vons). Determining if a store is temporarily or permanently closed can be challenging, as brands can change their store locator features without warning. In the case below, Bed Bath & Beyond closed some stores but they remained in store locator, with trading hours as “Closed” everyday and “CLOSED” in the name. Several brands use similar methods to indicate a store is temporarily closed and to avoid incorrectly marking these as permanently closed, we wouldn’t apply a closed_on date. To resolve the discrepancy here, we can update the logic for this specific brand and the new POI count can be reflected in the next release without impacting the logic and POI closures for other brands. In addition to QA checks throughout our pipeline, SafeGraph also uses customer feedback to continuously improve our POI data. Errors in open/close dates or other attributes can be reported directly to the Product Team via our Feedback Tool. To see the analysis in action, check out this spreadsheet with both working and raw data, then schedule a demo to get high quality POI data for your team. #### We Can Use Machine Learning to Make Better Decisions: World of DaaS interview with Stanford professor of economics, Susan Athey New podcast with Susan Athey, The Economics of Technology Professor at Stanford's Graduate Business School. Our conversation is available everywhere (Apple Podcasts, Spotify, YouTube, etc.). Please subscribe, follow, and review.‍ Susan Athey is one of the top 10 smartest people I have ever met. And I’ve met a lot of people. I’d put her in the same league as Tyler Cowen and Peter Thiel. She’s a genius. Susan is the Economics of Technology Professor at Stanford's Graduate Business School. She won the John Bates Clark Medal -- an award given to the American economist under the age of 40 who made the greatest contribution to economic thought and knowledge. She basically invented the role of Tech Economist when she served as Chief Economist for Microsoft. She sits on several boards. And so much more. Susan and I dive into what tech companies are doing wrong and how we can use machine learning to make better decisions. Here are some highlights from my conversation with Susan Athey. Machine learning requires feedback, so you need to pick the right time horizon Most algorithms on the web are really good at optimizing for clicks because there is a very tight feedback loop -- and feedback is really important to ML systems. So the tighter the feedback loop, the faster the algorithm will improve. The problem is that the things with the tighter feedback loops are not always good proxies for what you want to measure. Measuring clicks has a lot of downsides. Irrelevant clicks will never lead to a purchase. We really want to measure long-term value to the customer -- but that could take a year and that would be too long of a feedback loop. So we need to be creative in what we measure and identify the trade-offs between time and long-term value. A/B testing is often constructed to not get the right answer Almost all companies have this idea that in order to have data-driven innovation, you want to do a lot of A/B tests. To run a lot of tests, they need to be short-term. If they're short-term, then you’re going to overlook long term feedback which is super important for understanding if your algorithm is doing what you originally intended. Most people understand this tradeoff. But A/B tests let you ship algorithms often, and engineers are typically rewarded when their algorithm is shipped. Figuring out the right KPIs is really hard You have to recognize that there’s a tradeoff. You want to optimize for the long run objective, but there’s a lot of noise, innovation can slow down, and you may not get a signal. If you only optimize for a short-term objective, then you're probably going in the wrong direction. But figuring out the right KPIs to optimize for is one of the most powerful things you can do. You don’t want to have self learning systems for the big questions When you put in a reinforcement learning algorithm, you have to commit to metrics. Self learning works well when your metrics capture everything you care about. The system will optimize without you stopping it. But if you have bad metrics, you’ll optimize for the wrong thing. You want to ensure your algorithms don’t have unintended consequences At a smaller company, you can ensure that engineers pay more attention to it. Ship a new algorithm, have a holdout set, and evaluate a few months later. This works well when you have a team of 20 people where everyone understands all of the algorithms and decisions being made. At a larger organization, it’s harder. You don’t want to stop decentralized innovation. You want to make hundreds of product decisions in a week. So you can have peer reviews, where you have to explain the effects of your work and the metrics that capture this information. Most people know what’s qualitatively going wrong. So you can also identify when A/B tests are the right fit. And sometimes you just have to change the metrics that you’re monitoring. Long term experiments are worth the decrease in income Setting up a two year holdout group can help you figure out the right longterm metrics to optimize for. But this will require passing on some revenue. It might be expensive for a small firm. It’s easiest to set up before you have revenue. It's hard to claw back and give up revenue once you have investors to report to. Data quality is really, really important in machine learning Machine learning allows us to more quickly see interesting things in the world because we can maybe run a million experiments, whereas before we could just run one. But we also risk learning things from experiments that aren't true. High quality data from natural experiments is key to avoiding this. Of course, this is music to our ears at SafeGraph as our core focus is high-quality data for data scientists and machine learning engineers. But even with amazing data, there’s a lot left unknown We stopped teaching a new generation of people working on machine learning how to think about drawing inferences from observational data. And we stopped teaching them that there's situations where you just can't. Regardless of the data or the AI, there are some theorems that say you literally cannot answer that question. Hope you enjoy this episode of World of DaaS — would really appreciate it if you subscribe and review Apple Podcasts, Spotify, YouTube, etc.). #### We Need Open Information to Power Innovation What powers machine learning?What is the oil that machine learning engines run on?Knowledge. Machine learning engines run on knowledge.History. Machine learning engines run on history.The crass word for “knowledge” and “history” is … data.Machine learning runs on data.The more data, the better the data, the more innovation. Great data beats great algorithms.Great innovations will come from machine learning that mines great data. But who will bring about these innovations?That’s a very important question.There are two potential futures.One future is a world where all the important data — consumer behavior, health outcomes, economic data, and the like — resides with a very small number of giant companies. Companies like Google, Facebook, Tencent, and Amazon. These companies jealously guard their data because it’s what gives them their competitive edge.And because they have a monopoly on the data, they monopolize innovation. Yes, some innovation will still happen within start-ups. But the bulk of the world’s innovation could be concentrated in the hands of the few companies that control the knowledge. And the monetary gains from that innovation will go to an increasingly narrow group of people.This is not necessarily an evil world. But monopolies slow down technological progress. Monopolies promote scarcity despite a reality of abundance. Monopolies concentrate and stifle economic activity instead of distribute and promote it.Is that a world that you want to live in?There is a second version of the future, though. This is a world where access to data — to knowledge and history — is made available to all potential innovators.In version 2.0 of the world, data is democratized. And innovators everywhere can build on that knowledge.Open access to new technologies and core infrastructure drives innovation. The same is true for data.“It doesn’t matter, we have all the data”One recent catalyst in machine learning has been the invention of TensorFlow, a machine learning library that makes it especially easy to build deep learning models. In my opinion, it is one of the top ten most important innovations in the last decade. It has already helped create very large gains in deep learning. It is amazing.TensorFlow is an open-source ML framework by GoogleNot long after Google announced the project, I caught up with a core engineer on TensorFlow. I asked him why Google would open-source such an amazing innovation. Isn’t machine learning the key to Google’s future? He calmly replied “it doesn’t matter, we have all the data.”‍Back to our two futures.In the first future, a small number of companies have all the world’s data. In that world, even the most advanced models and machine learning frameworks don’t matter. Here, a few companies will always win by brute force because core truth data matters more for model performance than do fancy algorithms. In this future, innovation slows down due to lack of competition, and a very small number of companies command most of the innovation rents.But in the second future — the one where all innovators can get access to the high-quality data that they need — innovation becomes the competitive advantage. Because all innovators have access to truth about the world, the barrier is not access to the past, but imagination for the future. Here we value creativity. Here we value real innovation. Here we focus on the future rather than the past.The second future is not such a far-off world. We already have democratized access to compute power. Today that’s available to anyone. Open access to compute power (via AWS, Microsoft Azure, Google Compute, and more) has massively accelerated innovation. And no, it’s not free. But it is available to anyone that wants to pay for it. And more and more innovation is happening around cloud compute — like the advent of Docker containers. So the price of the compute power declines every month.The good news is that there are a bunch of companies and nonprofit initiatives working on democratizing access to different types of data: SafeGraph (my company), Data.World, data.gov, Siftery, OpenStreetMaps, weather services (like Accuweather and The Weather Company), Dun & Bradstreet, SecondMeasure, and many more. There are also data marketplaces and initiatives within non-data companies to make data more accessible, like AWS public datasets, Oracle Data Cloud, Salesforce.com data marketplace, LiveRamp (my old company) data store, Quandl, ESRI data marketplace, and more. But there is still a long way to go to challenge the data monopolies currently accumulating inside the walls of a few giants.Of course, we need to open up data responsibly. Unlike compute power, data can reveal private matters in a person’s life. The promise of progress (like coming up with the best cancer treatment for each individual) does not need to come with the cost of all of us giving up our privacy.Data can be very personal, and people should have the ultimate say in whether or not their data is used for analysis. And for those that allow their data to be used, companies need to take extreme care to ensure data privacy. This involves implementing strict safeguards, like ultimately building tools that allows companies to analyze datasets and train models without actually seeing the underlying data, as well as ensuring that any data that companies do see is k-anonymous for some large k.We need open information to power innovation.Data should be an open platform, not a trade secret. Information should not be hoarded so that only a few can innovate. We need as many organizations as possible working to solve the challenges facing humanity — and that requires everyone has access to powerful datasets.We need open information to power innovation.Data should be an open platform, not a trade secret.— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) November 21, 2017 When we open the world’s most powerful data — when small startups, new initiatives inside larger companies, and massive companies can all access the same datasets — then competition will increase, innovation will proliferate, and societal growth will accelerate exponentially. More people will benefit from technology, faster. This is the future that I want to live in.Special thank you to Ryan Fox Squire and Noah Yonack for helping draft this piece.Join Us: We’re bringing together a world-class team, see open positions. #### What Is a Catchment Area? + Methods & Tools for Your Analysis   Key Takeaways A catchment area defines the geographic region from which a business or service attracts its customers. Catchment area analysis helps businesses understand customer origins, market potential, competition, and store performance. Catchment concepts are used beyond retail, including geography, healthcare access, and school district planning. Businesses commonly calculate catchment areas using buffer zones, drive or walk times, and mobility-based methods. High-quality POI and mobility data, combined with GIS and BI tools, are essential for accurate catchment analysis. To grow your business, improve the customer experience, and retain and engage your customers, you need to understand who they are and where they come from. Catchment areas, or trade areas, allow businesses to understand where their visitors are coming from, and gain deeper insights into who they are and their behavior. To help you learn how to perform catchment area analysis yourself, we’ll cover the following: What is a catchment area? What are catchment zones? What is a catchment area analysis & why is it critical you do one? 3 methods of calculating catchment areas Top tools you’ll need to identify your catchment area Before we explain the best methods for calculating catchment area and the best tools to use, we’ll explain what a catchment area is and why catchment area analysis is so important. What Is a Catchment Area? A catchment area, or trade area, is the geographic area that a business, service, or organization attracts its customers from. Catchment areas can be defined by distance, by travel time, and by mobility patterns, allowing you to analyze where foot traffic comes from in a variety of ways. Catchment areas are often used to better analyze foot traffic and store visit rates. To learn more about how to perform a catchment area analysis, check out our ultimate guide to trade area analysis. What does this mean for businesses? For businesses, catchment areas show you where your customers come from and indicate the level of customer engagement in different areas near your business. Catchment areas can be used to analyze trade areas of your business, painting a picture of the coverage of your store locations, the competition in your area, and the overall pull of your store locations. This makes it extremely easy to compare your own store locations against each other, competitor locations to your locations, market saturation, and more. What is a catchment in geography? In geography, a catchment area is an area of land that collects water after rainfall, typically bounded by hills. Water flows down into these areas and collects into rivers and streams. These areas are useful for analyzing a geographic area, as it aims to understand waterfall and flow in the area. Since water flow impacts much of the region's geography, foliage, and ecosystem, this is an important lens for analyzing an area. This helps inform development of drainage basins and water flow. Trade area analysis - or catchment analysis - for businesses is based on this same concept. What does catchment mean for health care & schools? Catchment areas are often used by cities and government organizations to determine boundaries, such as school districts and the coverage of hospitals and health care facilities. Communities determine who can attend schools by defining school districts. To do this, they use catchment areas based on distance and travel time. Similarly, hospitals and other public safety institutions can ensure that people have access to the services close to them. What Are Catchment Zones? Catchment zones are simply catchment areas, but this term is commonly used to refer to boundaries used in school or government catchment area analysis. They are used to define zones or boundaries for people’s reference - such as which school district they belong to, which hospital is within an ambulance's service area, or the areas served by specific subway stations. Image Credit: Wikipedia While catchment areas in trade area analysis will often change depending on what you want to analyze, catchment zones are often fixed boundaries of analysis. For a business, you may want to view catchment boundaries by different travel times. For school zones, you want defined boundaries that remain relatively fixed. Catchment zones are also useful when examining multiple trade areas that need to be identified separately but analyzed together. What Is a Catchment Area Analysis & Why Is It Critical You Do One? Catchment area analysis is the process of analyzing where your customers are coming from and who they are. It involves identifying the trade area of your business and digging into demographics, POI, and other data to understand your customers and their behavior. This includes identifying competitor locations. It’s extremely important to do a catchment area analysis for your store locations for a number of reasons: Understand where your customers are coming from: Map and visualize where your customers visit your store locations from, helping you understand your main service area. Understand the market potential and penetration: Gain a sense of what the market is like in the region your stores exist, and determine how likely you are to be successful. Understand your store location coverage: Paint a clear picture of the coverage of your store locations, including gaps and overlaps. Improve your own store performance and compare against competitor businesses. Select new sites for opening and existing ones for closing: Identify the best sites for a new potential store and which ones would be the best to close, based on actual store performance. Determine which locations are performing the best: Use analytics to improve the performance of individual stores and your entire network. Identify the best stores to expand or remove and how to better reach out to customers. Understand your customers’ travel network: Leverage this to know exactly where your customers travel in your trade area, and enhance your customer experience for them in the process. Improve marketing outreach and customer targeting campaigns: The more you know about your customers, the better you can market to customers, increasing your conversion rate.‍ Analyze competitor locations: Learn how you perform against your competitors, develop strategies to better serve competitor customers, and gain their business for yourself. Map customer behavior and mobility patterns: Catchment area enables you to create extremely useful catchment maps, helping you perform faster, more effective catchment analysis. Trade area analysis should be a regular practice for your business, and is especially important when making big decisions about where to open new stores or close existing locations. 3 Methods of Calculating Catchment Areas When it comes to defining your trade areas, there are a number of ways to do this depending on what you are trying to learn. Below, we briefly explain what these three methods are: Buffer trade areas - Trade areas defined by a distance around the location(s) under analysis. Walk/drive time trade areas - Trade areas defined by the walk or drive time to the location(s) under analysis. Mobility trade areas - Trade areas defined by the mobility patterns of the location(s) under analysis. This is made possible by pairing point of interest (POI) and mobility pattern data appended to your store locations. For a detailed breakdown of each of these methods, see our full explanation of the 3 methods of calculating catchment areas for trade area analysis. Top Tools You’ll Need to Identify Your Catchment Area The most important tool for identifying your catchment area is the datasets you’ll use. This information has to be reliable and accurate for your catchment area analysis to be of the most value to you. Below, we cover where to get data for catchment area analysis and how you can use this data for catchment analysis. 1. Where to get catchment data: SafeGraph What this is: SafeGraph offers point of interest (POI) data perfect for catchment analysis. With detailed information on a variety of locations, SafeGraph data can be used to gain information on not just your store performance, but that of your competitors. Even better, you can leverage this data to visualize your catchment analysis with a trade area analysis map.‍ Why you need this for a catchment analysis: Your catchment area analysis is only as valuable as the data you use for these analyses. To create the best catchment area maps and ensure you are performing the best trade area analysis possible, be sure to use high-quality, reliable data that is easy to work with. 2. How to perform a buffer analysis: What this is: Buffer trade analysis involves creating a buffer zone (by distance) around your store locations, indicating the area for analysis. GIS and BI tools such as CARTO, Esri, Tableau, Domo, and AWS all facilitate this type of analysis, making it easy to define buffer trade areas for your store locations.‍ Why you need this for a catchment analysis: Buffer trade areas are a simple way of determining where your customers are coming from, and initiating this analysis. Based entirely on distance from your store locations, these buffer zones are the most basic form of catchment area analysis. 3. How to perform drivetime/walktime analysis What this is: Drive and walk time catchment analysis involves analyzing trade areas based on the travel time it takes to get to the store location by either walking or driving. Business Intelligence and GIS solutions enable analysis by turning raw data into visualizations that are easy to interpret and gain insights from.‍ Why you need this for a catchment analysis: Drivetime and walktime trade area analyses are extremely important for understanding how easy it is for customers to get to your store. This paints a clear picture of how accessible your store is to your customers and is especially important if your business sells items that require people to have a vehicle (i.e. big items like furniture, etc.). 4. How to perform mobility analysis What this is: Mobility catchment area analysis involves analyzing trade areas based on mobility patterns data. This allows you to see your trade area based on the mobility patterns of customers - and potential customers - in the areas around your business. You can leverage this data to better understand your customers, their demographics, and their behavior. Data visualization is made possible by using BI and GIS solutions.‍ Why you need this for a catchment analysis: Mobility trade analysis gives you deeper analytics than a standard buffer analysis, as it shows where people are actually going. By using mobility data, you can also derive demographics data such as age, income, education levels, and more. All of this gives you deeper insights to draw from for your decision-making process. Understanding and analyzing your businesses catchment area(s) is essential to harnessing the full potential of your location data, giving you an edge over your competitors, and allowing you to make better decisions about your store locations. You can then use this information to more effectively market to your customers, saving money and effort marketing to customers that are outside of your local area and won’t visit your stores. SafeGraph has a variety of datasets with enriched point of interest (POI), empowering you to understand where your customers come from, how you stack up against competitors, and manage your store locations effectively. FAQ’s 1. What is the difference between a catchment area and a trade area? They are often used interchangeably in business analysis. Both refer to the geographic area from which a location draws customers. 2. How are catchment areas used in business? Businesses use catchment areas to analyze customer origins, assess market coverage, evaluate competition, improve site selection, and optimize marketing strategies. 3. What data is needed for catchment area analysis? Accurate POI data, mobility or foot traffic data, demographic data, and geographic boundaries are commonly used. 4. What is the most accurate method for defining a catchment area? Mobility-based catchment analysis is generally the most accurate because it reflects real customer movement patterns rather than assumed distance or travel time. 5. Are catchment areas fixed or flexible? They are flexible for business analysis and can change depending on distance, travel time, or behavioral patterns. In contrast, government or school catchment zones are often fixed. They are often used interchangeably in business analysis. Both refer to the geographic area from which a location draws customers. Businesses use catchment areas to analyze customer origins, assess market coverage, evaluate competition, improve site selection, and optimize marketing strategies. Accurate POI data, mobility or foot traffic data, demographic data, and geographic boundaries are commonly used. Mobility-based catchment analysis is generally the most accurate because it reflects real customer movement patterns rather than assumed distance or travel time. They are flexible for business analysis and can change depending on distance, travel time, or behavioral patterns. In contrast, government or school catchment zones are often fixed. #### What is GIS data and How Does SafeGraph Top the Chart? Key Takeaways GIS data combines geographic coordinates with descriptive attributes to model the real world digitally. Reliable GIS analysis depends on accuracy, freshness, consistent schemas, and broad coverage. Many organizations struggle with outdated or incomplete GIS datasets that require heavy cleaning. SafeGraph focuses exclusively on high-quality places data designed for integration with GIS tools. Monthly updates, human validation, and Placekey support make SafeGraph well suited for large-scale GIS workflows. Geographic Information Systems, or GIS data, forms the backbone of modern location-based decision-making. From mapping and navigation to urban planning, retail strategy, and logistics, GIS data helps organizations understand what exists in the physical world and how locations relate to one another.As businesses increasingly rely on location intelligence, the quality and structure of GIS data matter as much as the analysis itself. In this blog, we will explain what GIS data is, how it is used, and why SafeGraph has become one of the trusted sources for high-quality places data that supports GIS workflows at scale.What is GIS Data?GIS Data refers to information that is tied to specific geographic locations along with attributes that describe them. It combines geographic coordinates with information such as boundaries, categories, and identifiers, all enabling spatial visualization and analysis.At its core, GIS data allows users to model the real world digitally and understand where something is located, what exists there, and how it relates to other places around it. Locations are represented as points, lines, and polygons, making it possible to map places, measure distance, analyze proximity and understand spatial relationships better at scale.The combination of spatial and descriptive information is what makes GIS data so powerful. It enables organizations to move beyond static maps to analysis that supports planning, forecasting and real-world decision-making.Core components of GIS DataMost GIS datasets are built from a few foundational components.Spatial data defines where something is located. This may appear as points, lines, or polygons that represent locations, roads, or physical boundaries.Attribute data provides context about those spatial features. For example, a point on a map may represent a store, while the attribute data describes its brand, category, hours of operation, or opening date.Data structure and format also matter. GIS data is typically organized as vector data, such as points or polygons, or raster data, such as gridded surfaces used for imagery or terrain. Consistent schemas and well-defined geometry are essential to ensure that data layers align correctly and can be combined without extensive preprocessing.Together these components allow GIS systems to model the physical world in a way that supports analysis, visualization, and integration with other datasets.How GIS Data Is Used Across Industries Organizations use GIS data to support a wide range of applications, from mapping consumer access to services to evaluating competitive landscapes and infrastructure planning.Retail and commercial real estate teams use GIS data to understand trade areas and site potential. Media and advertising teams rely on it to contextualize foot traffic and improve location-based planning and targeting. Governments and city planners also use GIS data to assess accessibility, zoning, and coverage of services.One common requirement across all these use cases is reliable geographic content that accurately reflects the real world.What Makes GIS Data Reliable for Better Decision-MakingNot all GIS data created is unique and helpful. Accuracy, freshness, and consistency determine whether location-based insights can be trusted.Outdated records, misaligned polygons, or incomplete attributes can undermine analysis and lead to poor-quality decision-making. Reliable GIS data requires precise geographic coordinates, clearly defined spatial boundaries, and consistent representation of places across regions. Freshness is equally important. The physical world changes constantly as businesses open, close, or rebrand, and outdated data can quickly undermine analysis.Many organizations struggle with GIS datasets that look complete on the surface but require extensive cleaning, deduplication, or validation before they can even be trusted. When GIS data lacks accuracy or timeliness, downstream decisions from mapping outputs to strategic planning are put at risk.This is where specialized data providers play a critical role.How SafeGraph Supports Modern GIS WorkflowsModern GIS workflows depend on data that integrates cleanly into existing systems. Analysts and product teams need datasets that are structured, interoperable, and designed for scale.Rather than building visualization tools or analytics platforms, SafeGraph focuses exclusively on sourcing, validating, and maintaining places data that integrates seamlessly into existing GIS environments. SafeGraph data is used by organizations working in mapping, real estate, media, retail, and others where location accuracy is non-negotiable.The operational focus matters in practice. As Andy Stevens, Chief Data Officer, Clear Channel Europe, notes:“We never expected the datasets to be turnkey with our existing systems and datasets from day one- there's always a bit of work to get it to where you need it to be. We have been blown away by SafeGraph’s proactiveness in helping us work through any issues we’ve encountered.”SafeGraph data is designed to fit directly into GIS workflows, supporting mapping, geocoding, spatial hierarchy analysis, and integration with other location-based datasets.This approach supports GIS teams working with complex pipelines, not just idealized use cases.What Sets SafeGraph Apart in GIS DataSafeGraph’s differentiation comes from focus and execution.Many data providers attempt to balance places data with platforms, visualization tools, or professional services. SafeGraph does not. Its sole focus is maintaining a trusted source of truth for physical places.This focus enables SafeGraph to deliver:Monthly updates that reflect real-world changes.Machine-learning-cleaned and, human verified POIsDetailed attributes and spatial hierarchy through PlacekeyCoverage that includes non-traditional locations often missed by other datasets This commitment to data quality is consistently recognized by customers. As Nic Babb, VP of Product at Adomni, notes:“We pored through spreadsheets to isolate categories and look for issues in the data, and SafeGraph was the clear winner. There was just so much weird, junky stuff in the other datasets.”By prioritizing accuracy, transparency, and consistency, SafeGraph enables GIS teams to spend less time cleaning the data and more time utilizing it to build better.Conclusion GIS systems are only as effective as the data they rely on. Understanding what GIS data is and how it is structured is the first step, but choosing the right data partner matters as much as choosing the right tools.For teams building GIS workflows that depend on reliable representations of the physical world, working with high-quality places can significantly reduce operational overhead and improve confidence on decision-making.Want to see what comprehensive, regularly refreshed POI data looks like in practice? Explore the full scope of SafeGraph Places and review the attributes available for your GIS workflows. Explore the full scope of SafeGraph Places and review the attributes available for your GIS workflows Get Sample Data FAQ’s 1. What is GIS data used for? GIS data is used for mapping, navigation, urban planning, retail analysis, media planning, infrastructure assessments, and many other location-based applications. 2. What are the main components of GIS data? The core components include spatial data, attribute data, vector and raster formats, and structured schemas that allow layers to work together. 3. Why is data freshness important in GIS? The physical world changes constantly. Outdated GIS data can misrepresent locations, leading to flawed analysis and poor decisions. 4. How does SafeGraph differ from other GIS data providers? SafeGraph focuses solely on places data, offering monthly updates, human validation, consistent schemas, and coverage that includes hard-to-find locations. 5. Can SafeGraph data integrate with existing GIS tools? Yes. SafeGraph data integrates directly with platforms like Snowflake, Esri, and CARTO to support mapping, geocoding, and spatial analysis at scale. GIS data is used for mapping, navigation, urban planning, retail analysis, media planning, infrastructure assessments, and many other location-based applications.The core components include spatial data, attribute data, vector and raster formats, and structured schemas that allow layers to work together.The physical world changes constantly. Outdated GIS data can misrepresent locations, leading to flawed analysis and poor decisions.SafeGraph focuses solely on places data, offering monthly updates, human validation, consistent schemas, and coverage that includes hard-to-find locations.Yes. SafeGraph data integrates directly with platforms like Snowflake, Esri, and CARTO to support mapping, geocoding, and spatial analysis at scale. #### What should your interview pass rate be? What should the “pass rate” be when you interview someone for a job? The pass rate is just a simple measure of how often you move any one candidate to the next round. Let’s say you interview someone for a software engineering job, and you can give the candidate a “thumbs up” or a “thumbs down” — what should the “thumbs up” rate be?‍ The short answer is: it’s complicated.‍ If your pass rate is 1%, you will be spending all your time interviewing. You rarely think the person you interview would be a good eventual hire, which leads to more interviews with other candidates. It will take you forever to fill the position. However, if your pass rate is 99%, you are likely too discriminatory against people to do a round of interviews in the first place. You are choosing not to hire people that could be great hires for your organization. And if your pass rate is 99%, what is the use in interviewing people in the first place? Almost every “candidate” is making it to the next round. Imagine a sports league where all the teams that competed against each other in the regular season make the playoffs. What was the point of the regular season? It was a waste of time. So is having a pass rate of 99%. Pass rates should depend on how many people you’ve hired for a particular position in the past. ‍ Imagine you’re a super software engineer, and you have hired hundreds of software engineers in your career. You should have a reasonably good idea of what you are looking for in a new hire. During the interview process, your rate of giving someone a “thumbs up” should be 30-50%.‍ If your pass rate is below 30-50%, you need to screen your candidates better.‍ If it is below that, you (or the colleagues before you) did not vet the candidate enough before you interviewed them. You may not have gone through their resume closely enough, examined their portfolio in enough detail, etc. With a pass rate below 30-50%, interviewing is not a good use of your time. You can screen candidates who are right for the job, but you or your team are not doing it. ‍ If your pass rate is above 50%, you need to take more risks. ‍If your pass rate is above 50%, you need to bring in more “riskier” candidates - those who may not look exactly the part on paper, but who may bring a different perspective. This will simultaneously lower your pass rate and expose you to outliers who can change the dynamic of your company. This is important. The best candidate may not be the one who looks like a perfect fit on paper right now. But, A-Players grow at a tremendous speed. You’re better off finding an A-Player who can grow into the position (and eventually surpass it) than hiring a C-Player who is a perfect fit right now. With a pass rate of less than 50%, you do not take enough risks within your interview process. A simple rule is that the more familiar you are with the position, the higher the pass rate. ‍That makes sense. The more you hire for a position, the better you should be at identifying the right background in candidates. And if you’re bringing in the right candidates, more of them should make it to the second round, giving you a higher pass rate. You should also already have systems in place for this kind of hire, be it skill-testing questions, programming questions, etc. The more familiar you are with the process, the more you know about the type of candidate you want in the room. But now, let’s say you are interviewing for a position that you don’t have a lot of experience hiring for. You are a new founder and have a lot of experience hiring engineers, but now you need to hire salespeople. This is more tricky. In this case, you might want to broaden your search (be less discriminatory) and talk with a broader set of people. The irony is that by interviewing a full spectrum of people, you’ll learn a lot about the specific kind of salesperson you want on your team. You’ll also learn some stuff about sales you probably didn’t know before. This wide-net approach means you will have a significantly lower pass rate (maybe it gets as low as 10%) because you’ll be meeting all different kinds of salespeople with diverse backgrounds, and many of them won’t be the right fit. It’s kind of like eating at a buffet. If you don’t know exactly what you want to eat, you pile a bunch of different things onto your plate and then really decide what you want once you’re back at your table and nibble on a few. So the less familiar you are with the position, the lower the pass rate. CEOs spend a lot of time recruiting for positions they have little experience in.‍ One of the reasons CEOs spend so much time interviewing is that almost anytime they are recruiting someone who reports to them, it is more of a one-off hire. It’s not a repeatable process that can be optimized. You hire only one CTO while you might hire 100 engineers. You hire only one CRO while you might hire 100 salespeople. So, CEOs need to see lots and lots of candidates early in every process. The more people they see, the more they’ll learn about precisely the kind of person they want for the “one-off” (and incredibly important) hire. For these interviews, their pass rate should be very low because they see a TON of candidates and are not sure what they want. It also helps to have a support system (excellent management team, industry peers, etc.) who can provide some level of feedback on what your ideal hire looks like. It’s hard operating and hiring in a vacuum. Try and get as much feedback as you can from people you trust.‍ So, what should the “pass rate” be when you interview someone for a job? ‍ It depends. The more you’ve hired for a role in the past, the higher the pass rate (30-50%), and the less you’ve hired for a position in the past, the lower the pass rate (10-30%). If you feel something is “off” in your interview process, the pass rate can be a good indicator of where the problem is. Track it, adjust it, and play with it depending on the role. Like every company, you’re only as good as the people you hire. SafeGraph team is hiring! ‍ If you are interested in a career at SafeGraph, please join us! Special thanks to Thomas Waschenfelder for his help and edits. #### Where Should Machines Go To Learn? If we want to massively accelerate artificial intelligence and improve human lives, we need to democratize access to data.This post was co-authored by Auren Hoffman and Ryan Fox Squire.Past civilizations built grand libraries to organize the world’s knowledge. These repositories of information focused on cataloging, aggregating, organizing, and making information accessible so that others could focus on learning and creating new knowledge.AI and machine learning systems also need repositories of information from which to learn — and right now everyone is building their own. If different groups of people focus on organizing data versus building AI, the progress of intelligent computers will massively accelerate.Computers are not learning fast enough.Despite all the progress in machine learning (ML), most of our computers (and their applications) leave much to be desired. In her essay Rise of the Data Natives Data Scientist Monica Rogati explains:There are frustrating times ahead…autocomplete and autocorrect are everywhere — and we make fun of them when they don’t work. We’re frustrated when our GPS…shows us a restaurant 1,000 miles away…If we haven’t yet fixed the small things, how can we be trusted with the innovations that would really enhance all our lives?When it comes to AI improving human lives, the pace of progress has been super slow.Big Data is MUCH better.Many state of the art AI and ML applications would be dramatically improved with more training data. This is hugely important. Google is one of the best AI and ML companies in the world. Why? Peter Norvig, Research Director at Google, famously stated that “We don’t have better algorithms. We just have more data.”The accuracy got way better as they increased the training data set.Michele Banko and Eric Brill at Microsoft Research wrote a now famous paper in support of this idea. They found that very large data sets can improve even the worst machine learning algorithms. The worst algorithm became the best algorithm once they increased the amount of data by multiple orders of magnitude.Machine learning and artificial intelligence is fueled by data.Good, high quality data serve as truth sets about the world from which our machines can learn. And most AI and ML is likely under performing because getting access to great truth sets is very hard.The Problem: Organizing the past at scale is hard.Organizing the past at the scale required by AI and ML is hard. Before a company can actually work on their AI or ML they need to solve four key challenges:Source: xkcd1) Acquire the data2) Host the data3) Prepare the data4) Understand data privacyTo understand the difficulty, let’s briefly discuss what each of these challenges entail.Challenge #1: Acquiring Data - companies need access to great truth setsThis is where companies like Google and Facebook have a huge advantage — their businesses generate treasure troves of data.Nobody taught Google what a cat was, but with tons of data it taught itselfFor example, to build a neural network that could recognize human faces and cat faces, Google used 10 million YouTube videos. That is a data set to which literally no one else in the world had access. Four years later (just a few months ago in the Fall of 2016) Google released a large-scale dataset of labeled photos and videos to help the machine learning community — because they recognize how valuable this dataset is for everyone else.But if you are a start-up doing image classification or a new self-driving car company (like Oliver Cameron), or a new search/query technology (see Daniel Tunkelang), or trying to revolutionize health care (see Jeremy Howard), or any other valuable AI application … even if you have $800 million in funding … you have very little data.An under appreciated fact: @google is doing what no major AI company is doing -- sharing massive datasets, and models pretrained on them.— Delip Rao (@deliprao) September 30, 2016 You might need to contact hundreds of companies and negotiate major business development (BD) deals to try to license data. It will take a huge effort — sometimes many years — and a lot of money. You might need to spend tons of engineering time and person hours and technological innovations to aggregate and organize and label data from open sources.‍For Uber to work on self-driving cars it announced a $500 million initiative to make maps (organizing the past). $500 million!!!If companies can’t figure out how to get truth sets, they can’t build smart machines.Challenge #2: Host the DataLet’s assume you have access to a great truth set — you also need to stand it up (host it in some way that your data scientists and engineers can work with it). Because the best datasets are very large, you need a cloud infrastructure and distributed processing technologies and your share of hot-shot, high-priced, back-end data engineers to make the data queryable and actionable.Your best engineers spend most of their time just managing your big data infrastructure, pipelines, and query layers.Challenge #3 Prepare the DataThen, even the best sources — even data you generate yourself — will be dirty. Your data will have errors, typos, mislabels, holes and need tons of cleaning. Your ML engineers and data scientists will spend most of their time just getting your data ready to use. It’s a cliche: data scientists spend 80% of their time just preparing the data. And this is the least enjoyable part of their job.data scientists spend most of their time massaging rather than mining or modeling data https://t.co/yAF5qA4Dzr— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) November 26, 2016 The size of this problem has led some to joke that the most impactful applications of AI would be to help data scientists clean their data faster.I'm starting to think that the best, first application of AI should be: data munging.— Michael E. Driscoll (@medriscoll) May 22, 2015 If the process of building smart computers itself creates new human problems for smart computers to solve — then we can see how progress will be slow.Challenge #4 Understand Data PrivacyIf you are trying to solve the most important questions of society, you probably are working with data about people. That means you need to become an expert on privacy and protecting consumer data. This requires significant ethical and legal sophistication.You must understand the evolving regulatory landscape of data privacy. This includes the implications of FCC rulings in the United States and the significance of GDPR in the EU and much more. It requires understanding evolving definitions of PII (personally identifiable information).Protecting personal privacy and developing next-generation AI are essential and mutually inclusive — but without the right expertise you can fail at both.The smartest people in data are spending too much time organizing the past.Our point is this: Almost all the super-smart data people want to focus on building AI and machine learning applications to improve human lives. They want to use data to make decisions and predictions about the future and power incredible new technologies — like self-driving cars or super-human medical diagnoses or global economic forecasts. This is great. But instead, they are spending tons of time organizing the past: acquiring data, hosting data, preparing data, and navigating data privacy.Why does every company that wants to work on AI or ML (or simply wants to incorporate AI and ML into their products) need to reinvent the wheel and develop time-consuming expertise that has little to do with their core business?An alternative: focus on your strengths and rent your data.David Ricardo’s classic economic theory of comparative advantage boils down to this: focus on your strengths, trade with others, and everybody wins. Organizing the past and predicting the future are different kinds of expertise — you shouldn’t have to master the former to contribute to the latter. In fact, requiring everyone to master all of these domains is bad for AI and ML as a field because progress will be rate-limited to the handful of big players with the resources to do it.Just like internet companies use Amazon Web Services to rent access to hardware, ML and AI companies should rent access to data. Innovators should focus on applying AI and ML to their domains of expertise (cancer, robotics, self-driving cars, economics, etc.). They should rely on other companies with different kinds of expertise to acquire the data, to build appropriate infrastructure, to clean the data, make it easy to work with, and to protect consumers’ privacy.Democratize access to data.If people (and companies) focus on their strengths, then some will organize the past and others will predict the future. The barrier to start working on AI and ML will be dramatically lowered. Access to data will be democratized. If we focus on our strengths, the pace of innovation in AI and ML will massively accelerate.This piece was authored by Auren Hoffman and Ryan Fox Squire. Auren is CEO of SafeGraph and former CEO of LiveRamp. Ryan is Product Manager at SafeGraph and former Data Scientist at Lumos Labs.If you found this valuable — please recommend and share this post.Special thanks to inspirations: Oliver Cameron, Michael E. Driscoll, Anthony Goldbloom, Brett Hurt, John Lilly, Hilary Mason, dj patil, Delip Rao, Joseph SmarrJoin SafeGraph: We’re bringing together a world-class team, see open positions. #### Why “Exit Transparency” Can Make Companies Stronger Exit Transparency is a deal all companies and employees should make … and live up to‍ Eventually, all great things must come to an end. Your best-performing employee will eventually leave, either to start their own business or work for another company. Since the great recession, the number of employees voluntarily leaving their jobs has risen steadily. Source: https://www.wsj.com/articles/in-this-economy-quitters-are-winning-1530702001 These moments are riddled with emotion. Some might even cry when a team member breaks free. But what’s critical is that you treat them well on the way out regardless of who is making the change. One of the best things one can do to manage exits is to have a blanket, well-defined “Exit Transparency” with all their direct reports. Exit Transparency is a deal made between an employer and the employee at the time of hiring. The deal is built on a bedrock of honesty on the way in and honesty on the way out. In practical terms, it means that an employee promises to tell their employer before they start actively looking for a job. They agree to not sneak around when doing a career search, but to be upfront about it instead. In return, the employer agrees to allow the employee to be employed during the time they are making the transition and to actively support them during the interview process. If you’ve ever changed jobs, having the support of your employer as you make the transition is a massive deal. The employee can drop all pretense and obfuscation, and simply do the best they can for the company in their final weeks. The employee can also better optimize their search for a new role. The employer also agrees to never go behind the employee’s back to look for a replacement. For example, if the company is looking to hire a CFO, it would tell the current Director of Finance what it’s doing and why it’s doing it. It would even tell them if they were no longer needed after the CFO is hired, being very clear about the reasons why. ‍ Two-way respect‍ We’d all like to believe that these simple acts of integrity are already carved into American business. They are not. Exit Transparency builds the infrastructure to support what we should all be doing anyway: treating our employees with the respect we expect in return. We should not treat employment like a marriage. Most marriages actually do last a lifetime (the U.S. divorce rate is dropping fast and already is less than 30%). But 99% of employment relationships eventually end and we should be OK talking about that and make sure that they end on good terms. Photo by Adeolu Eletu on Unsplash ‍Transparency in actions and building trust. ‍ Exit Transparency builds trust between the employer and employee from day one. When a company outlines the parameters of their Exit Transparency, it immediately shows the employee three things: The company recognizes reality. It has a long term outlook. It respects the employee and their agency to find other opportunities. It may seem counterintuitive that outlining the procedures for a new employee’s exit on the first day builds trust, but it does. The transparency encourages reciprocation. If the employee senses that the employer is being honest and direct in their exit strategy at the time of hiring, as well as during the exit, they are likely to reciprocate this attitude. This is something we implement at SafeGraph and I’ve seen it work. It’s not always perfect and we are always working to make it better. But in many cases, it has eased the transition for both the employee and the company. Exit Transparency is a positive-sum agreement. ‍ Exit Transparency is mutually beneficial to the employer and the employee. It creates the path to a positive-sum outcome that will occur under difficult circumstances. It’s harder now than it has been in the past to build positive-sum agreements between employers and employees in tech. There are no life-long employment contracts anymore. There are no guarantees. The technology space is filled with disruptors, and young companies either iterate or fail. The world where a company lists employee security as a core value has long passed. The uncomfortable truth is that most employees can be let go at any time for almost any reason. At the same time, it’s increasingly hard for companies to keep and nurture their 10x employees. These people are connected to the world in a way that provides them almost unlimited opportunities should they choose to pursue them. Exit Transparency is a critical way to recognize and plan for this. And there’s a genius in it because the deal is made under positive and transparent circumstances, during the employee’s first few days. This is when the relationship is filled with hope and opportunity. It’s the ideal time to make a positive-sum rainy-day agreement grounded in reality. “Technology is nothing. What’s important is that you have a faith in people, that they’re basically good and smart, and if you give them tools, they’ll do wonderful things with them.” - Steve Jobs ‍Your company has alumni: the relationship isn’t over once the employee leaves the company. The LiveRamp Mafia (and company alumni networks) https://t.co/4pCi05lmMa pic.twitter.com/nKkf00LmsZ — Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) January 6, 2019   Exit Transparency views employees as future alumni. While the company handles the employer/employee relationship during the course of employment, Exit Transparency helps to structure the employer/alumni relationship (see the LiveRamp Mafia). Reid Hoffman (no relation), Ben Casnocha, and Chris Yeh write in their book, The Alliance, about this ideal lifetime relationship between employer and alumni. They argue that very few companies have put into place a strategy for maintaining (and leveraging) the company’s alumni network. They write: “Establishing a corporate alumni network, which requires relatively little investment, is the next logical step in maintaining a relationship of mutual trust, mutual investment, and mutual benefit in an era where lifetime employment is no longer the norm.” Exit Transparency begins to build this positive-sum network. In an article called “Tours of Duty: The New Employer-Employee Compact,” Hoffman, Casnocha, and Yeh write that the exit interview is an underutilized opportunity to build this post-employment relationship with entrepreneurial employees. In addition to gathering a personal email address, Twitter handle, LinkedIn profile, phone number, etc. to add to an internal database, employers should use this time to build trust. The authors write: “The exit interview is also a trust-building opportunity. Many employees have sat through grimly polite or even resentful parting talks. You can make your company stand out by emphasizing the ongoing nature of the relationship.” One of my personal goals at SafeGraph is that we do a better job of exit interviews and supporting our alumni in 2020. Even though we are a small company, we already have some extremely talented alumni and we can do more to support them. Photo by Cytonn Photography on Unsplash After the employee leaves on amicable terms, the employer can also invest purposefully in the alumni network. They can provide alumni benefits, host alumni events, as well as provide quality references and career support. In return, the authors in Alliance argue that the company will be able to leverage the alumni network in 4 different ways: candidate referrals (hiring) network intelligence customer referrals brand ambassadors Companies like McKinsey have done a great job promoting their alumni network. With the speed at which start-ups innovate, a small advantage in these areas can compound into a large one over time. Exit Transparency is a competitive advantage.‍ This agreement sets the stage for a company 3, 5, and 10 years down the line. It allows it to leverage talented, entrepreneurial alumni even though they’re no longer on the payroll. And that’s a competitive advantage - especially in an event that often leaves behind a sour taste. If those ex-employees become assets and not liabilities, the company is in a better position for it. With Exit Transparency, what looks like an exit is actually an entrance into a new, positive-sum relationship. And over the very long term, for a company on the very edge, that can be the difference between shipping the product or closing up shop. Special thanks to Thomas Waschenfelder for his help and edits.   #### Why Airline Loyalty Programs Are Really Just FinTech Companies: World of DaaS interview with Author of The Diff, Byrne Hobart New podcast with Byrne Hobart, Author of The Diff. Our conversation is available everywhere (Apple Podcasts, Spotify, YouTube, etc.). Please subscribe, follow, and review.‍ Byrne Hobart is the author of The Diff, a substack newsletter covering inflection points in finance and technology. I'm a huge reader and fan myself. Last year, Byrne wrote an article about airline loyalty points, which I found really fascinating. Byrne explains how airline loyalty programs have saved the airline industry and drive the majority of their value. Here are some highlights from my conversation with Byrne Hobart. Airline loyalty programs are more valuable than the actual airline‍ Loyalty programs are mostly recession proof. If you look at how much money the loyalty programs make, and you look at what kind of valuation would traditionally be ascribed to a company that grows at X percent a year, is pretty recession resistant and has cash operating cash flow Y, you get to a valuation that is actually in excess of the market cap of the airline. The actual airlines are loss leaders to help the FinTech side, FKA loyalty programs‍ FinTech companies grow when they have preferential access to really valuable customers. Airlines take a hit on their services to help their loyalty programs grow. The entire business of actually flying people really fast from place to place and allowing humans to transcend our earthly bounce, that actually has negative value and destroys more than 100% of the equity capital you put into it.‍ Loyalty programs were the airlines’ lifeline during the pandemic‍ During the pandemic, there was a significant drop in travel. At that time, the airlines pointed to their loyalty programs to validate that they have a solid business that we know has these loyal customers, and we can borrow against it. This helped lower their interest rates and stay afloat. Airlines are both monopolies and commoditized businesses At one level, airlines are a network effects business that has these monopolistic characteristics. At another level, it is a totally commoditized business with brutal competition. It cycles back and forth through these states over time. It also cycles back and forth depending on location. It is more commoditized in a city with a lot of different airlines that service that city. In hubs with limited options, you’ll see more of a monopoly. So individual airlines are always trying to extend to parts of the map, where they can have a lot of pricing power. Signing up people for credit cards is really difficult, but also incredibly lucrative Growing a financial services company is brutally difficult. The growth constraint is so dominated by customer acquisition costs. How do we know this? It's literally worth buying planes, flying people around the country, having a heavily unionized workforce and all these safety regulations... to cost effectively sign people up for credit cards. Hope you enjoy this episode of World of DaaS — would really appreciate it if you subscribe and review Apple Podcasts, Spotify, YouTube, etc.). #### Why Data Standards Matter Key Takeaways Data becomes significantly more valuable when it can be linked to other datasets through shared join keys. Standards succeed through adoption, not perfection. “Good enough” and widely used beats technically flawless. Effective standards are SIMPLE: storable, immutable, meticulous, portable, low-cost, and established. Open, low-cost standards lower barriers to adoption and accelerate ecosystem growth. Lasting standards act as platforms that unlock value for entire industries, not just their creators. Learn more about running a data business on our new podcast, World of DaaS.DaaS Bible 2.0: How Standards Increase Data Flow and Benefit EveryoneNote: Will Lansing is the CEO of FICO and Auren Hoffman is the CEO of SafeGraph. This piece is a follow-up to Auren’s 2019 piece: The Data-As-A-Service Bible -- the most widely read piece on the business of selling data. Data is all the rage these days. Yes, we’ve heard that it is the new oil.But a single dataset on its own has limited value. The real value from data comes from connecting it across multiple disparate datasets. And to accelerate the connecting of data, it is really helpful if data producers and data consumers agree on a common standard.In this piece, we will dive into:How to make data more valuableWhat makes a good standardWhat standards have worked well in the pastHow new standards in the future can accelerate collaboration around dataIf you want to stop reading right now, the tl;dr is:Linking data to other data makes all the data more valuableStandards (also known as join keys) are the most valuable ways to link data togetherGood standards are platforms that create value for everyone (because everyone uses the standard)Successful standards have some common traits both in product design and go-to-market executionPerfect is the enemy of the standard -- it is better to focus on something that is good-enoughMetcalfe's Law also applies to standards: the value of the standard increases exponentially with adoptionNon-openness and collecting rents impede the success of a standard, because it impedes adoptionStandards should be SIMPLEThe easiest way to increase data’s value is by linking it together.Metcalfe’s Law shows that the value of a network grows in proportion to the square of the number of nodes in the network. We all understand that intuitively: a telephone is not very valuable if you can only call yourself. The reason messaging systems like WhatsApp are super useful is that a lot of other people also use WhatsApp.Source: Metcalfe's Law - WikipediaWhat most people don’t realize is that Metcalfe’s Law applies to data too. The more connected a dataset is to other data elements, the more valuable it is. And the easier it is to link your data, the more valuable it becomes. From the DaaS Bible:The reason for this is simple: data is only as useful as the questions it can help answer. Joining, linking, and graphing datasets together allows one to ask more and different kinds of questions.No one company or organization has a monopoly on data. Even mighty Amazon only knows less than 0.1% of what is happening in the economy. Even the Internal Revenue Service and the Federal Reserve have limited insights into the world. And even Google, which has access to more data about people than any other company in the world, still has mostly incomplete information.So, to truly understand something, you need to bring together data from as many different sources as you can. The days of doing a psychology study on 18 undergrads and attempting to generalize to the broader world are long-gone. Large data sets are out there, and to make them valuable, you need to link them together with join keys.Join keys are the secret to connecting datasets. Join keys are really valuable. They are just simple connectors that make it super easy to take many different datasets and bring them together.If you are an investor and you are trying to value a dataset, the easiest thing you can do is first recognize how many join keys there are in this dataset that can allow end-users to bring in additional data.By definition, join keys are derived. They’re also fairly simple, and as a consequence, also imperfect. Join keys get their power not from solving every problem or working on every use case, but from the fact that they are used by many other organizations. Remember Metcalfe's Law? The more organizations that use the join key, the more valuable it is.Data Is Most Powerful When It’s Standardized.Case study: Unix time as a standard. One great join key is time. Unix time (or other standards like UTC) standardize time zones so that an event that takes place at the same exact time in Lagos, Moscow and Sydney are represented as such.Unix time is a standard convention around time, but it’s not perfect. Unix time might say it is Tuesday when it is actually Monday in San Francisco -- so things can be confusing.Unix time is represented by a simple integer that is the number of seconds since January 1, 1970. Forgive us, but weren’t there significant events before 1970? (one of the co-authors of this piece was actually born before 1970 … we will let you guess which one). Do we really want a standard that represents everything before 1970 in hard-to-use negative numbers? The answer is yes -- because the perfect is the enemy of the standard.Unix time’s main power is that it is accepted as the convention to measure time. This means applications and computers all over the world can easily share and receive information about time. And Unix time is only a small improvement over the previous standards (like Greenwich Mean Time or GMT).One of the nice things about Unix time is that it can be represented as a string of numbers -- which means it can very easily be stored in a database and running calculations on it is simple math of adding or subtracting seconds.To reiterate, the power of Unix time is that everyone else uses it. Yes, it is clever. Yes, it is simple. But its widespread adoption as the standard is what makes it useful.Case study: the Meter and measuring distanceA long time ago, one measured how big their farm was by taking steps. This was obviously an imperfect way of representing the size of one’s lot or the distance between two cities … but it was accepted and it (mostly) worked.Today we have the meter as a standard.Developed in post-revolutionary France in the 1790s, the meter has conquered the world (at least everywhere except the United States) as the standard for measurement.Like all standards, the meter isn’t perfect. Why should the meter be the length it is? Would it be more practical for a “meter” to be larger or smaller? Yes, of course. But remember, the perfect is the enemy of the standard.The meter is clever. It can be easily subdivided (centimeters, etc) and expanded (kilometers, etc.) which is why the metric system has taken over science from the English system (we still can never remember how many feet are in a mile).But the cleverness of the meter is only a small part of its success. Its main reason for success is that everyone else has adopted it. If you are a business selling and buying materials across the world, it is really helpful that everyone uses the same system for measuring length. The more organizations who use the same standard, the more useful that standard becomes. Again, it’s Metcalfe’s Law in action.The meter is successful because enough people think it will be successful. All great standards are recursive. (of course, it can’t hurt to have the reigning emperor of Europe, Monsieur Bonaparte, as your cheerleader)To standardize a data set, it helps to be Free, Open, and Usable.One of the advantages of Unix time and the meter is that the standards are free and open. In fact, it is MUCH easier for something to become a standard if it is free and open because the barrier to adoption is low.It is doubtful the meter would have taken over the world if Napoleon decided to charge a small tax every time the meter was used.It is also easier for a standard to be adopted if it is locally storable under a simple license.Some data might seem open but there can be a hidden tax that can impede wide adoption. Data licenses like ODbL (Open Data Commons Open Database License) force people using the data to contribute back to the community. While that is great in many cases, many commercial entities will be wary of mixing in ODbL data with their proprietary data. Imagine if the meter was only offered in an ODbL license -- every time your doctor wanted to record your height, she’d have to also send it to a central “meter foundation.”A better open-source license for standards is the MIT license which allows commercial and non-commercial entities to use, store, and develop on the project without contributing back to the initiative. Of course, contributions are very much appreciated … but making contributions mandatory impedes the success of a standard. Again, the perfect is the enemy of the standard.Case study: FICO® Score as standardized data.The FICO® Score has become a standard to measure the overall likelihood that someone will repay a loan. The FICO Score is used by over 90% of top US lenders when making lending decisions.Typically, the higher your score, the lower the risk and the more likely creditors are to lend to you.The nice thing about the FICO Score is that it is simple and storable. It is a three-digit number and it is easy for both a human and a computer to understand. Someone with a 550 score is higher risk to lenders than a person with 760.Of course, the FICO Score is far from perfect. Two people with the exact same score may end up having differences in loan repayment.Another way the FICO Score isn’t perfect is that it is not free. Lenders need to pay to get the FICO Score. Charging for a standard can often impede the chance that the measurement becomes a standard (because to be a standard it must be widely adopted). The flip side of charging for the standard is that one can use that revenue to continually update it and make it better.Unix time has gotten a wee bit better over the last 40 years (a few leap seconds have been added here and there). In contrast, the FICO Score has hundreds of people working on improving it every year.So like all standards, the FICO Score isn’t perfect. But remember, perfect is the enemy of the standard. The FICO Score is a very good predictor of a person’s ability to repay a loan. The fact that almost every consumer, bank and credit institution understands and uses the FICO Score makes it much easier for the economy to function, because every party on all sides of every contract are speaking the same language. Re-enter Metcalfe’s Law.Standards unlock massive value for the networks that use them. Standards are really important because they create a common language to foster communication. If everyone spoke a different language, we’d never get anything done. Standards are both the glue that connects datasets together and and the grease that make data flow between organizations.One of the advantages of Unix time is that it is both a standard and also a useful join key. Let’s say we want to join the stock price of Tencent and Microsoft and see how correlated the two stocks are immediately after news breaks. Unix time allows us to join the stock data together based on time even though they are traded in two different jurisdictions, adding huge value to both jurisdictions.Another standardized join key that is super useful is the U.S. dollar. While different exchanges around the world are often listed in their national currency, they can be easily compared by converting them to the U.S. dollar. Yes, one could use a different currency (or even the price of gold or bitcoin at the time), but a standard is just a convention that we all agree to use. The dollar is not necessarily a better measure than another currency or store of value, but it is the agreed-upon standard that we all use. As we have stated before: the perfect is always the enemy of the standard.Language itself can standardize as well. The World Economic Forum brings together leaders from all over the world every January in Davos. These leaders all speak different languages, but the Davos gatherings are in English. And no, that’s not perfect -- not everyone who attends speaks or understands English … but it is an agreed-upon standard because it is good enough and unlocks value for most attendees.Standards unlock value in data in three key ways:Enables understanding -- use of standards promote common and clear meanings for dataDemocratizes access and availability -- standards make the exchange, interpretation and integration of data easier and more efficientIncreases use --> which drives access --> which in turn drives more use/reuse of data; more the data is used, the more valuable it becomesStandards accelerate collaboration around dataThe easier it is to join data, the more data will be transacted, moved, and used.Because it is so easy to join data on price (the dollar is a common-enough measure), it becomes easier and easier to join data about prices.But let’s say there is a world where people get paid in Bitcoin but they buy homes with platinum … you’d want to make sure these measures of value were joined before you did your analysis about how correlated home prices were to income. The join key (right now we use dollars) becomes really important for any type of correlation or relationship across datasets.As we said earlier:The more connected a dataset is to other data elements, the more valuable it is. And the easier it is to link your data, the more valuable it becomes. The reason for this is simple: data is only as useful as the questions it can help answer. Joining, linking, and graphing datasets together allows one to ask more and different kinds of questions.Even the simplest questions may have very complicated operations to answer. For example, let’s say we wanted to understand the global price consumers spent on milk over time. We will have to use multiple join keys just for this one elementary analysis. First we’d need to join on a measure of price (like the dollar). Then we have to choose what version of the dollar we are joining on (like the inflation adjusted dollar on Jan 1, 2010). Then we have to understand the rate of inflation we use and if we want to change it in each country. Then we have to understand the measure of milk that we are using (like the U.S. uses gallons but much of the rest of the world uses liters). Then we need to understand what kind of milk we are talking about (like in many countries, milk is not pasteurized and might not last as long). And we will need to have an understanding of what type of consumer we are looking at (we might want to discount the Brooklyn hipster that only buys the artisanal fully organic milk where the farmer reads daily bedtime stories to the cows).The easier it is to join the data, the more the data will be joined … and the more the data will eventually be used. One of the biggest reasons that most academic papers are written with only 1 or 2 datasets isn’t because of the difficulty in acquiring data (though that is certainly one of the issues), it is that it is so incredibly difficult to join disparate slices of data.What makes a great standard?The very best standards act as join keys that unlock data in multiple datasets.From the DaaS Bible:If the value of Dataset A is X and the value of Dataset B is Y, the value of joining the two datasets is a lot more than X+Y. Because the market for data is still very small, the value isn’t X*Y yet … but it is possible it will approach that in the future.Data becomes much more valuable the more additional datasets it can be joined to. And no, data owners don’t need to make money off of those other datasets -- those other datasets make your data better. As stated in the DaaS Bible:This is the #1 thing that most people who work at data companies do not understand. Most people think that they need to hoard the data. But the data increases in value if it can be combined with other interesting datasets. So you should do everything you can to help your customers combine your data with other data. One way to make data easy to combine is to purposely think about linking it — essentially creating a foreign key for other datasets.Joining your data to other datasets is what makes your data more valuable … and it makes sense to spend a lot of time investing in join keys.The best join key standards are SIMPLE.The SIMPLE acronym for data companies helps guide the creation of a universal identifier that is:Storable. You should be able to store the ID offline. For instance, I know my SSN and my payroll system stores my SSN.Immutable. It should not change over time. An SSN on a person is usually the same from birth until death (except if you enter the witness protection program).Meticulous (high precision). The same entity in two different systems should resolve to the same ID. It should be very difficult for someone to claim they have a different SSN.Portable. I can easily move my SSN from one payroll system to another.Low-cost. The ID needs to be cheap (or even free). If it is too expensive, the transaction costs will make it hard to use in many situations. The SSN itself has no cost.Established (high recall). It needs to cover almost all of its subjects. An SSN covers basically every American taxpayer (and more).6/ Your data will be much more valuable if you enable it to be joined with other datasets (even if you make no money off the other datasets). This is the #1 thing that most people who work at data companies do not understand.Build join keys into the data:— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) June 18, 2019One example: the Placekey is a join key that has a common identifier for all physical places. Prior to the Placekey, it took a very sophisticated engineering team to join data on a postal address. The Placekey is a simple string that can easily be joined. It is SIMPLE, free, and open. All companies that sell geospatial data, like SafeGraph, benefit when the data is easier to consume. All companies that consume geospatial data (like Esri, Carto, Mapbox, Unfolded, Apple, Twitter, Microsoft, etc.) benefit when data is easier to access.Some ideas on creating a standard in your industry.If a standard does not already exist in your industry, it might be a good idea to help create one. Here are some ideas around building a standard:Your standard should lift all boats.The definition of a standard is that it lifts all boats. Remember how the U.S. dollar added value to all the other national currencies by creating a standardized comparison tool for all of them? That’s the goal of any standard you create. It should help everyone in the community.Even companies can be standards: if they put their customers firstInsurance Services Office (ISO) (which is now a division of Verisk) was started in the 1970s to help insurance companies better underwrite and combat fraud. It was a data co-op that benefitted all insurance companies and very quickly made the entire industry more streamlined and profitable. While ISO is for-profit and charged for its services, it lifted all the boats in the insurance industry by creating a common standard.Visa is another example of this. After Visa was created (spun out of Bank of America in 1970), it operated as a not-for-profit organization for many decades. Dee Hock, Visa’s trailblazing CEO, passionately promoted Visa’s neutral status that allowed it to become a payments standard and help its thousands of partner banks. Today the Visa standard of payment rails powers trillions of dollars in transactions … and that would never have happened if it did not put its customers first.Open-source first companies are also an example of creating the product as a standard. Red Hat was one of the true pioneers in creating a for-profit company around an open-sourced standard (in this case, LINUX). Other notable companies include Databricks, Cloudera, Confluent, and many others.Your standard should be low-cost.One of the best ways to FAIL at creating a standard is to try to take too many rents or make it proprietary. Yes, there are amazing examples of proprietary standards, but they are generally the exceptions. A standard is a public good (which is why so many of the most well known standards have been created by or mandated by governments).Your standard needs the support of industry competitors, regulatory bodies, etc. Let’s say you run a company FoodDataGraph that has data on what people eat. Collecting this data and joining this data is an incredible problem. How do you categorize each thing someone eats? Is a hamburger its own entity or does it get split into meat, bun, lettuce, and tomato? How does that get joined to other datasets (like if you want to figure out calories, nutritional information, food source data, food vendor data, prices, etc.). How do you know a menu item from one restaurant is roughly the same as a menu item in another?It is not clear. But one thing is for sure, if you want to create a standard you cannot do it yourself. You are going to need a lot of other companies to adopt it.Huge food delivery companies (like Sysco and U.S. Foods -- and also Doordash, Grubhub, and UberEats) might be a good start. Big groceries (WalMart, Safeway, etc.), restaurant chains (like McDonald’s), and restaurant associations would also be good to get buy-in from. And eventually you might want to rope in the FDA to help bring the standard over the line.Adoption will be much faster if everyone (including your direct competitors) have open access to the standard. Remember, a great standard lifts all boats -- not just yours.A history of standards in one chart— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) Sep 12, 2020When standards disappear, entire ecosystems can die.Once standards get going, it is important that they last. Often billions of dollars are relying on the standard. If they do go away, it is important there is a reasonable replacement. Like if meters were abolished in science, we could use the English measurement system. It’s not ideal, but it will work. One could switch from the US Dollar to the Euro or just back to gold. It is not ideal but the switch can happen.We often want a standard for an industry (like Verisk in insurance) but we are afraid to bet on one because there is no assurance that the standard will be around for decades. The good news is that standards, once in place, often last a lot longer than anyone would have expected. Even when the standard itself is suboptimal, like the QWERTY keyboard, standards often persist well after their initial perceived expiration date.There is power in a standard. But as Voltaire (sometimes attributed to Spiderman’s Uncle) said: “with great power comes great responsibility.”Adding standards is hard … and humblingThe first thing you’ll find when working to start a standard is that there are 20 other projects that have started to solve the same problem -- some of those will be alive and some will be long dead.It is humbling to start a standard because the chance of success is low. The more the standard is SIMPLE, the higher chance it has of success. But that does not mean it will be easy to achieve ubiquity.Your standard should be built to exist foreverThe great paradox of standards-building is that organizations and individuals who may benefit from that standard will be nervous to adopt a new standard without knowing others are already using it. Utilizing and operationalizing a standard within organizations takes time - whether it’s the engineering team that needs to change their data pipelines or the sales team that needs to communicate how their data product can more easily integrate with this new standard.Creating a standard is really hard. The chicken-egg problem exists tenfold when one is developing a standard. One thing to help kick-start any standard is to show it will be around for a very long time.Your standard should be thought of and built so that it will last forever. There are various ways to support something forever, whether it be through open source software, easily computable, supported by a coalition of members, supported by a government entity (although this can also be impermanent), supported by a foundation, etc. It is up to you to figure out the best way to ensure your standard can last forever - but there are many different ways you can solve for the “forever problem.”Your standard may need to continually adaptSome standards are like the meter -- you set it and forget it.Some standards need to change and evolve over time -- like the FICO Score -- or like an OS like LINUX.The more your standard is in the “set it and forget it” camp -- the more you need to get it correct. Or at least need to keep building on top of it over time (the QWERTY keyboard has become an imperfect standard that we will likely keep for a very long time).The more your standard needs to evolve, the more it looks like a company or ongoing project. In these cases, you don’t need to get everything right up-front but you will need to adapt and change quickly. These standards need to continually improve or they will dieThink of your standard as a platform. Bill Gates’ famous definition: “a platform is when the economic value of everybody that uses it, exceeds the value of the company that creates it. Then it's a platform” also applies to standards. Standards are a platform. A standard join key is the ultimate (and original) platform.A standard is the OG of platforms.Make your standard SIMPLE.If you want to create a standard, try to keep it as SIMPLE as possible. The closer it is to SIMPLE, the more likely it will both be adopted and be enduring. Remember, SIMPLE means:Storable.Immutable.Meticulous.Portable.Low-cost.Established.And heed our advice, don’t try to make the standard perfect. Don’t try to please everyone. That will never, ever happen. All standards are incredibly flawed. The most important thing about a standard is that it is good enough to get adopted, not that it is perfect.The perfect is the enemy of the standard.Thank you for reading this. We’d love your comments, ideas, and critiques. We also would love to hear about standards in your industry and what you learned from them. FAQ’s 1. What is a join key in data? A join key is a shared identifier that allows different datasets to be connected. It enables analysts to merge information across sources and answer more complex questions. 2. Why are data standards important? Data standards create a common language across systems. They make it easier to exchange, compare, and integrate information between organizations. 3. What makes a good data standard? A strong standard is widely adopted, easy to store and use, stable over time, low-cost, and simple enough to integrate into existing systems. 4. Why is adoption more important than perfection in standards? A technically perfect system has little value if few people use it. A widely adopted standard unlocks more value because it connects more participants and datasets. 5. How do open standards benefit data ecosystems? Open standards reduce friction, increase participation, and encourage innovation. The easier a standard is to use, the faster it spreads across industries. A join key is a shared identifier that allows different datasets to be connected. It enables analysts to merge information across sources and answer more complex questions.Data standards create a common language across systems. They make it easier to exchange, compare, and integrate information between organizations.A strong standard is widely adopted, easy to store and use, stable over time, low-cost, and simple enough to integrate into existing systems.A technically perfect system has little value if few people use it. A widely adopted standard unlocks more value because it connects more participants and datasets.Open standards reduce friction, increase participation, and encourage innovation. The easier a standard is to use, the faster it spreads across industries. #### Why humility is core to being a successful DaaS (Data-as-a-Service) business ‍Last year we wrote the DaaS Bible (which is now the most widely read post about the operations of data businesses). This year we are creating some follow-up pieces on how one should think about running, managing, and expanding data businesses. So, what is a data business really? It is an ingredient. Many people think of data as a commodity which is not a bad metaphor. But a slightly better way to think about data is an ingredient that is sold to the best chefs. Data is like high quality butter. Imagine instead of data you were selling high-quality butter and your clients are the very best pastry chefs. Your customers take your amazing butter and create delicious croissants, cookies, and other fantastic pastries with it. Maybe they even create a cronut. The end customer that eats these delicious treats will never know the butter came from your farm. They may never appreciate the value of the butter. But every great chef knows and understands that the butter is one of the essential ingredients in a pastry. They will gladly pay a bit more for the best butter.   Like the butter, data is just an ingredient. It also sells to the best innovators (usually data scientists, machine learning teams, product managers, and top analysts). Humility is key in data businesses. The pastry chef knows the value of the high-quality butter. But it is important for the butter merchant to remember that the butter is just one of many important ingredients that makes a great croissant. There are many things (including the labor of the pastry chef) that go into making a delicious breakfast treat. Butter is just one of the things. Like the butter merchant, data companies need a lot of humility. Data is just an ingredient -- it is not the end solution. Being humble is so important for data companies that SafeGraph has called it out as one of our six values:‍ We are the enablers, not the solvers‍ As a company, it is important we have the humility to accept that our clients and partners will ultimately be the ones to make the world a better place and solve humanity’s greatest challenges … we are just an enabler. This humility should always color everything we do.     As we wrote in the DaaS Bible last year:‍ Data Companies are the unsexy archivists‍ Working at a data company is like being an archivist at the Library of Congress. You know your job is important but you also know it is a supporting role that helps other people shine. Your job is to help and support innovators. There are very few monuments to archivists. They don’t win Nobel Prizes. They don’t write the Constitution; they only preserve it. Being an archivist means being extremely humble. You are an unsung hero. Your job is to help the innovators innovate. You are not the race car driver, you are the pit crew (or maybe just the person who built the wrench). Some people are naturally excited about the role of being an archivist. They are excited to be in the background and have the intrinsic self-worth of playing the core supporting role. Like the lighting engineers in a Broadway play. But not everyone is suited to be behind-the-scenes and those people should not start or work at data companies. ‍ Note: if you are excited about the mission to be an archivist, join us in a career at SafeGraph.   #### Why joint data sets are infinitely more valuable This blog was reposted with permission from PredictHQ | Author: Valerie Williams | Original Source More data = more perspective Why are joint datasets more valuable? Because they contain information from multiple sources, which provides companies a more comprehensive and nuanced view of the data. For example, a QSR can use their own historical sales data combined with public transport information and census block data to better understand customer trends and patterns that impact their business transaction volumes. By combining these different perspectives, joint datasets provide a more complete picture of a given phenomenon, which is useful for research and decision-making in multiple ways: Joint datasets help to improve the accuracy of forecasting models by providing a larger and more diverse set of data for the model to work with. Massive data often leads to lower estimation variance, resulting in better predictive performance. More data also increases the chance that it contains useful information. Joint datasets provide a more complete view of the data, which can help to identify trends and patterns that might not be apparent when looking at the data from a single source. Joint datasets also help to identify relationships and connections between different variables, which can be useful for developing more sophisticated forecasting models that take these relationships into account. Data professionals across many industries have been using the concept of ‘Joins’ in Sql for many years. As more tools, programming languages, and methods become available, the concept of the join needs to evolve to enable various, symbiotic data sets to be easily combined. This will allow data professionals to spend more time analyzing the data and building insights, rather than figuring out how to mesh data sets together. Let’s take a look at how companies across key industries are using one joint data set in particular to enhance forecasting and drive sales, and you can too. How joint demand and location intelligence powers deeper business insights One particularly powerful data combination is demand intelligence and location intelligence, made possible by a partnership between PredictHQ and SafeGraph – a data company that specializes in providing granular location data and insights including point of interest (POI) data and building footprints which businesses use to better understand the physical world. Leaders in advertising, mapping applications, retail, travel, real estate, and more rely on SafeGraph POI data to reveal the spatial behavior of human beings in relation to physical locations. For example, real estate companies leverage POI data to identify indicators of growth in a specific area based on the presence of certain businesses such as large restaurant chains. These same industries and many others that PredictHQ works with, including quick service restaurants, accommodation companies, parking companies and more can leverage POI and intelligent event data to get a more holistic picture of locations and their demand. We join these datasets together using Placekey, a free, universal standard identifier for any physical place. The combined data sets provide even further granularity and precision by clarifying exactly what is driving demand at a location. Joining these datasets together unlocks local demand insights about holidays, concerts, sports, festivals, and more – each of which impact demand in different ways. With access to accurate POI data and intelligent event data, a variety of industries are gaining greater insight into events tied to unique locations. How to leverage events + POI data to make data-driven business decisions Data-savvy companies are boosting customer engagement by aligning their brands with local events, causes, and trends customers care about. When you have insight into all events taking place within walking distance of your stores, you can choose which ones to build campaigns or promotions around. For example, advance notice of a community breast cancer walk that ends right by one of your store, restaurant, or parking garage locations. PredictHQ provides predicted attendance, exact start times, and accurately predicted end times for attended events – paired with granular location intelligence powered by SafeGraph for unmatched accuracy and detail. Let’s look at a couple of examples of how joint event and POI data powers actionable business insights: Consumer Packaged Goods (CPGs) CPG brands often want to know their total addressable market – the total number of businesses that could potentially carry their product. By determining this number and mapping it out geographically, companies can more effectively focus their expansion efforts and target specific areas for growth. “As the pace of change in shopper demand patterns, market conditions, and supply chain constraints accelerates, demand forecasting AI that uses state-of-the-art models reduces guesswork and gives CPGs a more strategic view,” said PredictHQ CEO Campbell Brown. For example, a beverage manufacturer that produces a variety of sports drinks. ​​The company can use location data to analyze where their products are currently being sold, as well as where they are not being sold, but have the potential to be successful. Or they could use location-based event data to track local sporting events, such as football games or marathons, where their drinks are likely to be in high demand. They can then reach out to distributors within walking distance of these demand-driving events. The company could also use event data to identify when and where these events are taking place, and then use this information to target individuals in the surrounding area with targeted advertisements and promotions for their sports drinks. By leveraging joint data, they can boost brand expansion, gain business momentum, and better-than-anticipated earnings year over year. Out of home advertising Out of home advertising has been around for decades, with many companies discounting it as a channel for years but it is now experiencing a resurgence. A recent KPMG report pointed out that outdoor advertising has witnessed a 11% annual growth rate over the past five years and we expect to see the upward growth trend in 2023. Out of home advertising businesses can use the joint dataset for campaign planning through audience segmentation and measure advertising effectiveness based on specific POIs or events that take place. They are using the data to know where and how to interact with certain audiences. This is a shift from previous strategies of marketing to a person to marketing around an occasion, like a sports game. Occasion-based marketing allows OOH advertising companies to meet their target audiences expectations with context from events and around venues or POI where the events are being held that builds relationships and loyalty. Occasion-based marketing allows OOH advertising companies to meet their target audiences expectations with context from events and around venues or POI where the events are being held that builds relationships and loyalty. Mapping products in the geospatial industry Companies that provide mapping applications such as the ones you use on your phone to find an EV charging station or coffee shop use this joint data to better understand where things are, and update these details to ensure accuracy with the real world. For example, as the names of local stores and venues change, mapping companies need these details to provide the most up-to-date information for end users. Beyond having up-to-date details about a business, it’s important to depict an accurate view of a place’s building footprint, which encompasses a building’s precise parameters or “outlines” of a given structure. Polygons represent the true shape of a POI, and can help visualize the exact location of a building, the number of buildings, and even buildings hidden in aerial images by trees. For example, POI building footprints can provide context into the restaurants that live within a sports arena and the surrounding surface parking lots. By combining fresh POI and events data, mapping companies can provide their users an accurate depiction of the physical world. Retail Leading retailers are optimizing their site selection with event and POI data by choosing locations based on demand from nearby venues and areas that have many events – such as severe weather, concerts, sports events, school holidays, and more. Events drive people movement and companies are using event data to determine how many people will be coming to specific locations for events, to better understand the level of competition in the area and the potential growth. This insight helps retailers make more informed decisions about site selection and develop strategies for attracting and retaining customers in the chosen location. #### Why SafeGraph Does Written Interviews ✍️ (and Why Your Company Should Do Them Too) Most jobs at SafeGraph require a written interview. We’ve found these written interviews to be extremely valuable, and including them in our hiring process leads to better results. This post outlines our thinking about written interviews. What is a written interview? A written interview is not a test…and it is different from a project or presentation. It is essentially the same thing as a live interview except it is communicated in written form so candidates can take their time to compose their answers. We simply send candidates a link to a Google Doc with 4–8 questions. We ask them to get us responses within 3 days…so they should have ample time to think things through. Usually, candidates take 20–60 minutes to complete a SafeGraph written interview, and we try to have the courtesy to respond to the candidate within 12 hours of submitting the interview with any feedback or next steps. Written interviews can augment the interview process. While we are not arguing that companies should stop doing live interviews, we think replacing one of those live interviews with a written interview will significantly increase the interview experience (for both companies and candidates) and ultimately lead to better outcomes. Written interviews give everyone in the company a baseline about the candidate. Before doing a live interview of a candidate, every interviewer always reads the candidate’s resume and hopefully thinks about some good questions. At SafeGraph we also ask interviewers to do the extra step of reading the written interview before conducting their live interview. That means the candidate does not have to answer the same questions to everyone. Everyone at SafeGraph who interviews the candidate starts with a deeper understanding of the candidate, and interviewers can ask follow-up questions about the written interview. Written interviews help reduce the biases in favor of people that think fast on their feet that are inherent in live interviews. Not everyone does their best thinking on-the-spot (I certainly don’t). Some people (myself included) need to take some time to think about a problem before having a decent answer. A candidate should be prepared to talk about some topics live and on-the-spot (like their work experience), but other areas require thought (and research) to deliver a good answer. Written interviews can also change the dynamic of the interview process. With written interviews, you provide candidates an opportunity to comfortably showcase their creativity and critical thinking skills in a way that is abstracted from questions normally asked in an in-person interview. While all of the people hired at SafeGraph did well in both written and live interviews (otherwise they would not have gotten an offer), some of the best performed significantly better in the written interviews. In our small but rapidly growing start-up, we would have overlooked at least two great people if we did not get a chance to see their extraordinary written interviews. What stage of the interview process is the best time to do a written interview? After doing a resume review, we schedule a quick (under 30 minutes) live phone/video call to discuss the candidate, role, etc. We prefer the specific hiring manager for that role to do the first phone interview (when possible). When we have a good first live interview (usually a phone or video call), then we send a written interview to the candidate. We let them know why we are doing the written interview and walk them through the process. What are some good questions to ask in a written interview? Our written interview is currently six questions. There are three types of well-written interview questions: Expectations Questions. Research Questions. Thought Questions. Expectations Questions are usually ones that give the candidate a chance to opt-out. For instance, if a job requires a massive amount of overseas travel, you might want to ask “this job will require you to be on the road 6–9 days a month. Are you ok with that?” These questions do not require a lot of thought or a lengthy response, but we find that one gets a more truthful answer in a written interview (and it allows candidates to gracefully drop out of the process if they do not feel they can fit the criteria). Expectations questions are not a test, rather they are a confirmation that SafeGraph and the candidate are aligned on some fundamental expectations of the role. Research Questions are ones that require the candidate to do some research and get back to you. For instance, you might ask a marketing candidate: “Evaluate our website. What do you like and what can we do better?” Research Questions are hard to give in a live interview because they take time. But it is a shame to omit them because they give candidates a real opportunity to shine. Candidates also make choices on how to present the information (e.g., graphically, in a Gantt Chart, via a presentation, a recorded video, organized bullets, prose, etc.) that is helpful to understand how a candidate would communicate in the real workplace. Thought Questions are more open-ended questions that take consideration to provide meaningful answers. As an interviewer, my goal is not to hear the first answer…I prefer to hear the candidate’s best answer. For example, the classic Peter Thiel question is better to ask in a written interview than live. Our version is “what is something important that you believe that most people at SafeGraph would disagree with?” Even in written form, with plenty of time to collect your thoughts, this is an incredibly hard question (in fact, we usually get answers that most people at SafeGraph do very much agree with). Asking this live does not give the candidate a chance to shine. Because I am a big reader, another written question we like to ask is “what is a great non-fiction book or article that you would recommend we read?” Many of my best readings have been recommended to me from candidates. Asking this question live might not get the best answer (one may just get a sub-optimal recommendation that is top of mind). Written interviews should not entirely replace live interviews. One thing that rarely gets asked is why do companies do interviews live (in real-time)? One reason is to ask follow-up questions. If an interview was done asynchronously (like over email), it would be really hard to ask follow-up questions. The back-and-forth might take weeks. So having a set time (like Wednesday at 10:30 am) to have a live conversation can be really helpful for a quality back-and-forth. Another reason to have a live interview is to assess how fast someone thinks on their feet. For some roles, like sales, this is a very important skill. But if you rely too heavily on live interviews then you will be biased towards hiring people that think quickly on their feet. That is not a core skill a company should optimize for…at least for most roles. Another benefit to having live interviews is to allow the candidate to ask the interviewer questions. “What is it like to work at your company?” “What is your personal story about how you joined the company?” Live interviews allow the candidate to ask follow-ups on their questions. Live interviews also give you a sense of the candidate’s personality. How does this person verbally communicate? Would they be a good fit in my company’s culture? And they allow the candidate to get cultural cues about the company (which is especially true when visiting the office). But these cultural cues can also lead to implicit biases (Bob may be overly biased to hold favorable opinions of people that look and act like Bob) so they need to be tempered with other data. Can candidates give written interviews to companies? In live interviews, it is common to allow the candidate to ask the company questions and hear the company answer. While it is not common, we have seen candidates send us written interviews to answer. This usually happens when the process is more mature (near the offer stage). Candidates might have a list of questions about the company that they want to know and asking them in written form might be both efficient and ensure the best answers (we want to be able to provide our best answers, and not just our first answers, too!). Want to try out a written interview? We’re hiring. We invite you to explore a career at SafeGraph. #### Why SafeGraph Moved its HQ from San Francisco to Denver Where should a company’s official headquarters be located? As many companies become more distributed - or, like SafeGraph, fully distributed - it’s a question many are wrestling with.To be clear, there isn’t a one-size-fits-all answer. Every company will have different things they’re optimizing for, and different employee makeups that may dictate one option over another. Up until mid-2020 SafeGraph was legally headquartered in San Francisco. That made sense because it’s where we were founded and started in 2016, and over 80% of our early employees were located there. We had a large physical office, and held monthly “remoter’s weeks” in that office where all non-SF-based employees would come so we could all be together. There was no strong argument SafeGraph could make to be based anywhere else.Fast forward to 2020, and SafeGraph has morphed in a way many companies have over the past year. We decided to lean into a remote work environment a year or so prior to the pandemic, which has unsurprisingly only accelerated that change. Even a completely remote workforce can have a team Halloween party.As we pass the 50 employee mark today, we have less than a quarter of our team working or living in the Bay Area. Many that were previously there have moved - to Miami, southern California, Colorado, Seattle, and Texas, just to name a few - and even more that have recently joined are based out of other locations. We anticipate the percentage of SF employees will likely be under 10% soon. SafeGraph's remote working environment allows employees to live and work across North America.Although our CEO still lives in San Francisco, we don’t have the ties to the city that we once did. While we love San Francisco - its vibrancy and its beauty - the city has long been too expensive and offered too few services for the cost. SF is also increasingly unfriendly to businesses, making it hard to make long-term plans to be there. Crime has also gone up - a few of our employees have been mugged or assaulted in the middle of the day. San Francisco no longer makes sense to be set as our corporate headquarters.So, why Denver? There are several reasons we decided on the Mile-High City. In no particular order:We already have a good number of employees located in and around Denver, including two VPs, and anticipate that number growing over time. Though it’s not necessarily a legal requirement - depends on who you ask - we thought it was smart to choose a location where at least a few employees were located. Texas was another option we considered, but we only have one (newly relocated) employee there and it didn’t feel as “defensible” of a choice.Colorado is considered a business-friendly locale. San Francisco, and California more generally, is the opposite. Business taxes are more manageable in Denver, and the government is much friendlier and more receptive to businesses based there. This will be very valuable if SafeGraph is profitable in the future; we were profitable in 2019 and it is likely we will be again. This may also open up tax grants or other similar opportunities that would not have been available without being legally based in Denver.It’s very possible that our employee benefits costs will go down. As a distributed company who offers health insurance to all our employees, we are forced to be priced based on our headquarter location. This means an employee who lives in, say, Chicago is paying for their health insurance as if they lived in San Francisco. While the Bay Area isn’t the most expensive in the US, there are much more competitive markets (mainly due to the high average cost of medical care in San Francisco). Denver is typically more affordable, and we’re eager to see how much savings we can gain here. Health insurance is an ever-changing industry, though, so we won’t know what this means as far as savings until our next open enrollment period.Denver is known as being a popular place for remote employees to live. The reasons for this are many, including the more affordable cost of living, great climate, outdoorsy nature, great airport, etc. While we don’t plan to focus SafeGraph’s hiring in any given city, we do like that our headquarters is in a place with a longstanding connection to remote-friendly work.It’s a beautiful, fun, thriving, growing place! We’re proud to say we’re “Headquartered in Denver”, and think it’ll be a great location to all get together as a company when times return to normal. Our Denver-based employees enjoy exploring the area. Washington Park is a favorite spot.There are likely even more reasons for our HQ move that we haven’t yet discovered. We look forward to finding out what those are! #### Why SafeGraph Raised a $45M Series B SafeGraph is a fully remote data company, with over 70 employees across North America.Read more about SafeGraph's Series B in TechCrunch, Insider, and Street Fight. We raised a $45M Series B to get us one step closer towards making our vision (to democratize access to data) a reality. Raising a Series B allows us to grow faster and more quickly achieve our goal of being the destination for all data about physical places (geospatial points of interest). But raising a Series B doesn’t change our mission at all -- SafeGraph is (and will always be) just a data company. Our only focus is data. No software, analytics, visualization. Just data. Just facts. Now is a great time to build a data company.SafeGraph’s Series B will help us continue to build our pure-play data business as demand for high-quality, accessible data grows.Welcome Cathy Gao at Sapphire VenturesWe’re incredibly excited to work with Cathy Gao and the full Sapphire team.Our Series B is led by Sapphire Ventures, and we’re incredibly excited to work with Cathy Gao, Nino Marakovic, and the full Sapphire team. Cathy brings a wealth of knowledge to the table that will be invaluable while navigating and discovering new markets of data buyers. And Sapphire has deep experience in investing in data businesses. Their team understands what it takes to build category-defining companies. Before entering the fundraise, we ranked Sapphire as the most desired long-term capital partner for SafeGraph. We felt the firm had the best understanding of data businesses and also has an amazing history of backing mega-successes. ‍We also want to thank the handful of other firms we met with during the process and the four firms that gave us very competitive term sheets. All of these investors helped improve the SafeGraph business. One of the great things about the fundraising process is the large number of extremely smart people you get to interact with in a very concentrated time.‍Our Series B also includes investors from our previous round of funding -- thank you to Alex Rosen at Ridge Ventures, Mitch Kitamura at DNX Ventures, Will Snellings, Pete Briger, Dan Benton, Jack Dangermond, Jeff Epstein, Tod Sacerdoti, and Peter Thiel. They have been influential in our growth over the past four years, and we are excited to continue working with them.We raised the funds on a $370 million valuation. More importantly, we raised them from a great partner and we did it on super-clean terms. Meeting the growing market demandA lot has happened at SafeGraph and in the world since our last round of funding. The geospatial industry has grown tremendously, with more businesses relying on location data to solve increasingly complex problems. With our laser-focus on places data, SafeGraph is just a piece of the larger geospatial puzzle, and we’re excited to see the continued growth in the industry led by our users and partners.As a company, we’ve grown to over 70 employees (as of March 2021), more than doubled our year-over-year revenue, and built a highly-scalable data business. We’re proud of what we’ve accomplished, but we have big plans for the future and are excited to move into this next phase of growth.As we’ve hired more employees, we’ve expanded across North America. From Toronto to Miami to San Diego and everything in between, SafeGraph is gaining notoriety as a top tech employer.One of the most important metrics we measure is revenue per employee. As we’ve grown our team substantially in the past few years, this has been incredibly important. Our adherence to business efficiency ensured we didn’t need Series B funding to keep operating as we are today. But in order to meet the rapidly increasing demand for reliable and accurate places data, we elected to grow even faster.Why even raise money?Just six months ago, we were not even sure that we’d ever raise additional money. SafeGraph is growing very quickly and we’re one of the most efficient high-growth tech companies (we burned just $3M in cash in the last 2 years). Despite more than tripling our team, SafeGraph remains one of the most efficient high-growth tech companies by closely watching revenue per employee and other key metrics.Most companies raise money because either (1) they need to; or (2) the money is too hard to say no to. We did not want to raise money just because that is something expected of a fast-growing start-up. We did not want to raise money just because now is a great time to raise capital. If we raised more money, we wanted to do it because it would have a massive impact on the SafeGraph vision.Raising money (selling equity) has a cost. It takes a LOT of time to raise the money -- and that time is primarily borne by the CEO and a few other key personnel. The opportunity cost of that time is massive. Raising money is also dilutive to shareholders -- some of the best companies could have had a much higher shareholder return if they raised less money. Prior to the Series B, SafeGraph had raised less than $21M and most of that was in the bank. We prized our efficiency and definitely did not need to raise more money. We also prize our focus on shareholder value -- the price per share is more important for the long-term wealth creation of our shareholders than market-cap. Our obsession with shareholder value and our belief in focus guided us in spinning out a separate eight-figure ARR business last year. But we did ultimately decide to make the conventional choice and raise a Series B. We did that because we have a massive opportunity to democratize access to data (in the long term) and to have data about every physical place in the world (in the mid-term). Extra capital will significantly accelerate growing our selection from 7 million places to 7 billion places -- from aggressively expanding internationally to adding more types of places to making more acquisitions.And speaking of acquisitions … expect SafeGraph to be acquiring a few world-class companies this year -- both with the capital we have and the large pools of additional capital we also have access to.Navigating the pandemic (that helped us grow)Like most businesses, the event that had the greatest impact on us over the last year was the pandemic. COVID-19 greatly accelerated the adoption of geospatial analytics, particularly with places data, within every industry. Understanding deep structural information about physical places is even more important when there is an external event that accelerated the trajectory of change in the data. Like most companies, SafeGraph and its employees have adapted to a Zoom environment over the past year.SafeGraph serves some of the most important companies and organizations in the world: from logistics, GIS, public health, financial services, retail, academia, and everything in between. Special thank you to the data scientists at our amazing customers including: Sysco, Esri, Goldman Sachs, Volta Charging, Verizon, Mapbox, Ares, Tripadvisor, Choice Hotels, Sheetz, Jefferies, Centers for Disease Control, US Foods, Axtria, Tate & Lyle, City of Los Angeles, Sandia Labs, Xcel Energy, Viant, AT&T, KeyBank, and many many more.SafeGraph data powers numerous COVID-19 response applications, such as the US COVID Atlas.We’ve been working tirelessly to deliver our data to those who need it in these challenging times. To that end, we created a community of over 7,000 data scientists who collaborate on geospatial projects, both related to and separate from COVID-19 response. Their work has led to over 300 academic papers (in the last 9 months) published citing SafeGraph data and inspired us to keep innovating and building truth sets for the physical world.Finding our place in the geospatial ecosystemTeaming up with other geospatial data science leaders, we launched Placekey, the standard identifier for a physical place. Placekey makes location data more accessible and usable for all.We also learned a lot about how data scientists work with geospatial data and decided to tackle one of their main challenges: the join key. Data is the most powerful when it can be connected to other data, but that isn’t always easy to do. To empower our users, we helped develop the standard identifier for a physical place, Placekey. With Placekey, SafeGraph data is now easily joinable to a growing network of datasets, helping more people access location data.Building our partner network and global footprintPartnerships are key to our business, particularly as we focus on self-serve. We’re focused on places data -- and being really, really good at building it. But geospatial data is so much more than points of interest and building footprints.Through our partnerships, we are able to expand access to SafeGraph data, providing it within platforms data scientists are already using for analysis.We’ve been growing our partner network to include technology and services providers, like Snowflake and Databricks. These partnerships help us integrate with key technologies our customers use and provide expert professional services when needed.Four orders of magnitude in two yearsWe’re also going to increase the amount of data that is available to our customers by 100x in the next year. And then we are going to 100x that the year after. That’s right, we are increasing our coverage by two orders of magnitude a year! (For you math geniuses, that’s four orders of magnitude in two years).We are going to do this by:Aggressively expanding internationally (we launch UK in a few weeks)Working with other fantastic data partnersAcquiring companiesIncreasing SafeGraph’s infrastructureA team of passionate superstarsAs of this writing (mid-March 2021) there are over 70 employees of SafeGraph. We started 2020 with 22 employees -- so we’ve grown our team by over 3x in less than 15 months. If the definition of being a “veteran” is that over half the team has been hired after you, then people that joined SafeGraph six months ago are now veterans.While the company has grown rapidly, SafeGraph’s vision and values have remained remarkably consistent.And we are hiring. A lot. For all roles. Anywhere in the U.S. and Canada (and others soon). Come join us if:You are passionate about SafeGraph’s mission to democratize access to dataAt least 80% of people who have worked with you put you in the top 10% of the people they have worked withYou believe that life is too short to work with B-playersLooking aheadIn our blog that announced our Series A funding, we noted that we predict the past by focusing on veracity and truth. While many little things in our business have changed, the core things about SafeGraph remain consistent. We still focus on selling high-quality facts. Five years from now, you will be able to get data on every place in the world. And 100 years from now, SafeGraph will continue to democratize access to data and still just sell facts to innovators. #### Why the famous Peter Thiel interview question is so predictive Most people think their views are heretical when they are in fact mainstream.You learn this when you ask Peter Thiel’s favorite interview question to job candidates: “What important truth do very few people agree with you on?”The Thiel question (which can be described as “what is a heretical view you have?”) is a great one because it reveals that most people hold conventional opinions and call them “heretical.”It’s a great question because almost everyone cannot come up with an answer that most people they know do not agree with.Heretical in an interview context could mean “an idea that most people at SafeGraph disagree with,” or if the candidate works at a large company (like Google), “what most people at Google would think is crazy”.The Thiel question ("what is a heretical view you have") is such a great question because almost everyone answers it with an opinion that most people agree withPeople think their views are heretical when they are very mainstream— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) December 15, 2019 What makes for a good answer?A good answer is one that sparks a unique view of the world and shows the candidate doesn’t think like their peers. This signals that the candidate 1) thinks differently; 2) is open-minded, and 3) is brave enough to discuss an unpopular opinion.An employee who thinks differently will identify opportunities the larger group may not see. This employee will see a rabbit-shaped cloud in the sky when everyone else sees a turtle. There’s value in that (especially if it is indeed a rabbit).An open mind leads to more “yes, ands” and fewer “no, buts.” If you’re looking to drive growth at a small start-up, “yes, ands” are far more valuable. If you work at a large company, “no, but” is likely more valuable to protect the company from significant downside risk. So someone that performs well on the Thiel question may ultimately be better at a smaller fast-moving company than a larger slower-moving organization.One would prefer answers that do not just restate the conventional wisdom of a recent TED Talk. It’s okay if the answer is wrong. It’s not okay if everyone agrees with it.Of course, the best answers are both contrarian and right. Being right is extremely hard (and might take years to prove). But one should be able to immediately know if the answer is contrarian.What does a bad answer predict?There are three potential “flags” if someone answers the heretical view question during an interview with a conventional view:1. They are not self-aware enough to know a view is well-held.2. They don’t have many original thoughts.3. They are afraid to sell people their original thoughts.There are three potential "flags" if someone answers the heretical view question with a conventional view:1. they are not self-aware enough to know a view is well-held2. they don't have many original thoughts3. they are afraid to sell ppl their original thoughts— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) December 15, 2019 A Lack of Self-AwarenessIf someone answers the question with a mainstream view, it’s possible that they just aren’t self-aware enough to know a view is well-held. To be self-aware means you ask questions of yourself, such as “how many other people also have this view that I think I alone hold?”People who aren’t self-aware never question (❓) the belief or investigate whether their view is actually heretical. They don’t search it out online or read blogs on the topic. They don’t talk to their friends or peers about their belief.So they end up holding mainstream beliefs that they think are rare.No Original ThoughtsA second reason for a mainstream answer to the question could be that the candidate doesn’t have many original thoughts.Original thoughts are difficult to have because you are a product of your environment. If surrounded by people who hold a specific view, over time, you will likely come to hold that view. Usually, that incentive is to stay a member of the group — to keep your friends, your job, or your peers. We all eventually become the average of our friends.This is one reason why writers often prefer to write during long periods of isolation. Developing original ideas and heretical points of view is a precarious process, and the writer must protect those ideas from mainstream contamination. Isolation helps with this until the original ideas can stand on their own. The rest of us do not have that luxury so we are constantly at risk of losing our originality as we are pushed to conform to our surrounding norm.If someone doesn’t do this work to recognize why it’s difficult to have original thoughts, they’re not likely to have many of them.A Fear of JudgmentThirdly, some people have very original thoughts but they are afraid to tell others about them. They are afraid of being socially ostracized (or worse). Everyone is afraid of this. So we will only feel comfortable telling people the truth if we feel confident that they will not penalize us for this. They want to know that the messenger will not be shot.One of the advantages that Peter Thiel has is that he is a well-known contrarian that is known to appreciate various truths and is a promoter of free speech. So most people would be confident that even if Thiel does not like their idea, he will not personally attack them. Many people would be comfortable telling Peter thoughts they would not dare to publish on Twitter.Other leaders that want interviewees to feel comfortable proposing contrarian ideas have to provide a no-judgment space for people (so that they can feel safe). Part of the reason we are writing this is that we want people interviewing at SafeGraph to feel more assured that their “crazy” ideas will not be used against them.People that do have original thoughts need to feel safe enough to sell people their ideas. Because so few people have deeply innovative ideas, it would be very bad for society if these ideas just sat on a shelf and were never heard. It takes courage to go against a mainstream view in a group dynamic … but that courage does not need to be carried alone.As a leader, you should cultivate an environment where employees feel safe to share heretical views, especially if they solve a problem or lead to a growth opportunity. And you can do the same in an interview process.To put a candidate at ease when asking this question, clearly signals that you tolerate weird or unpopular ideas. That may help them to give an honest answer, even if they think their opinion is so outrageous it could destroy their chances of getting the job. Tell them that you won’t get offended by the answer, and hopefully, they’ll share a true heretical view.Risking unpopularityOne learns quickly in life that it is never fun to be unpopular. Everyone has felt it at some point in their life. Most people never want to feel it again and so they go against the grain.People that are willing to say unpopular things are ok risking unpopularity for the sake of truth. These people prize curiosity and knowledge over being the life of the party.Of course, some people are just jerks. One of the hard parts is finding sweet contrarians — people who think differently yet are still kind, nice, and considerate. You can think differently while still having tact and heart.As Thiel says in his book, Zero to One: “Brilliant thinking is rare, but courage is in even shorter supply than genius.”Hiring people who cannot come up with any heretical views is precarious. Either the person is ultra-conventional, political, or lacks self-awareness.You do not want those people in your organization, especially in leadership positions.in any case, hiring people who cannot come up with any heretical views is precarious. either the person is ultra-conventional, political, or lacks self-awareness ... and you do not want those people in your organization ... especially the leadership— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) December 15, 2019 Of course, it is not enough to just have a contrarian sentence. The interviewer wants to pry on why someone thinks that way. Just like regurgitating a TED Talk is not sufficient, neither is restating a well-known contrarian argument without something underlying the thesis. For instance, “there should be a free market where people should be allowed to sell their kidneys” has been discussed in every good freshman dorm. Answering that to the Thiel question without a lot of new information or arguments is just as bad as not being contrarian at all.The Thiel question can help you find and hire 10xers.The goal is to try to identify and hire 10xers for your company. These are people who not only identify solutions to problems but focus on finding new opportunities for growth (often with heretical ideas) and have the courage to pursue them. They have a bias for action while maintaining a glass-half-full view of the world.These 10xers are very rare. That’s why Thiel’s question is a great one — it helps you tease out a few of these characteristics in an interview.The question isn’t sufficient on its own.The question isn’t perfect though. It does not stand alone in an interview process and must be combined with other ways of assessing candidates. It’s just one tool in a larger toolset.Written interviews or advanced notice.It’s also difficult to give a good answer to this question during an in-person interview. The question requires a lot of thought. It’s an involved question so it’s best to only ask it in written interviews (or to give a candidate warning that you will ask the question). This gives the candidate a few days to think about their answer.A great answer (as rare as that is) does not guarantee better performance. It only hints at the ability to cultivate original thoughts, along with the courage to share them.You could be interviewing someone with a great answer who is not going to be a great employee. They could be a great entrepreneur or a great thought leader, but not a great employee. Original thought is important, but so is being able to work well in a team towards a common goal.I used to be worried about asking the Thiel question to candidates because I thought they would give answers that were completely crazy. but it turns out that crazy answers rarely happen -- the fear now is of the conventional answer.— Auren 𝐇𝐨𝐟𝐟𝐦𝐚𝐧 (@auren) December 15, 2019 I used to be worried about asking the Peter Thiel question to candidates because I thought they would give answers that were completely crazy. But it turns out that crazy answers rarely happen — the fear now is of the conventional answer.If someone does give you a heretical view that you’ve never heard, pay attention. It could be that someone who pushes your organization forward by 10x. It just doesn’t guarantee it.Special thanks to Thomas Waschenfelder for his help and edits.Want to learn more about SafeGraph? Check out our career opportunities (you can be based anywhere in North America). #### Why Transparency Matters: Becoming the Most Transparent Data Company Key Takeaways Transparency is the foundation of data privacy for location data and a prerequisite for trustworthy analytics. SafeGraph focuses on place-based intelligence, does not share data on individual consumers, and does not offer any mobility data whatsoever. Open schemas, documented sourcing methodologies, and published limitations allow customers to evaluate data quality with confidence. This approach enables mission-critical location analytics built on trust, accuracy, and long-term reliability. Note: SafeGraph does not offer mobility or foot traffic data.Data privacy for location data is fundamental to building trust in geospatial intelligence. As location data increasingly informs decisions across enterprises, research institutions, and How Transparency Is Core to Our Mission of Becoming the Source of Truth for Physical Placesd public organizations, transparency around how that data is sourced, validated, and protected becomes essential. Safe Graph’s goal is to be a source of truth for data on physical places. To support this goal, we focus on curating accurate, precise, and up-to-date geospatial datasets that power location analytics for large corporations, small businesses, and academic institutions. We also recognize that truth in a rapidly changing physical world is aspirational. While we continuously strive for the highest possible accuracy, we are equally committed to being transparent about where our data comes from and how it is evaluated. This commitment includes free access to our data sourcing methodologies, schemas, fill rates, and known limitations. How Transparency Is Core to Our Mission of Becoming the Source of Truth for Physical PlacesFrom the beginning, SafeGraph has been committed to data privacy for location data and to providing access to high-quality places data without compromising consumer trust. Our focus has always been on data tied to latitude and longitude coordinates, never to individual people. We are equally transparent about the limitations of our datasets because trust is built not only through accuracy, but through openness. Our Data Journey Understanding physical locations also requires understanding how people interact with them. To support location intelligence, use cases while maintaining strong data ethics, we partner with trusted data providers to enrich our places data with anonymized foot traffic data and aggregated transaction signals. All such data is privacy-safe, aggregated, and designed to prevent individual identification. Our earliest datasets consisted of aggregated and anonymized mobile location pings sourced from applications where consumers explicitly opted in to share their location. At the time, this data included latitude, longitude, and timestamp information, but lacked context about what existed at those locations. Customer feedback made this limitation clear. Without knowing what was present at a location, the data was difficult to interpret or use effectively. To address this gap, we began sourcing points of interest (POI) data to add meaningful context to mobility signals. However, we quickly discovered that high-quality POI data was difficult to obtain. Existing providers updated their data infrequently, relied on inaccurate geocoding, and lacked the level of detail required for reliable location analytics. As a result, we made a strategic decision to build our own POI database. In 2018, SafeGraph shifted its focus from device-based location data to place-based intelligence. This shift strengthened our commitment to POI data sourcing methodology, transparency, and data quality. Even as we built a more accurate and comprehensive database than what was available in the market, we recognized that complete accuracy is impossible in a constantly changing physical world. To maintain transparency, we continue to publish known issues, data limitations, and monthly fixes. This approach allows customers to understand not only the strengths of our data, but also its boundaries. You can learn more about how we structure and maintain place intelligence in our POI Data documentation. The Importance of Transparency for Mission-Critical Location Analytics Organizations often use location intelligence for mission-critical analytics, which means they need to trust both the data itself and the company that produces it. This is especially true when working with sensitive datasets such as anonymized foot traffic data and other privacy-aware location signals. When there is uncertainty around data quality or POI data sourcing methodology, the analytics built on top of that data are immediately called into question. Whether insights are used to make important business decisions, advise clients, or inform consumers, the outcome must be trustworthy to be valuable. Transparency around what the data represents, how it is sourced, and where its limitations exist forms the foundation of that trust. The same principle applies to data providers. If a company is not trusted, neither is the product or service it delivers. Transparency about bugs is one part of building trust, but so is openness around how data is collected, anonymized, and governed. This is particularly important in data ethics within location intelligence, where responsible handling and clear documentation directly influence credibility. Industry research and policy guidance consistently emphasize that transparency and anonymization are critical safeguards in the use of location data, particularly as geospatial insights play a growing role in enterprise and public-sector decision-making. For SafeGraph, transparency is especially important because our data often serves as one component within a larger solution. If the integrity of that data is questioned, the reliability of the end solution is also at risk. By maintaining openness around sourcing practices, data quality, and known limitations, we work to protect both the integrity of our datasets and the outcomes our customers depend on. How Is SafeGraph Transparent? SafeGraph does not hide anything about its data. We publish our data schema publicly, along with bug fixes and release notes, as part of an open data schema for geospatial intelligence. Our datasets are refreshed monthly to reflect a dynamically changing world. While we strive to create the most accurate places data possible, we recognize that no dataset is ever perfect. We actively encourage user feedback and make it easy to report errors so they can be addressed quickly. SafeGraph also makes data access straightforward. We provide data free to academics for use in research and education and remain transparent about what data we offer and how it is built. Our goal is to ensure that high-quality location data is accesible to those who need it. The integrity of our company and data curation practices is a core value across SafeGraph. In location intelligence, privacy is foundational to transparency. That is why we are open about what data we build, how it is built, and the safeguards we use to protect consumer privacy. Ready to use trustworthy location data? Get a free SafeGraph data sample and explore how transparent, privacy-safe data can support your analytics. Get Sample Data Other companies may violate privacy regulations, but what differentiates SafeGraph is that we don't offer any consumer data. Because our focus is entirely on places, we never collaborate with individual-level consumer data. Instead, we curate aggregated and anonymized foot traffic data to provide insights into visit volume and frequency at specific locations. Industry research and policy guidance consistently emphasize that transparency and anonymization are critical safeguards in the ethical use of location data, particularly as geospatial insights play an increasing role in enterprise and public-sector decision-making. Across our datasets, we apply a consumer protection methodology to ensure that insights reflect trends at locations rather than individual behaviour, reinforcing trust and responsible data use. What Kind of Data Does SafeGraph Build?Part of being transparent is clearly defining the data we provide. SafeGraph offers data about physical places. Our POI database (Point of Interest) includes attributes such as latitude and longitude coordinates, open and close dates, and NAICS codes. Geometry data defines structural boundaries and building relationships to support accurate proximity analysis and geofencing. We also offer SafeGraph Spend, which provides anonymized and aggregated debit and credit card transaction data at the business level.  Does SafeGraph Collect Data?SafeGraph creates its datasets using a combination of machine learning, web crawling, and licensed third-party data. Places and Geometry datasets are built from open store locators, publicly available APIs, and licensed sources, supplemented by proprietary machine learning models to infer additional attributes and spatial boundaries. SafeGraph Spend is created using licensed third-party transaction data that is aggregated and anonymized at the store level. The purpose of this dataset is not to analyse individual spending behaviour, but to understand how aggregated transactions relate to physical locations, regions, and categories of places. You can learn more about our full data sourcing process here. Does SafeGraph Have an SDK? SafeGraph does not provide an SDK or consumer-facing software available in app stores. SafeGraph Provides Open Access to Data One of SafeGraph’s core values is ensuring that data is not hoarded by a small number of organizations. We believe data should be accessible to those who need it to support research, innovation, and informed decision-making. This commitment to openness has been central to SafeGraph since before our product strategy shift in 2018. We aim to foster an environment where data scientists, academics, and businesses can collaborate using location data to generate meaningful insights. SafeGraph’s Commitment to Transparency SafeGraph recognizes that our data is often one component within larger analytical systems. To protect the integrity of those systems and the decisions built on them, our priority is to be a trusted and transparent data partner. By remaining open about our practices, limitations, and safeguards, we work to increase the usability, accessibility, and reliability of geospatial data for critical problem-solving. If you’d like to learn more about SafeGraph data and our approach to transparent, privacy-safe location intelligence, explore our free datasets or schedule a demo with one of our experts. FAQ’s 1. Where does SafeGraph’s location data come from? SafeGraph builds its datasets using a combination of publicly available sources, licensed third-party data, web crawling, and proprietary machine learning. Places and Geometry data are sourced from open store locators, public APIs, and licensed datasets, with additional attributes inferred using internal models. 2. How does SafeGraph protect consumer privacy in location data? SafeGraph does not collect or sell individual-level consumer data, thereby protecting consumer privacy. 3. What makes SafeGraph different from other location data providers? SafeGraph focuses exclusively on physical places rather than people. It differentiates itself through transparent data sourcing, publicly available schemas, published bug fixes, frequent updates, and a privacy-first approach to location intelligence.  4. How often is SafeGraph’s data updated and corrected? SafeGraph refreshes its datasets monthly to reflect real-world changes. Updates include new locations, closures, corrections, and known issues, all documented through public release notes to maintain transparency. 5. Is SafeGraph data free, open, or auditable? SafeGraph provides free data access for academic research and education. Its open data schema and public documentation allow users to understand how the data is structured and sourced, making it easier to evaluate and audit for specific analytical use cases. SafeGraph builds its datasets using a combination of publicly available sources, licensed third-party data, web crawling, and proprietary machine learning. Places and Geometry data are sourced from open store locators, public APIs, and licensed datasets, with additional attributes inferred using internal models.SafeGraph does not collect or sell individual-level consumer data, thereby protecting consumer privacy.SafeGraph focuses exclusively on physical places rather than people. It differentiates itself through transparent data sourcing, publicly available schemas, published bug fixes, frequent updates, and a privacy-first approach to location intelligence. SafeGraph refreshes its datasets monthly to reflect real-world changes. Updates include new locations, closures, corrections, and known issues, all documented through public release notes to maintain transparency.SafeGraph provides free data access for academic research and education. Its open data schema and public documentation allow users to understand how the data is structured and sourced, making it easier to evaluate and audit for specific analytical use cases. #### Working with Locations Inside Other Locations Introduction What do you do when a single piece of land can be rightfully claimed by two different locations? It's not that unusual for one place to be two places. For example, the Claire's in your nearby mall. Is that in the Claire's? Is that in the mall? Really, it's in both. SafeGraph addresses this problem using the concept of parent and child locations. Some overarching location that contains many other locations within it is a parent. Common types of parents include malls, airports, and shopping centers. You can see the full list of location types that can be parents, and some more information about parent/child relationships, here. The location that's inside of some sort of parent is a child. Dealing with parent and child locations can be an important part of working with SafeGraph data, even if you're not interested in distinguishing them, since if you're not careful you could end up double-counting foot traffic: once for the parent, and once for the child. So some important questions: How can we look in our data and figure out where we have parent and child locations? How can we work with data that has parent and child locations? How can we think about the spatial orientation of parent and child locations? Let's start by opening up our data and loading it in. customer_placekey customer_parent_placekey customer_location_name placekey 1 zzw-222@64h-vr7-ysq 222-226@64h-vr7-ysq Sharps Barbershop 222-226@64h-vr7-ysq 2 zzy-222@64h-vr7-ysq 222-226@64h-vr7-ysq Cajun Seafood & Wings 222-226@64h-vr7-ysq 3 222-226@64h-vr7-ysq NaN Pleasant Valley Marketplace 222-226@64h-vr7-ysq 4 zzw-223@64h-vr7-ysq 222-226@64h-vr7-ysq Mamma Mia Pizzeria 222-226@64h-vr7-ysq 5 22p-222@64h-vr7-y35 222-226@64h-vr7-ysq Kingdom World Outreach Center 222-226@64h-vr7-ysq 6 22f-222@64h-vr7-yvz 222-226@64h-vr7-ysq Saigon 1 222-226@64h-vr7-ysq 7 22g-222@64h-vr7-yvz 222-226@64h-vr7-ysq DHL 222-226@64h-vr7-ysq 8 zzw-224@64h-vr7-y35 222-226@64h-vr7-ysq Sally's Bakery & Grocery 222-226@64h-vr7-ysq 9 22r-222@64h-vr7-9j9 222-226@64h-vr7-ysq Smoke Shack 222-226@64h-vr7-ysq This sample contains a set of customer_ columns of the child data used to match, and then another set of columns containing the match data, i.e. the parent. Store data that was acquired in a different way, for example getting all POIs in a certain city, might be structured slightly differently, so the next steps might not be necessary for you. So, question 1 as above: 1. How can we look in our data and figure out where we have parent and child locations? We've got the data. We separated out all the match data, but for a given set of POIs that contains both parents and children, we can figure out where we have parent and child locations by looking at the parent_placekey column. This column is missing for any location without a parent. For children locations, it will tell you which location is the parent. We can figure out children by any row that has a nonmissing parent_placekey column. And we can figure out parents by *row for which its placekey is found as a parent_placekey of some other column. Let's find this in our data here: customer_placekey customer_parent_placekey customer_location_name placekey 1 zzw-222@64h-vr7-ysq 222-226@64h-vr7-ysq Sharps Barbershop 222-226@64h-vr7-ysq 2 zzy-222@64h-vr7-ysq 222-226@64h-vr7-ysq Cajun Seafood & Wings 222-226@64h-vr7-ysq 3 222-226@64h-vr7-ysq NaN Pleasant Valley Marketplace 222-226@64h-vr7-ysq 4 zzw-223@64h-vr7-ysq 222-226@64h-vr7-ysq Mamma Mia Pizzeria 222-226@64h-vr7-ysq 5 22p-222@64h-vr7-y35 222-226@64h-vr7-ysq Kingdom World Outreach Center 222-226@64h-vr7-ysq 6 22f-222@64h-vr7-yvz 222-226@64h-vr7-ysq Saigon 1 222-226@64h-vr7-ysq 7 22g-222@64h-vr7-yvz 222-226@64h-vr7-ysq DHL 222-226@64h-vr7-ysq 8 zzw-224@64h-vr7-y35 222-226@64h-vr7-ysq Sally's Bakery & Grocery 222-226@64h-vr7-ysq 9 22r-222@64h-vr7-9j9 222-226@64h-vr7-ysq Smoke Shack 222-226@64h-vr7-ysq Now it just so happens that the data we've taken for this demonstration includes exactly one parent: the Pleasant Valley Marketplace in Virginia Beach, and all of its children. With our data loaded, and the parents and children identified, we can move on to our next question: 2. How can we work with data that has parent and child locations? This requires us to think about what kind of parent/child relationship we have. The most important distinction is whether or not the child is enclosed within its parent. The enclosed column tells you whether a child is enclosed within its parent. This is something like that Claire's in the mall. It's really inside that mall, as opposed to a burger joint in an outdoor strip mall, which is in that strip mall but maybe also sort of its own space. When a location like Claire's is enclosed, it can be difficult to tell the difference between a device being in Claire's and being near Claire's inside the mall. The way we deal with parent and child locations differs considerably based on whether the children are enclosed or not. In the case of Pleasant Valley Marketplace, the children are enclosed (enclosed == True). Children that are enclosed (enclosed == True) are basically not distinguished from their parents. They do not have their own separate foot traffic data. Sometimes they have their own polygons, but sometimes they're just a part of the parent polygon, although they may have their own latitude/longitude data. Children that are not enclosed (enclosed == False) have parents but also act as independent locations. We track visitor data like visits_per_day separately for those locations, and they have their own polygon data in the polygon_wkt column. The Pleasant Valley Marketplace is full of enclosed children, so we'll talk about how to handle that first. Working with Enclosed Children How can we handle data from enclosed children? Well, we can ignore the children's foot traffic data, since it doesn't really have any. We can take the parent foot traffic data we see, and that covers the entire region. customer_placekey customer_parent_placekey customer_location_name placekey 1 zzw-222@64h-vr7-ysq 222-226@64h-vr7-ysq Sharps Barbershop 222-226@64h-vr7-ysq 2 zzy-222@64h-vr7-ysq 222-226@64h-vr7-ysq Cajun Seafood & Wings 222-226@64h-vr7-ysq 3 222-226@64h-vr7-ysq NaN Pleasant Valley Marketplace 222-226@64h-vr7-ysq 4 zzw-223@64h-vr7-ysq 222-226@64h-vr7-ysq Mamma Mia Pizzeria 222-226@64h-vr7-ysq 5 22p-222@64h-vr7-y35 222-226@64h-vr7-ysq Kingdom World Outreach Center 222-226@64h-vr7-ysq 6 22f-222@64h-vr7-yvz 222-226@64h-vr7-ysq Saigon 1 222-226@64h-vr7-ysq 7 22g-222@64h-vr7-yvz 222-226@64h-vr7-ysq DHL 222-226@64h-vr7-ysq 8 zzw-224@64h-vr7-y35 222-226@64h-vr7-ysq Sally's Bakery & Grocery 222-226@64h-vr7-ysq 9 22r-222@64h-vr7-9j9 222-226@64h-vr7-ysq Smoke Shack 222-226@64h-vr7-ysq However, while we don't have traffic data for the children, we do have plenty of other information about them from the core information columns. placekey parent_placekey location_name 0 zzw-223@64h-vr7-y35 222-226@64h-vr7-ysq Western Union 1 zzw-222@64h-vr7-ysq 222-226@64h-vr7-ysq Sharps Barbershop 2 zzy-222@64h-vr7-ysq 222-226@64h-vr7-ysq Cajun Seafood & Wings 4 zzw-223@64h-vr7-ysq 222-226@64h-vr7-ysq Mamma Mia Pizzeria 5 22p-222@64h-vr7-y35 222-226@64h-vr7-ysq Kingdom World Outreach Center 6 22f-222@64h-vr7-yvz 222-226@64h-vr7-ysq Saigon 1 7 22g-222@64h-vr7-yvz 222-226@64h-vr7-ysq DHL 8 zzw-224@64h-vr7-y35 222-226@64h-vr7-ysq Sally's Bakery & Grocery 9 22r-222@64h-vr7-9j9 222-226@64h-vr7-ysq Smoke Shack 10 zzw-222@64h-vr7-y35 222-226@64h-vr7-ysq Dolphin Laundromat 11 zzy-222@64h-vr7-yvz 222-226@64h-vr7-ysq State Farm 12 222-222@64h-vr7-ysq 222-226@64h-vr7-ysq Krossroads Cafe and Tavern 13 228-222@64h-vr7-yvz 222-226@64h-vr7-ysq Allstate Insurance 14 22b-222@64h-vr7-ysq 222-226@64h-vr7-ysq Iglesia Cristiana Rios de Agua Viva de Virginia Beach 15 zzw-222@64h-vr7-yvz 222-226@64h-vr7-ysq Family Dollar Stores 16 222-224@64h-vr7-ysq 222-226@64h-vr7-ysq Tung Hoi Chinese Restaurant 17 zzw-226@64h-vr7-y35 222-226@64h-vr7-ysq Fj Beauty Studios 18 22s-222@64h-vr7-y35 222-226@64h-vr7-ysq Adamo's New York Pizzeria 19 222-223@64h-vr7-ysq 222-226@64h-vr7-ysq Food Lion 20 22f-222@64h-vr7-ysq 222-226@64h-vr7-ysq Tokyo Express For example, maybe we're interested in the kinds of businesses and locations that are inside the shopping center. Number top_category sub_category Activities Related to Credit Intermediation Other Activities Related to Credit Intermediation 1 Agencies, Brokerages, and Other Insurance Related Activities Insurance Agencies and Brokerages 2 Bakeries and Tortilla Manufacturing Retail Bakeries 1 Couriers and Express Delivery Services Couriers and Express Delivery Services 1 Drycleaning and Laundry Services Drycleaning and Laundry Services (except Coin-Operated) 1 General Merchandise Stores, including Warehouse Clubs and Supercenters All Other General Merchandise Stores 1 Grocery Stores Supermarkets and Other Grocery (except Convenience) Stores 1 Other Miscellaneous Store Retailers Tobacco Stores 1 Personal Care Services Barber Shops 1 Beauty Salons 1 Religious Organizations Religious Organizations 2 Restaurants and Other Eating Places Full-Service Restaurants 7 3. How can we think about the spatial orientation of parent and child locations? In the case of enclosed-children data where the children have their own polygons, you can handle spatial orientation as normal. Simply look at the polygons! But what if they don't, as in this data? At this point, you're stuck with just the parent polygon. But you can still do a little something with the children, because they will have latitude and longitude data that you can work with. So we'll start by mapping out the parent polygon. The polygon_wkt column is information about the POI's polygon in WKT format. We can add the child locations on as points (see this guide). This looks like the kind of place where there's a long row of stores down the center, surrounded by parking. Let's make sure that makes sense. First of all... does the polygon include parking? We can check in the includes_parking_lot column. Yep! That's a parking lot. Knowing that we have a parking lot is important for interpreting foot traffic data-for example, foot traffic to a McDonald's means something very different depending on whether or not we pick up the drive-thru. What is notable is that Pleasant Valley turns out to be an outdoor shopping center, which goes to show that sometimes these kinds of locations can also be "enclosed." Working with Non-Enclosed Children Let's look at another part of the data, picking out a parent location that has non-enclosed children. Unlike the first data set, this one was created by pulling all locations in a certain zip code. This means we can see, and understand how to import, this alternate structure of data. This time we'll be working with the Shoppes at Lac de Ville, which is in Rochester, New York. This is the Placekey ID 222-224@665-8rv-vs5, and so to get both the parent and child data, we can look for that code in either the parent_placekey column or the placekey column. placekey parent_placekey location_name bucketed_dwell_times 43 222-224@665-8rv-vs5 NaN Shoppes At Lac De Ville {"240":177} 56 zzw-22b@665-8rv-sdv 222-224@665-8rv-vs5 Allstate Insurance NaN 59 224-222@665-8rv-vmk 222-224@665-8rv-vs5 Parker Robt E III DDS {"240":0} 97 zzw-223@665-8rv-sdv 222-224@665-8rv-vs5 Silk {"240":0} 153 222-225@665-8rv-vs5 222-224@665-8rv-vs5 Citizens Bank {"240":0} 187 222-222@665-8rv-s89 222-224@665-8rv-vs5 Ritz Stacey M Od {"240":5} 224 224-222@665-8rv-vs5 222-224@665-8rv-vs5 Mobile Notary Service NaN 227 229-222@665-8rv-vmk 222-224@665-8rv-vs5 Joseph I Mann MD Greater Rochester Neurology {"240":0} 229 zzw-228@665-8rv-sdv 222-224@665-8rv-vs5 Project Leannation {"240":0} 283 223-222@665-8rv-vj9 222-224@665-8rv-vs5 Visionary Eye Associates {"240":0} 286 222-228@665-8rv-vs5 222-224@665-8rv-vs5 Julian's Dry Cleaners NaN 296 222-224@665-8rv-s89 222-224@665-8rv-vs5 Rochester Eye Associates NaN 301 zzw-227@665-8rv-sdv 222-224@665-8rv-vs5 Mesquite Grill {"240":11} 395 zzw-222@665-8rv-sdv 222-224@665-8rv-vs5 Dollar General {"240":28} 396 222-229@665-8rv-vs5 222-224@665-8rv-vs5 Bolsa Nails {"240":0} 426 zzw-229@665-8rv-sdv 222-224@665-8rv-vs5 Thimble Tailoring & Clothier NaN 439 223-222@665-8rv-x5z 222-224@665-8rv-vs5 M&T Bank {"240":1} 463 222-227@665-8rv-vs5 222-224@665-8rv-vs5 Liberty Wine & Liquor {"240":0} 471 222-223@665-8rv-vs5 222-224@665-8rv-vs5 Feet First Shoes and Pedorthics {"240":0} 514 222-226@665-8rv-vs5 222-224@665-8rv-vs5 CVS {"240":0} 558 222-22c@665-8rv-vs5 222-224@665-8rv-vs5 Boomtown Cafe {"240":4} 580 zzw-222@665-8rv-vj9 222-224@665-8rv-vs5 Evangelisti Reconstructive & Plastic Surgery {"240":0} 618 222-222@665-8rv-vs5 222-224@665-8rv-vs5 Oreck {"240":0} 625 222-223@665-8rv-s89 222-224@665-8rv-vs5 Dupont David OD {"240":0} 779 zzy-223@665-8rv-sdv 222-224@665-8rv-vs5 Amaya Indian Cuisine {"240":18} 785 zzw-223@665-8rv-vj9 222-224@665-8rv-vs5 Stephen Evangelisti {"240":1} 835 222-222@665-8rv-x89 222-224@665-8rv-vs5 MacGregor's Grill & Tap {"240":1} 854 zzy-222@665-8rv-sdv 222-224@665-8rv-vs5 Paislee Boutique {"240":0} 857 zzw-222@665-8rv-vpv 222-224@665-8rv-vs5 Tops Friendly Markets {"240":45} 868 226-222@665-8rv-vmk 222-224@665-8rv-vs5 Brighton Towne Dental {"240":26} 889 zzw-225@665-8rv-sdv 222-224@665-8rv-vs5 United States Postal Service (USPS) NaN 900 zzw-224@665-8rv-sdv 222-224@665-8rv-vs5 Rita's Italian Ice {"240":35} 902 zzy-222@665-8rv-vj9 222-224@665-8rv-vs5 CaminoByTheWay {"240":2} And are these actually non-enclosed child locations? Let's make sure. All false! That's what we were expecting. The first thing to be aware of when dealing with non-enclosed data is that unless we're careful, we'll double-count foot traffic. Foot traffic that shows up for a child will also show up for its parent. So we will want to drop one or the other if we're going to be aggregating things up and don't want to double-count. How can we tell if we have double-counting going on? Well, it should be going on any time you have a non-enclosed child location. But it's especially easy to see if the parent location doesn't have any visitors outside of its children, as is the case here. We can add up the daily visits for the parent, and for all the children, and should get the exact same values. parent_visits child_visits 0 25 25.0 1 28 28.0 2 28 28.0 3 54 54.0 4 77 77.0 5 55 55.0 6 54 54.0 7 58 58.0 8 29 29.0 9 30 30.0 10 67 67.0 11 73 73.0 12 61 61.0 13 55 55.0 14 58 58.0 15 33 33.0 16 22 22.0 17 43 43.0 18 83 83.0 19 51 51.0 20 50 50.0 21 66 66.0 22 50 50.0 23 25 25.0 24 56 56.0 25 58 58.0 26 62 62.0 27 53 53.0 28 65 65.0 29 36 36.0 30 21 21.0 They're exactly the same! Clearly if we want to work with foot traffic data, we'll need to only use one or the other. Next we can ask how to deal with the spatial arrangement of our data. This time, we have data where each place of interest has its own polygon in the polygon_wkt column. polygon_wkt 56 POLYGON ((-77.59417363840089 43.119868567405916, -77.59414413410173 43.119943943129186, -77.59391480523095 43.119885208807354, -77.5939416273211 43.11981472754674, -77.59417363840089 43.119868567405916)) 59 POLYGON ((-77.59143161215052 43.120892209242385, -77.5912596878754 43.120816922141515, -77.591356 43.120596, -77.591542 43.120639, -77.59143161215052 43.120892209242385)) 97 POLYGON ((-77.59414341246315 43.11994394312916, -77.59407635723778 43.120070221730195, -77.59401466643044 43.120098609906954, -77.593844346158 43.12004868586318, -77.59391274248787 43.119885208807375, -77.59414341246315 43.11994394312916)) 153 POLYGON ((-77.59200493412783 43.120714745000875, -77.59228924828341 43.12079207746439, -77.59237105565836 43.120632273247, -77.59284446554949 43.120730896725995, -77.59273985939791 43.12101428539153, -77.59253094468227 43.12150846710204, -77.59250412259212 43.12151042486173, -77.5924638894569 43.12157307313886, -77.59240219864955 43.121608312766554, -77.59197036299815 43.12150455158246, -77.5919971850883 43.12144777652045, -77.59196231637111 43.12144777652045, -77.59193549428096 43.12150259382258, -77.59167800221553 43.12143211442511, -77.59185502801051 43.121030771864206, -77.59177456174007 43.121015109662125, -77.59183088812938 43.120928967478925, -77.59188453230968 43.12094854525849, -77.59200493412783 43.120714745000875)) 187 POLYGON ((-77.5918583166116 43.120005479690604, -77.59162764663633 43.119952618856665, -77.59174566383298 43.11967656709296, -77.59197901601728 43.11972747034871, -77.5918583166116 43.120005479690604)) 224 POLYGON ((-77.59200493412783 43.120714745000875, -77.59228924828341 43.12079207746439, -77.59237105565836 43.120632273247, -77.59284446554949 43.120730896725995, -77.59273985939791 43.12101428539153, -77.59253094468227 43.12150846710204, -77.59250412259212 43.12151042486173, -77.5924638894569 43.12157307313886, -77.59240219864955 43.121608312766554, -77.59197036299815 43.12150455158246, -77.5919971850883 43.12144777652045, -77.59196231637111 43.12144777652045, -77.59193549428096 43.12150259382258, -77.59167800221553 43.12143211442511, -77.59185502801051 43.121030771864206, -77.59177456174007 43.121015109662125, -77.59183088812938 43.120928967478925, -77.59188453230968 43.12094854525849, -77.59200493412783 43.120714745000875)) 227 POLYGON ((-77.59164308521562 43.12040763170624, -77.59147215835709 43.120331919138046, -77.59159 43.120058, -77.591775 43.120101, -77.59164308521562 43.12040763170624)) 229 POLYGON ((-77.59423801141725 43.11974620402106, -77.59417363840089 43.11987052521808, -77.59394430953012 43.11981276973279, -77.59399124818788 43.1196972585986, -77.59423801141725 43.11974620402106)) 283 POLYGON ((-77.591946 43.121896, -77.591944 43.121862, -77.591803 43.121865, -77.591801 43.121798, -77.591748 43.121784, -77.591736 43.121808, -77.591592 43.12177, -77.591574 43.121808, -77.591446 43.121774, -77.591542 43.121581, -77.591813 43.121654, -77.591804 43.121671, -77.591936 43.121668, -77.591937 43.121699, -77.592063 43.121696, -77.592065 43.121762, -77.592282 43.121757, -77.592288 43.121889, -77.591946 43.121896)) Each POI having its own geometry is going to be the case whenever we have non-enclosed children, but keep in mind it will also sometimes be the case with enclosed children. Just be sure to look if it's there! When it comes to polygons in close proximity like this, sometimes we can be certain of how well we have the shape down, and other times we can't. For this we'd want to look at the polygon_class column. Ideally we want this to be an OWNED_POLYGON indicating that we can map the location to a specific polygon. Otherwise, there might be a little uncertainty. What do we have here? We have a single parent location that is an OWNED_POLYGON, as well as 13 children OWNED_POLYGONs. In addition, we have 19 children SHARED_POLYGONs, which means that multiple POIs have ended up sharing the same polygon - these POIs can't be distinguished, or they may literally share the same space. More detail here. As you might expect, the child polygons sit inside of the parent polygon. Using the same methods as before, this time we can actually get the internal structure of the location. We can see the exact structure taken up by the children. It doesn't fill the whole parent space! And yet, every single visit to the parent was accounted for by a child. What gives? Well, all that blank space is parking lot. And while the parent location includes the parking lot... The children don't... And so what's happening? SafeGraph is willing to count visits to parent POIs that aren't to any children, and there are areas here that are part of the parent but not the children. But in this case we can see that we aren't counting any visits from the parking lot to the parent POI (or perhaps there weren't any, but that seems unlikely). Good to know! Wrapping Up So there we have it! Some reminders: Parent locations like malls and airports have children location inside of them Locations can include or exclude parking lots Children can be enclosed or non-enclosed Enclosed children don't get their own foot traffic data Non-enclosed children do, and if you're aggregating up you want to drop either the parents or the non-enclosed children or else you'll double-count Some enclosed children don't have their own polygons Some enclosed children, and all non-enclosed children, should have their own polygons The parent polygon can be bigger than the full list of its children, but sometimes this additional area doesn't record visits (sometimes it does, though) Ready to get started? Schedule a demo with our experts. #### Your Favorite Local Grocery Store Chain? Never Heard Of It! InfoGraphics Exploring The Top Chains In Each State We explored SafeGraph’s point of interest data to discover the top chains for grocery stores, pharmacies, gas stations, & hotels in each state. We were surprised to find that some business categories had just a few dominant players across the US, but in other categories, the landscape was much more fragmented & diverse.Where does America get its groceries?Top Grocery Store Chains In Each State In Terms Of The Number of Store LocationsThe grocery store chain that’s so ubiquitous in your state? It’s likely your friends on the other side of the country have never heard of it! When looking at the supermarket chain with the most number of locations in each state, 29 different brands were state leaders.However, when we look at the data from a parent company perspective, the landscape turns out to be a lot less fragmented. Kroger owns brands like Frys, Fred Meyer, Harris Teeter, & Smith’s. Ahold Delhaize, a Dutch retail company, operates Foodlion, Stop & Shop, Giant, & Hannaford. Albertsons owns its namesake and Safeway.The grocery store market hasn’t always been this fragmented. Back in the 1930s, A&P (Great Atlantic & Pacific Tea Co.) was the leader in the supermarket sector, with nearly 16,000 stores at its peak. Since then, Walmart has emerged as the 800-pound gorilla, taking in about 30% of nationwide grocery revenue.Former Kroger CEO, David B. Dillon, explained that maintaining the local branding and store layouts of regional grocery chains serves as a useful differentiator against Walmart’s more uniform look and feel.Where does America get its prescriptions filled?Top Pharmacy & Drug Store Chains In Each State In Terms Of The Number of Store LocationsPharmacy chains are much more consolidated than grocery chains. CVS & Walgreens are both leaders in 20 states each. Rite Aid, the leader in terms of store locations in 7 states, came in a distant third. In 2018, Walgreens purchased almost 1,900 Rite Aid stores, so the pharmacy landscape is more of a duopoly than the map initially lets on.As a brand, CVS hasn’t always been so dominant. In the 1990s and 2000s, CVS acquired Peoples Drug, Revco, Arbor Drugs, & Eckerd and re-branded all these chains to CVS which allowed them to become the dominant player they are today.Where does America fill up their gas tanks?Top Gas Stations In Each State In Terms Of The Number of LocationsMany of today’s most popular gas station brands can trace their roots back to Standard Oil. Broken up in 1911 by the Sherman Antitrust Act, Standard Oil was split into 34 ‘Baby Standards’.Standard Oil of California turned into Chevron, Standard Oil of Jersey turned into Exxon, and Standard Oil of New York turned into Mobil.Where does America stay when traveling?Top Hotel Chains In Each State In Terms Of The Number of Unique Hotel LocationsThe hotel landscape faces much more consolidation than gas stations and grocery stores. Hampton Inn & Suites is the top hotel chain in terms of the number of locations in 22 states. Originally started as a budget hotel by the Holiday Corporation (parent company to Holiday Inn), through a series of mergers and acquisitions, Hampton Inn came under the ownership of Hilton Worldwide in 1999.Super Motel 8 comes in as the 2nd most popular chain, with a stronghold in 10 Midwestern states. Best Western is in a close third with the most number of hotels in 9 western states.Hungry for more cool infographics or data on these points of interest?Check out the blog post Top 3 Most Popular Fast Food Chains By State. Surprisingly enough, the #1 most popular quick-service restaurant in terms of the number of locations isn’t McDonald’s! ### Pages #### 2025 Spatial Analysis Trends Free eBook Spatial Analysis in 2025: Key Trends Examine the rapid evolution of the geospatial industry and the latest advances in technology, infrastructure, and regulation. Download the eBook What you'll learn in this ebook: How AI and machine learning will revolutionize how we work with spatial data.How advances in high-res data and imaging will transform hyper-local analyses.Regulatory frameworks for data privacy and data governanceNew tools that are democratizing geospatial data and analysis #### About Building the Most Accurate Data on the Physical World. SafeGraph is narrowly focused on building the most accurate global places dataset available, empowering modern builders to create world-class location-based applications and analytics tools. Our Approach to Building Trusted Data   We Strive to Be the Source of Truth About the Physical World SafeGraph’s Places dataset contains comprehensive, precise information about global places--updated monthly. We're dedicated to providing the best veracity and coverage so that our partners and their users can depend on our data.   Our Mission Is to Democratize Data We believe open access to data is crucial to innovation and technological advancement. Our mission is to make high-quality, ethically and transparently sourced physical places data easily accessible to all. Learn more about our vision and values.   A Modern Data Partner We understand the goals, challenges, and pace of modern business and support our partners in delivering great user experiences and competitive insights. From regular check-ins to sourcing data by request, we make sure you have what you need to build the future.   Process, Technology, Machine Learning, and People We hire smart people and use advanced engineering to ingest, draw, merge, classify, and verify our data. The outcome of our dedication to hard work and quality is a fresh, clean, accurate dataset. Read our documentation. Learn About the Latest Data Industry Trends, Best Practices, and Innovations SafeGraph Blog Check out the latest in hought leadership, technical deep dives, and knowledge sharing—everything you need to know to create great data-based products. Read Now Read Now Deliver Better Products With Premium POI Data Get in Touch #### Address Geocoded Address Data Powering Global Location Intelligence Structured geocoded address data with precise coordinates. Built for analytics, routing, and geospatial applications. Schedule a Demo Download Geocoded Address Sample Challenges with Address Data Coverage and Quality Global address data is fragmented. Many regions lack structured, geocoded addresses, limiting the accuracy of geospatial analysis and logistics operations. Sparse or incomplete address data Inconsistent address formatting Sparse or incomplete address data SafeGraph addresses these gaps with a structured geocoded address database designed for hard-to-source markets. Built for Advanced Analysis Every record is parsed into discrete address fields and paired with precise geographic coordinates, ready for ingestion into your data warehouse. Parsed address components (street, city, region, postal code) Accurate address data with latitude and longitude Structured and verified address records designed for reliable geocoding Multi-script support for international addresses Flat-file delivery for warehouse ingestion Consistent schema across countries Core Capabilities of the SafeGraph Address Dataset Improve Routing and Location Accuracy Power routing and spatial analysis with precise address coordinates. Integrate with Analytics Pipelines Load address data directly into warehouses for large-scale geospatial analysis. Validate AddressCoverage at Scale Identify missing addresses and perform bulk address validation across markets. Support Global Address Workflows Handle international address formats with multi-script support. Address Data Across Hard-to-Source Markets SafeGraph’s geocoded address dataset covers 35+ countries, with particular depth in markets where reliable address data is traditionally sparse or fragmented.Coverage includes: Balkan countries, including Bulgaria Eastern Europe and Black Sea corridor, including Turkey Mediterranean: Albania MENA: Morocco Other geographies where mainstream address sources have limited coverage How Organizations Use SafeGraph Address Data Teams across logistics, mapping, and geospatial analytics use SafeGraph geocoded address data to improve operational accuracy and strengthen geocoding systems. Logistics and DeliverySafeGraph address data helps logistics teams: Perform bulk address validation Identify missing addresses in service zones Improve routing precision with verified coordinates Validate address coverage before expanding service areas Reduce failed deliveries caused by incomplete address data Pricing Based on Data ScopeCommon uses include: Expanding candidate datasets for geocoding APIs Supporting batch geocoding workflows Enriching internal geocoded address databases Providing ground-truth coordinates for validation Improving search, autocomplete, and routing accuracy in mapping systems Address Schema Overview The dataset follows a consistent, flat schema designed for direct ingestion into analytical warehouses. Each field is typed and documented. Column Name Description Type Example primary_number A JSON string with alphabet as key. Value: A primary numeric identifier for the building. JSON { "latin": "4700" } sub_building A JSON string with alphabet as key. Value: Combined secondary designators (Ste, Unit, Bldg, Block). JSON { "latin": "Unit 1105" } building_name A JSON string with alphabet as key. Value: The name of the building. JSON { "latin": "Eaton" } street A JSON string with alphabet as key. Value: All street components combined. JSON { "latin": "Main Street" } intermediate_locality Additional details associated with locality like subdivision, neighborhood, or village. JSON { "latin": "Hohenlimburg" } locality Subdivision or district within a city. JSON { "latin": "Repto Robles" } city The city of the point of interest. JSON { "latin": "Clearwater" } sub_region Second largest administrative division in a country. JSON { "latin": "Tom" } region State, province, or county of the location. JSON { "latin": "Pinellas" } postal_code The postal code of the location. JSON { "latin": "12235" } full_address The full unparsed address from the source. JSON { "latin": "1680 Campbell Ln Bowling Green KY 42104-1062" } iso_country_code 2-letter ISO country code. String US latitude Latitude coordinate of the address. Float 36.714767 longitude Longitude coordinate of the address. Float 121.662912 View Geocoded Address Schema Address Data Designed for Modern Data Infrastructure SafeGraph address data integrates easily with modern analytics and geospatial infrastructure. Learn More About Bulk Data Delivery FAQ’s What is a geocoded address dataset? A geocoded address dataset contains structured address records paired with geographic coordinates such as latitude and longitude. How is this different from a geocoding API? Geocoding APIs resolve addresses one at a time through a live request. That works for individual lookups but can become expensive at scale. SafeGraph delivers a flat file you load directly into your warehouse, so you can run bulk address validation and analysis without rate limits or live dependencies. Can the dataset support batch geocoding? SafeGraph’s flat file format makes it well-suited for batch workflows. You can validate large address lists, fill in missing coordinates, and run coverage analysis across entire service areas directly from your warehouse, without managing API calls or hitting rate limits. Does the dataset include latitude and longitude? Yes, the SafeGraph Address dataset does include latitude and longitude for every record. Unliked a standard geocoding API that processes lookups one by one, this dataset provides these coordinates in a structured flat-file format.  How is the dataset delivered? SafeGraph delivers its geocoded address data as a flat-file (typically CSV or Parquet) designed for direct ingestion into modern data infrastructure. A geocoded address dataset contains structured address records paired with geographic coordinates such as latitude and longitude.Geocoding APIs resolve addresses one at a time through a live request. That works for individual lookups but can become expensive at scale. SafeGraph delivers a flat file you load directly into your warehouse, so you can run bulk address validation and analysis without rate limits or live dependencies.SafeGraph's flat file format makes it well-suited for batch workflows. You can validate large address lists, fill in missing coordinates, and run coverage analysis across entire service areas directly from your warehouse, without managing API calls or hitting rate limits.Yes, the SafeGraph Address dataset does include latitude and longitude for every record. Unliked a standard geocoding API that processes lookups one by one, this dataset provides these coordinates in a structured flat-file format. SafeGraph delivers its geocoded address data as a flat-file (typically CSV or Parquet) designed for direct ingestion into modern data infrastructure. Have Questions About Our Address Dataset? Download Geocoded Address Sample Precise, Verified Address Data for Better Location Intelligence #### Attributes Rich POI attributes for better products and insights Stop cleaning data and start building. Access market-leading POI accuracy to power product or model. Schedule a Demo Places data attributes that add clarity and context Quality is your priority, so it's ours too. We rigorously verify our data to power your products with the highest quality ingredients.   Placekey Universal placekeys are unique IDs that allow Places data to be easily joined with other datasets. Learn More Geographic Coordinates Exact latitude and longitude for each POI location. Learn More Industries & Categories Granular tags that provide context for detailed POI data. Learn More Opened/Closed Dates Track business lifecycle data for accurate timelines. Learn More Brands Brand names and IDs for precise chain-level analysis. Learn More Store IDs Store-level identifiers for clean joints and data enrichment. Learn More Polygons Precise POI footprints as part of geospatial places attributes. Learn More ...and more Open hours, phone numbers, website URLs, and other location dataset attributes. Address details, and more—create a custom dataset to suit your unique application. Learn More Explore all of our attributes Clear pricing. No surprises. Simple, predictable pricing without open-ended usage costs. Flexible monthly delivery Receive a clean, ready-to-use file every month, designed for easy integration into your workflows. Schedule a demo Pricing based on data scope Select the number of location and detailed POI data fields you need, and scale as your use case evolves. Learn more about our pricing Build better products with premium POI data Explore datasets Get in touch #### Blog URL: https://www.safegraph.com/blog/ #### Careers We’re Building a World-Class Team, Fully Remote We're a lean, high-ownership team team building the highest-quality data on physical places. Every person here makes a real impact.Don’t see your perfect fit below? Email us. Open Positions All positions (1) Category (1) Senior Software Engineer - Machine Learning Full-time You’ll be a generalist responsible for building and running large-scale data, machine learning, and agentic systems. The focus is operational ML/AI, including agentic systems and geospatial data pipelines.  You should be comfortable owning the full lifecycle: from data ingestion and distributed processing to model development, deployment, and monitoring. This role requires the ability to iterate quickly from initial concept to a robust, production-ready solution. Apply Now Full-Stack Developers Full-time Due to growing workload, we are looking for experienced and talented Full-Stack Developers to join our fast-paced Engineering team. You will work closely with Product, Design and Marketing to analyze, develop, debug, test, roll-out and support new and existing product features. Apply Now Senior Software Engineer - Machine Learning Full-time You’ll be a generalist responsible for building and running large-scale data, machine learning, and agentic systems. The focus is operational ML/AI, including agentic systems and geospatial data pipelines.  You should be comfortable owning the full lifecycle: from data ingestion and distributed processing to model development, deployment, and monitoring. This role requires the ability to iterate quickly from initial concept to a robust, production-ready solution. Apply Now Full-Stack Developers Full-time Due to growing workload, we are looking for experienced and talented Full-Stack Developers to join our fast-paced Engineering team. You will work closely with Product, Design and Marketing to analyze, develop, debug, test, roll-out and support new and existing product features. Apply Now Vision and Values Do Fewer Things but Be Great at Them Judgement Is the X-Factor We Are the Enablers, Not the Solvers Respect Our Own Time Get Leverage Respect Others’ Time Don’t Be a Bottleneck Focus on Growth Thrive in a 100% Remote Environment Creating community in a remote friendly environment can be a challenge a lot of companies face. Here at SafeGraph, we know that community is important in all aspects of life -- including work. We host a variety of remote friendly team events that focus on team-bonding to help make that transition to a new team or new city easier. Supporting You at Every Stage of Your Career Healthcare Benefits We care about your health! We offer a variety of medical healthcare plans, as well as dental and vision options. Work-From-Home Friendly Have total autonomy to work from home or anywhere in North America. Vacation Flexibility Take that vacation you've always wanted—when you want—with our flexible vacation days. Personal & Professional Development Grow faster here. Take a course or attend a conference that will help you grow professionally, on us. AI-Augmented Team We build with AI, not just talk about it. From internal agents to daily use of tools, we work smarter so a small team can move like a big one. Retirement Have more control of your future with our 401k retirement benefits. #### CARTO SafeGraph Integration Partner Access SafeGraph Data Directly in CARTO Skip the two-step process - access SafeGraph data directly through CARTO via CARTO’s Spatial Data Catalog. Get Started Power Spatial Analysis CARTO users can access SafeGraph data directly through CARTO’s Spatial Data Catalog for precise spatial analysis. Market-leading ANALYSES Enrich CARTO’s spatial analytics seamlessly with SafeGraph data Leverage SafeGraph Places for accurate point of interest data. Gain detailed context around location intelligence that can inform site selection, market analysis, consumer behavior, community planning, and more. CARTO and SafeGraph on increasing consumer demand SafeGraph and CARTO experts team up to help OOH advertisers with campaign planning and attribution using spatial analytics and places data. Watch the webinar POI data that keeps up with the changing world Work with us to build a custom POI dataset that will drive your tools and business forward. Explore places Get in touch #### CCPA Privacy Policy Addendum for California Residents: California Privacy Right Effective: January 1, 2024NOTICE TO CALIFORNIA RESIDENTS [CONSUMERS] - CALIFORNIA CONSUMER PRIVACY PROTECTION ACTThe California Consumer Privay Act of 2018 (“CCPA”) provides certain rights to residents of California. To the extent that SafeGraph is subject to the CCPA, this section of the Privacy Policy articulates the policies for natural persons who are residents of California (“California Consumer”) and whose Information SafeGraph may collect. This Addendum supplements the information in the Privacy Policy. However, this Addendum is intended solely for, and is applicable only as to, California Consumers. If you are not a California Consumer (or a resident of California), this does not apply to you and you should not rely on it. The term “Information” as used in this Addendum has the same meaning as “personal information” under the CCPA.In the below tables and sections, we describe (as required by the CCPA):Notice at Collection – the categories of Information that we collect directly from you, the purpose of the collection, the categories of sources we collect it from, and the period for which we retain that Information.The Information Collected, Sold, Shared, or Disclosed in the past twelve months.Your California Privacy Rights and Choices – what rights you have under the CCPA, for instance, to request that we “opt out” your information from our marketing database (also called “do not sell” rights), or to request categories and personal information that we may have collected about you.   Category Sold or Shared Identifiers E.g., IP addresses; the names, addresses, phone numbers, etc. of our customers and prospective customers, business partners or employment candidates. No. Personal information categories listed in the California Customer Records statute (Cal. Civ. Code § 1798.80(e)). No. Characteristics of protected classifications under California or federal law (e.g., gender of potential customers in marketing surveys). No. Commercial information, including records of personal property, products or services purchased, obtained, or considered, or other purchasing or consuming histories or tendencies. No. Internet or other electronic network activity information. No. Professional or employment-related information. No. Purposes: We collect this Information for the following purposes:1. Marketing, analytics, and other research, for example:to respond to communications and customer service inquiries;to market to you and provide you information about new products and events;for our internal and operational purposes, such as to consider or make internal service improvements or quality checking, or for our own sales and marketing purposes.2. Processing job applications and employee management.‍3. Other internal purposes, for example:auditing, detecting security incidents, debugging, short-term and transient use, quality control, and legal compliance. We use Site Data, from other “business to business” interactions (such as at trade shows) or from data compilers for the above, as well as for our own marketing, business and employee recruitment purposes.Retention: We retain your Information only for as long as is reasonably necessary to fulfill the purpose for which it was collected. However, if necessary, we may retain your Information for longer periods of time, until set retention periods and deadlines expire, for instance where we are required to do so in accordance with legal, tax and/or accounting requirements set by a legislature, regulator or other government authority.2. Notice of Collection, Sale, Sharing, and Disclosure of Information in Past 12 MonthsDuring the past twelve months we have collected the following Information for the following business and commercial purposes from the following categories of sources: #### ChainXY vs SafeGraph | Business Listing & POI Data Providers ChainXY vs SafeGraph Which business listing & POI data provider should you go with - ChainXY or SafeGraph? See a comparison of data quality, freshness & cost to make the best choice. Get in Touch SAFEGRAPH CHAINXY Update frequency Monthly Quarterly Price $$ $ Major brand POIs Instant download Independent stores Non-commercial places Precise geocodes See the full picture - not just major brands Make sure you’re working with the most comprehensive POI data out there, not just coverage of major brands. SafeGraph curates data for all the places you need to build a quality product, including EV charging stations, parks, independent stores, and more. Stop worrying about data quality & coverage The world changes fast, and POI data can quickly become stale. At best, ChainXY updates their major brand POIs every quarter, but some chains and geographies are even more out-of-date. At SafeGraph, we update our places database monthly to make sure you have the freshest, most accurate data to embed in your products. Trusted by Leading Innovators When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard.” Julian Adams,Director of Data Science, Avison Young Trusted by Leading Innovators We pored through spreadsheets to isolate categories and look for issues in the data. And SafeGraph was the clear winner. There was just so much weird, junky stuff in the other datasets, it just didn’t pass basic data quality. So kudos to SafeGraph for a solid product.” Nic Babb,VP of Product, Adomni Trusted by Leading Innovators From the beginning of our data sourcing process, SafeGraph provided the most comprehensive and actionable POI dataset. Their coverage of the top 1,000 restaurants is unmatched and invaluable. Ben Anderson,Senior Manager of Market, Customer, and Competitive Intelligence, Sysco Trusted by Leading Innovators When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard." Julian Adams,Director of Data Science, Avison Young Trusted by Leading Innovators We pored through spreadsheets to isolate categories and look for issues in the data. And SafeGraph was the clear winner. There was just so much weird, junky stuff in the other datasets, it just didn't pass basic data quality. So kudos to SafeGraph for a solid product." Nic Babb,VP of Product, Adomni Trusted by Leading Innovators From the beginning of our data sourcing process, SafeGraph provided the most comprehensive and actionable POI dataset. Their coverage of the top 1,000 restaurants is unmatched and invaluable. Ben Anderson,Senior Manager of Market, Customer, and Competitive Intelligence, Sysco Talk to one of our data experts. #### Consumer Data Access Request Form Consumer Data Access Request Form Access information about the categories and specific pieces of personal data associated with you. Consumer Data Access Request Form California residents also have the right to request that we disclose what categories of your personal information we collect, use, or sell and to correct or obtain copies of non-sensitive data that we have collected about a consumer, if any.  As a California resident, you may request access to the specific pieces of personal information that we have collected from you.To request to know what categories of your personal information we have collected, please complete this form. We may request additional information to confirm your identity and process your request. We will only use any information submitted to process your request.Requests must be made individually. Inputting your email in the form below allows us to confirm your identity and to communicate with you. Request Access to Your Personal Data #### Corelogic Safegraph Partnership SafeGraph Integration Partner Mitigate complications in matching various sources of address & property data with CoreLogic and Placekey CoreLogic and Placekey’s goal is to solve the data-joining problem most organizations face today by connecting unique location identifiers in one place. Get Started Receive a corresponding CoreLogic Integrated Property Number, or CLIP®, with each Placekey via the free Placekey API Geospatial data usage is artificially constrained by the laborious process of matching and combining datasets. With this integration, you can now easily share and access datasets across organizations by seamlessly merging Placekey and CLIP® to a unique piece of real estate. CLIP® is the master key that unlocks and connects any property database Use CLIP® to connect property data from CoreLogic’s databases to their proprietary data and internal portfolios, as well as to third-party data sources. Clients can gain detailed clarity on individual properties and take portfolio analysis to new levels by integrating multiple, previously disconnected datasets. Placekey is a free and open universal standard identifier for any physical place Placekey creates a common industry standard for identifying any physical place. Placekey users seek geospatial data that can be easily joined to other datasets because they know real answers come from combining data from many different sources. Placekey stems from the philosophy that data should be easy to use and access, not hoarded. #### Data for Innovators Actionable Location Intelligence at Your Fingertips SafeGraph delivers market-leading POI data accuracy so you can focus on building what's next. Schedule a Demo A Trusted Source of Truth for Global POIs Improve Address Accuracy at Scale Reliable, geocoded address data helps you validate, standardize, and expand coverage across global markets. Learn More Create Devoted Users Clean, accurate, comprehensive POI data delivers consistently excellent product experiences that users will rely on and rave about. Learn More Lead a Happy Productive Team Data is rigorously curated, tagged, categorized, and delivered ready to use so your team can focus on product building, not data processing. Learn More Trusted by Leading Innovators The time spent munging the information is time lost to providing real value to our clients. With SafeGraph, however, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before.” Julian AdamsDirector of Data Science at Avison Young Develop Apps and Platforms Users Love Build quickly without sacrificing functionality or performance. Always-fresh, accurate, attribute-rich Places data supports product innovation with location data at every stage, from MVP to scale. See RADAR in Action Fuel Insight Tools That Give You an Edge Powerful analysis platforms are only as relevant as the data they use. SafeGraph’s Places data helps teams turn geospatial data into insights that support better products, smarter decisions, and long-term innovation. Explore Use Cases Work With a Modern Data Partner We’re focused on sourcing high-quality POI data so startups and enterprises can innovate with location data without managing complex data pipelines. Learn About Our Partnerships POI Data Built for a Changing Physical World Explore Datasets Get in Touch #### Do Not Sell My Info Do Not Sell Request Consumer Opt-Out / Do-Not-Sell RequestEFFECTIVE: JANUARY 1, 2023‍California residents may opt-out of the “sale” of their personal information. California law broadly defines what constitutes a “sale” – including in the definition making available a wide variety of information in exchange for “valuable consideration.”To request an opt-out, please complete the form below. The information you supply on this form would only be used to process your request. PLEASE INPUT YOUR EMAIL HERE #### Epc 2023 ESRI PARTNER CONFERENCE Join us at Esri Partner Conference 2023 Join SafeGraph along with hundreds of Esri partners and professionals from around the globe to learn, reconnect, and discover the latest advances in geographic information system (GIS) technology. Where you’ll find us Join SafeGraph in Palm Springs from March 4-6 to learn about the latest innovations in geospatial data. Access POI and building footprint datasets in Esri ArcGIS SafeGraph's global places dataset enriches maps and analytics with location intelligence and relevant context. Users can easily identify brand relationships, understand spatial hierarchy, and model consumer interactions with specific locations. Plan on attending? Let us know. #### Esri User Conference 2023 ESRI USER CONFERENCE 2023 Join us at Esri User Conference Join SafeGraph at the world’s largest GIS conference and find innovation, inspiration, and motivation as you explore all things GIS with technical experts, peers, and exhibitors. Get Started Receive a corresponding CoreLogic Integrated Property Number, or CLIP®, with each Placekey via the free Placekey API Geospatial data usage is artificially constrained by the laborious process of matching and combining datasets. With this integration, you can now easily share and access datasets across organizations by seamlessly merging Placekey and CLIP® to a unique piece of real estate. CLIP® is the master key that unlocks and connects any property database Use CLIP® to connect property data from CoreLogic’s databases to their proprietary data and internal portfolios, as well as to third-party data sources. Clients can gain detailed clarity on individual properties and take portfolio analysis to new levels by integrating multiple, previously disconnected datasets. Placekey is a free and open universal standard identifier for any physical place Placekey creates a common industry standard for identifying any physical place. Placekey users seek geospatial data that can be easily joined to other datasets because they know real answers come from combining data from many different sources. Placekey stems from the philosophy that data should be easy to use and access, not hoarded. #### Evaluation Terms of Service Evaluation Terms of Service PLEASE READ THESE TERMS OF SERVICE CAREFULLY.  THIS IS A BINDING CONTRACT. Introduction; Agreement to these Terms of Service. SafeGraph, Inc., a Delaware corporation, (“Licensor”) has compiled anonymized information and is willing to make available the data set described herein (the “Evaluation Data”) to permit prospective customers to evaluate and test the Evaluation Data in accordance with the terms and conditions of this Limited Data Evaluation License Agreement (“License” or “Agreement”).  Other products and services may be offered by Licensor subject to separate terms. The prospective customer (“Licensee”) wishes to evaluate and test the Evaluation Data in connection with Licensee’s products or services in accordance with the terms and conditions of this Agreement.BY CLICKING AGREE WHERE THE OPTION IS PRESENTED, YOU REPRESENT AND WARRANT THAT: (I) YOU ARE AN AUTHORIZED REPRESENTATIVE OF LICENSEE WITH FULL LEGAL AUTHORITY TO ACCEPT AND BIND LICENSEE TO THE LICENSE; (II) LICENSEE HAS READ AND UNDERSTANDS THE LICENSE; AND (III) LICENSEE AGREES TO THE LICENSE. EACH TIME LICENSEE ACCESSES OR USES THE EVALUATION DATA, LICENSEE ACCEPTS THIS LICENSE. IF YOU LACK THE LEGAL AUTHORITY TO BIND LICENSEE, PLEASE DO NOT CLICK AGREE OR USE THE DATA. IF LICENSEE DOES NOT AGREE TO THE LICENSE, LICENSOR IS UNWILLING TO GRANT LICENSEE THE RIGHT TO USE THE EVALUATION DATA, AND LICENSEE MUST CEASE USE OF THE EVALUATION DATA IMMEDIATELY. Upon agreement by Licensee by and through its authorized representative, in consideration of the mutual promises, agreements, and conditions stated herein, the License constitutes a binding legal agreement between Licensee and Licensor (individually, a “Party;” collectively, the “Parties”). Limited License.  Subject to the terms and conditions of this Agreement, Licensor hereby grants Licensee a temporary, limited, royalty-free, non-exclusive, non-transferable, non-sublicensable, revocable, beta-test license to the Evaluation Data during the Trial Period solely for the purpose of internal evaluation in a test environment and otherwise in accordance with the terms and conditions of this Agreement. The Evaluation Data shall consist of a random sample of point of interest attributes across geographies drawn from Licensor’s Places and Geometry products at Licensor’s sole discretion. The Evaluation Data is provided for internal review only and Licensee may not itself nor authorize another to use the Evaluation Data for any commercial, resale, distribution or other purpose. For further clarity, Licensee shall not, nor enable any third party to: (i) sell, rent, lease, sublicense, distribute, transfer or otherwise provide the Evaluation Data or any portions or copies thereof to any third party; (ii) use the Evaluation Data to create or host any commercially available mailing list, point of interest database or business listings database, (iii) use the Evaluation Data in any manner or for any purpose that infringes, misappropriates, or otherwise violates any intellectual property right or other right of any person, or that violates any applicable law, or (iv) use the Evaluation Data to attempt to identify behavior of a known individual for any reason). In addition, other than expressly authorized herein, Licensee shall not copy, adapt, translate, reverse engineer, or create derivative works therefrom. LICENSEE AGREES AND UNDERSTANDS THAT IT IS NOT AUTHORIZED TO DISTRIBUTE OR OTHERWISE USE THE EVALUATION DATA. Further Obligations.  If Licensee creates any written reports or generates performance analytics with respect to the Evaluation Data (“Test Analytics”) such Test Analytics must be restricted under these terms of license and kept confidential in accordance with Section 5 below. Licensee agrees to provide a summary of all such Test Analytics to Licensor promptly upon the end of the Trial Period. Licensee agrees that it is responsible for any acts or omissions of its agents or permitted subcontractors that access or use any of the Evaluation Data and Licensee will ensure that such agents and permitted subcontractors comply with the terms of this Agreement. Ownership.  Licensor shall own and retain all right, title and interest in and to the Evaluation Data, together with all intellectual property rights therein and thereto. Licensor reserves all rights not expressly granted hereunder. Nothing contained in this Agreement shall be construed as transferring any right, title, or interest in the Evaluation Data except as expressly set forth herein. Confidentiality.  Evaluation Data shall constitute confidential information belonging to Licensor, and accordingly, Licensee shall not disclose the Evaluation Data to any third party, except with Licensor's prior written consent and as permitted under the next sentence. Licensee may disclose the Evaluation Data to its employees, consultants or other agents who have a bona fide need to know the Evaluation Data for evaluation under the limited license rights herein, provided, that each such employee, consultant or agent is bound by confidentiality obligations at least as protective as those set forth herein. Licensee shall protect the confidentiality of the Evaluation Data in the same manner that it protects the confidentiality of its own confidential information of like kind (but in no event using less than with reasonable care). Licensee shall promptly notify Licensor if it becomes aware of any actual or suspected breach of confidentiality of the Evaluation Data. If Licensee is compelled by law or legal process to disclose the Evaluation Data, it shall provide Licensor with prompt prior notice of such compelled disclosure (to the extent legally permitted) and provide reasonable assistance, at Licensor’s expense, if Licensor wishes to contest the proposed disclosure. Licensee acknowledges and agrees that any disclosure or use or breach of the Data would result in irreparable injury to Licensor for which money damages would be inadequate and in such event Licensor shall have the right, in addition to other remedies available at law and in equity, to seek immediate injunctive relief. Upon any termination of this Agreement, to the extent that any Evaluation Data is retained, Licensee shall continue to maintain the confidentiality of the Data. Term and Termination.  The Effective Date of the Agreement is the date on which the Licensee manifests acceptance of this Agreement by clicking Agree where presented. The license rights in Section 2 are limited in duration to a time period starting from the Effective Date and continuing for the period of 14 days (the “Trial Period”), unless terminated herein. Licensee and Licensor may terminate this Agreement at any time by notifying the other. Upon expiration or termination of this Agreement, the license rights stated in Section 2 shall terminate and Licensee shall immediately discontinue all use of the Evaluation Data and remove or destroy all copies of the Evaluation Data from Licensee (including employees’) hardware. Licensee shall not disclose, retain or use the Evaluation Data or Test Analytics after the expiration or termination of this Agreement. DISCLAIMERS.  TO THE FULLEST EXTENT PERMISSIBLE PURSUANT TO APPLICABLE LAW, LICENSOR MAKES NO WARRANTIES OR REPRESENTATIONS, EXPRESS, IMPLIED, ORAL, WRITTEN, OR OTHERWISE, AND LICENSOR EXPRESSLY DISCLAIMS (I) ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NONINFRINGEMENT, (II) ANY WARRANTY REGARDING CORRECTNESS, QUANTITY, QUALITY, ACCURACY, COMPLETENESS, RELIABILITY, PERFORMANCE, TIMELINESS OR CONTINUED AVAILABILITY OF THE DATA. UNDER NO CIRCUMSTANCES SHALL LICENSOR BE LIABLE FOR ANY INDIRECT, PUNITIVE, INCIDENTAL, SPECIAL, CONSEQUENTIAL OR EXEMPLARY DAMAGES, INCLUDING WITHOUT LIMITATION DAMAGES, FOR LOSS OF PROFITS, GOODWILL USE, OR OTHER INTANGIBLE LOSSES THAT RESULT FROM THE USE OF OR INABILITY TO USE THE EVALUATION DATA. TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, LICENSOR ASSUMES NO LIABILITY OR RESPONSIBILITY FOR (I) ANY PERSONAL INJURY OR PROPERTY DAMAGE, OF ANY NATURE WHATSOEVER, RESULTING FROM LICENSEE’S ACCESS TO AND USE OF THE EVALUATION DATA; (II) ANY ERRORS OR OMISSIONS IN, OR ANY LOSS OR DAMAGE INCURRED AS A RESULT OF THE USE OF THE EVALUATION DATA. IN NO EVENT SHALL LICENSOR, ITS DIRECTORS, EMPLOYEES, AFFILIATES OR LICENSORS BE LIABLE TO LICENSEE FOR ANY CLAIMS, PROCEEDINGS, LIABILITIES, OBLIGATIONS, DAMAGES, LOSSES OR COSTS ARISING UNDER OR RELATING TO THIS AGREEMENT FOR MORE THAN $1,000. THIS LIMITATION OF LIABILITY APPLIES WHETHER THE ALLEGED LIABILITY IS BASED ON CONTRACT, TORT, NEGLIGENCE, STRICT LIABILITY, OR ANY OTHER BASIS, EVEN IF LICENSOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ‍ General. This Agreement shall be governed by the laws of Colorado, except for its conflicts of laws principles. All disputes arising under or relating to this Agreement shall be within the exclusive jurisdiction of the state or federal courts located in Denver, Colorado and each Party hereby consents to such exclusive jurisdiction and venue. Neither Party may assign this Agreement to any third party without the prior written consent of the other party. Nothing in this Agreement is intended to confer any rights or remedies on any person or entity that is not a Party to this Agreement. No modification of this Agreement or waiver of the terms and conditions hereof shall be binding upon the Parties unless approved in writing by each of the Parties. Except as otherwise provided herein, the failure of either Party to enforce at any time any provision of this Agreement shall not be constituted to be a present or future waiver of such provision, nor in any way affect the ability of either Party to enforce each and every such provision thereafter. If any provision of this Agreement is held invalid or unenforceable at law, such provision will be deemed stricken from this Agreement and the remainder of this Agreement will continue in effect and be valid and enforceable to the fullest extent permitted by law. This Agreement represents the entire agreement between the Parties and supersedes any and all prior understanding, agreements, or representations by or among the parties, written or oral, related to the subject matter hereof. #### Geocoded Addresses Introducing SafeGraph’s Geocoded Address Dataset Access high quality address data in hard to source countries for better market insights and operational precision. Get early access to geocoded address data Overcome Address Challenges in Tough Markets SafeGraph’s Geocoded Address product is tailored for mapping & geocoding companies navigating challenging markets. Gain access to accurate and comprehensive address data in countries where sourcing is notoriously complex.Key benefits include:Streamlined Sourcing: Source multiple address datasets through a single, reliable supplier, reducing dependency on fragmented sources.Focused Coverage: Target data for tough to access markets like Turkey, Bulgaria, Israel, and El Salvador.Mapping Specific Applications: Improve navigation, search, and address validation with precise geocoded addresses. Built for Precision: Schema and Coverage SafeGraph's geocoded addresses delivers detailed schema elements to support the most advanced mapping applications.Schema Features including:Primary Number, Sub-Building, Building Name, Street Name & Suffix, Intermediate Locality, Locality, City, and Region, Postal Code & ISO Country Code, Latitude & LongitudeCountries Covered:Our geocoded data focuses on hard to source locations, including Bahrain, Bulgaria, Hungary, Ireland (incl. Eircode), Israel, El Salvador, Turkey #### Geometry Accurate Building Footprint Polygon Data The definitive spatial foundation for building world-class products and location intelligence Schedule a Demo Download Sample Precise Polygon Data for Better Location Accuracy Understand the true shape of POIs and how places relate to each other using reliable geometry data.Robust polygon geometry data includes: Place shapes as Well-Known Text (WKT) Spatial hierarchy between polygons Attributes for parking areas and polygon quality Maintaining Your Own Database of Polygons Is Challenging You need data you can trust and rely on. We apply complex engineering and human verification to ensure accuracy of all POIs in our datasets. Learn More About SafeGraph Geometry Power Precise Location Analytics with Telemetry Ready Polygons Our precise polygons are built to be the foundation of your mobility analysis. By joining SafeGraph Geometry with telemetry data, you can move beyond simple 'proximity' and achieve true attribution - distinguishing a customer inside a store from a pedestrian on the sidewalk. Read Our Guide to Visit Attribution Our World Is Complex - We Make Analyzing It Easy We give you metadata so you can easily distinguish the spatial hierarchy and relationship between different places. With clear polygon hierarchies that have parent-child relationships, you’ll gain detailed insights about the POIs you’re analyzing. Explore the Data Schema Quality Data Ingredients at Scale POIs 0 M+ Brands 0 K+ Categories 0 + Countries & Territories 0 + Download a Free Sample of Geometry Data Get Your Free Geometry Data Sample Everything You Need to Get Started  Access data specs and delivery information Geometry Technical Docs Delivery Learn how the geometry datasets work and what it includes. Understand every attribute available in the geometry polygon database. Geometry Data Schema Track latest product releases and stay updated about the datasets you use. Release Notes Get a quick overview of coverage, depth, and available polygon data. Summary Statistics Get your data easily in any of the following 3 ways: Set up an S3 bucket to receive scheduled monthly deliveries of Geometry data. Ideal for teams that need full datasets for internal processing. Bulk Download Query SafeGraph Geometry directly in Snowflake. Integrate data into existing workflows without managing file transfers. Snowflake Explore sample Geometry data before committing. Review available columns and assess how it fits your current datasets. Request a Sample Learn how the geometry datasets work and what it includes. Understand every attribute available in the geometry polygon database. Geometry Data Schema Track latest product releases and stay updated about the datasets you use. Release Notes Get a quick overview of coverage, depth, and available polygon data. Summary Statistics Get your data easily in any of the following 3 ways: Set up an S3 bucket to receive scheduled monthly deliveries of Geometry data. Ideal for teams that need full datasets for internal processing. Bulk Download Query SafeGraph Geometry directly in Snowflake. Integrate data into existing workflows without managing file transfers. Snowflake Explore sample Geometry data before committing. Review available columns and assess how it fits your current datasets. Request a Sample FAQ’s 1. What is building footprint polygon data? Building footprint polygon data represents the actual shape and boundaries of a place. Instead of a single coordinate, it maps the full area a location occupies. This helps you understand how places exist in real space and how they relate to nearby locations. 2. How is polygon data different from point-based location data? Point data uses one latitude and longitude to represent a place, which can miss the true boundaries. Polygon data captures the full geometry, making it easier to analyze proximity, overlaps, and spatial relationships, especially in dense areas. 3. What does the geometry dataset include? The dataset includes polygons in Well-Known Text (WKT) format along with useful metadata. This covers spatial hierarchies between places, attributes for parking areas, and indicators that help you assess polygon quality and reliability. 4. How does polygon data improve visit attribution? By using precise boundaries, polygon data helps determine whether a device is actually inside a location rather than just nearby. This improves the accuracy of visit attribution and reduces confusion between closely located places. 5. Why is it difficult to maintain polygon data internally? Building a reliable polygon dataset requires constant updates, data cleaning, and validation. Real-world locations change often, and maintaining accuracy at scale demands both engineering effort and ongoing quality checks. Building footprint polygon data represents the actual shape and boundaries of a place. Instead of a single coordinate, it maps the full area a location occupies. This helps you understand how places exist in real space and how they relate to nearby locations.Point data uses one latitude and longitude to represent a place, which can miss the true boundaries. Polygon data captures the full geometry, making it easier to analyze proximity, overlaps, and spatial relationships, especially in dense areas.The dataset includes polygons in Well-Known Text (WKT) format along with useful metadata. This covers spatial hierarchies between places, attributes for parking areas, and indicators that help you assess polygon quality and reliability.By using precise boundaries, polygon data helps determine whether a device is actually inside a location rather than just nearby. This improves the accuracy of visit attribution and reduces confusion between closely located places.Building a reliable polygon dataset requires constant updates, data cleaning, and validation. Real-world locations change often, and maintaining accuracy at scale demands both engineering effort and ongoing quality checks. Resources BLOGTop 3 Polygon Data Use Cases for Geospatial Insights Read Now GUIDEDetermining Points of Interest Visits From Location Data Read Now BLOGGeometry Data:The Anchor of SafeGraph Places Read Now VIDEOSafeGraph Geometry: Precise POI Footprint Data Read Now Explore High-Precision POI Geometry for Better Location Insights #### Homepage The Source of Truth for Physical Places Clean, high-quality places data — so your analytics, models, and location products start with truth. Explore Places Data Trusted by the World’s Leading Innovators and Builders The Industry’s Most Trusted Places Data POIs 0 M+ Brands 0 K+ Categories 0 + Countries & Territories 0 + Develop Reliable Products and Insights With Accurate Places Data Contextualize Locations Power Reliable User Experiences With Accurate Places Data. Understand Evolving Markets Contextualize Locations Add detail to points of interest (POI) with extensive attributes related to place type, open/closed status and more.   Power Reliable User Experiences With Accurate Places Data. Provide consistent, high-quality POI data – including categories, geocodes, and open or closed status – so your applications return results users can trust.   Understand Evolving Markets Use frequently updated places data to track store openings and closures across geographies for market analysis, competitive benchmarking, and risk assessment.   Contextualize Locations Add detail to points of interest (POI) with extensive attributes related to place type, open/closed status and more.   Power Reliable User Experiences With Accurate Places Data. Provide consistent, high-quality POI data – including categories, geocodes, and open or closed status – so your applications return results users can trust.   Understand Evolving Markets Use frequently updated places data to track store openings and closures across geographies for market analysis, competitive benchmarking, and risk assessment. Why Teams Choose SafeGraph Data   Production ready from day-1 Data arrives clean, structured, and normalised. No heavy preprocessing required.   Less time fixing bad data Reduce engineering and analyst time spent sorting, cleaning, and reconciling inconsistent location data.   Built for real-world change Frequent updates capture store openings, closures, and hours of operations across geographies. Location Intelligence for Real-World Industries   Retail Site selection, trade area analysis, and competitive intelligence across physical footprints.   Software Providers Location-based analytics and AI workflows embedded into SaaS products. Financial Services Location-based signals for investment research, risk modeling, and transaction enrichment.   Consumer Packaged Goods Location intelligence for distribution planning and sales execution across retail channels.   See the Difference in Places Data Quality other data providers SafeGraph data The Industry’s Most Trusted Places Data PlacesComprehensive, frequently updated point of interest data covering millions of real-world locations worldwide. Learn More GeometryPrecise polygons built from real-world place footprints for accurate spatial analysis. Learn More AddressesStructured geocoded address data with global coverage, designed for accurate mapping and location intelligence. Learn More Comprehensive, frequently updated point of interest data covering millions of real-world locations worldwide. Learn More Precise polygons built from real-world place footprints for accurate spatial analysis. Learn More Structured geocoded address data with global coverage, designed for accurate mapping and location intelligence. Learn More Tab #1 Tab #2 Tab #3 Why Teams Trust SafeGraph Data Clear Channel Olvin Mobsta Avison Young Spade “When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard.” Andy Stevens Chief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study “With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it.” Matt Taaffe VP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study “We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph.” James Sexton-Barrow Head of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study “With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before.” Julian Adams Director of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study “SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location.” Oban MacTavish CEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study "When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard." Andy Stevens Chief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study "With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it." Matt Taaffe VP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study "We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph." James Sexton-Barrow Head of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study "With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before." Julian Adams Director of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study "SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location." Oban MacTavish CEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study Integrate SafeGraph Data Into Your Existing Workflows and GIS Environments. Explore Integrations Get SafeGraph data delivered directly to Snowflake, or download from the... Browse in one place - access SafeGraph data directly through the AWS Marketplace Skip the two-step process - access SafeGraph data directly through CARTO via… Experiment with POI Data - access SafeGraph Places data directly through ArcGIS… Access SafeGraph POI data directly within the Databricks Data Intelligence Platform... Get SafeGraph data delivered directly to Snowflake, or download from the... Browse in one place - access SafeGraph data directly through the AWS Marketplace Skip the two-step process - access SafeGraph data directly through CARTO via… Experiment with POI Data - access SafeGraph Places data directly through ArcGIS… Access SafeGraph POI data directly within the Databricks Data Intelligence Platform... Explore Integrations Explore SafeGraph Datasets Explore Datasets Talk to a Data Expert #### Influencer Bios LTK for Food Influencers Add new revenue from your restaurant content Generate commission from your personalized recommendations page Learn more Earn affiliate revenue for reservations made by your followers Layer a passive monetization strategy onto your existing effortsEasy link-in-bio setup that works for all of your favorite restaurants Easily create your personalized recommendation page Curate your favorite restaurants by dining categoryReduce your audience’s questions with self-serve lists Affiliate links for all major reservation platforms Generate commission across OpenTable, Resy, Tock, SevenRooms, and more! #### Integrations Integrate SafeGraph Data Into Your Existing Workflows We partner with leading GIS, cloud computing, and data storage companies to make SafeGraph's high-quality location data easily accessible. Explore our partner integrations below. Get Started Integration Partners CARTO Skip the two-step process - access SafeGraph data directly through CARTO via… Snowflake Get SafeGraph data delivered directly to Snowflake, or download from the Snowflake… Esri Experiment with POI Data - access SafeGraph Places data directly through ArcGIS… AWS Browse in one place - access SafeGraph data directly through the AWS Marketplace. PredictHQ SafeGraph and PredictHQ enable organizations with detailed and accurate… CoreLogic CoreLogic and Placekey’s goal is to solve the data-joining problem most… DOMO Take your analysis to the next level with detailed POI data from SafeGraph directly in… Spectus Ingest clean, accurate POI data in Spectus’ secure geospatial platform to make… Join the data revolution. #### Introducing: SafeGraph's Early Closure Signal for POIs Introducing: SafeGraph's Early Closure Signal for POIs Unlock actionable insights to stay ahead of market changes and competitor closures. Get early access to future retail closure data Stay Ahead with Early Closure Insights Make informed, strategic decisions with early signals identifying retail spaces at risk of closure. These insights empower you to act swiftly, unlocking opportunities in marketing, hiring, and real estate development. Leverage actionable data to:Identify pending competitor closures.Pinpoint opportunities in markets experiencing shifting consumer dynamics.Target locations primed for development due to upcoming vacancies. *All 5 rows were marked closed in the March 2025 SafeGraph Places delivery Powered by Advanced LLMs for Precision We apply LLMs throughout the flagging process to ensure only POIs with strong closure signals are surfaced. Delivered as small data, this insight is ready to use—no engineering required. Recent results:Flagged 2,500 closures last year with 2 months' lead time on average.~4,500 POIs currently flagged with medium to high confidence of imminent closure. #### Join SafeGraph at SDSC London SDSC LONDON 2023 Join SafeGraph at SDSC London Join SafeGraph at the Spatial Data Science Conference 2023 in London this May. Network and learn from other data science, GIS, and Analytics professionals. Let us know if you're attending Stop by our booth Pick up some cool data swag and connect with geospatial data experts at SafeGraph at our booth on May 18. Then, watch our keynote presentation with Clear Channel Europe about how OOH media companies can use POI data to combat the challenges of a constantly changing world. "What is a Place?" On stage, hear from Bryan Bonack, Director of Product Management at SafeGraph, and Aaron Martin, Data Innovation Manager at Clear Channel Europe, talk through how OOH media companies must consider the challenges of ensuring only real, valid places meet the refined definition to enable advertisers with better campaign planning insights. #### Learn More The Source of Truth for Physical Places Thanks for meeting with us. Learn more about SafeGraph's datasets, check out our resources, and try some data out for yourself. Start Exploring Real-World Data View Our Data Schema Surface best-in-class Places data within your platform. Learn More Explore Our Resources Check out our whitepapers, data visualizations, blogs, and more. Read Our Blogs Download Data Instantly Explore data about places from SafeGraph instantly. Learn More Our Data Maintaining Quality Places and POI Data (and Why It Matters) Read Now Introducing Expanded SafeGraph Category Tags: More Granular Than NAICS Codes Read Now Why Data Scientists Choose SafeGraph for Location-Based Data Read Now A Technical Guide to SafeGraph Places Data Read Now More Content The Ultimate Guide to Geofencing for Marketing and Beyond Read Now Geospatial Data: A Comprehensive Guide Read Now Geospatial Data Sources: Where to Get the Data, You Need Read Now Geometry Data: The Anchor of SafeGraph Places Read Now How to Use POI Data for Catchment Area Analysis Read Now Comparing SafeGraph and OpenStreetMap: The Hidden Cost of Free Data Read Now Clear Channel Europe Uses SafeGraph Data to Fuel More Targeted, Higher Performing OOH Advertising Campaigns Read Now #### Local Business Appointments API Product Launch Local Business Appointments API SafeGraph announces a centralized appointments API supporting reservations for local businesses through multiple platforms. Be the first to know about launch updates Enable your users to book reservations and appointments with local businesses Elevate user engagement by incorporating appointment scheduling seamlessly within your application and website. Enhance the user experience by enabling bookings with restaurants, hair salons, and other appointment-based businesses. This integration adds value to your platform while increasing DAUs. Streamline integration efforts and costs Maximize efficiency by integrating just once with SafeGraph’s API, eliminating the need for numerous individual integrations. Our solution empowers you to effortlessly connect with a myriad of platforms, saving valuable BD and engineering time. #### Modern Data Partner A Data Partner Built for Modern Enterprises. High-quality places data, delivered with the flexibility & transparency to build applications that thrive in the real world. Explore SafeGraph What Sets SafeGraph Apart Premium Places Data That’s All SafeGraph focuses on one thing: delivering high-quality global places data that supports accurate analysis, reliable location-based features, and long-term growth. Explore Our Rich Attributes Flexible Delivery and Pricing Ready-to-use data delivered to the environment of your choice. Flexible pricing and usage terms give teams control without added complexity. See Our Integration Partners Transparent Documentation Monthly updates clearly outline what data is collected, cleaned, and processed, so accuracy and coverage are never in question. Read Our Technical Documentation Schedule a Demo Trusted by Leading Innovators We never expected the dataset to be turnkey with our existing systems and datasets from day one—there’s always a bit of work to get it to where you need it to be, we’ve been blown away by SafeGraph’s proactiveness in helping us work through any issues we’ve encountered." Andy StevensChief Data Officer, Clear Channel Europe We’re Committed to Solving Your Data Challenges Our team collaborates with you to configure a places data license configuration tailored to your use case.
We can source data by request to serve your evolving needs and enhance our dataset.
A dedicated team member provides support, overseeing onboarding, troubleshooting, and regular check-ins. Our Partnership Model Trusted by Leading Innovators When vetting our options, we had two choices. Either we built this capability in-house from scratch or we worked with a partner like SafeGraph who could tick all of the boxes.” Travis RiedlhuberManaging Director at RainBarrel We Specialize in POI Data So You Don’t Have To Our places data is meticulously sourced, cleaned, tagged, and delivered ready to use. Teams can integrate it with confidence and focus on building better user experiences, insights, and location-driven products.As a geospatial data partner and location data platform for business, SafeGraph helps teams move faster without managing data complexity. Explore Places Data Leading Companies Trust SafeGraph Data Power Your Tools withSafeGraph Data Explore Datasets #### Newsletter Get a Monthly Roundup of SafeGraph News and Industry Updates Stay in the loop about SafeGraph Places data, recent blog posts, upcoming events, podcast episodes, and industry trends. Join the newsletter Get a monthly roundup of SafeGraph news and industry updates. Stay in the loop about SafeGraph Places data, recent blog posts, upcoming events, podcast episodes, and industry trends. All of Our Best Content in One Email, Monthly New Product Initiatives Monthly release notes and info on new product launches. Upcoming Events Industry focused webinars, product demos, and more. New Blogs Industry trends, technical deep dives, and more. Thought Leadership Discover how our leadership team sees the data market evolving over time. Why Teams Trust SafeGraph Data Clear Channel Olvin Mobsta Avison Young Spade “When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard.” Andy StevensChief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study “With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it.” Matt TaaffeVP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study “We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph.” James Sexton-BarrowHead of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study “With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before.” Julian AdamsDirector of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study “SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location.” Oban MacTavishCEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study "When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard." Andy StevensChief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study "With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it." Matt TaaffeVP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study "We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph." James Sexton-BarrowHead of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study "With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before." Julian AdamsDirector of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study "SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location." Oban MacTavishCEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study Excited About Using Our Updated POI Dataset To Enhance Your Business Insights. See How It Works in Practice. Schedule a Demo #### Places High Precision POI Data Built to Power Modern Applications The world changes fast – stay ahead of the curve with the most precise, reliable global points of interest data available. Schedule a Demo Deliver Premium User Experiences and Market Leading Insights Never worry about stale or inadequate data. Built on a globally maintained POI dataset, you can build more informative tools and stay ahead of the competition with up-to-date, accurate places data. Continuous data collection and monthly updates deliver the most current, reliable global POI dataset available. Rigorous sourcing and verification ensure POIs are geographically precise and accurately detailed. Rich POI attributes provide the context needed for superior functionality and analyses. Places Data Attributes That Add Clarity and Context Quality is your priority, so it's ours too. We rigorously verify our data to power your products with the highest quality ingredients. Geographic Coordinates Exact latitude and longitude for each POI location. Learn More Industries & Categories Granular tags that provide context for detailed POI data. Learn More Opened/Closed Dates Track business lifecycle data for accurate timelines. Learn More Brands Brand names and IDs for precise chain-level analysis. Learn More Store IDs Store-level identifiers for clean joints and data enrichment. Learn More ...and more Open hours, phone numbers, website URLs, and other location dataset attributes. Address details, and more—create a custom dataset to suit your unique application. Explore All of Our Attributes Leverage Comprehensive POIs and Place Attributes for Deeper Context Precise Footprints Understand the shapes and spatial hierarchy of places within our POI dataset, using machine-generated, human-verified place polygons with rich metadata. Learn More Parking Lots Expand the concept of a place within the POI dataset to include its surrounding parking lots. Learn More Transparent Documentation Monthly updates clearly outline what data is collected, cleaned, and processed, so accuracy and coverage are never in question. Learn More Enhance your product with POI attributes like opened/closed dates, business hours, precise geographical coordinates, and more. Explore Places Attributes Don’t Waste Time Collecting and Cleaning Messy POI Data SafeGraph focuses on sourcing and cleaning POI data, so you can spend time on what matters most - developing the best products for your users. Build with confidence, knowing your places data is a true representation of the physical world. POI datasets are delivered cleaned, deduped, and ready to embed. Flexible delivery options fit into your existing workflow Simple pricing tailored for your specific use case puts you in control of your budget The Industry’s Most Trusted Places Data
 POIs 0 M+ Brands 0 K+ Categories 0 + Countries & Territories 0 + See How Leading Organizations Use SafeGraph’s POI Dataset in Real World Applications Avison Young Uses SafeGraph Data to Offer Ground-Level Insights for Commercial Real Estate Site Selection Read Now Clear Channel Europe Uses SafeGraph Data to Fuel More Targeted, Higher Performing OOH Advertising Campaigns Read Now Dosh’s Card-Linked Offers Platform Is Powered by SafeGraph Places Data Read Now FAQ’s What is POI data & what does it include in Places? Point of interest (POI) data is information about geographical places that exist in the real world, such as the location of the physical place, ownership, open and close history, and more. POI data is extremely useful for business and competitive intelligence analysis and is used by investors, retail businesses, and other location-driven industries.With SafeGraph Places POI datasets, you can access information on the physical location of a place, such as latitude and longitude, street address, city, region, and postal code. In addition, our Places datasets include brand affiliation, hours of operation, historical data on when businesses open and close at a location, and building footprint information. Together, these attributes further contextualize each location and provide a reliable representation of the physical world. How much does the Places dataset cost? Like all of our data, the cost of the Places POI dataset depends on the number of rows, columns, and the frequency of delivery you request. Contact our sales team to learn more about enterprise pricing options. How often is the dataset updated? SafeGraph issues updates to the Places POI dataset once per month, which is more frequent than many other POI data providers that may update every three to six months. This is possible because we work with a wide range of data sources and efficiently combine updates from those sources. Each month, a subset of our sources sends updated information, which we onboard and integrate quickly. This process allows us to reflect store openings and closures in the Places dataset in a timely manner. The time between a store opening or closing and its appearance in the dataset depends on when the change is detected by our sources and when it is processed by SafeGraph. In most cases, updates are reflected within the same month, which is significantly faster than many alternative providers. How do you determine when a POI is opened or closed? Opened and closed dates are determined using metadata at the source level. If a new point of interest from an existing source repeatedly appears in our build pipeline, it is flagged as “opened_on” during the month in which it first appears. Similarly, if a POI from an existing source repeatedly disappears in our pipeline, it is flagged as “closed_on” during the month in which it first disappears. These flags are applied to the Places product following final QA checks and overall data hygiene processes. Temporary closures are not captured in open and close tracking. During the onset of COVID-19, it became difficult to distinguish permanent closures from temporary ones, which resulted in a lower number of POIs marked as closed between March and June 2020. If a POI has not been sourced consistently enough to determine a closure date, the tracking_closed_since field will remain null. In general, the SafeGraph Places dataset tracks opened and closed dates from July 2019 onward. How does SafeGraph assign NAICS codes to points of interest (POI)? SafeGraph Places uses the North American Industry Classification System (NAICS) developed by the US Census Bureau. This system assigns a numeric code of up to six digits to classify points of interest by industry. Although NAICS was developed in the United States, it has proven effective for categorizing points of interest data in other countries as well. The classification is hierarchical, with the first two digits representing a broad category and additional digits providing increasingly specific classifications. Point of interest (POI) data is information about geographical places that exist in the real world, such as the location of the physical place, ownership, open and close history, and more. POI data is extremely useful for business and competitive intelligence analysis and is used by investors, retail businesses, and other location-driven industries.With SafeGraph Places POI datasets, you can access information on the physical location of a place, such as latitude and longitude, street address, city, region, and postal code. In addition, our Places datasets include brand affiliation, hours of operation, historical data on when businesses open and close at a location, and building footprint information. Together, these attributes further contextualize each location and provide a reliable representation of the physical world.Like all of our data, the cost of the Places POI dataset depends on the number of rows, columns, and the frequency of delivery you request. Contact our sales team to learn more about enterprise pricing options.SafeGraph issues updates to the Places POI dataset once per month, which is more frequent than many other POI data providers that may update every three to six months. This is possible because we work with a wide range of data sources and efficiently combine updates from those sources. Each month, a subset of our sources sends updated information, which we onboard and integrate quickly. This process allows us to reflect store openings and closures in the Places dataset in a timely manner. The time between a store opening or closing and its appearance in the dataset depends on when the change is detected by our sources and when it is processed by SafeGraph. In most cases, updates are reflected within the same month, which is significantly faster than many alternative providers. Opened and closed dates are determined using metadata at the source level. If a new point of interest from an existing source repeatedly appears in our build pipeline, it is flagged as “opened_on” during the month in which it first appears. Similarly, if a POI from an existing source repeatedly disappears in our pipeline, it is flagged as “closed_on” during the month in which it first disappears. These flags are applied to the Places product following final QA checks and overall data hygiene processes. Temporary closures are not captured in open and close tracking. During the onset of COVID-19, it became difficult to distinguish permanent closures from temporary ones, which resulted in a lower number of POIs marked as closed between March and June 2020. If a POI has not been sourced consistently enough to determine a closure date, the tracking_closed_since field will remain null. In general, the SafeGraph Places dataset tracks opened and closed dates from July 2019 onward. SafeGraph Places uses the North American Industry Classification System (NAICS) developed by the US Census Bureau. This system assigns a numeric code of up to six digits to classify points of interest by industry. Although NAICS was developed in the United States, it has proven effective for categorizing points of interest data in other countries as well. The classification is hierarchical, with the first two digits representing a broad category and additional digits providing increasingly specific classifications. Explore Guides and Case Studies That Show How Teams Use Our POI Dataset and Places Data in Practice BLOGHow to Use POI Data for Catchment Area Analysis Read Now BLOGComparing SafeGraph and OpenStreetMap: The Hidden Cost of Free Data Read Now ANALYSIS RUN DEC 2022Gas & EV Charging Stations Read Now GUIDEA Technical Guide to SafeGraph Places Data Read Now Learn More About Accurate and Precise Global POI Data Have Questions AboutOur POI Dataset Explore Datasets Get in Touch #### POI Data for AI Trip Apps Places Data for AI Trip Apps Build Smarter Travel Experiences with High-Quality Places Data SafeGraph provides clean, accurate POI data on millions of hotels, restaurants, and attractions to supercharge your recommendations and planning tools. Get in Touch POI Data Built for AI Travel Apps SafeGraph curates comprehensive and accurate data on millions of global places like hotels, restaurants, landmarks, museums, and more - so your AI models can deliver personalized, context-aware travel planning. Whether you're building an itinerary generator, a local discovery engine, or a recommendation system, our data helps you serve the right spot at the right moment. Precision, Scale, and Freshness SafeGraph's places data is rigorously cleaned, normalized, and updated monthly. We include category tags, amenity data, open/close status, hours, and building footprints so your app’s results aren’t just fast, they’re actually useful. #### Preference Center You're now unsubscribed. We're sorry to see you go! You can resubscribe at any time. #### Pricing Pricing Built Around Your Data Requirements Access high-quality POI and geospatial data with flexible licensing. Choose the datasets, regions, and formats that fits your workflow. Request Pricing Comprehensive POI Data Simple Pricing No pay-per-use policies or restrictive usage terms—get a custom dataset with flexible delivery and support for a straightforward fee. Here's how it works: Identify the location rows you want to purchase. Add premium attribute columns for more detailed context. Define the usage rights and delivery terms that fit your needs. Pay a simple annual fee based on your custom mix of rows, columns, and usage rights. Every Places Customer Receives Monthly data delivery to your environment of choice. See our integrations.Proactive support and strategic guidance to ensure you're getting the most value out of every record.POI data you and your users can trust to be fresh, accurate, and transparently sourced. Learn More About Our Data We Power Modern Builders Learn more about how we help today’s most innovative companies solve their data challenges. Explore Our Customer Case Studies Go Deeper With Our Docs Get in-depth info about our data schema, curation methods, delivery integrations, monthly updates, and more. Explore Our Docs FAQ’s Do you offer prepackaged Places datasets? We work with our customers to create custom Places packages tailored to their specific needs. But we can definitely provide starting points for commonly requested bundles. How often is Places data updated? Our Places data is updated monthly, by the 7th day of each month. I don’t see the integration I want—do you support custom integrations? We’re very willing to explore additional integrations or methods of delivery. What technical support does SafeGraph provide? We believe in transparency, so our technical documentation is publicly available, along with our Summary Statistics. In addition, each Places customer is assigned a Customer Success Manager to ensure they get the most out of their data and always have a direct contact available to answer questions and provide guidance. Can I try Places data before I commit to a paid contract? Of course! We can cut custom samples that allow you to evaluate our data and get a clear idea of what you’d be purchasing. Do you source data by request? Happily. We constantly seek to enhance our data with new places our customers find valuable, so customer requests make up a large portion of our pipeline. Do you offer special pricing for organizations like nonprofits and educational institutions? We partner with companies like Dewey who serve these organizations specifically. If you’d like an introduction, just let us know. We work with our customers to create custom Places packages tailored to their specific needs. But we can definitely provide starting points for commonly requested bundles.Our Places data is updated monthly, by the 7th day of each month.We’re very willing to explore additional integrations or methods of delivery.We believe in transparency, so our technical documentation is publicly available, along with our Summary Statistics. In addition, each Places customer is assigned a Customer Success Manager to ensure they get the most out of their data and always have a direct contact available to answer questions and provide guidance.Of course! We can cut custom samples that allow you to evaluate our data and get a clear idea of what you’d be purchasing.Happily. We constantly seek to enhance our data with new places our customers find valuable, so customer requests make up a large portion of our pipeline.We partner with companies like Dewey who serve these organizations specifically. If you’d like an introduction, just let us know. Let’s Design Your Custom Places Dataset Get in Touch #### Privacy Policy Privacy Policy Effective: January 1, 2023Introduction. SafeGraph LLC (“SafeGraph”) values and respects the privacy of users, and we work hard to do just that. This Privacy Policy includes substantial revisions from past versions to better reflect our current practices and products, which focus on data about physical locations - not identifiable individuals. Like many companies, however, SafeGraph collects certain personal information from people interested in our products and services, through our website, or from third-party entities in connection with SafeGraph’s direct-to-consumer services.  This Privacy Policy applies to Information collected by SafeGraph and sets forth SafeGraph’s commitment to you to protect the privacy of that Information. We encourage you to read this Privacy Policy to understand the Information we collect and how we collect it, why we collect it, how we use it, and the choices available to you to manage your privacy.  We periodically revise this Privacy Policy to reflect our practices and account for new and evolving laws that govern privacy. This Privacy Policy is not a contract and does not create any legal rights or obligations. Who We Are. SafeGraph is a data-as-a-service company. Our sole mission is to provide the most trusted data about physical places around the world. We believe open access to high-veracity, ethically, and transparently sourced data about physical places is crucial to innovation and technological advancement.  Our Offerings. When we use the term “SafeGraph Services”, we are referring to the datasets that we offer (detailed explanations of each data product are available on our public website: https://docs.safegraph.com/docs). We do not collect Information directly from natural persons to develop SafeGraph Services. SafeGraph aggregates data collected from third party data sources to develop the SafeGraph Services, such as products reflecting trends in consumer transactions at a certain retail store over or at a given time. Our customers – a variety of companies and organizations – use SafeGraph Services for commercial and research purposes, including retail site selection (e.g., determining where to open a new restaurant) and market research (e.g., analyzing general consumer purchasing trends).  Although we may license Information from third parties (e.g., advertising identifiers, such as Apple IDFA or Google Advertising ID) to develop SafeGraph Services, we do not correlate that data with other personal data such as name, phone number, or email address and none of the products we sell include Information. We use filtering mechanisms and other privacy enhancing measures to remove any Information from our products and require our customers to commit to not reidentifying anonymized or aggregated data.  Covered Information.  When we use the term “Information” in this Privacy Policy, we mean information that identifies, relates to, describes, is reasonably capable of being associated with, or could reasonably be linked, directly or indirectly, to an individual or household. It does not include aggregated or de-identified information that is maintained in a form that is not capable of being associated with or reasonably linked to an individual.Information We Collect and How We Collect ItWe obtain Information in three main ways: (1) Information you choose to provide directly to us; (2) Information we collect automatically through technology when you visit our website; and (3) Information we collect from other sources.‍1. Information You Choose to Provide Directly to UsWhen you use our website or otherwise interact with us, you may provide certain Information to us. For example, we may collect Information that you provide when you visit our corporate website (including the website on which this Privacy Policy is posted), such as to express interest in the SafeGraph Services, sign up for our newsletters or similar informative services, or to apply for a job. Some of the most common categories of Information we may collect directly from you include:Contact Information, such as name, address, telephone number, email address;Customer service interaction Information, such as messages submitted to us through online forms or email, and summaries or voice or video recordings of interactions with our customer service personnel;Information you provide in connection with surveys, promotional activities, and other outreach, such as your interests, preferences, demographic data (e.g., gender), and information needed for you to participate and for us to fulfill your participation incentives (if any).Employment application Information. If you are applying for a job with SafeGraph, we may collect information such as contact information, background information, and employment qualifications and history.‍2. Information From Other SourcesAs is true of most websites, we or third-party platforms with whom we work may automatically collect Information when you visit our website from your desktop computer, laptop, mobile phone, tablet, or other consumer electronic device (“Devices”) that you use to access the website. This Information may include pseudonymized information, such as a unique browser identifier, technical information about your device (such as the device type, operating system, settings and system configurations, IP address, other unique device identifiers, and mobile network information) and your activity on our website, as well as data about the webpages you access, traffic to and from websites, the dates and times associated with transactions, and web log data. We refer to this as “Site Data.”SafeGraph does not incorporate or sell Site Data as part of our SafeGraph Services. We use Site Data for our internal business purposes, such as marketing, analytics, research, and improving our website and services. Site Data may be correlated with other personal information, such as name or email address.Our Use of Cookies. We (or other companies we work with) may collect or associate Site Data using “cookies.” Cookies are small data files that have unique identifiers and reside, among other places, on your Devices, in emails we send to you, and on the web pages of the SafeGraph website. Among other things, cookies help us improve and personalize your use of the SafeGraph website. We may permit certain third parties to place cookies through the SafeGraph website to provide us with better insights into the use of the SafeGraph website and user demographics and to provide relevant advertising for SafeGraph Services to you. These third parties may collect Information related to your online activities over time and across different websites when you use the SafeGraph website. Our Use of Web Beacons. Web beacons are electronic images that may be used on the SafeGraph website or in emails we send to you. We may use web beacons to deliver cookies, count visits to our website, understand usage and effectiveness of offers, and tell whether you open an email and act upon it.By accessing or using the SafeGraph website, you consent to the placement of cookies on your browsers and devices and the use of web beacons as described in this Privacy Policy. You may block, delete, or disable cookies or web beacons to the extent permitted by your browsers, applications, and Devices as described in more detail below. However, if you do so, the performance of certain features of the SafeGraph website may be limited or not work at all.3. Information from other sourcesWe also receive Information from third parties. Sometimes, we receive or license Information (such as email addresses) from vendors or others in our industry, reflecting the contact or other details of potential customers, business partners or prospective employees; we also sometimes collect this information at trade shows and events.  We only use this Information for our internal business purposes and do not incorporate it into SafeGraph Services.To develop SafeGraph Services we license data from third parties, including Data Suppliers/Brokers who license to us commercially available Information and data.  Examples of such commercially available Information may include data about interactions with third party retail locations, websites, apps, and device data (including mobile device type and advertising identifiers, such as Apple IDFA or Google Advertising ID, and location data, but we do not use or include device data or other Information in SafeGraph Services).Why We Collect InformationTo the extent relevant to applicable law, any of several legal bases may underlie our collection of Information, including:Contract Performance. When the processing is necessary to perform our contract with you, which may include providing you with SafeGraph Services or tailoring SafeGraph Services to your preferences.Legitimate Interests. When the processing is necessary for our legitimate business interests (or those of our enterprise customers), including but not limited to, providing, developing, improving or marketing the SafeGraph Services.Consent. When you have given us your consent to process your Information.Legal Obligation. When we have a legal obligation to do so, as described in this Privacy Policy.‍How We Use and Share the Information We CollectWe use and share Information:to provide, develop, operate, and improve SafeGraph Services;to respond to communications and customer services inquiries;to market to you and provide you information about new products and events;for our internal and operational purposes, such as to consider or make internal service improvements or quality checking, or for our own sales and marketing purposes; andfor legal, auditing and accounting purposes, such as (a) to protect or enforce our rights or those of others, (b) to evaluate or enhance the security or quality of our Information, or (c) to investigate potential wrongdoing.We may disclose Information that we collect or you provide as described in this Privacy Policy:to our subsidiaries and affiliates;to service providers, contractors, and other third parties we use to support our business or perform functions on our behalf;to a potential or actual buyer, assignee, or other successor (including its related advisors and agents) in the event of a merger, divestiture, restructuring, reorganization, dissolution, or other sale or transfer of some or all of our assets, whether as a going concern or as part of bankruptcy, liquidation, or similar proceeding, in which Information held by us about you is among the assets that may be or are actually transferred;to fulfill the purpose for which you provide it;for any other purpose disclosed by us when you provide the Information;with your consent.We may also disclose Information:to comply with any court order, law or legal process, including to respond to any law enforcement, government or regulatory request; to enforce or apply the engagement agreement and other agreements, including for billing and collection purposes;if we believe disclosure is necessary or appropriate to protect the rights, property, or safety of us, our customers, or others.Your ChoicesWe may choose or be required by law to provide different or additional disclosures relating to the processing of Information about residents of certain countries, regions or states.  Depending on your state of residency and subject to certain legal limitations and exceptions, you may be able to exercise some or all of the following rights:The Right to Know. The right to confirm whether we are processing your Information and access to that information. The Right of Portability. The right to obtain a copy of the Information in a portable and, to the extent technically feasible, readily usable format that allows you to transmit the data to another entity without hindrance, subject to certain exceptions as set forth in applicable law.The Right to Correction. The right to correct inaccuracies in your Information, taking into account the nature of the Information and the purposes of the processing of the Information.The Right to Deletion. The right to have us delete the Information we maintain about you, subject to certain exception as set forth in applicable law.The Right to Opt-Out of Certain Processing of Information. The right to opt-out of processing of your Information for purposes of (i) targeted advertising, (ii) the sale of personal data, or (iii) in some states, profiling in furtherance of decisions that produce legal or similarly significant effects concerning the consumer.Right to Non-Discrimination: Depending on your state of residency, you may also have the right to not receive retaliatory or discriminatory treatment in connection with a request to exercise the above rights. Submitting Privacy Rights Requests. To submit a request to exercise one of the privacy rights identified above, please email us at privacy@safegraph.com. We may need to verify your identity before processing your request. In certain circumstances, we may decline a request to exercise the rights described above, particularly where we are unable to verify your identity or locate your information in our systems. We will only use personal data provided in connection with a Rights Request to review and comply with the request.Children’s PrivacyWe do not intend to collect Information from children under the age of 16. If you believe we have collected Information from a person under the age of 16, please contact us at privacy@safegraph.com.Information SecuritySafeGraph uses reasonable measures to protect Information in our possession against unauthorized access, disclosure, alteration, or destruction. We regularly review our information security, storage, and processing to ensure compliance with industry best practices. However, as no physical or technological safeguards are 100 percent secure, we do not guarantee the security of any particular elements of data that we hold.Information RetentionWe retain your Information only for as long as is reasonably necessary to fulfill the purpose for which it was collected. However, if necessary, we may retain your Information for longer periods of time, for instance where we are required to do so in accordance with legal, tax and/or accounting requirements set by a legislature, regulator or other government authority.To determine the appropriate duration of the retention of Information, we consider the amount, nature and sensitivity of the Information, the potential risk of harm from unauthorized use or disclosure of Information and if we can attain our objectives by other means, as well as our legal, regulatory, tax, accounting and other applicable obligations.Once retention of your Information is no longer necessary for the purposes outlined above, we will either delete or de-identify the Information or, if this is not possible (for example, because Information has been stored in backup archives), then we will securely store your Information and isolate it from further processing until deletion or deidentification is possible.The GDPR (European and UK Law)SafeGraph is not established and does not offer products or services involving the collection or sale of “personal data” in European Economic Area (“EEA”) countries or the United Kingdom. We likewise seek not to collect such personal data from our data suppliers. Consequently, the European Union General Data Protection Regulation (the “GDPR”) and the applicable data privacy laws of the United Kingdom (“UK GDPR”) are not applicable. Should any of the foregoing change, we will update this section of our Privacy Policy.  We may sometimes receive “personal data” from potential customers or business partners located in the EEA or the United Kingdom. If you wish to exercise any of your rights under the GDPR or UK GDPR with respect to this limited set of data (for instance, if you provided us with your business information in the past), please contact us at privacy@safegraph.com. ‍Contact UsIf you have any questions regarding this Privacy Policy, please contact us at privacy@safegraph.com or by writing to us at:Attention: Office of General Counsel1580 N Logan St, Ste 660, #53755Denver, CO 80203Updates to This Privacy PolicyWe will update this Privacy Policy from time to time. When we make changes to this Privacy Policy, we will change the “Effective” date at the beginning of this Privacy Policy. We recommend that you check the Privacy Policy frequently so that you are informed of any changes. All changes shall be effective from the date of publication unless otherwise provided in the notification.Additional Information DisclosuresSensitive InformationIn some instances, we may receive information considered to be “sensitive” under certain privacy laws, including geolocation data.  However, we do not collect this information directly from natural persons to develop SafeGraph Services, and we require our data suppliers to validate that they have obtained all data licensed to us in a lawful manner. SafeGraph does not sell sensitive information or products derived from geolocation data that can identify a person’s location, and we do not process or otherwise share sensitive information for the purpose of targeted advertising.Further, we do not use any sensitive information to infer characteristics about a consumer.  We use this sensitive information for the purposes set forth in the “How We Use and Share the Information We Collect” section of our Privacy Policy, to enter into and perform a contract with you, to comply with legal and regulatory requirements, to protect the life or physical safety of you or others, or as otherwise permissible for our internal business purposes consistent with applicable laws. Deidentified DataWe may at times receive or process Information to create de-identified data that can no longer reasonably be used to infer information about, or otherwise be linked to, a particular individual or household. Where we maintain deidentified data, we will maintain and use the data in deidentified form and not attempt to reidentify the data except as required or permitted by law.  #### Products URL: https://www.safegraph.com/products/ #### Publications Our data in the real world. Stay current with the latest media coverage, academic papers and industry research. Get in touch View All Press Academic Research Industry Research Academic Research June 14, 2023 Points-of-Interest from Mapillary Street-level Imagery: A Dataset For Neighborhood Analytics   Academic Research November 7, 2022 Probabilistic Program Inference in Network-Based Epidemiological Simulations   Industry Research May 9, 2022 Interactive graph shows Sacramento museum and park foot traffic after latest COVID surge   Press March 15, 2022 State of New Jersey Using Tyler Technologies’ Solution to Understand Economic Data   Press February 9, 2022 Peloton and the fate of the fitness industry   Academic Research December 24, 2021 The role of alcohol outlet visits derived from mobile phone location data in enhancing domestic violence prediction at the neighborhood level   Academic Research December 17, 2021 Human mobility data and machine learning reveal geographic differences in alcohol sales and alcohol outlet visits across U.S. states during COVID-19   Academic Research November 24, 2021 Predicting Stages in Omnichannel Path to Purchase: A Deep Learning Model   Academic Research December 17, 2021 Human mobility data and machine learning reveal geographic differences in alcohol sales and alcohol outlet visits across U.S. states during COVID-19   Academic Research November 24, 2021 Predicting Stages in Omnichannel Path to Purchase: A Deep Learning Model   Learn how the geometry datasets work and what it includes.   Understand every attribute available in the geometry polygon database. Geometry data schema Track latest product releases and stay updated about the datasets you use. Release notes Get a quick overview of coverage, depth, and available polygon data. Summary statistics Get your data easily in any of the following 3 ways:   Set up an S3 bucket to receive scheduled monthly deliveries of Geometry data. Ideal for teams that need full datasets for internal processing. Bulk download Query SafeGraph Geometry directly in Snowflake. Integrate data into existing workflows without managing file transfers. Snowflake Explore sample Geometry data before committing. Review available columns and assess how it fits your current datasets. Request a sample Get your data easily in any of the following 3 ways:   Set up an S3 bucket to receive scheduled monthly deliveries of Geometry data. Ideal for teams that need full datasets for internal processing. Bulk download Query SafeGraph Geometry directly in Snowflake. Integrate data into existing workflows without managing file transfers. Snowflake Explore sample Geometry data before committing. Review available columns and assess how it fits your current datasets. Request a sample Academic Research June 14, 2023 Points-of-Interest from Mapillary Street-level Imagery: A Dataset For Neighborhood Analytics Academic Research November 7, 2022 Probabilistic Program Inference in Network-Based Epidemiological Simulations Industry Research May 9, 2022 Interactive graph shows Sacramento museum and park foot traffic after latest COVID surge Press March 15, 2022 State of New Jersey Using Tyler Technologies' Solution to Understand Economic Data Press February 9, 2022 Peloton and the fate of the fitness industry Academic Research December 24, 2021 The role of alcohol outlet visits derived from mobile phone location data in enhancing domestic violence prediction at the neighborhood level Academic Research December 17, 2021 Human mobility data and machine learning reveal geographic differences in alcohol sales and alcohol outlet visits across U.S. states during COVID-19 Academic Research November 24, 2021 Predicting Stages in Omnichannel Path to Purchase: A Deep Learning Model Academic Research December 17, 2021 Human mobility data and machine learning reveal geographic differences in alcohol sales and alcohol outlet visits across U.S. states during COVID-19 Academic Research November 24, 2021 Predicting Stages in Omnichannel Path to Purchase: A Deep Learning Model Learn how the geometry datasets work and what it includes.   Understand every attribute available in the geometry polygon database. Geometry data schema Track latest product releases and stay updated about the datasets you use. Release notes Get a quick overview of coverage, depth, and available polygon data. Summary statistics Get your data easily in any of the following 3 ways:   Set up an S3 bucket to receive scheduled monthly deliveries of Geometry data. Ideal for teams that need full datasets for internal processing. Bulk download Query SafeGraph Geometry directly in Snowflake. Integrate data into existing workflows without managing file transfers. Snowflake Explore sample Geometry data before committing. Review available columns and assess how it fits your current datasets. Request a sample Get your data easily in any of the following 3 ways:   Set up an S3 bucket to receive scheduled monthly deliveries of Geometry data. Ideal for teams that need full datasets for internal processing. Bulk download Query SafeGraph Geometry directly in Snowflake. Integrate data into existing workflows without managing file transfers. Snowflake Explore sample Geometry data before committing. Review available columns and assess how it fits your current datasets. Request a sample #### Publisher Reservations Add new revenue from your restaurant content SafeGraph provides affiliate restaurant links across all major reservation platforms. Learn more Receive affiliate revenue for reservations that your audience make Layer a passive monetization strategy onto your existing monetization efforts. Easy to set up for evergreen revenue, and works across any restaurant that takes reservations. Improve audience engagement Provide additional value to your audience with reservation links to sites like OpenTable. Plus, many influencers use these links to create quick responses to follower FAQs and restaurant recommendations. Compatible with your IG stories, link in bio, and website Affiliate links work within your existing set-up. Add links to your IG stories, your favorite link in bio tool, or your website. #### Reservation Service Policies Guidelines Reservation Service Policies Guidelines Get early access to future retail closure data Content Guidelines We do not allow Publishers with the following types of content to use the Service, nor may Publishers use the Service in connection with sites that display any of the following: Pornographic content Libelous, defamatory, or obscene content Violent or hateful content, including content that advocates or promotes discrimination on the basis of race, ethnicity, gender, religion, sexual orientation, age or disability Content of a religious nature Content of a political nature Content that promotes illegal activity Content that promotes or involves the use or sale of firearms, illegal substances, financial services or advice, or gambling Content that specifically targets children aged 13 and below Duplicated content from other websites It is a breach of our Program Policies to: Include personal information (such as license plate numbers, names, e-mail addresses or street addresses) or personal health information in an Affiliate Site. Incorporate third-party intellectual property on an Affiliate Site. Publish an Affiliate Site that contains any viruses, Trojan horses, worms, bots, backdoors, and/or other computer programming routines that may potentially damage, interfere with, intercept, disable, deactivate, or expropriate any personal information or third-party intellectual property. Use search engine marketing in order to generate affiliate revenues through SafeGraph. Alter through redirection or other means the http referrer. Use or register a domain name containing merchant or other entities’ names, brands or trademarks, or misspellings thereof. Fail to comply with all applicable United States Federal Trade Commission, UK Advertising Standards Authority or other applicable guidelines, including those concerning sponsored content and how implied endorsements and testimonials like affiliate marketing must be disclosed to consumers. Engage in cookie stuffing or include pop-ups, false or misleading links on Affiliate Sites; Mask, obscure or otherwise deidentify the referring URL information (i.e. the page from which a click originates). Use redirects to bounce a click off of a domain from which the click did not originate in order to give the appearance that it came from such domain. Intersperse any content or enable any additional pop-up between an affiliate link and a Booking Platform. Directly or indirectly access, launch, and/or activate links through or from, or otherwise incorporate links in, any software application, website, or other means except as expressly authorized by SafeGraph in the Agreement. “Crawl”, “spider”, index or in any non-transitory manner store or cache information obtained from any links, or any part, copy, or derivative thereto. Fail to comply with Booking Platform and/or Merchant pass-through terms. Fail to include a clear and concise disclosure statement within any and all Affiliate Sites (including, without limitation, all pages, blog posts, social media posts or emails) where affiliate links are posted as an endorsement or review, and where it is not clear that such any such link is a paid advertisement Create the impression that your website is the website of a merchant or other entity, including, without limitation, framing or copying of a website in any manner or creating banners or advertisements that mimic a merchant or other entity's website’s search, display, or social ads in any manner. Purchase advertisements that direct to your site(s) that could be considered as competing with a merchant’s ads. Use SafeGraph affiliate tracking tags outside of SafeGraph technologies. Be an entity registered in a country that has economic sanctions and export control laws and regulations of the United States, EU, UK and, as applicable, other jurisdictions upon it. SafeGraph reserves the right to deny any publisher suspected of engaging in the above activities from its publisher network or any other activity prohibited under the Agreement. In the case of a publisher already accepted into the SafeGraph publisher network, should they be suspected of engaging in prohibited activities, SafeGraph reserves the right to suspend or terminate their account at any time, without compensation and in its sole discretion. Traffic Geography SafeGraph currently monetizes traffic exclusively from North America, Europe and APAC. Due to our current geographical coverage, if your site does not have a significant amount of traffic from these regions, we cannot monetize your content and we may deny your application to ensure you avoid disappointment with our service. We are unable to work with publishers located in or providing services to the following countries: Cuba, Democratic Republic of Congo, Iraq, Central African Republic, Donetsk, Luhansk, Crimea, Russia, Yemen, Syria, Sudan, Lebanon, Somalia, Haiti, Zimbabwe, Belarus, Myanmar (Burma), Libya, Venzuela, Nicaragua, and North Korea. #### Reservations Terms of Service Reservations Terms of Service Get early access to future retail closure data SafeGraph Reservations Terms of Service 1.     General1.1.  These Terms of Service, together with the Privacy Terms (available here) and Reservation Service Policies (available here) (together, the “Terms of Use”) set out the terms and conditions on which SafeGraph, Inc. (a Delaware Corporation with corporate address 1580 N Logan St Ste 660 #53755 Denver, CO 80203-1942) (“SafeGraph”) provides the Service to Publishers. The Terms of Use are a contract between you and SafeGraph.1.2.  By submitting your application to SafeGraph and/or your continued use of the Service, you are confirming that either:1.2.1.     you are a Publisher intending to enter into the Agreement with SafeGraph in a personal capacity, and that you agree to comply with the Terms of Use; or1.2.2.     you are an employee, agent or subcontractor of a Publisher and to whom the Publisher has granted all necessary authorizations to agree to comply with the Terms of Use and to enter into the Agreement with SafeGraph, in each case on behalf of Publisher.If you are unable to provide one of the above confirmations, you must not submit the application, or otherwise use or access the Service.2.     Definitions2.1.  In these Terms of Service, the following expressions have the following meanings:2.1.1.     “Affiliate Sites” means any and all internet websites, mobile websites, and mobile applications controlled or operated by Publisher and/or its corporate affiliates.2.1.2.     “Agreement” means the agreement between SafeGraph and Publisher which (i) is created on acceptance by SafeGraph of Publisher’s application in accordance with clause 3.1 and (ii) incorporates these Terms of Use.2.1.3.     “Applicable Laws” means all applicable statutes, common law, orders, regulations and regulatory policies, binding codes of practice and guidance notes, directives, notices or requirements of any Governmental Authority, as amended or superseded, including but not limited to:2.1.3.1.          US Federal Trade Commission rules and guidelines regarding collection, use and disclosure of data from or about End Users and/or specific devices;2.1.3.2.          any and all applicable federal, national, state or other privacy and data protection laws as may be amended or superseded from time to time (the “Applicable Data Protection Laws”);2.1.3.3.          economic sanctions and export control laws and regulations of the United States, Canada, and other jurisdictions, as applicable; and2.1.3.4.          any similar rules, guidelines or principles of any applicable jurisdiction.2.1.4.     “Booking Platform” means an entity providing an affiliate reservation service on behalf of multiple Merchants.2.1.5.     “Booking Platform Links” means the text links, graphical hypertext links and other linking code obtained via the Reservation Link Platform which provide direct access to the Booking Platform and which properly record referrals made to such Booking Platform.2.1.6.     “Booking Platform Materials” means the Booking Platform Links and Booking Platform Marks of a Booking Platform.2.1.7.     “Booking Platform Marks” means the website, name, word mark, and any other graphics, logos, designs, scripts, indicia and service names associated with such Booking Platform.2.1.8.     “Chargeback” as defined in Section 9.2.1.9.     “Commission” means a payment made by a Merchant or Booking Platform to SafeGraph resulting from Sales effected by the Service.2.1.10.  “End User” means a user of an Affiliate Site, including when such user engages a Merchant experience via a Booking Platform or otherwise through the Service.2.1.11.  “Governmental Authority” means (a) any international, foreign, federal, state, county or municipal government, or political subdivision thereof; (b) any governmental or quasi-governmental agency, authority, board, bureau, commission, department, instrumentality or public body; or (c) any court or administrative tribunal of competent jurisdiction.2.1.12. “Merchant” means a supplier of goods and/or services to End Users.2.1.13. “Personal Data” means any information that relates to an identified or identifiable individual (and such term shall include, where required by Applicable Data Protection Laws, unique browser and device identifiers.)2.1.14. “Publisher” means a person or other entity who is entitled to access and use the Service under these Terms of Service, or individuals representing such a person or entity.2.1.15.  “Publisher Revenue” means a Publisher’s share of a Commission, less any Chargebacks.2.1.16.  “Reservation Link Platform” means SafeGraph’s proprietary repository of Booking Platform Materials, made available by SafeGraph in whatever form, which Publishers can use to drive End User engagement of a Merchant’s goods or services for purposes of earning Publisher Revenue.2.1.17.  “SafeGraph Code” means computer code that enables the Service on an Affiliate Site.2.1.18.  “Sale” means a Merchant’s provision of an experience, goods, and/or services to an End User for valuable consideration following a referral of such End User from an Affiliate Site to a Booking Platform.2.1.19.  “Service” means authorized access to and use of the Reservation Link Platform, SafeGraph Code, and any other SafeGraph affiliation technologies that SafeGraph may make available for Publishers’ use.2.1.20.  “Usage Data” means information collected or created by the use of the Service.3.     The Service3.1.  Once SafeGraph approves a Publisher’s application, SafeGraph shall make available SafeGraph Code for Publisher to implement on the Affiliate Sites, which will facilitate Publisher’s End Users to make bookings with a Merchant through the applicable Booking Platform.3.2.  SafeGraph may approve or reject an application to register for the Service at its entire discretion, without obligation to provide reasoning. Publisher’s registration for the Service is specific to the Affiliate Site(s) set out in Publisher’s application and it may not use the Service in relation to any Affiliate Site other than those for which it has SafeGraph’s prior permission.3.3.  If Publisher wishes to use the Service in relation to any additional or alternative Affiliate Sites then Publisher may submit an additional application requesting that such Affiliate Sites be added to its list of approved affiliate sites (an “Additional Affiliate Site Request”), which SafeGraph may approve or reject at its entire discretion.4.     Booking Platforms4.1.  Publisher acknowledges and agrees that:4.1.1.     SafeGraph may from time to time, and with immediate effect and at its discretion, integrate or exclude any Booking Platform into or from the Service and each Booking Platform may from time to time, and with immediate effect and at its discretion, integrate or exclude any Merchant;4.1.2.     either SafeGraph or a Booking Platform may from time to time, and with immediate effect, vary Commission rates and/or Publisher Revenue as well as the way in which Commissions and/or Publisher Revenue are calculated;4.1.3.     a Booking Platform may from time to time, and with immediate effect, terminate its involvement in the Service in relation to all or some Publishers or request removal of a specific link, brand, product, or trademark from any or all Affiliate Sites;4.1.4.     SafeGraph may notify Publisher of any changes pursuant to clauses 4.1.1, 4.1.2 or 4.1.3 through means in its sole discretion. If SafeGraph notifies Publisher that it has received a notice from a Booking Platform requesting that Publisher remove links or references to the Booking Platform Materials from Affiliate Sites, Publisher agrees to remove the relevant links and/or references as soon as reasonably practicable (and in any event within two (2) business days of receiving the notice from SafeGraph). It is Publisher’s responsibility to check the Agreement to ensure that Publisher is up to date with such changes; and4.1.5.     SafeGraph shall be entitled to share Usage Data collected or received in connection with the performance of the Service, including reporting Publisher’s performance to Booking Platforms.5.     Use of the Service5.1.  In order to use the Service, Publisher:5.1.1.     must be approved by SafeGraph pursuant to Section 3, and use the SafeGraph Code only on Affiliate Site(s) approved by SafeGraph;5.1.2.     acknowledges and agrees that SafeGraph is entitled to monitor Publisher’s use of the Service to ensure it is being used by Publisher in accordance with this Agreement. In the event Publisher uses any third-party marketing link affiliation service directly alongside the Service, interference in the correct operation of the Service is possible, including interference with the calculation of Publisher Revenue, and SafeGraph’s warranties do not apply;5.1.3.     agrees, represents and warrants that it will comply with all Applicable Laws in its performance of this Agreement, including with respect to the use of the Service;5.1.4.     must comply with Reservation Service Policies (as amended by SafeGraph from time to time); and5.1.5.     must comply with the Privacy Terms (as amended by SafeGraph from time to time).5.2.  Publisher acknowledges and agrees that it and SafeGraph designate as a third-party beneficiary of this Section 5, including without limitation clause 5.1.4, any Booking Platform whose Booking Platform Materials are available through the Service and such Booking Platform shall accordingly have the right to directly enforce this Section 5 against Publisher with respect to its Booking Platform Materials.5.3.   Notwithstanding any other term or condition set out in the Terms of Use, SafeGraph reserves the right at any time in its sole discretion, without notice or liability to Publisher: (i) to refuse to permit Publisher to use the Service or any portion thereof; and (ii) to amend the measures taken to protect against inappropriate use of the Service.6.     Revenue6.1.  Publisher shall be entitled to its proportional share of Commissions, less any Chargebacks (i.e., “Publisher Revenue”), collected by SafeGraph resulting from Publisher’s use of the Services.6.2.  Publisher acknowledges and agrees that:6.2.1.     the calculation of Commissions due, if any, shall be performed by the relevant Booking Platform, each of which may use different methods for calculating Commissions;6.2.2.     a Booking Platform may refuse to pay or later adjust Commissions on a number of grounds in its sole discretion and that neither SafeGraph nor Publisher has any right to challenge or appeal a Booking Platform’s determination of Commissions due;6.2.3.     where a Merchant participates in more than one Booking Platform and both or all Booking Platforms participate in the Service, SafeGraph may in its sole discretion attribute the Commission to the Booking Platform of its choosing.6.3.  SafeGraph shall collect, calculate, and aggregate all Publisher Revenue due in connection with the Service. SafeGraph will take commercially reasonable steps to report collections, allocations, and disbursements of Commissions and Publisher Revenue to Publisher.7.     Payment Terms7.1.  SafeGraph will pay Publisher the Publisher Revenue for a given month, minus any Chargebacks made in accordance with Section 9 and any pending payments from Merchants and/or Booking Platforms, within 92 days after the end of the month in which such Publisher Revenue was earned or within 30 days of receiving payment from the Merchants and/or Booking Platforms, whichever is later.7.2.  Payments will be made in accordance with the payment method selected by Publisher in its application. While SafeGraph is responsible for the cost of making payments, Publisher is solely responsible for any fees charged by Publisher’s bank or other provider for receiving funds.7.3.  Publisher acknowledges and agrees that it is solely responsible for ensuring that its bank account details and all other necessary payment information on the application (“Payment Details”) are correct and up to date at all times, and that SafeGraph is not required either to verify the Payment Details or to notify Publisher if it discovers that the Payment Details are incorrect.7.4.  If SafeGraph is unable to pay Publisher Revenue to Publisher as a result of the Payment Details being out of date or otherwise incorrect, then Publisher shall be entitled to notify SafeGraph of the correct Payment Details and request payment of such invoice during the period ending on the last day of the calendar year in which the invoice was issued or, if earlier, the last day of the six month-period following the date of the invoice (“Claim Period”). If Publisher has not notified SafeGraph of the correct Payment Details and requested payment within the Claim Period then Publisher hereby unconditionally and irrevocably waives its right to payment of the relevant Publisher Revenue.8.     Taxes8.1.  Publisher is responsible for all taxes applicable to its use of the Services and performance under the Agreement, provided that taxes may be deducted or withheld from any payments made to Publisher hereunder as SafeGraph determines to be required by Applicable Laws, and payment to Publisher as reduced by such deductions or withholdings will constitute full payment and settlement to Publisher of such payment. Publisher may not charge and SafeGraph will not be liable for any income taxes imposed on Publisher or any other taxes or charges assessed against Publisher or associated with the operation of Publisher’s business. Prior to Publisher receiving any payments hereunder, Publisher will deliver all required tax documentation to SafeGraph in a manner reasonably requested by SafeGraph. Additionally, Publisher will provide SafeGraph with any forms, documents or certifications as SafeGraph may reasonably request.9.     Chargebacks9.1.  A Booking Platform may require SafeGraph to reverse the Commission paid in respect of a Sale (a “Chargeback”) in certain circumstances, including (but not limited) to the following:9.1.1.     the Sale was not a bona fide transaction;9.1.2.     the relevant goods or services were not utilized or returned by the End User; or9.1.3.     it is discovered that the transaction was fraudulent.9.2.  In the event of a Chargeback, the Publisher Revenue will be reduced accordingly. Publisher acknowledges and accepts that information regarding individual Chargebacks is not available and that Publisher has no right to appeal or otherwise challenge a Chargeback.9.3.  Publisher acknowledges and agrees that the Commissions remain subject to Chargebacks, and are therefore conditional, even after it has been paid to SafeGraph or Publisher.10.  Service Availability10.1.                 SafeGraph will use commercially reasonable efforts to ensure that the Service works on Affiliates Sites but gives no warranty that the Service, which is otherwise provided “AS IS” and “AS AVAILABLE”, will achieve any minimum availability or response targets.10.2.                 Publisher agrees to notify SafeGraph promptly of any Service availability or performance issues via e-mail to notices@safegraph.com. SafeGraph will use commercially reasonable efforts to correct any reported issues as soon as reasonably practicable.11.  Service Suspension11.1.                 If SafeGraph has reasonable grounds to believe that Publisher is not using the Service in accordance with the terms of the Agreement, including in breach of the Reservation Services Policies, SafeGraph may, without limiting any other remedy available at law or in equity and without limiting Clause 5.2 hereof:11.1.1.  request Publisher either to remedy the breach or other default within such time frame as SafeGraph may reasonably require; or11.1.2.  if SafeGraph in its discretion considers that the breach or other default is sufficiently serious, or if Publisher has failed to respond to SafeGraph’s request under clause 11.1.1 to SafeGraph’s reasonable satisfaction, suspend Publisher’s access to the Service, in whole or in part, with immediate effect and without any obligation to provide prior notice (a “Service Suspension”).11.2.                 As soon as reasonably practicable following a Service Suspension, SafeGraph will notify Publisher of the reason(s) for the Service Suspension and, where applicable, confirm the steps that Publisher is required to take before SafeGraph may elect to reinstate Publisher’s access to the Service.11.3.                 SafeGraph may, but shall have no obligation to, reinstate Publisher’s access to the Service after:11.3.1.  Publisher provides written certification that the breach or grounds giving rise to the Service Suspension have been completely remedied; and11.3.2.  At its election, SafeGraph has performed tests or otherwise is able to satisfy itself that such breach or default has in fact been adequately remedied.12.  Term and Termination of the Agreement12.1.                 The Agreement will commence when SafeGraph notifies Publisher in accordance with these Terms of Service that Publisher’s application has been accepted and, unless earlier terminated in accordance with clause 12.2, will continue until either party gives the other party written notice of termination at any time via email. For the avoidance of doubt, either party may terminate this Agreement with immediate effect at any time for any or no reason without liability to the other party. Notices to SafeGraph shall be sent to notices@safegraph.com.  Notices to Publisher will be sent to the email address provided by Publisher in its application or as updated in accordance with this Agreement.12.2.                 Either party may terminate this agreement with immediate effect if:12.2.1.  the other party becomes bankrupt, insolvent or unable to pay its debts in accordance with applicable laws; or12.2.2.  the other party is in breach of any material term of the Agreement and, in the case of a breach capable of remedy, has failed to remedy such breach within three (3) days of having been notified in writing of such breach.12.3.                 Upon termination of the Agreement:12.3.1.  Publisher will immediately cease all use of the Service, remove all instances of the Service and Booking Platform Materials from all Affiliate Sites, and promptly return to SafeGraph, or at SafeGraph’s written request, destroy, any and all of its intellectual property rights, information and/or materials, or those of any Booking Platform provided hereunder, in Publisher’s possession; and12.3.2.  except where (i) the Agreement is terminated by SafeGraph under clause 12.2 (in which case Publisher forfeits all rights to receive any further payments) or (ii) the Publisher Revenue accrued (less any Chargebacks) is less than $65 (USD), SafeGraph shall pay Publisher the Publisher Revenue accrued (less any Chargebacks and pending payments from Merchants or Booking Platforms) not later than the date falling three (3) months after the effective date of termination.13.  Grant of Rights13.1.                 SafeGraph grants to Publisher a non-exclusive, non-transferable, non-sublicensable, revocable, worldwide, license to use the Service during the term of the Agreement in accordance with the terms and conditions hereof.13.2.                 Publisher shall not use the Service or Booking Platform Materials in any way other than as set out in the Agreement. Any attempt to interfere with the operation of the Service (or any part thereof) will constitute a breach of the Agreement.13.3.                 All intellectual property rights subsisting in the Service (or any part thereof, including without limitation the SafeGraph Code and Reservation Link Platform), and in any developments, enhancements, data, information and other material relating to, arising out or derived from the Service, or any part thereof (“Derivative Works”), shall at all times be owned by and vest in SafeGraph, other than the Booking Platform Materials which shall remain at all times the intellectual property of the relevant Booking Platform. Except as expressly set forth in clause 13.1, Publisher agrees that the Agreement does not transfer or grant any right, title or interest in any other party’s intellectual property, including SafeGraph’s intellectual property rights in or to the Service, or the Derivative Works (or any part thereof) to Publisher. All intellectual property rights subsisting in the Affiliate Sites shall at all times be owned by and vest in Publisher, except to the extent necessary for SafeGraph to exercise its rights or perform its obligations hereunder including, without limitation, those set forth in Section 16.13.4.                 Each party owns all data, if any, that such party collects in connection with the Service. As to SafeGraph, such data may include and is not limited to Usage Data. The foregoing shall further include any reports created, compiled, analyzed, or derived by a party with respect to such data. SafeGraph’s data collection practices are reflected in its Privacy Policy, which Publisher should periodically review for updates.13.5.                 Publisher is not required to provide any feedback or suggestions to SafeGraph. To the extent Publisher does provide any such feedback or suggestions, Publisher hereby grants to SafeGraph and its affiliates a non-exclusive, perpetual, irrevocable, royalty-free, transferable, worldwide right, and license to use, reproduce, disclose, sublicense, distribute, modify, and otherwise exploit all such feedback and suggestions without restriction.14.  Privacy and Data Protection14.1.                 Publisher acknowledges and agrees that it will not send or make available to SafeGraph any Personal Data belonging to a third party in connection with the Service. To the extent any Personal Data is collected in connection with the Service, the parties agree to SafeGraph’s Privacy Terms located here, as updated from time to time, and to maintain any such data in compliance with Applicable Laws.14.2.                 Each party shall implement appropriate technical and organizational security measures to protect Personal Data collected in connection with the Service from accidental or unlawful destruction, loss, alteration, and unauthorized disclosure or access, consistent with the requirements of Applicable Data Protection Laws. Each party assumes responsibility for its collection, use, processing, and maintenance of Personal Data, if any.15.  Assignment15.1.                 Publisher may not assign or otherwise transfer its rights and/or obligations under the Agreement, whether in whole or in part. SafeGraph may transfer its rights and/or obligations under the Agreement, whether in whole or in part, without Publisher’s consent.16.  Communication16.1.                 Any notice under the Agreement shall be in writing and shall be made either via email or certified mail to the other party’s registered office address. Notices sent by email will be deemed effective 24 hours from the time of sending to the other party’s provided email address and notices sent by mail will be deemed effective 48 hours after posting.16.2.                 Publisher agrees that SafeGraph may use Publisher’s contact details (including its email and registered address) to notify it about its account with SafeGraph, any issues relating to, and updates to, the Service, and any modifications to the Terms of Use.16.3.                 SafeGraph may disclose its relationship with Publisher in its marketing material and in its operational relationship with Booking Platforms and Merchants, including using for such purposes Publisher’s name and trademarks.17.  Modifications17.1.                 SafeGraph may modify all or any part of this Agreement, at any time and at its sole discretion, with immediate effective.  Publisher’s continued participation in the Service following any such modification to this Agreement will constitute Publisher’s acceptance of the modification.18.  Indemnity18.1.                 Publisher shall indemnify, defend and hold harmless SafeGraph against all losses, liabilities, damages and costs (including legal expenses) sustained, incurred or suffered by SafeGraph as a result of any claim, action or proceeding that: (i) the Affiliate Sites infringe the intellectual property rights of any third party; (ii) Publisher is in breach of its obligations under the Terms of Use; or (iii) any third party claims arising from Publisher’s use of the Service otherwise than in accordance with the Terms of Use.19.  Limitations on Liability19.1.                 Except as expressly and specifically provided in the Agreement, all warranties, conditions, representations and other terms of any kind, whether express or implied, are, to the fullest extent permitted by law, excluded from the Agreement. In particular (but without prejudice to the generality of the foregoing), SafeGraph makes no express or implied warranties or representations with respect to the operation or availability of the Service, the participant of any Booking Platform or Merchant, or to the optimization of Commissions. SafeGraph will not be liable for the consequences of any interruptions to or errors in the Service.19.2.                 SafeGraph shall not be liable for: loss of profits; loss of business; depletion of goodwill or similar losses; loss of anticipated savings; or loss of goods; or loss of use; or loss or corruption of data or information; or any special, indirect, consequential or pure economic loss (whether or not falling in any of the foregoing categories), costs, damages, charges or expenses.19.3.                 In no event will SafeGraph’s total aggregate liability under or in connection with the Agreement, whether for breach of contract, tort (including negligence), misrepresentation or any other legal theory, shall be limited to an amount equal to SafeGraph’s share of Commissions during the twelve-month period preceding the date on which the claim arose.19.4.                 Nothing in the Agreement excludes the liability of either party for any other liability which cannot be excluded under applicable law, including fraud, fraudulent misrepresentation, or death or personal injury caused by either party’s negligence.20.  Force Majeure20.1.                 Neither party shall be liable to the other by reason of any event arising which is beyond the reasonable control of the affected party, including any industrial action (save in respect of affected party’s employees or suppliers, governmental regulations, fire, flood, disaster, civil riot or war).21.  Entire Agreement21.1.                 The Agreement constitutes the whole agreement between the parties relating to its subject matter and supersedes any prior drafts, agreements, undertakings, representations, warranties and arrangements of any nature, whether in writing or oral, relating to such subject matter.22.  Governing Law, Jurisdiction and Venue22.1.                 The Agreement shall be governed by and construed in accordance with the laws of the State of Delaware, except for its choice-of-law rules that would result in the law of another State or forum being applied. In relation to any legal action or proceedings to enforce the Agreement or arising out of or in connection with the Agreement, any such proceeding shall be brought exclusively in the state or federal courts of the State of Delaware, and each of the parties irrevocably submits to the exclusive jurisdiction of thereof. #### Safegraph Aliases for Emergency Response SafeGraph Aliases for Emergency Response Introducing: SafeGraph Aliases SafeGraph announces first data product supporting alternative names for Common Places, purpose-built for 9-1-1 emergency response. Be the first to know about launch updates Enable telecommunicators with a better understanding of common place names for 9-1-1 emergencies Most of the time, 9-1-1 callers will reference their own locations as what they know, for example, “I’m at the CVS on first street” or “I’m across the street from the Walmart”. Telecommunicators require an in-depth understanding of common place names for local areas so they can quickly identify a caller’s location and respond to the emergency. SafeGraph Aliases was built specifically for this; to help telecommunicators understand every alternative name a 9-1-1 caller may reference. SafeGraph Aliases handles all the ways that people reference the world around them Renamed locations: “I’m at the Tappan Zee Bridge” (which has recently been renamed to the Mario Cuomo Bridge)Stores that closed and reopened to something else:  “I’m at the Rite-Aid” (now a Walgreens)Places within venues: “I’m at the Abercrombie” - an “alias” for the Westfield Mall where the Abercrombie is locatedAlternative store references: “I’m at the TJ’s” - this is another way that people reference the brand Trader Joe’sBrand names for local stores: “I’m at the Chevy dealership” - the location may actually be called “Joe’s Auto Dealer”Local store names for national brand service: “I’m at Bob’s insurance” - this may actually be a “State Farm” where the main insurance agent is Bob Save precious response time by removing the need to ask callers for specific address information By providing more “aliases” for the locations that 9-1-1 callers reference, it allows telecommunicators to more quickly determine the caller’s location, saving precious seconds. Be confident in responding to emergency situations with a comprehensive and regularly updated database of common place names for every local area. #### SafeGraph Alternatives | Which POI data is right for you? SafeGraph Places Alternatives Which POI data provider is right for you? See a detailed breakdown of accuracy, precision, freshness, and cost as you compare SafeGraph to other places data. Learn More SAFEGRAPH OTHER PROVIDERS Update frequency Monthly Quarterly–Annually Price $$ $$$$ Major brand POIs Independent stores Non-commercial places Precise geocodes & polygons Transparent documentation Work with a true data partner SafeGraph curates the most accurate and fresh places data for product builders so they can spend time on what really matters - developing the best experience for their users. Let us handle the heavy lifting of sourcing and cleaning POIs. Just tell us what you need and we'll get it for you, appending detailed contextual information so you have exactly what you need to build a winning product. Ditch the stale data Not all data providers can keep up with a dynamically changing world. Most update their POI databases quarterly or annually, and some rely on crowdsourcing to stay up-to-date. SafeGraph updates our POI database every month so you always have the freshest data in your product. Get all the data you need with a straightforward price and TOS Unlike other providers, SafeGraph has transparent and flexible usage terms for our POI data. We deliver data to our partners where they want to receive it (Snowflake, AWS, Esri, or a simple CSV just to name a few), and don't nickel-and-dime based on how often the data is used or what it is used for. Just make one simple purchase of what you need and receive monthly updates to bake into your product. Trusted by Leading Innovators When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard.” Andy Stevens,Chief Data Officer at Clear Channel Europe Trusted by Leading Innovators With tracking, consent, and privacy being such hot-button issues these days, we needed to make sure we had all of our bases covered. SafeGraph ended up being the perfect combination of data quality, business integrity, and standardized delivery—all wrapped into one” Scott Stoltzman,Director of Data Science, RCLCO Trusted by Leading Innovators We pored through spreadsheets to isolate categories and look for issues in the data. And SafeGraph was the clear winner. There was just so much weird, junky stuff in the other datasets, it just didn’t pass basic data quality. So kudos to SafeGraph for a solid product. Nic Babb,VP of Product, Adomni Trusted by Leading Innovators When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard." Andy Stevens,Chief Data Officer at Clear Channel Europe Trusted by Leading Innovators With tracking, consent, and privacy being such hot-button issues these days, we needed to make sure we had all of our bases covered. SafeGraph ended up being the perfect combination of data quality, business integrity, and standardized delivery—all wrapped into one" Scott Stoltzman,Director of Data Science, RCLCO Trusted by Leading Innovators We pored through spreadsheets to isolate categories and look for issues in the data. And SafeGraph was the clear winner. There was just so much weird, junky stuff in the other datasets, it just didn't pass basic data quality. So kudos to SafeGraph for a solid product. Nic Babb,VP of Product, Adomni Talk to one of our data experts. #### Safegraph Places for Retail HIGH QUALITY POI DATA Fresh, accurate places data built for retailers Utilize fresh points of interest (POI) to attract high value customers, open stores at the right sites, and make the best strategic decisions. Get in touch with one of our data experts. SafeGraph’s Guide to Developing a Winning Retail Strategy with Location Data POI data can give retail businesses a competitive edge in market analysis, site selection, promotional strategy, and store planning. Simply knowing how to use this data to your advantage will give you the power to adapt your retail strategies in real-time to ever-evolving consumer and market dynamics. Read the guide Uncover Complex Market and Brand Insights in a Changing Economy With the detailed information provided by SafeGraph Places data, Sysco’s market, customer, and competitive intelligence team is able to deliver detailed reports to the leadership team and identify areas of opportunity. Read the case study How to use POI data for Catchment Area Analysis Using comprehensive and up-to-date POI data will not only fuel your efforts around catchment area analysis to improve site selection, but also glean actionable insights around how to create better overall customer experiences that local consumers actually want and need. Read the Blog #### Schedule a Demo See What Places Data Can Do for Your Business Schedule a demo with one of our data experts by filling out the form below. Trusted by Industry Leaders Our team is here to answer your questions Get a detailed introduction to SafeGraph's points of interest (POI) data tailored to your unique use case. Learn about pricing and delivery options. Looking for something else? For press inquiries, email press@safegraph.com #### Spend Enrich POIs with Aggregated Transaction Data See how and when people are spending their money at specific locations. Combine POI, polygon, and spending data to understand how consumers behave by geography. Schedule a Demo Download Sample Aggregated Permissioned and Anonymized Consumer Spending Data for Places By layering aggregated transaction data over high-quality point of interest data, you can gain a clear understanding of consumer spending behavior. You can analyze spending trends over time at a set location, between locations, and even across entire regions.Spend data provides detailed credit and debit transaction information like: Median spend per transaction Spend by day Spend by customer frequency Online vs in-person spend Coverage The Most Comprehensive Transaction Data Tied to Places Spend data is built with the largest source of credit and debit transactions available, plus a proprietary methodology to match transaction data to individual POIs for location-based insights. Watch the Overview Data Enrichment Aggregated and Anonymized Transaction Data for Financial Context Enriching points of interest (POIs) with transaction data enables you to see the whole picture, whether for competitive intelligence, site profiling, or audience building. See SafeGraph Spend in Action Ready to Use Data Comes Cleaned and Ready to Use Transaction data is messy, often requiring complex data engineering to be useful. Other providers do aggregation, but at too high a level to be insightful. Leveraging our expertise in the POI space, SafeGraph provides aggregated transaction data at the place-level for the most flexibility in analysis. Learn More Download a Free Sample ofSpend Data Explore aggregated transaction data tied to places. Get Your Free Geometry Data Sample Everything You Need to Get Started Access Data Specs and Delivery Information Spend Technical Docs Delivery Access all documents related to our transaction attributes below. See the schema for the Spend dataset so you understand how it pairs with Places and all relevant metadata. Spend Data Schema We outline exactly how our data is curated, and explore how to quantify sampling bias. Technical Notebook See how comprehensive the Spend dataset is, detailing the attributes, brands, and regional coverage included. Summary Statistics Data can be accessed simply using any of the following 3 methods: Access reliable, places-based transaction data at scale. Configure an S3 bucket for streamlined monthly data deliveries. Bulk Download Query and download SafeGraph Spend directly through Snowflake to easily integrate transaction data into your workflows. Snowflake Reach out and try SafeGraph Spend for yourself. Explore the columns included and see the possibilities for enriching your current data. Request a Sample Access all documents related to our transaction attributes below. See the schema for the Spend dataset so you understand how it pairs with Places and all relevant metadata. Spend Data Schema We outline exactly how our data is curated, and explore how to quantify sampling bias. Technical Notebook See how comprehensive the Spend dataset is, detailing the attributes, brands, and regional coverage included. Summary Statistics Data can be accessed simply using any of the following 3 methods: Access reliable, places-based transaction data at scale. Configure an S3 bucket for streamlined monthly data deliveries. Bulk Download Query and download SafeGraph Spend directly through Snowflake to easily integrate transaction data into your workflows. Snowflake Reach out and try SafeGraph Spend for yourself. Explore the columns included and see the possibilities for enriching your current data. Request a Sample FAQ’s What is aggregated transaction data & what does it include in Spend? Aggregated transaction data is data on individual transactions such as payments, purchases, loans, bookings, and more. Transaction data is aggregated when it comes from multiple sources and is compiled into a single usable collection of data. Personal information is removed from transaction data so that individuals cannot be tied directly to their purchases; this allows investors, retailers, financial institutions, and more to use spending data without compromising individual identities. Instead, spending data can be used to understand how groups of people near a specific POI or in a specific area (such as a neighborhood), interact with the places around them. How does SafeGraph build Spend data? From raw transaction data, we use a proprietary algorithm to match anonymized transactions to POIs, leveraging the high quality information from our Places dataset to ensure high veracity matches. From there, we aggregate the matched transactions and apply rigorous QA methods to create the final Spend product. We source transactions from a well known financial data aggregator used by some of the largest financial institutions. We receive consumer-permissioned and de-identified credit and bank card spending data from accounts at thousands of financial institutions in the US, which we then aggregate to our POI dataset. How often is Spend data updated? Spend data is aggregated monthly and delivered about two weeks after the final day of the prior month. This delay is to ensure that all transactions appropriately settle and to provide enough processing time. Does SafeGraph provide historical Spend data? Currently, we have transaction records going back to January of 2020. Which geographies does SafeGraph provide Spend data for? Spend data is currently only available in the US. Interested in another geography? Let us know. Aggregated transaction data is data on individual transactions such as payments, purchases, loans, bookings, and more. Transaction data is aggregated when it comes from multiple sources and is compiled into a single usable collection of data. Personal information is removed from transaction data so that individuals cannot be tied directly to their purchases; this allows investors, retailers, financial institutions, and more to use spending data without compromising individual identities. Instead, spending data can be used to understand how groups of people near a specific POI or in a specific area (such as a neighborhood), interact with the places around them.From raw transaction data, we use a proprietary algorithm to match anonymized transactions to POIs, leveraging the high quality information from our Places dataset to ensure high veracity matches. From there, we aggregate the matched transactions and apply rigorous QA methods to create the final Spend product. We source transactions from a well known financial data aggregator used by some of the largest financial institutions. We receive consumer-permissioned and de-identified credit and bank card spending data from accounts at thousands of financial institutions in the US, which we then aggregate to our POI dataset.Spend data is aggregated monthly and delivered about two weeks after the final day of the prior month. This delay is to ensure that all transactions appropriately settle and to provide enough processing time.Currently, we have transaction records going back to January of 2020.Spend data is currently only available in the US. Interested in another geography? Let us know. Resources VIDEOSafeGraph Spend: Aggregated & Anonymized Consumer Spending Data for Places Read Now BLOGCross Shopping Behaviour: See Where Else Consumers Spend Money Read Now BLOGAre US Inflation Trends Reflected in SafeGraph Spend? Read Now BLOGValidating Spend Data for Brands Against Company Reporting‍ Read Now EVENTAnnouncing SafeGraph Spend: Places-Based Transaction Data Read Now Learn More About Accurate and Precise Global POI Data #### Terms of Services Terms of Service Data License AgreementThe dataset (“Data”) that you are licensing is created and owned by SAFEGRAPH, INC., a Delaware corporation (“Licensor” or “SafeGraph”). You are purchasing the Data through Licensor’s online shop (“Shop”) and your use of the Data is governed by this Data License (“License” or “Agreement”), which constitutes a binding legal agreement between you (“Licensee”) and Licensor.EACH TIME YOU USE THE DATA, YOU ARE ACCEPTING THIS LICENSE. IF YOU DO NOT AGREE TO THE LICENSE, LICENSOR IS UNWILLING TO GRANT YOU THE RIGHT TO USE THE DATA, AND YOU MUST CEASE USE OF THE DATA IMMEDIATELY. YOU ACCEPT THE LICENSE BY (1) CLICKING TO AGREE OR ACCEPT WHERE THESE OPTIONS ARE PRESENTED TO YOU, (2) ACTUALLY USING THE DATA AND/OR (3) UPLOADING ANY OF YOUR DATA TO THE SHOP. IF YOU ARE ACCEPTING ON BEHALF OF YOUR EMPLOYER OR ANOTHER ENTITY, YOU REPRESENT AND WARRANT THAT: (I) YOU HAVE FULL LEGAL AUTHORITY TO BIND YOUR EMPLOYER OR SUCH ENTITY TO THE LICENSE; (II) YOU HAVE READ AND UNDERSTAND THE LICENSE; AND (III) YOU AGREE, ON BEHALF OF THE PARTY THAT YOU REPRESENT, TO THE LICENSE. IF YOU DON'T HAVE THE LEGAL AUTHORITY TO BIND, PLEASE DO NOT CONFIRM THAT YOU AGREE OR USE THE DATA.THIS IS A 1-YEAR LICENSE. YOU MUST DELETE THE DATA AFTER 1 YEAR.1. License.(a) License Grant. Subject to and conditioned on Licensee’s payment of fees and compliance with all the terms and conditions of this Agreement, Licensor hereby grants Licensee a non-exclusive, non-sublicensable, revocable, and non-transferable license during the Term to use the Data for “Permitted Uses”: (i) internal business or internal research purposes, and/or (ii) the creation of external products, applications, research publications and analyses based on the Data so long as (A) only non-material portions of the Data are exposed to third parties and (B) such products are not competitive with the offering of the Data for sale.(b) Use Restrictions. Licensee shall only use the Data for the Permitted Uses. Licensee shall not at any time, directly or indirectly: (i) sell, sublicense, assign, distribute, publish, transfer, disclose or otherwise make available the Data in its current form or substantially similar form, (ii) permit users of any product or service that incorporates the Data to download or export material portions of the Data (where “material portions” means a set of data that could be marketed independently, or reverse-engineered to discover any portion of the Data), (iii) use the Data to create or host any commercially available mailing list, point of interest database or business listings database, (iv) use the Data in any manner or for any purpose that infringes, misappropriates, or otherwise violates any intellectual property right or other right of any person, or that violates any applicable law, (v) use the Data to attempt to identify behavior of a known individual for any reason, (vi) use the Data to do advertising targeting or attribution of individuals based on visits to any health care point of interest, (vii) use the Data to analyze, study, or report on protests or social demonstrations, or (viii) as it solely relates to Data referred to as Spend, use the Data for purposes of investing in financial instruments or in connection with any other financial service or use.In addition to and without limiting the foregoing, Licensees accessing the Data through SafeGraph’s API shall not: (1) cache the Data in or on any medium for any period of time; (2) implement any measure that might avoid or circumvent SafeGraph’s API usage limitations or interfere with the accuracy of reporting; (3) attempt to circumvent any API limits, including, but not limited to, mass-registration of applications; or (4) append materials or content to API requests or queries, unless approved in advance by SafeGraph).(c) Reservation of Rights. Licensor reserves all rights not expressly granted to Licensee in this Agreement. Except for the limited rights and licenses expressly granted under this Agreement, nothing in this Agreement grants, by implication, waiver, estoppel, or otherwise, to Licensee or any third party any intellectual property rights or other right, title, or interest in or to the Data. Without limiting the foregoing, Licensee shall not acquire any proprietary rights of the Data.(d) Additional Restrictions. Additional third party restrictions are set forth in Section 10.2. Fees and Delivery.(a) Fees. Fees for this License will be as set forth in the Shop (the “Fees”). Licensee is responsible for the timely payment of the Fees. If Licensee fails to make any payment when due, in addition to all other remedies that may be available: Licensor may prohibit access to the Data until all past due amounts have been paid, without incurring any obligation or liability to Licensee or any other person by reason of such prohibition of access to the Data.(b) Taxes. Licensee is responsible for all sales, use and excise taxes, and any other similar taxes, duties and charges of any kind imposed by any federal, state or local governmental or regulatory authority on the Fees, other than any taxes imposed on Licensor’s income.(c) Delivery. The Data will be delivered to Licensee by Licensor through the Shop (either through flat file delivery or API, as applicable), or as otherwise agreed to. Licensor has no liability due to a delay in delivery or any temporary interruption in service of the Shop or API.3. Data Security and Licensee Covenants.(a) Data Security. Licensee shall use all reasonable legal, organizational, physical, administrative and technical measures, and security procedures to safeguard and ensure the security of the Data and to protect the Data from unauthorized access, disclosure, duplication, use, modification, or loss.(b) Licensee Representations and Covenants. Licensee represents and warrants that it has the full right, power and authority to enter into this Agreement and to perform its obligations hereunder; and that Licensee’s use of the Data and performance of this Agreement shall not violate, conflict with, or result in a material default under any other agreement, including confidentiality agreements between Licensee and third parties. Licensee covenants to maintain, hold and process the Data in compliance with all applicable laws. Licensee covenants it shall not attempt to reverse engineer, decompile, or otherwise re-identify the Data by using any method, including, but not limited to, merging external data with Data provided by Licensor. Licensee agrees to not (i) circumvent security features used to prevent or restrict access to or use the Shop or API, (ii) create user accounts by automated means or (iii) impersonate any person or entity.4. Intellectual Property Ownership.Licensee acknowledges that, as between Licensee and Licensor, Licensor owns all right, title and interest, including all intellectual property rights, in and to the Data. Licensee further acknowledges that: (a) the Data is an original compilation protected by United States copyright laws; (b) Licensor has dedicated substantial resources to collect, manage and compile the Data; and (c) the Data constitutes trade secrets of Licensor. If Licensee contests any of Licensor’s right, title, or interest in or to the Data, including without limitation, in a judicial proceeding anywhere throughout the world, (a) Licensor may terminate this Agreement without advance notice to Licensee or an opportunity for Licensee to cure and without further obligation or liability and (b) Licensee acknowledges and agrees that it will be in material breach under this Agreement.‌5. Disclaimer of Warranties.THE DATA IS PROVIDED “AS IS” AND LICENSOR HEREBY DISCLAIMS ALL WARRANTIES, WHETHER EXPRESS, IMPLIED, STATUTORY OR OTHERWISE. LICENSOR SPECIFICALLY DISCLAIMS ALL IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT, AND ALL WARRANTIES ARISING FROM COURSE OF DEALING, USAGE OR TRADE PRACTICE. LICENSOR MAKES NO WARRANTY OF ANY KIND THAT THE DATA, OR ANY PRODUCTS OR RESULTS OF ITS USE, WILL MEET LICENSEE’S OR ANY OTHER PERSON’S REQUIREMENTS, OPERATE WITHOUT INTERRUPTION, ACHIEVE ANY INTENDED RESULT, BE COMPATIBLE OR WORK WITH ANY SOFTWARE, SYSTEM OR OTHER SERVICES, OR BE SECURE, ACCURATE, COMPLETE, OR ERROR FREE.‌6. Indemnification.Licensee shall indemnify, hold harmless, and, at Licensor’s option, defend Licensor from and against any and all losses, damages, liabilities, or costs (including attorneys’ fees) (“Losses”) resulting from any any third-party claim, suit, action, or proceeding (“Third-Party Claim”) based on Licensee’s: (i) negligence or willful misconduct; (ii) breach of representation or warranty hereunder or (iii) use of the Data in a manner not authorized by this Agreement, provided that Licensee may not settle any Third-Party Claim against Licensor unless such settlement completely and forever releases Licensor from all liability with respect to such Third-Party Claim or unless Licensor consents to such settlement, and further provided that Licensor shall have the right, at its option, to defend itself against any such Third-Party Claim or to participate in the defense thereof by counsel of its own choice.‌ Use of Licensed Data by Licensee shall be at its own risk and Licensee will indemnify, defend and hold harmless Licensor’s third-party suppliers from and against all Losses that arise with respect to Licensee’s use of the Data.7. Limitations of Liability.IN NO EVENT WILL LICENSOR BE LIABLE UNDER OR IN CONNECTION WITH THIS AGREEMENT UNDER ANY LEGAL OR EQUITABLE THEORY, INCLUDING BREACH OF CONTRACT, TORT (INCLUDING NEGLIGENCE), STRICT LIABILITY, AND OTHERWISE, FOR ANY (a) CONSEQUENTIAL, INCIDENTAL, INDIRECT, EXEMPLARY, SPECIAL, ENHANCED OR PUNITIVE DAMAGES, (b) INCREASED COSTS, DIMINUTION IN VALUE OR LOST BUSINESS, PRODUCTION, REVENUES, OR PROFITS, (c) LOSS OF GOODWILL OR REPUTATION, (d) USE, INABILITY TO USE, LOSS, INTERRUPTION, DELAY OR RECOVERY OF ANY DATA OR BREACH OF DATA OR SYSTEM SECURITY, OR (e) COST OF REPLACEMENT GOODS OR SERVICES, IN EACH CASE REGARDLESS OF WHETHER LICENSOR WAS ADVISED OF THE POSSIBILITY OF SUCH LOSSES OR DAMAGES OR SUCH LOSSES OR DAMAGES WERE OTHERWISE FORESEEABLE. IN NO EVENT WILL LICENSOR’S AGGREGATE LIABILITY ARISING OUT OF OR RELATED TO THIS AGREEMENT, UNDER ANY LEGAL OR EQUITABLE THEORY, INCLUDING BREACH OF CONTRACT, TORT (INCLUDING NEGLIGENCE), STRICT LIABILITY AND OTHERWISE, EXCEED THE TOTAL FEES PAID BY LICENSEE IN THE YEAR PERIOD PRECEDING THE EVENT GIVING RISE TO THE CLAIM.‌8. Term and Termination.(a) Term. The term of the license set for in this Agreement begins on the date Licensee purchases or receives (whichever occurs first) any Data via Shop flat file or API request or otherwise and, unless terminated earlier pursuant to any of the Agreement’s express provisions, will continue in effect until the one year anniversary of such date (the “Term”). Each purchase of Data triggers a new Term with respect to that piece of Data. Licensee and Licensor may agree to extend the Term by separate written agreement.(b) Termination. In addition to any other express termination right set forth elsewhere in this Agreement, Licensor may terminate this Agreement, effective on written notice to Licensee, if Licensee breaches any of its obligations under this Agreement.(c) Effect of Expiration or Termination. Upon expiration or earlier termination of this Agreement, the license granted hereunder will also terminate, and, without limiting Licensee’s obligations under Section 3, Licensee shall within 5 business days cease using and delete all copies of the Data (except that Licensee may retain any Placekeys in perpetuity). Within 15 days of Licensor’s request, Licensee shall certify in writing to the Licensor that the Data has been deleted. No expiration or termination will affect Licensee’s obligation to pay all Fees that may have become due before such expiration or termination, or entitle Licensee to any refund.(d) Survival. This Section 8(d) and Sections 3 (Data Security and Licensee Covenants), 4 (Intellectual Property Ownership), 5 (Disclaimer), 6 (Indemnification), 7 (Limitations of Liability), 9 (Miscellaneous), 10 (Third Party Terms), 11 (Academic Partnership Program) and 12 (Matching Service) survive any termination or expiration of this Agreement. No other provisions of this Agreement survive the expiration or earlier termination of this Agreement.‌9. Miscellaneous.‌(a) Entire Agreement; Interpretation. This Agreement constitutes the sole and entire agreement of the parties with respect to the subject matter of this Agreement and supersedes all prior and contemporaneous understandings, agreements, and representations and warranties, both written and oral, with respect to such subject matter. Nothing in this Agreement shall create any rights in any third party beneficiaries. The parties agree that any principle of construction or rule of law that provides that an agreement shall be construed against the drafter of the agreement in the event of any inconsistency or ambiguity in such agreement shall not apply to the terms and conditions of this Agreement.(b) Notices. All notices, requests, consents, claims, demands, waivers, and other communications hereunder (each, a “Notice”) must be in writing. The parties shall deliver Notices by personal delivery, nationally recognized overnight courier (with all fees prepaid), email or certified or registered mail (in each case, return receipt requested, postage prepaid). Except as otherwise provided in this Agreement, a Notice is effective only: (i) upon receipt by the receiving party, and (ii) if the party giving the Notice has complied with the requirements of this Section. Refusal to accept Notice shall be deemed receipt.(c) Force Majeure. In no event shall Licensor be liable to Licensee, or be deemed to have breached this Agreement, for any failure or delay in performing its obligations under this Agreement, if and to the extent such failure or delay is caused by any circumstances beyond Licensor’s reasonable control, including but not limited to acts of God, flood, fire, earthquake, explosion, war, terrorism, invasion, riot or other civil unrest, strikes, labor stoppages or slowdowns or other industrial disturbances, or passage of law or any action taken by a governmental or public authority, including imposing an embargo.(d) Amendment and Modification; Waiver. Licensee’s rights and obligations under this License may be amended or modified from time to time and at any time. If any such amendment or modification is material, Licensor will post notice of it on the Shop or by email to registered users. Your access of the Shop or API and use of the Data following any such amendment or modification shall be deemed your acceptance of such amendment and modification. You agree to review this License periodically to be aware of such amendments and modifications. No waiver by any party of any of the provisions hereof will be effective unless explicitly set forth in writing and signed by the party so waiving. Except as otherwise set forth in this Agreement, (i) no failure to exercise, or delay in exercising, any rights, remedy, power or privilege arising from this Agreement will operate or be construed as a waiver thereof and (ii) no single or partial exercise of any right, remedy, power, or privilege hereunder will preclude any other or further exercise thereof or the exercise of any other right, remedy, power, or privilege.(e) Severability. If any provision of this Agreement is invalid, illegal, or unenforceable in any jurisdiction, such invalidity, illegality, or unenforceability will not affect any other term or provision of this Agreement or invalidate or render unenforceable such term or provision in any other jurisdiction. Upon such determination that any term or other provision is invalid, illegal, or unenforceable, the parties hereto shall negotiate in good faith to modify this Agreement so as to effect the original intent of the parties as closely as possible in a mutually acceptable manner in order that the transactions contemplated hereby be consummated as originally contemplated to the greatest extent possible.(f) Governing Law; Submission to Jurisdiction. This Agreement is governed by and construed in accordance with the internal laws of the State of Delaware without giving effect to any choice or conflict of law provision or rule that would require or permit the application of the laws of any jurisdiction other than those of the State of Delaware. Any legal suit, action, or proceeding arising out of or related to this Agreement or the licenses granted hereunder may be instituted exclusively in the federal courts of the United States or the courts of the State of Delaware in each case located in the state of Delaware, and each party irrevocably submits to the exclusive jurisdiction of such courts in any such suit, action, or proceeding.(g) Assignment. Licensee may not assign or transfer any of its rights or delegate any of its obligations hereunder, in each case whether voluntarily, involuntarily, by operation of law or otherwise, without the prior written consent of Licensor. Any purported assignment, transfer, or delegation in violation of this Section is null and void. Licensor may assign this Agreement to a successor in connection with the merger, consolidation, or sale of all or substantially all of its assets or that portion of its business to which this Agreement relates without the consent of Licensee. This Agreement is binding upon and inures to the benefit of the parties hereto and their respective permitted successors and assigns.(h) Export Regulation. The Data may be subject to US export control laws, including the US Export Administration Act and its associated regulations. Licensee shall not, directly or indirectly, export, re-export, or release the Data to, or make the Data accessible from, any jurisdiction or country to which export, re-export, or release is prohibited by law, rule, or regulation. Licensee shall comply with all applicable federal laws, regulations, and rules, and complete all required undertakings (including obtaining any necessary export license or other governmental approval), prior to exporting, re-exporting, releasing, or otherwise making the Data available outside the US.‌(i) Equitable Relief. Licensee acknowledges and agrees that a breach or threatened breach by Licensee of any of its obligations under Section 1 (Permitted Uses) or Section 3 (Data Security and Licensee Covenants) would cause Licensor irreparable harm for which monetary damages would not be an adequate remedy and agrees that, in the event of such breach or threatened breach, Licensor will be entitled to equitable relief, including a restraining order, an injunction, specific performance, and any other relief that may be available from any court, without any requirement to post a bond or other security, or to prove actual damages or that monetary damages are not an adequate remedy. Such remedies are not exclusive and are in addition to all other remedies that may be available at law, in equity, or otherwise.10. Third Party Terms.If at any time during the Term (i) the Data includes certain places in the United Kingdom; (ii) Licensee’s use cases include visualizations that include park locations; or (iii) there is Data in the future that requires attribution or similar additional terms; Licensee agrees to each of the corresponding terms found at Places Attribution.11. Academic Partnership Program.If Licensee is licensing the Data through Licensor's Academic Partnership Program, as indicated and agreed to by Licensee prior to being granted access to the Data, then in addition to the other terms conditions stated in this Agreement, Licensee agrees that:(a) If any conflict exists between the Agreement and this Clause 11, Clause 11 shall take precedence.(b) Definition. Licensee represents and warrants that it is the student of, or employed by, an institute of higher education (“Academic”).(c) Attribution. Licensee must credit SafeGraph if it publishes or creates content using the Data.(d) Permitted Use. Licensee may use the Data solely for non-commercial academic research and publication, and Licensee may not authorize another to use the Data for any commercial, resale, distribution or other purpose.(e) Term. Either Party may terminate this Agreement at any time by notifying the other Party. Upon expiration or termination of this Agreement, the license rights granted herein shall immediately terminate and Licensee shall immediately discontinue all use of the Data and within 5 business days remove or destroy all copies of the Data from Licensee’s hardware. Licensee shall not disclose, retain or use the Data after the expiration or termination of this Agreement.12. Matching Service.(a) If Licensee provides SafeGraph with any of Licensee’s internal database of places whether through upload or other means (the "Licensee Data"), to the extent feasible, Licensor will assign each location represented in the Licensee Data to a SafeGraph place identification number (each, a “SGPID” or Placekey). Licensor shall perform these matching services free of charge.(b) In consideration for Licensor providing the matching services, Licensee grants to Licensor an irrevocable, non-exclusive, royalty-free, fully paid up, perpetual, worldwide license, with the right to store, use, reproduce, publish, distribute, perform, and create derivative works from the Licensee Data.(c) Licensor shall not be liable or responsible for the deletion, destruction, damage, loss or failure to store any Licensee Data.(d) Licensee represents and warrants that (i) Licensee owns and controls all of the rights to the Licensee Data, or Licensee otherwise has the lawful right to distribute the Licensee Data to Licensor and grant Licensor the license hereunder, (ii) Licensee’s and Licensor's use and/or transmission of such Licensee Data will not violate any rights to any person or entity or violate any applicable laws, (iii) the Licensee Data does not include any personally identifiable information and (iv) the Licensee Data only includes data relating to places. #### Use Case Use Cases Discover unique use cases and resources for how SafeGraph data can be applied to your industry. Explore expert perspectives, technical guidance, and industry analysis on global POI data, geospatial workflows, and location accuracy. Popular right now ARITY + SAFEGRAPH Building Trips + Audiences with SafeGraph Places Learn More Global OOH Improve out of home advertising with SafeGraph Places Learn More Lead Generation Improve Lead Generation With High Quality Places Data Learn More Transaction Enrichment Contextualize Transactions with High Quality Places Data Learn More WOOLBRIGHT + SAFEGRAPH Expand redevelopment efforts with SafeGraph Places Learn More ### Partners #### Access SafeGraph Data in Domo URL: https://www.safegraph.com/partners/domo/ #### ArcGIS SafeGraph Integration Partner Access SafeGraph Places Data in ArcGIS Experiment with POI Data - access SafeGraph Places data directly through ArcGIS Marketplace and Business Analyst. Get Started Effortlessly incorporate POI data into workflows and analyses SafeGraph Places for ArcGIS Marketplace makes it easy to perform in-depth market analysis with any product in the Esri ecosystem, while direct access in Business Analyst enables users to quickly incorporate POI data into their BA workflows.  Give your ArcGIS analysis an edge with SafeGraph data ArcGIS users can leverage a subset of SafeGraph Places for accurate point of interest insights that provide an edge when conducting analytics related to site selection, territory planning, cannibalization, and more. Grocery access in the US and Puerto Rico View the accessibility to grocery stores across the US and Puerto Rico using SafeGraph data and Esri. View the Map SafeGraph Data is Designed with GIS in Mind For deeper insights, download SafeGraph data directly through the ArcGIS Marketplace. Explore SafeGraph Data in ArcGIS Clear Channel Olvin Mobsta Avison Young Spade “When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard.” Andy StevensChief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study “With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it.” Matt TaaffeVP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study “We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph.” James Sexton-BarrowHead of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study “With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before.” Julian AdamsDirector of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study “SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location.” Oban MacTavishCEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study "When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard." Andy StevensChief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study "With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it." Matt TaaffeVP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study "We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph." James Sexton-BarrowHead of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study "With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before." Julian AdamsDirector of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study "SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location." Oban MacTavishCEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study See what places data can do for your business. Schedule a demo with one of our data experts by filling out the form below. #### AWS SafeGraph Integration Partner Access SafeGraph Data directly in AWS Browse in one place - access SafeGraph data directly through the AWS Marketplace. Get Started Extensive POI data available in AWS Access SafeGraph Places data directly in the AWS Data Exchange for expedited analysis of points of interest and building footprints. Power business intelligence in AWS with location data from SafeGraph Leverage accurate SafeGraph data to deepen location intelligence and inform site selection, market analysis, consumer behavior, community planning, and more. Canadian Market Analysis using AWS Data Exchange Explore an analysis of the Canadian QSR Market using SafeGraph POI data from the AWS Data Exchange. Read the Blog SafeGraph Data Fuels Business Intelligence For deeper insights, download SafeGraph data directly through the AWS Data Exchange. Explore SafeGraph Data in AWS Clear Channel Olvin Mobsta Avison Young Spade “When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard.” Andy StevensChief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study “With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it.” Matt TaaffeVP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study “We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph.” James Sexton-BarrowHead of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study “With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before.” Julian AdamsDirector of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study “SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location.” Oban MacTavishCEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study "When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard." Andy StevensChief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study "With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it." Matt TaaffeVP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study "We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph." James Sexton-BarrowHead of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study "With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before." Julian AdamsDirector of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study "SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location." Oban MacTavishCEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study See what places data can do for your business. Schedule a demo with one of our data experts by filling out the form below. #### CARTO SafeGraph Integration Partner Access SafeGraph Data Directly in CARTO Skip the two-step process - access SafeGraph data directly through CARTO via CARTO’s Spatial Data Catalog. Get Started Power Spatial Analysis Access SafeGraph Places data directly in the AWS Data Exchange for expedited analysis of points of interest and building footprints. Enrich CARTO’s spatial analytics seamlessly with SafeGraph data Leverage SafeGraph Places for accurate point of interest data. Gain detailed context around location intelligence that can inform site selection, market analysis, consumer behavior, community planning, and more. CARTO and SafeGraph on increasing consumer demand SafeGraph and CARTO experts team up to help OOH advertisers with campaign planning and attribution using spatial analytics and places data. Watch the webinar SafeGraph Data is Designed for Spatial Analysis For deeper insights, download SafeGraph data directly through CARTO's Spatial Data Catalog. Explore SafeGraph Data in CARTO Clear Channel Olvin Mobsta Avison Young Spade “When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard.” Andy StevensChief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study “With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it.” Matt TaaffeVP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study “We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph.” James Sexton-BarrowHead of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study “With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before.” Julian AdamsDirector of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study “SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location.” Oban MacTavishCEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study "When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard." Andy StevensChief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study "With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it." Matt TaaffeVP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study "We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph." James Sexton-BarrowHead of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study "With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before." Julian AdamsDirector of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study "SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location." Oban MacTavishCEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study See what places data can do for your business. Schedule a demo with one of our data experts by filling out the form below. #### Databricks SafeGraph Integration Partner Access SafeGraph Places Data in Databricks Experiment with large-scale POI data by analyzing SafeGraph Places datasets directly within the Databricks Lakehouse environment. Get Started Effortlessly incorporate location data into analytics workflows SafeGraph Places data can be accessed through platforms such as AWS Data Exchange and analyzed in Databricks, enabling teams to explore millions of locations and integrate location intelligence into existing data pipelines. Give your Databricks analytics an edge with SafeGraph data Databricks users can combine SafeGraph Places with internal datasets to analyze consumer behavior, market activity, and geographic demand patterns. Built for modern data platforms SafeGraph datasets integrate seamlessly with Databricks, allowing teams to access and analyze high-quality location data within their existing analytics and data science workflows. SafeGraph Data for the Databricks Lakehouse For deeper insights, access SafeGraph Places data within Databricks and explore location intelligence alongside your analytics workflows. Explore SafeGraph Data in Databricks Clear Channel Olvin Mobsta Avison Young Spade “When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard.” Andy StevensChief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study “With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it.” Matt TaaffeVP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study “We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph.” James Sexton-BarrowHead of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study “With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before.” Julian AdamsDirector of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study “SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location.” Oban MacTavishCEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study "When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard." Andy StevensChief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study "With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it." Matt TaaffeVP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study "We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph." James Sexton-BarrowHead of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study "With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before." Julian AdamsDirector of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study "SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location." Oban MacTavishCEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study See what places data can do for your business. Schedule a demo with one of our data experts by filling out the form below. #### SafeGraph and CoreLogic Partnership URL: https://www.safegraph.com/partners/corelogic/ #### Snowflake SafeGraph Integration Partner Access SafeGraph Data Directly in Snowflake Get SafeGraph data delivered directly to Snowflake, or download from the Snowflake Data Marketplace. Schedule a Demo Deliver point of interest data directly in Snowflake Discover SafeGraph Places in Snowflake Marketplace. Get the data delivered directly through Snowflake to fast-track your way to detailed insights on where places are located, what they are, and who is visiting them. Give your business intelligence an edge with SafeGraph data Snowflake users can leverage precise SafeGraph data to enrich analyses related to site selection, consumer behavior, investment research, and more. Enrich Your Data with SafeGraph Data For deeper insights, get SafeGraph data delivered directly to Snowflake, or download from the Snowflake Data Marketplace. Explore SafeGraph in Snowflake Clear Channel Olvin Mobsta Avison Young Spade “When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard.” Andy StevensChief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study “With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it.” Matt TaaffeVP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study “We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph.” James Sexton-BarrowHead of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study “With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before.” Julian AdamsDirector of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study “SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location.” Oban MacTavishCEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study "When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard." Andy StevensChief Data Officer, Clear Channel Europe Fueling More Targeted, Higher Performing OOH Advertising Campaigns Read the case study "With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it." Matt TaaffeVP of Product, Olvin Building Accurate Store Visit Attribution Tools with Precise POIs and Place Footprints Read the case study "We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph." James Sexton-BarrowHead of Planning at Mobsta Using SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns Read the case study "With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before." Julian AdamsDirector of Data Science, Avison Young Powering Avison Young's Commercial Real Estate Site Selection Tool Read the case study "SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location." Oban MacTavishCEO at Spade Cleaning and enriching transaction data for the card industry using SafeGraph Places Read the case study See what places data can do for your business. Schedule a demo with one of our data experts by filling out the form below. ### Case Studies #### Adomni Uses SafeGraph Data to Empower Brands to Plan Highly-Targeted DOOH Campaigns The Problem: How to add more granularity to digital out-of-home campaign planning As the out-of-home advertising ecosystem has evolved from billboard ads to digital screens, brands have started to see the medium in a different light. And now that they’ve started to see the tremendous value in being able to tap into a wealth of data and insights to plan, target, buy, execute, and measure out-of-home advertising campaigns, they want to get even more granular with their planning. Over the last few years, Adomni has been getting more requests from brands and advertisers to reach highly targeted digital out-of-home (DOOH) audiences using location-based data. The only problem is that getting access to clean, accurate, and up-to-date POI data that can fuel this kind of targeting is not always a quick and easy task. In many cases, Adomni’s team had to rely on the advertisers themselves to supply whatever POI data they had to work with or try to build it themselves via public data sources.  Either way, sourcing this data became a major sticking point for a number of reasons. First, the datasets were oftentimes inaccurate or incomplete—especially at the brand- or category-level. Second, sometimes it took too long to get access to the right datasets, which slowed down campaign planning significantly. And lastly, when an advertiser wanted to do competitive conquesting, in many cases there was simply no reliable data to work with at all. So it soon became abundantly clear that in order to address this growing demand from brands and advertisers to plan DOOH campaigns with a POI-based strategy, Adomni needed access to high-quality geospatial data at nationwide scale. The Problem-Solver: Adomni Adomni was founded to give brands and advertisers a new way to buy ad space on digital out-of-home screens through an easy-to-use online marketplace with transparent pricing, advanced audience targeting capabilities, screen-level metrics for detailed forecasting and budgeting, and most importantly, the ability to launch campaigns in a matter of minutes.  Today, they’ve more than delivered on this mission. At the time of this case study, their cutting-edge and highly visual DOOH platform aggregates 700k+ screens, across 40+ countries, 40 venue types (i.e. bars, restaurants, airports, malls, etc.), and 350 publishers—delivering 70B impressions monthly (and growing).  What sets Adomni apart from other multichannel programmatic platforms is their core focus on innovating the digital out-of-home advertising space. Not to mention, their world-class managed services team includes some of the industry’s leading DOOH experts who seek to help brands and advertisers tap into the full potential of the digital out-of-home advertising space. #### Avison Young Uses SafeGraph Data to Offer Ground-Level Insights for Commercial Real Estate Site Selection The Problem: How to help businesses make better, more informed real estate decisions The question that Avison Young’s clients ask regularly is, “What’s the best location for my office, (industrial) business, or retail location?” But in asking this question, they really want to know more. Is a location easily accessible by highways or public transportation? Are there plenty of lunch options in the area? For retail locations, will the store or restaurant be well-positioned to reach the largest number of target consumers?  In other words, these businesses come to Avison Young not only to make better and more informed commercial real estate site selection decisions but also to get deeper real-time insights into what’s actually happening at ground level around those locations. And with local market dynamics—including property availability—changing faster than ever before, having access to the right location data at the right time is critical for faster decision-making. The Problem-Solver: Avison Young Avison Young is a data-driven commercial real estate company fueled by a common purpose: To create real economic, social, and environmental value—powered by people. They rally around the idea that anyone engaged in real estate today needs data, technology, and consultative solutions to achieve their unique and rapidly transforming goals. They believe that commercial real estate plays a vital role in creating healthy and productive workplaces, prosperous cities, and other built spaces that can benefit the economy, environment, and local community. #### Billups OOH Ad Attribution Solution Powered By SafeGraph POI Data The Problem: Out-of-Home Ad Attribution How can Billups accurately measure if a consumer was exposed to an out-of-home ad placement? The Problem Solver: Billups Billups is an advertising technology company servicing the out-of-home and digital out-of-home marketplace, translating the physical world into data-driven solutions with the power of the leading outdoor advertising platform. With 18 offices across the US & Canada, Billups is creating audience-centric connections for brands and agencies by building, executing, and measuring intelligent campaigns at scale. The Backstory: Challenges In Determining Store Visits The outdoor advertising industry has historically lacked good data upon which to select placements that reach ideal audiences. Without quality data, the industry also struggles to help brands measure campaign effectiveness. Billups pioneered a data-driven solution to the outdoor media placement & measurement problems by using anonymous location data derived from mobile phones.‍Billups can determine each anonymized customer’s daily journey, along with which billboards that consumer was exposed to. This is done by mapping the journey each device takes with anonymous GPS data that is then cross-referenced with outdoor inventory locations. This process works well for digital brands wherein outdoor ad exposure leads to online behavior, measured with online pixels and cookies. But for traditional brick and mortar retailers, measuring outdoor advertising campaign performance remained challenging. In these cases, Billups needed to understand whether outdoor ad exposures led to store visits.‍To do this, Billups needed to determine which businesses customers visit during their daily journey. However, using anonymized geolocation data to determine if a consumer entered a specific business (and not the parking lot or the bordering business) is a complex undertaking. Enter SafeGraph's POI Data & Geofences Billups used SafeGraph Places to solve their store visit attribution problem. To turn anonymized location data into contextualized store visits, Billups relied heavily on SafeGraph’s POI polygons. These POI building footprints define the exact location, shape, and size of a store.‍When joined with GPS data, the precise geofences increase the accuracy of detecting store visits when compared to using store centroids or geocoded street addresses for the store location.‍Billups chose SafeGraph Places due to its scale (10 million places globally including almost every location for the top 8,500+brands). This allows Billups to efficiently work with new clients on campaigns without waiting on engineering resources to put together relevant POI & geofence data. Since SafeGraph data is updated monthly, Billups no longer tediously maintains their internal POI dataset to account for store openings and closings.‍SafeGraph Places enables Billups to recommend to its clients the ideal spot to place outdoor media based on the types of stores and shoppers that pass by the inventory location, leading to superior results for Billups’ clients. Future Plans: Ad-Placements Based on POI Proximity Billups plans to bring SafeGraph data into its demand-side platform, in order to help clients select ad-placements based on the available inventory’s proximity to different points of interest. It also plans to use SafeGraph data to derive insights on visitors to large & important POIs such as airports and stadiums.  In addition, Billups is now able to select specific POIs within any market and rank OOH placements for the highest exposure to audiences visiting those POIs, making the buys smarter than ever before. #### Clear Channel Europe Uses SafeGraph Data to Fuel More Targeted, Higher Performing OOH Advertising Campaigns The Problem: How to help advertisers plan and buy OOH ad campaigns more effectively ‍When advertisers come to Clear Channel Europe, they’re often looking for creative ways to broadcast their marketing messages and engage with potential customers via the out-of-home channel in the most targeted, relevant, and cost-effective ways possible.  A big part of this requires having a deep understanding of the exact location as well as the surrounding area of every OOH ad unit within the company’s inventory. What are the demographics? What other stores are nearby? How much do consumers spend in that trade area? What are the customer profiles for those that live in the area as well as passers-by? And the list of questions that advertisers ask when planning their OOH ad campaigns goes on.  To address these advertiser needs, Clear Channel Europe launched a proprietary mobile data and campaign planning platform called ‘Clear Channel RADAR’, which taps into powerful digital insights around audience groups—including demographics, location and/or point-of-interest (POI) visits, brand affinity, and online browsing behaviors—to promote more effective, flexible and data-driven campaign planning. This technology provides invaluable data to help brands better understand how consumer behaviors change in real-time, thereby allowing them to build more nuanced and effective OOH campaigns that drive stronger consumer engagement. When it comes to POI data, however, most available data sources are quite disparate and tend to exist at only the local level. There is rarely ever a global “source of truth” to tap into. While publicly available postal code data was sufficient in the past, it wasn’t updated regularly and often had taxonomical inconsistencies. Additionally, it did not provide detailed information about specific locations to help answer questions about proximity, driving time, etc. Even high-quality datasets were difficult to work with because the data itself wasn’t structured in the way the team needed in order to provide customers with actionable analytics and insights. The Problem-Solver: Clear Channel Europe Clear Channel Europe is one of Europe’s leading OOH media and infrastructure companies with 280,000 advertising sites across 17 European markets. On a bold mission to “create the future of media,” Clear Channel Europe is committed to striking the right balance between delivering effective and targeted advertising solutions for brands and businesses while also being a platform for good that aims to make a positive impact on the environment and within the communities in which Clear Channel operates. The Solution: SafeGraph Places “We have a lot of experience working with open source and competitor datasets as part of our ‘Clear Channel RADAR’ platform,” explained Andy Stevens, Chief Data Officer for Clear Channel Europe’s European operations. “Over time, we realized we needed a better data source solution to help our customers run more targeted, brand-safe, and high-performing out-of-home advertising campaigns.” This prompted Clear Channel to start working with SafeGraph data. With the launch of the SafeGraph Places global dataset, SafeGraph has become one of the only companies to offer a fresh, comprehensive, and accurate (updated monthly) POI dataset in a unified schema for almost any place worldwide. This powerful dataset captures accurate location-based data for thousands of brands and categories—from large corporations to mom-and-pop stores and industrial locations. It is the only dataset to combine proprietary machine learning techniques with real-world verification (by humans) in order to maintain high accuracy at all times. Powered by SafeGraph Places data, Clear Channel Europe enables advertisers to perform proximity analysis for planning and buying OOH ads with the most up-to-date POIs. “After testing a sample of the SafeGraph Places dataset, we found it to be far more accurate than other data sources we’ve used in the past,” continued Stevens. “Not to mention, the way the data was packaged was user friendly; its taxonomy simply made sense.”  While having access to accurate and precise POI data was the top priority for the Clear Channel Europe team when evaluating potential new data providers, what stood out most was SafeGraph’s high-touch approach to customer success. “We never expected the dataset to be turnkey with our existing systems and datasets from day one—there’s always a bit of work to get it to where you need it to be,” reiterated Stevens. “But we’ve been blown away by SafeGraph’s proactiveness in helping us work through any issues we’ve encountered.” The Result: Equipping advertisers with better campaign planning insights “Working with SafeGraph, we now have a more accurate and consistent dataset for our planning platform, Clear Channel RADAR, which empowers our campaign planners to help our customers execute more effective out-of-home advertising campaigns,” emphasized Stevens. Clear Channel RADAR is equipped with many different filters such as POI proximity (powered by Safegraph) and near-real-time behavioral profiles, allowing the company’s campaign planners to pinpoint the correct OOH ad placements based on their customers’ marketing goals. Within the platform, selecting individual ad units opens up instant access to all sorts of actionable insights– helping advertisers make more informed media buying decisions. Clear Channel’s RADAR platform embeds global SafeGraph POIs to provide advertisers with real-time out-of-home advertising insights across European markets. The Future: Weaving SafeGraph data into other parts of the business “The success we’ve seen by adding SafeGraph data into Clear Channel RADAR has given us ideas about moving other departments and markets to a single source of information,” said Stevens.  One immediate use case would be for the teams in charge of acquiring or building new OOH ad units. It will have a direct impact on making smarter investment decisions by allowing teams to go into pitches with a deeper understanding of local market dynamics, including customer demographics, potential reach, and foot traffic.  “At the end of the day, we are moving towards an audience-based approach in everything we do as a business,” underscored Stevens. “The more we can make all aspects of the out-of-home buying experience like digital, the more we can grow our customers’ businesses—and ours as well. We just need to harmonize all of the data sources first.” SafeGraph data has helped Clear Channel Europe build an increasingly audience-based approach to out-of-home advertising by enhancing the granularity of insights driven by their proprietary RADAR platform. #### Dosh’s Card-Linked Offers Platform Is Powered By SafeGraph Places Data About Dosh Dosh is a card-linked offer platform connecting brands to consumers with automatic cash back when they make purchases, ultimately driving higher spend and brand loyalty. Directly integrated into major card networks, Dosh enables brands to track advertising spend to every transaction, providing measurable attribution both online and offline. The Challenge:Accurate Business Data For Major Retail Brands To provide consumers with purchased-based rewards, the card-linked offer industry requires accurate location data for each and every brand venue. Reward platforms need store location data in order to match card network transactions with a user’s relevant purchases and reward their cash back.‍Store location data sourced directly from the card networks often contains out of date or incomplete address data that doesn’t meet Dosh’s needs. For large retailers, 10-20% of store locations would be missing from the card network data. This resulted in higher contact rates for customer service to sort out missed transactions.‍Sourcing the venue data directly from the brands themselves is also challenging. Due to silos within many large retail organizations, marketers often don’t have the most up-to-date information about their own store locations. This can force platforms to wait multiple weeks to get the most recent data from the right group within the retailer, which slows down the merchant onboarding process to start the rewards campaign. The Solution: SafeGraph Places Data To overcome these industry challenges, Dosh’s data science team evaluated multiple location data providers before choosing SafeGraph.‍Dosh compared SafeGraph’s Places data for a national coffee/quick serve restaurant with over 9,000 venues to data sourced from the card networks and several other data providers. They found that SafeGraph data had 90 store locations that other providers were missing. Additionally, Dosh found 100 locations which other providers said were related venues but were actually closed or non-related businesses of a different brand.‍Dosh chose to use SafeGraph Places not only for its high precision and recall, but also due to SafeGraph’s simpler commercial term structure and flexible data licensing rights when compared to other POI data providers.  By integrating SafeGraph Places data, Dosh was able to match store locations more accurately to the correct card network IDs. This allowed Dosh to speed up the merchant onboarding process from weeks to days. The accuracy of SafeGraph’s data also reduced customer service contacts and is used in the Dosh app’s map in order to help guide users to nearby stores eligible for Dosh cash back as well. Overall, by integrating SafeGraph data into its onboarding process, Dosh was able to improve their go-to-market workflow, deliver a better customer experience to its users, and realize revenue faster from its partner retailers. #### How the Retail Banking Industry Uses SafeGraph Data to Make More Informed Site Selection Decisions The Context: A rapidly evolving retail banking industry Although brick-and-mortar banks (aka, retail banks) played a vital role during the COVID-19 pandemic as “essential services,” a massive behavioral shift saw consumers leaning into digital across all industries more than ever before became the new normal for most.  This came at a time when many retail banks had already undergone 10+ years of much-needed innovation, shifting their focus from “financial products” to “customer experiences.” This gave rise to new tech platforms and enhanced digital services that solved a number of customer pain points, including making it easier for customers to manage their finances directly from their mobile phones. Fortunately, when the world came to a halt at the start of the pandemic, retail banking customers were prepared to manage everything from the palm of their hands. But this digital transformation didn’t happen just to improve the customer experience alone. Over the last few years, many new digital-only players, often called neo-banks, have entered the market in growing numbers. Compared to traditional retail banks, they claim to offer a purely digital, hassle-free experience without any unnecessary fees. With these new competitors disrupting the world of financial services, retail banking had to evolve (quickly) to stay relevant. Therefore, it’s safe to say that the retail banking industry was already at its transformation tipping point well before the pandemic began. The pandemic simply was a catalyst for accelerating it. In fact, McKinsey found that 15-20% of banking customers in the US were likely to increase their use of digital or mobile banking channels even once the pandemic subsided.  This has forced the retail banking industry to double down on a number of fronts, most notably around improved digital banking services, stronger customer relationships (i.e. moving beyond transactions), and increased customer personalization (in-store and online).  If you ask anyone in the retail banking industry, they’ll tell you that it’s far from obsolete. There’s a high-touch, interpersonal element to the industry that, on one hand, can be enhanced by digital services but, on the other hand, can’t be effectively replicated or replaced by them. However, they will also tell you that retail banking is incredibly saturated because today’s customers have more choices about where to keep their money safe and secure. The Problem: How to remain competitive and accessible in a rapidly changing market In the face of growing competition, retail banks must be increasingly strategic about where they place their brick-and-mortar locations. Not only do these bank locations need to be easily accessible to their target customers, but they also must be competitively positioned to fill voids where other bank brands (and branches) don’t have a strong presence. The question then becomes: How can a retail bank best optimize its footprint in any given market across the US? Answering that question comes down to data. Unfortunately, simply relying on publicly available data sources, like Census Block Group (CBG) data, to suss out a bank’s real market opportunity is what we could call a “bare minimum” approach to retail site selection. To paint a complete picture that drives informed decision-making, you must know:  What businesses are currently present in a given trade area; What CBGs feed into that trade area; What the demographic make-up of those CBGs is #### Just One Way the Public Sector Uses SafeGraph Data to Understand the ‘Walkability’ of Urban Areas The Context: Leveraging the power of data to answer questions that solve specific problems Unlike many of the other industries we work with, the public sector has incredible breadth and depth that can’t really be encapsulated in a one-size-fits-all way. So much of the data-driven work happening daily in non-profit organizations, academic institutions, and government agencies is specifically focused on answering highly specific questions that aim to address issues occurring within the public sphere. The ultimate goal: Informing the development of new policies that can positively impact people's lives at the national, state, and local levels.  Therefore, it should come as no surprise that the need for high-quality data within the public sector really came to a fever pitch during the first wave of the COVID-19 pandemic. Many researchers and analysts within various public sector organizations were frantically trying to wrap their heads around how the pandemic would impact things like access to healthcare services, long-term mental health, shifts in human behavior (i.e. how daily mobility changed), compliance with stay-at-home orders, the resilience of the economy, evolving purchase behaviors, school closures, and the list goes on.  At a time when the world felt like it got turned upside down, researchers within the public sector turned to location-based data to understand not only what was happening in real-time but also to identify ways of making the best out of a challenging and unprecedented situation. In many ways, the pandemic shed new light on the importance of comprehensive places data as the key to driving immediately actionable insights.  The Problem: How to determine the accessibility and walkability of urban areas  One of the biggest questions that came up soon after the world came to a screeching halt was around the overall accessibility of urban areas. More specifically, one public sector organization we’ve worked with wanted to get a deeper understanding of which urban areas in the United States could provide people with essentially everything they needed to survive—all within a walking distance of their homes. This includes things like hospitals and clinics, pharmacies, grocery stores, banks, public transportation, and other essential services.  Doing this analysis would, therefore, make it possible for them to pinpoint the urban areas that were inherently less walkable and then use that knowledge to inform new policies or serve as a catalyst for infrastructure development aiming to put more services within easy reach.  One of the key areas of focus for this analysis centered on grocery stores. Not only did this public sector organization want to understand if or how many grocery stores were within a walkable distance for people living in these urban areas but also what kinds of grocery store options existed. For example, was there a big supermarket nearby to fulfill a community’s nutritional needs or only small bodegas or mini-marts with limited food options?  Of course, a big part of this research was aimed at identifying so-called “food deserts,” or areas where healthy food options are severely limited, as that makes it a lot easier to build business cases for investing in infrastructure to fill those gaps. But similarly, this research can also shine a spotlight on urban areas that are more prone to public health crises, like obesity for example, and, as a result, have greater potential to place unnecessary stress on local health services. So, while walkability may have been the primary goal of this analysis, solving the broader accessibility issue can have a cascade effect on other issues pertaining to public health. #### Media Storm's Location-Based Audiences Using SafeGraph POI Data The Problem: Accurate Location-Based Audiences How can Media Storm turn noisy geolocation data into precisely defined location-based audiences? The Problem Solver: Media Storm Media Storm is the 2nd largest independent full-service media agency in the US with deep roots in the entertainment, retail and the travel/tourism industries. Today with over 35 clients across key categories, Media Storm remains an agency centered around national and regional brands that transact at the hyper-local/community level. Media Storm currently works with leading companies such as AMC, Big Lots!, Celebrity Cruises, WGN America and WE tv across services including media planning, audience segmentation, and digital and TV media buying. Through their data science and advanced analytics group, JubaPlus, they deliver true business ROI – not just media results. The Challenge: Precise Store Location Data Media Storm, on behalf of its clients, was tasked with running advertising campaigns to re-engage store visitors and turn them into repeat customers. To create these audiences of store visitors, Media Storm licensed a data feed of anonymous GPS locations tied to Mobile Advertising IDs (MAIDs). However, Media Storm found it challenging to determine from geolocation data whether a MAID had visited a store without accurate data on where stores were precisely located.   Working with brands to source the store location data proved to be insufficient due to imprecise client data. For example, one big-box retailer client of Media Storm’s often reported centroids of stores being located more than 50 meters off from their true locations, leading to the creation of inaccurate audiences. Clients also had the challenge of either having limited or no data on where competitor stores were located. This prevented Media Storm from advertising to store visitors of competitor brands for conquesting campaigns. Media Storm found that even perfect store centroid data would fall short in creating accurate audiences in dense environments like strip malls or cities. In these situations, merely drawing a radius around a store centroid would include visitors to nearby stores or pedestrians outside the storefront. Without exact building footprints for the stores, Media Storm’s audiences would include irrelevant MAIDs resulting in wasted ad-spend. Enter SafeGraph's POI Data & Geofences Media Storm realized that in order to create accurate store visitor audiences from anonymous GPS data it needed an in-house dataset of where stores are exactly located. Media Storm chose SafeGraph Places, a dataset of over 10 million points of interest with business listing information, for this purpose.   Using SafeGraph’s brand information and NAICS code (category) for a place, Media Storm was able to quickly identify store locations of its clients and those of its clients’ competitors.  For each of these POI, Media Storm used the polygon (building footprint) to geofence the anonymous GPS data feed and find which MAIDs visited a store. This helped Media Storm access all the actual visitors to the store location while filtering out nearby but irrelevant MAIDs. Media Storm used the more accurate store visitor audiences to drive efficient advertising campaigns which in turn, led to superior performance for Media Storm’s clients. Future Plans: Store Visit Attribution Media Storm plans to use SafeGraph Places for store visit attribution. The plan is to create A/B tests of MAIDs exposed and not exposed to their paid media campaigns. Then, from the geolocation feed and SafeGraph polygons, they plan to measure the store visitation rates for the test vs. control group of MAIDs to measure the effect of the advertising campaign on store visit lift. #### MikMak Uses SafeGraph Data to Close the Gap Between the Online-to-Offline Shopper Journey The Problem: How to help multichannel brands take control of the shopper journey To understand what makes MikMak really tick, you need to first understand how the company’s e-commerce solution works at a basic level. So, let’s start with a quick example.  Let’s say you are browsing a CPG brand’s website and then decide to click on a specific product that you are interested in purchasing. Once you land on the product detail pages (PDPs), you’ll likely see a number of “Buy Now” buttons scattered about. Clicking on those buttons is where MikMak’s magic really happens. Those buttons operate more or less as a redirect to a choice of either online retailers like Walmart, Instacart, and Amazon or brick-and-mortar retailers (based on a user’s location) where shoppers can buy the goods they’re interested in.  There are a lot of benefits here for multichannel—grocery, alcohol/spirits, and CPG—brands. For starters, it provides deep insight into who shoppers are, what their path to purchase looks like, where they’re located, and what their preferences are at any step along their journey. This helps brands optimize the shopper journey by creating repeatable shopping experiences that are more likely to drive positive outcomes (purchases). It also enables them to be much more efficient with marketing spend, so they can meet shoppers wherever they are on their journey. In short, brands today want to have a greater understanding of how their online investments can drive both online and offline outcomes. While pushing shoppers to purchase through online retailers is a fairly seamless process, moving shoppers from the online to the offline world is more complicated. “Giving consumers the option between digital and brick and mortar shopping experiences will accelerate the path to purchase for brand manufacturers,” explained Christian Trapp, MikMak’s Lead Product Manager. “Having access to the most accurate, global location-based data is a critical ingredient for determining where the best retailers are, in order to accelerate the performance of a brand’s where-to-buy campaigns.”  The Problem-Solver: MikMak Simply put, MikMak enables brands to grow at the speed of commerce. As the leading global eCommerce acceleration platform for multichannel brands, MikMak provides analytics and eCommerce enablement solutions for understanding their consumers' online behavior, determining the best use of marketing dollars, and driving omnichannel sales. By making a brand’s owned-and-operated media inherently shoppable, they are well-positioned to measure the online shopper journey, understand the touchpoints consumers go through to make purchases, and begin to trace how that experience extends into the offline world.   The Solution: SafeGraph Places “We looked at different options for location data, and SafeGraph was the only partner that gave us the flexibility and accuracy to add new retailers at the speed we need to operate,” added Trapp. “When you combine the quality and accuracy of the data with the SafeGraph team’s clear dedication to stellar customer support, you’ve got a winning recipe for success.” “In addition to speed, SafeGraph’s dataset also provided us with enhanced data granularity, which we previously had to manage internally,” continued Trapp. This has freed up MikMak’s internal resources to focus on research and development in other areas of need. “Working with SafeGraph has been seamless—and their data has provided tremendous value to MikMak,” emphasized Trapp. “We’re excited to continue exploring other opportunities available through our partnership, as we continue to optimize the shopper journey and create better customer experiences across both the online and offline worlds.” The Result: Driving great confidence in the online-to-offline shopping experience “If we had to think about life pre-SafeGraph versus post-SafeGraph, it really comes down to confidence,” summarized Trapp. “Not only do we feel 99.99999% confident about where we’re sending shoppers to in the real world, but we also have up to 20% greater POI coverage than we had before.” This alone has been a real game changer for MikMak, enabling them to go to market with solutions—at scale and with greater accuracy—than ever before.   In addition to extracting even more value from the granularity of the sub-categories built into the SafeGraph Places dataset, as mentioned above, MikMak’s team is also keen to understand the different events that occur from when shoppers see a product online to when they eventually go and purchase it at a brick-and-mortar location. “What we can see already is where it looks like a consumer wants to shop based on things like clicking on a location’s address and checking out where it sits on a map,” added Trapp. “Now, the real challenge here is building out that attribution model that crystallizes that online-to-offline conversion.”  Part of this attribution model will also explore customer journey extensions like the time of day shoppers are engaging with marketing communications online or the time of day they are thinking about making a purchase decision. “We’re really curious about understanding the thought process behind shoppers and why they make certain choices,” reiterated Trapp. “Honing in on what motivates shoppers to action at different times of day and at various points along their daily journeys—from home to the office and everything in between—can give us and our clients a deeper understanding of a shopper’s psyche. And that’s exciting!” The Future: Taking the MikMak experience across borders  Right now, MikMak’s use of SafeGraph data is only focused on the U.S. and U.K. markets, but they’ve already started to set their sights on adding the same rigor they’ve applied in these two markets to 1,500+ retailers around the world. “We are interested in unlocking this capability globally,” shared Trapp. “Shoppers may seem to be different in different geographies, but they actually share similar behavioral patterns—and by eventually tapping into the SafeGraph Places global dataset, we’ve got a viable opportunity to meet shoppers wherever they are in the world and however they continue to evolve their behaviors.”  MikMak’s journey is just beginning—whether it’s a matter of cracking the code for online-to-offline attribution or finding new ways to make owned-and-operated media inherently shoppable all around the world. #### Mobsta Uses SafeGraph Data to Give Ad Agencies An Effective Way to Plan Geo-Contextual Campaigns The Problem: How to help ad agencies plan successful location-based marketing campaigns When the Mobsta team visits ad agencies across the UK, they come to the table with a clear message: We know where your audiences are and we can save you a ton of time and effort finding them. That’s because the Mobsta team is composed of the UK’s leading location experts. “Everything we do is rooted in location,” explained James Sexton-Barrow, Mobsta’s Head of Planning. “We need to have a solid understanding of where places are and how people are moving from one place to another. Without the right data to fuel those insights, we wouldn’t really have any leg to stand on.” Up to this point, Mobsta had already been using multiple high-quality data sources—including mobile pings—to fuel their insights. The only thing missing was accurate, up-to-date, and granular POI data to add context to those insights. Fortunately, their soon-to-launch ‘TraffiQ’ platform—a visual ad planning tool that pulls information from the SafeGraph Places global dataset and is aimed at helping agencies more accurately pinpoint the right ad units in the right locations for reaching the right audiences—gives them a very firm leg to stand on. The Problem-Solver: Mobsta Over the past 10 years, Mobsta has earned its reputation in the UK as a go-to specialist in location-based data marketing and audience targeting. Thanks to its ability to monitor over one billion devices globally, the company has played a major role in empowering ad agencies to plan, buy, and execute highly scalable advertising campaigns across multiple channels in 52 markets, all powered by highly accurate geospatial data.  In fact, their expertise has led them to be consistently ranked as a top location provider in the UK. And with the disappearance of cookies just around the corner, their advanced geo-contextual (cookie-less) solutions will give them a leg up on the competition around next-gen behavioral targeting. The Solution: SafeGraph Places A lot of different data sources feed into the TraffiQ platform, many of which come from governmental bodies or highly regarded industry players. “We expect nothing less than the gold standard in data,” underscored Sexton-Barrow. “And when it came to onboarding a location-based data partner, we weren’t going to lower our standards one bit—which is precisely why we decided to partner with SafeGraph.” Mobsta uses the SafeGraph Places global dataset to identify, with utmost accuracy, virtually any POI across the UK instantly.  “Receiving SafeGraph data each month in an S3 bucket means the POI data in our platform can automatically refresh each month with minimal development required,” explained Jack Burton, Mobsta’s Head of Product. “Plus, the scale of the data, including its clear and simple schema, allows us to work with it without having to overcome any learning curves.” Not only does this equate to a lot of time saved with every monthly update but more importantly, it immediately creates new opportunities for Mobsta’s team to provide even greater value to its customers. Burton continued, “We were able to get SafeGraph's data into the platform and working efficiently within a couple of weeks—that’s pretty incredible!” But aside from providing Mobsta with the highly accurate location-based data they needed, they’ve been continually impressed with SafeGraph’s high level of customer service. “We’ve got a great relationship with the SafeGraph team on a personal level,” said Sexton-Barrow. “It’s clear that SafeGraph wants to be a true partner in our success.” The Result: Building greater trust around location-based audience targeting “During an agency visit, we need to be able to give a live demo with speed and accuracy,” continued Sexton-Barrow. “The good news is that we can always count on monthly updates to the SafeGraph Places dataset to keep our analytics accurate and up-to-date at all times.” Because SafeGraph data has played such a big part in bringing the TraffiQ platform to life, the precision and accuracy of the location-based insights have helped Mobsta’s sales team create better, more trusted relationships with their agency clients.  As a result, the business has grown. “We let our agency partners play around with the TraffiQ tool on their own because we are confident in the accuracy of the data,” reinforced Sexton-Barrow. “This has taken a tremendous weight off of our sales team’s shoulders because the data visualizations, powered in large part by SafeGraph data, speak for themselves.” In fact, it has helped fuel a new dynamic in which agencies now proactively reach out to Mobsta to plan, buy, and execute campaigns. Mobsta’s TraffiQ platform enables both their sales team and their agency-based clients to visualize and contextualize audiences with rich, location-based information with ease. “The success we’re already seeing with the soft launch of TraffiQ is proof that we’ve got a compelling and competitive offer for the agencies we work with,” followed up Sexton-Barrow. “Now, all we’ve got to do is expand this service offering across all of Europe.” And fortunately, with the SafeGraph Places global dataset by their side, the Mobsta team will be able to create a universal language across European countries in a turnkey way. #### Olvin Uses SafeGraph Data to Build a More Accurate and Reliable Store Visit Attribution Model The Problem: How to Provide More Granular, Store-level Foot Traffic Insights The right data makes all the difference. Unfortunately, the data sources that Olvin first started working with, when building out the company’s Almanac platform, were too “noisy” and inaccurate to be able to answer a key customer question: How many people are actually stepping foot into my brick and mortar store? “With our previous data provider, the polygons didn’t always align to where stores really were,” explained Matthew Taaffe, Olvin’s VP of Product. “It took us sometimes a month or two to import and clean the data before we could even use it.” It became clear that better (and cleaner) data was needed to build a stronger, more accurate attribution model that could deliver the granular, store-level insights their customers wanted. The Problem-Solver: Olvin Olvin is a retail analytics platform, fueled by its flagship Almanac product, with an objective to “level the playing field against e-commerce and create tools that enable brick and mortar retailers and product owners to get ahead of the curve with predictive analytics.” The company is focused on tapping into the power of artificial intelligence (AI) to predict consumer behavior and demand, so that brick and mortar businesses can make decisions with confidence around labor optimization, assortment planning, and site selection. But at the end of the day, the company’s focus is all about helping marketers, planners, and merchandisers delight their customers once again. The Challenge: Going From Aggregated Local Area Foot Traffic Insights to Store-level Insights When Olvin was searching for new data providers, they did a simple “visits validation” test:  Find a store location on Google Maps Verify whether the data actually shows the store at the same location With the company’s original data provider, there was all too often a disconnect. Sometimes the data placed stores in incorrect locations—like stores being shown in the middle of a street—or, in the worst of cases, showed them overlapping each other. This made it virtually impossible to provide customers with store-level foot traffic insights.  “Because of the inaccuracy of the data we were working with, we couldn’t provide insights with precision,” underscored Taaffe. “We had no choice but to aggregate data to the local area, which made it impossible for us to provide foot traffic insights with any level of granularity.” But granularity is exactly what Olvin’s customers wanted. Making informed decisions around retail site selection or day-to-day store operations requires knowing not only how many customers (may) step foot into a store but also where they’re coming from when they do. The Solution: SafeGraph Places and Geometry Data The three primary criteria that Olvin cared about most when choosing a new data provider were: Accurate polygons  Easy to buy and download  Reasonable cost  SafeGraph ticked all of those boxes—and more. For starters, SafeGraph’s Geometry dataset passed the “visits validation” test and made it possible for Olvin to build an attribution model that could accurately assess foot traffic at the store-level. “Not to mention, SafeGraph data adheres to industry standards, like NAICS codes,” confirmed Taaffe. “This makes it a lot easier for us to work with and join to other data sources without having to do a big cleanup effort.” The Result: Getting a Competitive Edge with Future Customers Whether it’s for trade area analysis, retail site selection, store visit attribution, store performance, or even demographic insights, getting customers to believe in Olvin’s offering wouldn’t be possible if the company couldn’t provide store-level insights with precision and granularity. “Plus, it allowed us to simplify the user experience and make it a whole lot easier for our customers to take action on the insights we provide,” reiterated Walters.  “The ability to offer our customers store-level foot traffic insights has helped us unlock new conversations,” chimed in Taaffe. “While our competitive edge has always been based on our predictive modeling and forecasting methodologies, fueled by several data sources, it’s now so much easier for our customers to perceive the value of what we offer them.” In other words, all it took was for Olvin to work with the right data sources to now be able to provide “predictive foot traffic insights” versus simply a summary of foot traffic. #### Plaid Uses SafeGraph to Bring Precision to Transaction Enrichment The Problem: Transaction Data Clarity and Improving Operational Efficiency In the fintech world, Plaid plays a pivotal role by providing essential data connectivity that powers numerous financial applications and services. Their clientele, ranging from small start-ups to large financial institutions, depends on Plaid for precise and reliable financial data to make informed decisions, manage personal finances, and offer tailored financial services. However, Plaid faced a significant challenge: the data obtained from financial institutions often contained noisy, unstructured, and ambiguous transaction information, making it difficult for end-users to recognize and understand their financial activities. The core issue revolved around merchant recognition, especially for smaller local merchants, and the accurate pinpointing of transaction locations. The existing data was not only hard to interpret but also lacked the granularity needed for advanced financial analysis, leading to a suboptimal user experience and diminished value in financial insights derived from the data. The Problem-Solver: Plaid's Commitment to Data Connectivity Plaid's mission is to help unlock financial freedom for everyone through technology, offering a platform that connects various financial accounts and provides a unified, detailed view of financial data. They empower developers, innovators, and entrepreneurs to create easy-to-use applications that enable consumers and businesses to lead healthier financial lives. Their technology is crucial for enhancing the functionality of financial apps, simplifying payments, and personalizing financial advice, thereby fostering a more inclusive financial ecosystem. Despite their advanced technology, the challenge of cleaning messy transaction data remained. To maintain their industry-leading position and continue providing high-value services, Plaid recognized the need for an external provider that could enhance their Enrich and Transactions API products with accurate merchant location information. Leah Karlins, Product Lead at Plaid states, "Our goal is to transform the complex, often cryptic transaction data into transparent, easy-to-understand information that empowers financial decision-making." The Solution: Leveraging SafeGraph for Enhanced Data Accuracy Plaid embarked on a quest to find a merchant data provider that could meet their high standards for accuracy, comprehensiveness, and reliability.  SafeGraph's Places dataset stood out for its data veracity, rich metadata, and advanced merchant classification attributes. Leah remarked on the decision, "We chose SafeGraph for their unparalleled accuracy in location data and merchant identification, which has enabled us to significantly upgrade our data enrichment process." The Result: Empowering Users with Unprecedented Financial Insights The collaboration between Plaid and SafeGraph has led to improvements in how financial data is presented and utilized. Users of Plaid-powered applications now experience a new level of clarity in their financial transactions, with detailed information on merchant identity and transaction locations. This enhancement has not only boosted user confidence in financial apps but also enabled more sophisticated financial analysis, personalized budgeting tools, and improved fraud detection mechanisms. Leah highlighted the transformative impact of this integration, stating, "Our collaboration with SafeGraph has enabled us to elevate our matching capabilities. We’re now able to match approximately half of all card-present transactions to a precise merchant location, which significantly enriches the data we provide." In addition to accuracy improvements, Leah adds, “SafeGraph's data has reduced our need for extensive QA processes, saving considerable time and resources. This efficiency allows our team to concentrate on strategic business objectives, enhancing our product development and innovation." The Future: Expanding Horizons with Data-Driven Financial Solutions Buoyed by the success of their integration with SafeGraph, Plaid is now poised to explore new frontiers in financial technology. They envision leveraging this enhanced data capability to introduce more innovative services, expand into new markets, and continue breaking down barriers in the financial industry. The ongoing partnership with SafeGraph signifies Plaid's unwavering commitment to delivering exceptional value through data, ensuring that they remain at the forefront of the fintech revolution, driving innovation, and empowering users worldwide. Leah expresses optimism about the future, stating, "This is just the beginning. The enhanced data capabilities open up new possibilities for us to innovate and deliver even more value to the thousands of financial innovators we serve." #### RainBarrel Uses SafeGraph Data to Build Digital Advertising Audiences with Geospatial Context The Problem: How to build digital advertising audiences based on contextual intelligence It’s one thing to know where in the world (geographically-speaking) a mobile ping comes from. It’s another thing to be able to uncover the context or, rather, the real world environment surrounding that ping. Is it coming from a restaurant? A gym? A corporate office park? And then taking it one step further, where are those anonymized mobile pings traveling to and from?  Unfortunately, most location-based data sources available today—including open source data—simply don’t cut it. While they can provide the exact latitude and longitude, time stamps, device types, and device IDs associated with almost any mobile ping, they can’t tell the full story about where those pings really came from. That doesn’t really help advertisers at all. So, the RainBarrel team tasked itself with developing a cutting-edge way to build digital advertising audiences at scale. They already had their own source for mobile pings but lacked the necessary context around those mobile pings to unlock new value for their customers. They just needed access to high-quality and accurate geospatial data to fill in those contextual gaps. Making matters worse, as a Canada-based startup squarely focused on addressing the targeting needs of advertisers servicing Canadian consumers, the pickings were slim. “When it comes to data, Canada is an underserved market,” explained David Choi, RainBarrel’s Product Manager. “Most data providers are based in the U.S. and tend to treat Canada like an afterthought.”    While getting access to the right data was one hurdle to overcome, it was equally important for that data to be sourced in an ethical way. “For us, it’s an absolute priority to not only be privacy-compliant when building digital advertising audiences but also to be clear about how we put those audiences together,” underscored Travis Riedlhuber, RainBarrel’s Managing Director. “Transparency is the key to building high-quality digital advertising audiences at scale.” The Problem-Solver: RainBarrel RainBarrel is a proprietary Audience Graph based on commercially available geospatial data, allowing advertisers to target their messages to the right audience. As opposed to most other audience targeting methods available in the market today—typically defined by online user behaviors—RainBarrel takes precision targeting to an entirely new level by leveraging the power of location-based data to build digital advertising audiences rooted in offline user behaviors.  Say you run a luxury brand and wanted to target consumers interested in purchasing high-end goods. In the past, you might have built a digital advertising audience around users who regularly visited the websites of brands like Chanel, Louis Vuitton, and Gucci. Those online behaviors would lead advertisers to believe that those consumers make up the “luxury buyer” segment. Unfortunately, these online behaviors are often aspirational at best; just because consumers visit a luxury brand’s website doesn’t necessarily mean they can or will make a high price point purchase. However, a consumer who regularly steps foot at least twice per month into a Nordstrom store is more likely to be an active shopper—based on offline behaviors alone—and, therefore, more valuable to a brand from a targeting standpoint. This is precisely how RainBarrel expertly uses geospatial data to close the gap between aspiration and intent.    In other words, they’ve unlocked the untapped potential at the convergence of digital audience segmentation and real-world actions to create a more effective way for advertisers to target, reach, engage, and convert audiences. And they’ve built this around three guiding principles:  Privacy: All data used to build digital advertising audiences must meet the most stringent privacy standards and be in compliance with current GDPR and CCPA regulations. No personally-identifiable information is ever captured or shared whatsoever.  Transparency: All information pertaining to the company’s 2,600+ audiences is made publicly available, including details around data recency, audience size, number of POIs, geographic distribution, methodology, documentation, and the list goes on. Reach: All audiences are built, first and foremost, to address the evolving needs of RainBarrel’s customers in real-time. This also means making the audiences easy to activate and compatible with today’s leading DSPs and digital advertising platforms.  #### Socially Determined Uses SafeGraph Data to Advance the State of Public Health in a Meaningful Way The Problem: How to ensure people get the best care and stay as healthy as possible The traditional healthcare model is fee-for-service. This basically means that people pay—via insurance or out-of-pocket—whenever they visit their general practitioner (GP), go to the emergency room, purchase prescription medications, get lab tests done, etc. This model tends to disproportionately favor customers with insurance, disposable income, and access to various healthcare resources. Conversely, because many of these services often incur exorbitant costs, it puts disadvantaged or uninsured populations at risk of not getting the healthcare services they need in order to stay healthy for the long term. As a result of these inequities, the industry is beginning a slow migration toward a risk-based reimbursement model wherein resources get funneled into local health providers who can provide the right amount of coverage and better programs to support their patients’ healthcare needs. Because these healthcare providers on the ground are closer to their patients—and their respective health issues—they are in a better position to assume the risk by allocating the resources received in the best way possible to keep their patients healthy.  This begs the question: How can payers, providers, life sciences companies, and other players within the healthcare industry know where to allocate funds and resources to support public health in a more effective, locally-relevant way?  As a starting point, they need to have a clear line of sight into the social determinants of health (SDOH) that keep underserved or disenfranchised populations from staying healthy. This is critical for being able to pinpoint the right allocation of resources at the hyper-local level. Of course, a big part of this requires having access to accurate, real-time location data.  And here’s precisely where Socially Determined steps into the picture.  The Problem-Solver: Socially Determined Socially Determined has a bold vision to be the primary source for delivering meaningful social risk insights that create impact at scale and make equitable healthcare a reality. But what does that mean exactly? In short, the company harnesses the full power of data to help payers, providers, life sciences companies, and others working in the public health sector to not only be able to identify the SDOH playing a role in a person’s healthcare journey but also to pinpoint the best way to funnel resources for addressing public health needs. Their groundbreaking work has the potential to transform the entire healthcare ecosystem for the better, helping to make it more inclusive, affordable, and focused on driving better outcomes. #### Spade Uses SafeGraph Data to Bring Greater Transparency and Trust to the Financial Services Industry The Problem: How to clean and enrich messy transaction data in real-time The data infrastructure underlying the card industry hasn’t changed in decades—and, believe it or not, this isn’t doing banks (aka, card issuers) or consumers any good. This is especially true today when fraud is at an all-time high.  The team at Spade knew that this could no longer be the status quo. So, they set out on a mission to bring long overdue disruption to the card industry by giving card issuers better, more granular, and more insight-driven data to work with. After all, relying on incomplete data to monitor and manage transaction health simply couldn’t cut it anymore.  From Spade’s perspective, the key to bringing greater legitimacy, accuracy, and transparency to card transactions—while simultaneously catching instances of fraud early on—is to provide more context and data on transactions, including the exact location where those transactions take place. This additional layer of data can be used to optimize fraud detections models, refine spend control measures, and even streamline consumer rewards programs. In other words, a real game changer for the industry. Unfortunately, most banks don’t receive granular location information, like a merchant’s address, at the moment of transaction. Without this kind of context, it’s virtually impossible to know exactly where consumers spend their money. Plus, it makes it infinitely more difficult to personalize the consumer experience, incentivize rewards redemption, and nip fraud in the bud.  While there are clearly a number of benefits that come with attaching a verifiable location to card transactions, the truth is, detecting and stopping fraud is what really keeps card issuers up at night. However, given just how outdated the global card data infrastructure is, establishing a location-based layer for all transaction data as a new industry standard has been no easy task. That is, until now thanks to the work of Spade’s team. The Problem-Solver: Spade Spade is a fast-growing startup that’s on a mission to power the future of finance with superior transaction data. But what does that mean? In short, Spade’s transaction enrichment API, backed by real data, helps bring instant clarity and context to card transactions. The company is quickly revolutionizing what happens at the “transaction moment” by accurately matching merchant, category, and geolocation details—acquired directly from verifiable sources—to all card transactions that pass through their API.   To turn this into a reality, the team at Spade had a big choice to make. Were they going to continue sourcing, scraping, cleaning, and cataloging POI data on their own? Or were they going to look for a data partner to help them fill this gap?  “We had our own data but it was just not as broad as we needed,” explained Oban MacTavish, CEO at Spade. “Scraping POI data is a complicated, expensive, and time-consuming process, and we needed a data partner who could address our POI data needs at scale.” The Solution: SafeGraph Places “After vetting a number of data providers, we ultimately chose to work with SafeGraph, first and foremost, because of the sheer breadth and depth of their POI data,” continued MacTavish. “We wanted to work with an expert in location-based data who would serve as a trusted data source for us and hold a high bar for what a location actually is.”  Before working with SafeGraph, Spade was scraping anywhere between 500k to 1M locations on their own; after joining forces with SafeGraph and leveraging the Places dataset, they now had access to millions of merchants across the U.S. This enabled them to scale their offering instantly.  Although data coverage helped make this an easy decision for Spade, it was SafeGraph’s dedication to perennial data quality that really stood out. While surveying other vendors, the Spade team realized that not all vendors maintained the same quality standards. Some even dilute their own POI data with inaccurate merchants. This is a big problem because widespread mistakes, like closed or inaccurate locations in a dataset, can quickly become a costly error. But that’s just the technical part of working with SafeGraph. MacTavish continued, “We appreciated SafeGraph’s flexibility to build an agreement that made sense for us and their responsiveness whenever we needed any kind of support.” From delivering the data as flat files to reacting quickly to feedback, Spade has continued to be impressed with SafeGraph’s ability to come up with win-win solutions for the partnership. “We spend a lot of time with the data—and whenever we spot issues, SafeGraph ensures they get fixed quickly.”  Finally, as the cherry on top, one other detail set SafeGraph apart from the competition. “We’re a highly specialized startup with a very specific use case that not every vendor can support—or is even necessarily willing to make the investment in supporting,” explained MacTavish. “Even though this was a new use case for SafeGraph, too, it was clear from the start that they wanted to innovate with us and help our business succeed.” The Result: Enriching card transactions with location-based data “A big part of what we do, as a merchant data company, is connect transactions to a variety of data sources, with POI data used specifically to verify that businesses are real,” summarized MacTavish. “Although matching locations to transactions is just one piece of the Spade pie, it’s a key piece of our product—our services run on POI data.” While this industry-disrupting startup is still very much in its early stages, Spade has successfully set the foundation to revolutionize an ancient card data infrastructure that relies too heavily on inaccurate and incomplete data. By showing the industry what’s possible when more dimensions are added to transaction data, they will enable credit card issuers to generate more swipes, reduce slippage, offer more (and better) rewards, and of course, stop fraud.  MacTavish concludes, “Our business needs to exist; the credit card industry is overdue for a serious overhaul. However, without SafeGraph data, we couldn’t do what we do today at the quality and scale our customers expect from us.” #### Sysco Uncovers Complex Market and Brand Insights in a Changing Economy Using SafeGraph Places Data About Sysco ‍Sysco is the global leader in selling, marketing, and distributing food and non-food products to restaurants, healthcare and educational facilities, lodging establishments, and other customers around the world. The Challenge: Staying on Top of Trends in a Changing Market With a global reach across multiple industries, Sysco conducts detailed market analysis to strategically evaluate growth avenues. Even before COVID-19, this was no small task. The data on operators in the foodservice industry is fragmented and not frequently updated, which posed a challenge for this type of analysis.  Sourcing this data was particularly difficult for smaller brands, and keeping up with the constantly changing restaurant industry proved to be a tall order. Once the team had a solid supply of location data, they had to turn around and source it all again to maintain its freshness. When the global pandemic drastically altered the economy, especially the industries Sysco is focused on, having reliable data for market analysis became even more crucial to the business’ success. Restaurateurs across the country faced disparate local regulations on restrictions, causing many to temporarily or permanently shutter.  Sysco’s market, customer, and competitive intelligence team was seeking more accurate, up-to-date data that could help them navigate these changes quickly and effectively. The Solution: SafeGraph Places Data Sysco began using SafeGraph data in its market analysis for lead generation as part of its recent digital transformation before COVID-19 hit. SafeGraph’s POI data empowered Sysco to better understand customers’ and operators’ same-store activity through deeper customer and market intelligence. The Sysco team developed a multi-pronged approach to its market analysis that uses SafeGraph data to locate businesses and identify relationships, like brand affiliation. This approach enabled Sysco to quickly respond to the changing COVID-19 economy.  With SafeGraph Places data, Sysco has reliable and up-to-date information about their customers and operators, helping them identify on a weekly basis where to focus their efforts. The brand hierarchy information included in SafeGraph Places data has been particularly useful to Sysco, as they can aggregate market data at the brand-level to identify which brands are experiencing spikes or declines in demand. SafeGraph Places data has proven invaluable to Sysco in navigating the many operating changes implemented by restaurants in response to COVID-19. Sysco also uses SafeGraph data to analyze how lockdowns will affect their customers and operators in different geographies and industries. SafeGraph data empowers Sysco to uncover complex market insights in a changing economy and evolve its business strategy with reliable analytics. With the detailed information provided by SafeGraph Places data, Sysco’s market, customer, and competitive intelligence team is able to deliver detailed reports to the leadership team and identify areas of opportunity. #### Talon Uses SafeGraph Data to Add Greater Breadth and Depth to OOH Audience Targeting The Problem: How to add more precision to OOH audience targeting capabilities Talon’s technology was developed with the goal of giving brands and agencies a better, easier, and more effective way to plan, execute, and deploy out-of-home advertising campaigns.  Over time, and especially as the OOH industry has become increasingly data-driven, Talon’s customers have become interested in gaining spatial intelligence about the surrounding areas of the OOH units available for booking—in addition to understanding more about audience behaviors as well as how best to reach target audiences.  “Proximity has always been a key consideration in how our customers plan and buy OOH ad campaigns today,” shared Sophie Lewis, Talon’s Product Owner. “They are keen to buy ads that are strategically placed to either drive immediate foot traffic or go head-to-head with the competition.” This is impossible without geofencing the ad units.  At the same time, there’s also been an uptick in clients wanting to understand how consumers move around in or between different locations. But in the absence of access to precise, granular, and up-to-date POI data, it has been a challenge for Talon to easily optimize OOH ad campaigns with these insights in a consistent way. Therefore, the opportunity was ripe for the taking to start working with high-quality data that could help Talon address questions around proximity and foot traffic more effectively.   The Problem-Solver: Talon As a leading Global OOH media agency, Talon believes that behind every great outdoor campaign, there’s a lot more than just planning. This led them to build a multi-pronged strategy for taking a brand’s OOH ad campaigns to the next level, based on smart, data-driven planning, powerful in-house technology, and unforgettable creativity. Whether working directly with brands or in partnership with ad agencies, Talon’s team takes pride in its full-service approach—from planning and buying to creative development to production and execution—to ensure that the OOH ad campaigns they facilitate are as effective and successful as they can possibly be. The Solution: SafeGraph Places and Geometry Talon underwent a thorough vetting process to ensure SafeGraph data could help them achieve their goals around proximity as well as understanding how consumers move around locations.  Talon needed to first validate that SafeGraph data covered a wide array of brands and categories—including the brands they work with—and was also distributed evenly across the US (beyond top DMAs alone). But they also wanted access to something that most POI data vendors don’t offer: up-to-date business operating hours and location polygons for each POI.  On top of that, they had a few operational challenges to solve as well: Data Consistency: When working with various data sources in the past, they found each dataset had its own standards for “cleanliness,” making it difficult to work with until it was fully scrubbed. “Our team had to do a lot of data transformation in order to make the data we were using actually usable,” explained Lewis. “We weren’t able to answer our customers’ questions immediately until the data was cleaned.” Segment Creation: “We’ve got fairly sophisticated audience segmentation models in our ‘Ada’ (audience targeting and measurement) platform, but we could never depend on third-party data sources to provide the same level of detail,” confirmed Lewis. The big issue here for Talon was both a lack of visibility and a lack of control over how these third-party audience segments were built. This made it infinitely harder to work with these datasets, including being able to integrate them into the Ada platform seamlessly. Fortunately, SafeGraph was able to tick all of the boxes. Not only could SafeGraph provide highly granular and up-to-date POI data—covering multiple brands and categories across the entire US market—SafeGraph also helped Talon solve a key issue by being able to include geometry attributes for each POI by providing them with access to accurate polygons.  “For us, the quality and scale of the data were paramount,” reiterated Lewis. “Not only was SafeGraph able to deliver the quality and scale we needed to achieve our goals—much less with a monthly update cadence—but the structure of the data is giving us new opportunities to refine our audience targeting capabilities even further.”  To date, Talon has used the ‘top categories’ and ‘sub-categories’ attributes available in the SafeGraph Places dataset to provide their customers with granular audience insights. Looking to the future, they are planning to leverage tags to give users more extensive filtering options in Ada to help find precise audiences with greater granularity. Finally, what made SafeGraph really stand out to Talon’s team was how easy it has been working together. “We always know there’s someone at SafeGraph who can answer our questions, help us work through issues, and be receptive to our feedback,” emphasized Lewis. “This really feels like a partnership; SafeGraph offers a level of service you don’t get with other data vendors.” The Result: Driving efficiencies that fuel smarter, more targeted OOH ad campaigns Since working with SafeGraph, Talon’s team has been able to respond to constantly growing demand faster than ever before. “We’ve been able to significantly improve our response time,” explained Lewis. “Now when we get a brief from a customer, we can get back to them even faster with a well-thought media recommendation.”  What this basically means is that this has enabled Talon to be on par with other channels (especially digital). “Although out-of-home advertising has historically been used to drive reach and brand awareness, we’re now able to drive lower funnel metrics—like consideration, purchase, and intent—with niche behavioral audience targeting,” summed up Lewis.  Integrating SafeGraph data into their Ada platform has, therefore, made it possible for the brands and agencies Talon works with to run more effective OOH ad campaigns, especially in terms of boosting foot traffic and driving increased brand recall. And because Talon can work with the data in a way that works best with all of their platforms, they’ve also been able to tailor their response to briefs to align the right audience for achieving a customer’s specific goals. The Future: Weaving SafeGraph data into different parts of the business While Talon is currently using SafeGraph data for the US market only, there’s already talk about expanding the partnership into other markets globally. “The natural next step for us is to replicate what we’re doing in the US into the other markets where we have a presence, including Singapore and the United Arab Emirates,” continued Lewis.  Additionally, Talon recently launched a DSP called ‘Atlas’ that allows them to buy OOH ad units programmatically for their customers. Being that audience targeting is a big part of the out-of-home buying experience, Talon is thinking about how to weave SafeGraph data into the DSP, along with other data sources like weather, sporting events, and more to offer brands and agencies a more holistic and comprehensive media buying experience.  “More broadly speaking, we’re asking ourselves a lot of questions around consumer movement partners and what that might look like in the future,” concluded Lewis. “There’s the potential to combine foot traffic data with POI data in order to gain a deeper understanding of how events or seasonality affect human behaviors that ultimately impact ongoing media effectiveness.” #### Understanding the Economy with INRIX and SafeGraph Data The Problem In today's fast-paced financial landscape, the significance of alternative data for investors cannot be overstated. Traditional financial metrics and indicators only provide part of the picture, often lagging real-time market movements. Alternative data, on the other hand, provides investors with the ability to uncover hidden trends, assess market sentiment, and gain a competitive edge. In the world of data-driven decision-making, investors are presented with a plethora of options, each holding its own unique potential. Below we explore two different ways alt data is helping Wall Street make better investment decisions:  Consumer trends/predictive insights: This involves providing Trip counts to significant points of interest (POI) like theme parks, casinos, hotels, restaurants, and retail shops. By doing so, investors can gain insights into a business's financial health over specific time periods. Commercial Vehicle Activity: Examining the movements of commercial fleets around manufacturing, distribution, logistics, and shipping facilities. This data helps correlate supply chain dynamics with changing consumer behaviors. The primary goal is straightforward: Identify datasets that correlate with a company’s revenue, or other operational metrics (e.g. inventory levels, attendance, sales, visits, specific revenue streams etc). Can these datasets provide signals indicating a company's strengths or weaknesses in current market conditions? Are companies responding effectively to evolving consumer demands and supply chain challenges? Many investors are now searching for a comprehensive solution that eliminates the need to procure and connect disparate datasets individually. They want partnerships among data providers to simplify access to vital information, streamlining their decision-making processes. The Problem-Solver: INRIX Since 2004, INRIX has been a prominent provider of data and insights into global mobility patterns. Initially focused on the automotive and transportation industries, INRIX gained recognition for innovative solutions using real-time parking and traffic data, particularly in facilitating the safe testing of autonomous vehicles. Over time, INRIX expanded into other sectors, notably assisting state and city governments in enhancing road efficiency and safety. INRIX recently ventured into the financial services industry by offering investors and analysts access to its proprietary Trips dataset. To make and deliver data-driven insights to the investment community at scale, INRIX needed to enhance its vehicle trips data with precise geospatial information on POIs. This effort aimed to shed light on the details of trips between various locations and their relevance to consumer visitation, retail trends, distribution, manufacturing activities, and more. According to Phil DeFrancesco, INRIX's Head of Product, Financial Services, "The investment community relies on us to provide straightforward answers to complex questions. We have put in the effort behind the scenes to simplify their ability to act on our insights. We just needed the right data partner to make this happen." The Solution: Partnership between INRIX Trips Plus and SafeGraph Places Dealing with diverse datasets and the manual work required to manage them can be challenging for investment managers due to data heterogeneity and vendor fragmentation. With Trips Plus, which includes SafeGraph Places, these challenges are eliminated. Investors can now analyze consumer trends, supply chain efficiency, and manufacturing output across public and private companies seamlessly. This analysis assists in identifying investment opportunities and generating alpha for portfolios. Trips Plus combines INRIX's high-quality passenger and commercial vehicle Trip data with SafeGraph's attribute rich POI data. This unique combination of data sources empowers investment professionals to optimize portfolio allocations, manage risk effectively, and uncover market insights well before they appear in earnings reports or become widespread trends. After evaluating various data sources, INRIX selected SafeGraph as the ideal partner to enhance its Trips dataset with detailed geospatial information. SafeGraph's comprehensive POI coverage and robust parking lot dataset were key factors in this decision. Recognizing the potential for growth, SafeGraph expanded its Places dataset to include additional POIs related to manufacturing, distribution, logistics, and shipping facilities to meet INRIX's specific requirements. Ross Epstein, SVP of Product at SafeGraph, expressed excitement about the partnership, stating, "By combining insightful connected vehicle data with detailed and up-to-date points of interest, investors can now spend more time reviewing trade performance and boosting their portfolio." The Result: Unlocking New Insights with Trips Plus Historically, investors analyzed passenger and commercial vehicle trip data separately. However, Trips Plus with SafeGraph’s Places provides a significant opportunity to assess whether a brand's manufacturing, logistics, distribution, or shipping activities align with consumer behaviors or diverge from them. For example, the data can detect discrepancies between increased manufacturing and shipping activities and decreased store visits before inventory backlogs occur.  Such information – only available in INRIX Trips Plus – enables the investment community to answer critical questions more easily, leading to informed investment strategies and timely business decisions. ### Guides #### 12 Methods for Visualizing Geospatial Data on a Map Mapmaking, or cartography, is the visualization of geospatial data. It’s an art in that it seeks to represent data in a form that can be more easily understood and interpreted by non-technical audiences. But it’s also a science in making sure the visuals accurately conform to the data that they’re based on.Whether you get your data from SafeGraph or your own research, it’s much more impactful when it’s mapped out to more directly describe the locations it’s referring to.But maps, like many things, are not all the same. Based on their advantages and limitations, some styles of maps are better at representing certain types of information than others. So to help you choose the right map for the data you want to illustrate, we’ve compiled a list of 12 common methods for visualizing geospatial data.12 methods for visualizing geospatial data in better waysHow you represent the geospatial data you acquire can affect what conclusions you draw from it. So it’s important to choose a mapping style that allows you (or your clients) to make sense of the information in ways that best suit your needs.To demonstrate, here are 12 examples of mapping strategies, with explanations regarding their strengths, weaknesses, and best use cases.1. Point mapA point map is one of the simplest ways to visualize geospatial data. Basically, you place a point at any location on the map that corresponds to the variable you’re trying to measure (such as a building, e.g. a hospital).It’s useful for showing distribution and density patterns of things, but it requires you to collect or geocode location data accurately so you can identify each location precisely on the map. The point technique can be difficult to use with large-scale maps, as points may overlap each other at certain zoom levels.2. Proportional symbol mapThis is a variation of the point map. It uses a circle or other shape to represent data at a particular location. However, based on the point's size and/or color, it can be used to represent multiple other variables at once (such as population and/or average age).This makes proportional symbol maps good at conveying several types of information at the same time. They can still suffer from the same issue as point maps, though: trying to cram too many data points onto a large-scale map – especially across small geographic areas – can lead to overlapping.3. Cluster map(Image source: Esri ArcGIS) This is a proportional symbol map with a twist. It features a similar concept of using points of varying sizes and colors to represent multiple types of data at a location at once. However, these larger points serve as stand-ins for smaller points, which become visible if you increase the map’s scale. This gets around the main issue of overcrowding in point maps, but requires special geospatial data visualization tools such as GIS software.4. Choropleth mapA choropleth map is another common type of map. It’s made by separating the area being mapped, such as by geographic or political boundaries, and then filling each resulting section with a different color or shade. Each color or shade represents a different variable, and/or a different value or range for a single variable. This makes choropleth maps useful for visualizing clusters of data across a geographic area while maintaining the context of regional boundaries.Just be careful using this style with areas where regions differ markedly in size, as the size of a region may not necessarily have any relationship to the data attributed to it. For example, on a map of the United States, states with larger land masses – like California and Texas – tend to draw attention. However, on a choropleth-style map, they may not have a high concentration of a measured variable or have traits that are important to a particular form of analysis, compared to smaller states like Maryland, Delaware, or Rhode Island. If you are trying to point out something in a smaller area that could be dwarfed by larger areas, you may need to include an inset map to make sure the color is called out. 5. Cartogram map(Image source: Esri ArcGIS) This variation of the choropleth map is a hybrid of a map and a chart. It involves taking a land area map of a geographic region and dividing it into segments in such a way that sizes and/or distances are proportional to the values of the variable being measured. Then each segment is given a different color or shade to relate it to its corresponding value. In this way, the data more directly correlates with the land area it’s referring to.However, trying to line up size and distance proportions with a region’s actual land area often results in distortions that can make it difficult to recognize what location a cartogram map actually represents. For this reason, it can be helpful to include a land area map of the location alongside the cartogram map, for reference’s sake.6. Hexagonal binning mapImage source: Mapbox Hexagonal bin maps are another choropleth map variant that divide a geographic area into a grid made of regular hexagons and derivative figures. This easily creates a continuous shape while still covering land area with accuracy. Then each cell in the grid is given a color or shade to represent a value of a variable, much like in a regular choropleth map.This type of geospatial data visualization provides a good balance of precisely mapping a set of granular data points without losing accuracy through converting discrete data into continuous data. However, it can be difficult to scale up or down without combining or separating cells.7. Heat mapA heat map is somewhat like a choropleth map in that it uses colors or shades to represent different values or value ranges. However, it presents these values and ranges as a continuous spectrum, rather than as discrete cells constrained by geographical or political boundaries.In this way, a heat map is useful for more precisely visualizing patterns of high (“hot spots”) and low concentrations of a variable. This can come at the cost of accuracy, however, as it often requires converting discrete data points into a continuous spectrum via algorithms.8. Topographic map(Image source: US Geological Survey) A topographic map is another fairly standard form of geospatial data map. Often, topographic maps are used to represent physical land features that are spread out over an area. These include terrain elevation (especially mountains, volcanoes, and other high landmarks) and river systems. They can also include man-made things such as roads, railways, or other transportation networks.9. Flow mapFlow maps, also known as ‘path’ maps, are more specialized versions of line maps. Instead of focusing on physical features of the earth, they are used to represent the movement of things across the earth over time. These can include migrating humans or animals, resources and other goods for trade, vehicle traffic, and weather patterns (especially severe storms such as hurricanes). They are usually constructed as sets or pairs of origin and destination data points.10. Spider map(Image source: Esri ArcGIS) The spider map is a variation of the flow map. Instead of focusing on discrete pairs of origin and destination data points, the spider map looks at the relationships between origin points and multiple destination points – some of which may be held in common.An example of a spider map may be a route map for buses, streetcars, subways, trains, or other modes of transportation that have series of predetermined stops between multiple vehicles. You could also use a spider map to display how frequently ride-sharing vehicles, like bikes or scooters, are picked up from specific parking stations and dropped off at others.11. Time-space distribution map(Image source: Towards Data Science)This is an advanced form of geospatial data mapping that combines the precision of a point map with the dynamism of a flow map. It seeks to accurately determine the locations of objects at any point in time as they move. Naturally, this is only possible through GIS software and other forms of non-static mapping. The most common usage of this type of map is in monitoring the locations of vehicles or mobile devices through global positioning systems.12. Data space distribution map(Image source: Towards Data Science) This is another variant of the flow map that aims to not only represent the movement of things over time, but also how variables dependent on that movement change over time.Let’s go back to the example of representing a subway system using a spider map. You could turn that into a data space distribution map by plotting how many people are on a particular train as it moves between stations. You could even plot this variable for multiple times each day to get an idea of when and where the subway is busiest and may need extra staff on duty.Hopefully you now have a better understanding of the ways geospatial data can be represented, as well as the scenarios in which certain methods may be more appropriate than others. But before you visualize any data, you need to collect it first. If you’re not so keen on the needed legwork and potential accuracy issues that come with gathering data on points of interest manually, SafeGraph has all of your location data needs covered. If you're ready to learn more, check out the next chapter, "Challenges of Geospatial Data Integrations" here. If you want to go back to basics of analytics, our guide titled “Geospatial Data Analytics — What It Is, Benefits, and Top Use Cases” which you can read here, will teach you everything you need to know about this topic. #### A Technical Guide to SafeGraph Places Data Places Change Every Day - POI Data Should Reflect ThatIn today’s ever-evolving physical world, accurate and timely points of interest (POI) data proves more important than ever. Businesses, organizations, and research institutions gather and utilize POI data to execute successful operations - from food delivery services to find-my-nearest apps to marketing and advertising campaigns. However, POI-based applications rely on up-to-date information to provide genuine value to a product or service, and many POI resources fail to deliver adequate or accurate location data due to the dynamic nature of our world. A key concern of using POI data for business applications is data validation. While many POI data resources use semi-reliable data validation approaches (think manual verification and directory checks), many fail to maintain timely validation methods. In fact, most POI data providers only update their databases every three to six months, which can be problematic depending on the data’s application. For example, when evaluating another POI data provider’s free match service in February of 2022, we found that 17% of their POI records were invalid. We confirmed this by checking website domains and searching for news articles about store locations that recently closed or relocated. ‍“We can’t be on the ground in every local market we operate in—we need access to data that can be our ‘eyes’ on the ground and give us a more accurate idea of what the local market looks like. But it wasn’t a good use of the team’s time to append partially complete POI data with open-source data to fill in the gaps.” - Julian Adams, Director of Data Science at Avison YoungAccording to the National Retail Federation, more than 8,100 retail store locations opened in 2021 - and that’s just in the US. Whether these were new brand openings, brand expansions, or store relocations, this stat indicates just how much POIs change every day. At a global scale, these changes are challenging to stay ahead of, and many companies building mapping and location-based platforms or applications struggle to curate an accurate and up-to-date database of places.When POI data is an integral part of an organization’s operations, the risk associated with such a significant gap in database updates is high. A company that relies on an up-to-date record of places for its trade area analysis, for example, risks building catchments based off of incorrect competitor locations, thus misallocating resources if using stale data. Similarly, a consumer-facing mapping application built with outdated POI data makes for a poor user experience and creates a high churn rate. These are just two examples among many of the importance of data veracity when representing the dynamic physical world.Working with stale and inaccurate data is also highly inefficient. According to research by Gartner, poor data quality costs large corporations nearly $15 million per year in losses, both in time and resources. Modern data scientists spend approximately 19% of their time collecting baseline data and 60% of their time cleaning and organizing it. With the majority of time spent remediating ‘dirty data,’ companies can drastically reduce operational data costs by simply obtaining high-quality data from the start. ‍"We pored through spreadsheets to isolate categories and look for issues in the data and SafeGraph was the clear winner. There was just so much weird, junky stuff in the other datasets, it just didn't pass basic data quality. So kudos to you guys for a solid product." - Nic, Babb, VP of Engineering at AdomniThis need for fresh and reliable POI data is why SafeGraph was founded in 2016, and we remain focused on one thing only: being the source of truth for physical places. The SafeGraph Places dataset is curated each month to empower organizations with an up-to-date view of global market landscapes, brand relationships, and how places share physical spaces.What is SafeGraph Places Data?SafeGraph Places is a comprehensive dataset composed of high-quality POIs, leveraged by thousands of organizations globally who trust the data as their primary source of truth. It’s a database created to address the most pressing challenges involved with POI data collection and upkeep, providing data scientists, product managers, and analysts with accurate and timely location information to ensure their products, services, analytics, and strategies are built on real-world facts. Places contains a robust set of geospatial attributes to provide deep context about physical locations, including address string, geographic coordinates, brand affiliation, open/close date, and NAICS/category codes. An advantage of SafeGraph’s Places dataset is the breadth of location types included. While many POI providers only provide traditional commercial places, such as restaurants and retail stores, SafeGraph additionally curates POI data for parks, warehouses, EV charging stations, oil rigs, and other important, non-traditional places. This comprehensive coverage of global places under one unified data schema enables efficient data ingestion, modeling, and analysis - eliminating the need to prep data from multiple sources. SafeGraph Places provides a comprehensive view of what POIs exist in an area, including non-traditional places like corporate offices, bus stops, and apartment complexes.SafeGraph’s data curation process ensures the POIs included in Places are geographically precise and contain fresh and accurate attributes about what is actually occurring at that place. In the next section, we’ll dive into our data curation methodology and how we maintain freshness in a changing world. How Does SafeGraph Curate the Places Dataset Each Month?Each month, SafeGraph creates the Places dataset using machine learning (ML) technology, web crawling, and third-party licensing. More specifically, SafeGraph curates POIs by:Passing POI metadata through machine learning models to assign relevant business categories, deduplicate POIs within and across sources, and cleanly parse addresses.Crawling open-source web domains and store locators for accurate and up-to-date place locations.Using publicly available APIs to provide updated locations for specific categories of POIs such as airports and government buildings. Licensing third-party data to fill open gaps in POI data. The combination of all these sources results in a ready-to-use, clean, and current dataset that reflects the current state of POIs around the world. “With other data providers, we would have to spend a lot of time cleaning the data to make it useable. Of course, the quality of the data was important to us, but the ethics of SafeGraph’s methodology really stood out.” - Scott Stoltzman, Director of Data Science at RCLCOThe Places Data SchemaEach column in the Places data schema is designed to provide relevant and up-to-date information about global POIs. We describe each column in more detail below:Placekey & Parent PlacekeySafeGraph is a founding member of Placekey, the universal standard unique identifier for places. Placekey was developed out of a need to make location datasets from different sources easily joinable. To make sure our data is interoperable with other location data, SafeGraph appends Placekeys and parent Placekeys to all of our datasets.Within SafeGraph Places, the Placekey and parent Placekey columns help identify the physical location of a POI and how it is spatially related to other places. When both components of a Placekey come together, it results in the ‘what’ and the ‘where’ of a specific POI and serves as a join-key to simplify bringing multiple location-based datasets together.Placekey is a unique and persistent ID tied to an individual POI that simplifies joining location-based datasets from multiple sources. Think of a Starbucks location inside of a shopping mall - that Starbucks will have a unique Placekey because of its geographic location and the type of place it is. Each record in the Places dataset contains a Placekey.The parent Placekey column, on the other hand, is only populated in rows representing places that are contained by another place. Using the previous example, the Starbucks store inside the mall will have both a Placekey and a parent Placekey, where the Placekey represents the store itself, and the parent Placekey represents the entire shopping mall. This concept of representing how places are related to each other physically is what we call ‘spatial hierarchy.’ Spatial hierarchy metadata appended to SafeGraph Places indicates when a place is standalone, exists within a larger structure, or shares a physical location with another place.Placekey and Parent Placekey denote spatial hierarchy (how places relate to each other), such as a store within a mall.Location Name, Brands, & SafeGraph Brand IDTo provide the base information about what exists at each geographic location, SafeGraph includes three closely related columns. The location name column delivers the unique name of each place, such as 7-Eleven. Sometimes this matches the value in the brands column, particularly when a location name is simple, like Walmart, and falls under the Walmart brand. However the location name column can differentiate between a Walmart and a Walmart Supercenter, while the brand for that location will still be just Walmart.The brands field is helpful for seeing entire brand footprints regardless of whether individual locations have different naming conventions. SafeGraph brand IDs also help surface brand relationships by serving as a unique and persistent identifier for different brands. Brand IDs remain the same in the event of a brand renaming itself so as not to break any existing models or queries.SafeGraph brand IDs also detail parent and child brands. Similar to Placekey denoting spatial hierarchy, SafeGraph brand IDs show brand hierarchies. For example, Yum! Brands owns multiple restaurant brands, so POIs for those restaurant brand locations will contain a brand ID for that restaurant, and a parent brand ID for Yum! Brands. This takes identifying brand footprints and market landscapes a step further to show how some brands are related to each other, and provides another field option for querying and modeling places data.Every POI in SafeGraph Places includes a location name, but not all records include a brand or brand ID. This is because many places do not belong to a larger brand, such as independent restaurants or local museums. SafeGraph defines a brand as a branded store which has multiple locations all under the same logo or store banner.Some columns in the Places dataset only apply to certain types of POIs, like Brand and Brand ID. Smaller mom-and-pop store locations or offices will have ‘null’ values in these columns because they do not belong to a larger brand.While location name, brands, and brand IDs are included in the main file delivered for SafeGraph Places, we include a supplementary brand info file in each delivery to provide the parent brand ID and more brand-specific information. The brand info file is easily joinable to SafeGraph Places through the brand ID column, and includes brand categorization information, stock symbol, stock exchange presence, and lists of which countries the brand currently has opened and closed locations in. "From the beginning of our data sourcing process, SafeGraph provided the most comprehensive and actionable POI dataset. Their coverage of the top 1,000 restaurants is unmatched and invaluable.” - Ben Anderson, Senior Manager of Market, Customer, and Competitive Intelligence at SyscoAddress Elements & Geographic CoordinatesThe Places dataset includes separate columns for the latitude and longitude of each POI to make the data easily mappable. It also has columns for parsed-out address strings, including separate columns for street address, city, region, postal code, and ISO country code. These foundational columns not only locate the POIs in the physical world (as does Placekey) but also power geocoding services in mapping applications and serve as valuable filters for selecting POIs from specific geographic areas. Store IDStore IDs are unique identifiers within a brand for store locations. The store ID column enables users to easily join with other datasets that include store IDs. Most often, this involves transaction information, financial statements, quarterly reports, and first-party company data.Phone Number, Open Hours, & WebsiteTo provide further foundational context for each POI, the Places dataset includes three columns related to how people can engage with that place: phone number, website, and open hours. These are particularly useful for mapping applications or platforms that surface information to people looking to interact with a place. The open hours column contains specific hours of operation by day in an easily explorable JSON format.Top Category, Sub Category, & NAICS CodeThe NAICS code, top category, and sub category columns categorize a POI by what type of place it is. These categorizations were developed by the US Census Bureau to distinguish different place types and are all closely related to each other.“SafeGraph data adheres to industry standards, like NAICS codes. This makes it a lot easier for us to work with and join to other data sources without having to do a big cleanup effort.” - Matt Taaffe, VP of Product at OlvinNAICS codes define a POI by a 6-digit code - a taxonomy developed to classify each type of POI numerically. Burger King, a ‘limited-service restaurant,’ contains a NAICS code of 722513. Top category is a string label that defines a POI by its purpose - based on the first 4 digits of a NAICS code. A Burger King, for example, is labeled ‘restaurants and other eating places.’ Sub category is a string label that defines a POI with a description of its purpose - based on the first 6 digits of a NAICS code. The same Burger King location labeled ‘restaurants and other eating places’ in the top category column is labeled ‘limited-service restaurant’ in the sub category column.SafeGraph strives to provide 6-digit NAICS codes for most POIs, but for some places our models cannot meaningfully differentiate between two adjacent 6-digit NAICS. In these situations we err on the side of caution so as not to provide false facts, and choose to only assign a 3 or 4 digit description, meaning the sub category column will be null.Category TagsThe category tag column expands on this categorization, providing further flexibility and granularity where the NAICS codes fall short. For example, category tags for a fast food restaurant may include terms like ‘counter service,’ ‘sandwich shop,’ ‘late-night,’ ‘drive-through,’ and more, while the sub category would remain ‘limited-service restaurant’ regardless of the type of food served. Category tags are also helpful in distinguishing between different types of medical offices or retailers. This information is typically used to: Power more detailed and specific search queriesCreate more informative customer-facing mapping applications Better understand market landscapes Develop more accurate models Category tags allow for more granular filtering and symbolizing, and help distinguish place types from each other within the same NAICS code.Because each POI can contain multiple category tags, category tags are included as JSON in one column if applicable to a specific place. Opened On, Closed On, & Tracking Closed Since To indicate the real-world status of a POI and make it clear when places open and close, SafeGraph includes three date-related columns. The opened on column provides the month and year that POI opened, while the closed on column details the month and year that POI closed, if applicable. If a closed on column value is null, that indicates the POI is still open. If an opened on column value is null, it means we are still acquiring the metadata to confidently report when that place opened, or that it opened before we had rich enough metadata to infer a date. We also include a tracking closed since column to note when we began reporting on that place’s opened or closed status. The SafeGraph product and engineering teams have developed a detailed and thorough logic for determining if POIs are opened or closed. If a new place from an existing source repeatedly appears in our build pipeline, it is flagged as opened during the month in which it first appears. Similarly, if a POI from an existing source repeatedly disappears from our build pipeline, it is flagged as closed during the month in which it first disappears. These flags are added to the Places product permitting final QA checks and overall data hygiene. SafeGraph does not track temporary closures so as not to mistakenly mark places as permanently closed. You can read more about our open and close logic here.Geometry Type While SafeGraph Places is ultimately a file of latitude and longitude coordinates for POIs, we do provide detail on whether the location itself exists in the real world as a polygonal space or not. For example, while the record for Golden Gate Park in SafeGraph Places is represented as geospatial coordinates for a single point, the geometry type field indicates that the park actually can be represented as a polygon. Types of places that do not have a polygonal geometry type include bus stops or ATMs, since they often do not have physical extents large enough for a person to traverse. SafeGraph uses the Places dataset to build Geometry data, providing the polygon data for places with geometry. #### Best POI Data Provider for Quality Point of Interest Data Geospatial data can be used in many different contexts. More and more, businesses and other organizations are finding creative ways to put it to work. One of the most basic types of geospatial data is point of interest (or POI) data, which provides details about physical places on Earth. Beyond mapping the Earth’s surface, POI data is being used in real estate, retail, finance, urban planning, and more to analyze growing neighborhoods and decide where to place facilities or which ventures to invest in. But data-based work is often only as good as the quality of data that you start with. So how do you know if the POI data you’re getting from a provider is reliable? We’ll answer that question via the following sections: How to choose a POI data provider What attributes should POI data providers have in their data? 7 best POI data providers First, we’ll discuss how to know if a POI data provider is going to supply you with quality data. How to choose a POI data provider Not all data is created equal, and so not all geospatial data providers are equal either. To decide which company your business should source POI data from, there are a number of key questions surrounding data quality that your organization should ask: Based on how the provider sources it, is the data credible and reliable? What information can the data provide, and what gaps does it leave? How accessible is the data, and how much processing is needed to make it accessible? Why does your business want this data (i.e. what do you plan to do with it)? Some other, more specific, data quality considerations your organization should think about include the following: Data relevance Data relevance is related to the question of why your business wants data and what it intends to use that data for. Before looking for data sources, your company should clearly define the questions it wants answered and the objectives it hopes to achieve via data. This makes it easier for your organization to narrow down which kinds of data will be useful, and which kinds will simply take up space and add noise. Data age and timeliness The universe is changing all the time. That’s why a dataset’s age is often linked to its accuracy: the more time that has passed since the dataset was created or updated, the more likely it is that the data refers to states of reality that are no longer true. This could be as simple as a business changing its phone number or email address. To compensate for this, SafeGraph reviews and updates its geospatial data every month. This gives our clients a more frequent data refresh cadence than most competitors, so they can have accurate data sooner and more often. Data scale and completeness Data needs both volume and comprehensiveness to be useful. If a dataset doesn’t contain many records, then it’s not giving you information about very many items. Likewise, if records don’t have the appropriate details filled in, then you are missing out on important context for the data you do have. SafeGraph’s POI dataset, Places, contains information on over 40 million points of interest. Not only that, it also contains some of the highest attribute fill rates in the industry for nearly 40 attributes. As an illustration, check out this visualization of our global POI data coverage by country. Data usage rights and terms Not all data providers allow their data to be used in the same way. Some specify that their data must only be used for an organization’s internal projects. Others allow their data to be used in a consumer-facing capacity, but only in very specific applications. So even if your company finds a POI provider with an abundance of quality data, that data may still not be very useful if there are too many restrictions on how it can be used. At SafeGraph, we have fairly flexible terms of use for our data. We deliver it over common GIS and data management platforms, or as a CSV file, so your organization can start using the data in whatever environment it finds easiest to work with it. What attributes should POI data providers have in their data? POI data is typically organized into one or more tables. The rows in the table denote the number of different points of interest that the dataset covers. The columns, on the other hand, represent attributes: types of information that give additional context to the points of interest being covered. As we mentioned, quality POI data will have a mix of both. On one hand, just listing a bunch of points of interest isn’t very helpful if there isn’t any supporting information that explains more about each place. On the other hand, complete in-depth details may not be all that useful if they only reference a small number of places. Here are some common types of attributes that POI data distributors should have in their datasets: Location, typically in the form of a street address or latitude and longitude coordinates (or, ideally, both). Contact information, which generally includes information such as a place’s name, mailing address, and phone number. Some datasets will also include information regarding a place’s website, social media accounts, or other contact details. Categorization, which explains what function a point of interest serves. This is usually represented by its NAICS code, which standardizes the classes of industry that business establishments belong to. Additional category tags may be included to provide further context to a place beyond its type of economic activity (e.g. if a restaurant serves a specific type of cuisine). Branding, information related to a company (or subsidiary of a company) that operates at a point of interest, along with others, under consistent logos or trademarks. Information can include the brand’s name, the name of its parent company, its NAICS classification, stock market designations, and which countries the brand operates in. Business information specific to a particular location. This could include the place’s unique ID within the franchise, its hours of operation, the date it first opened to the public, and the date on which it ceased operating (if applicable). The 7 best POI data providers Now that we’ve gone over some of the things you should seek out in POI data providers, where does your organization start looking? To give you a head start, here are 7 of the top distributors of point of interest data. We’ll tell you a bit about each one, including their strengths and weaknesses. 1. SafeGraph Best for: High-accuracy and frequently-refreshed POI data, including building footprint polygons SafeGraph is one of the leading providers of geospatial data because it’s all we focus on. We have accurate data on over 40 million points of interest around the world, with some of the highest fill rates for nearly 40 attributes. In addition, nearly half of our POIs have the added precision of polygon-based building footprints and spatial hierarchy metadata. Our datasets are updated monthly, so your organization gets information that stays fresh. 2. Foursquare Best for: Global POI coverage reinforced through first-party app verification Foursquare has a unique way of updating and reinforcing its global POI data: its consumer-facing applications, Foursquare City Guide and Swarm. These apps allow users to “check in” at points of interest, and in doing so, contribute additions or corrections to Foursquare’s data. Of course, this is a voluntary process, so it doesn’t always result in increased precision or completeness in Foursquare’s datasets. Foursquare’s POI data also isn’t backed by polygon-based building footprints. 3. Precisely Best for: Coverage of many different types of POIs worldwide Precisely is known more for its data management tools, but it has a portfolio of geospatial data as well. Its POI data has a pretty granular classification system, and is easy to combine with Precisely’s other geospatial datasets as well due to the PreciselyID standard of identifying locations. However, because of the way this standard works, Precisely’s POI data is somewhat prone to having duplicate entries. And despite having polygon-based building footprints to show exactly where POIs are, Precisely doesn’t have spatial hierarchy metadata to denote when certain distinct places are inside or a part of a larger place (e.g. a store inside a mall). Lastly, Precisely’s geospatial data is rather expensive. 4. HERE Technologies Best for: Data on a very high number of POIs worldwide that’s easy to search and filter HERE has a high volume of global POI data – over 120 million points of interest in over 100 countries and territories. But what makes it stand out is the number of options it has for searching, filtering, and sorting that data. For instance, it can find electric vehicle charging stations, a restaurant that serves a specific type of cuisine, or even a place along a planned travel route. The completeness of HERE’s POI data is somewhat lacking, though, as the company is focused more on navigation solutions than place information. The data can also get somewhat expensive to use if your organization goes beyond the free offerings. 5. TomTom Best for: Very wide global coverage of POIs with lots of descriptive attributes, fairly priced Considered one of the pioneer companies of commercial GPS equipment, TomTom naturally has a database on points of interest. The database has global coverage and a wide array of well-documented descriptive attributes, all at reasonable prices. TomTom’s POI data only has limited completeness, though, as – like with HERE – TomTom’s focus is more on mapping and navigation tools than it is on data about specific points of interest. Another weakness in the data is that the latitude and longitude coordinates of places are based on street addresses, not the centroids of the actual buildings. 6. Google Places Best for: Comprehensive global POI coverage, including imagery and social sentiment While it became world-famous because of its web search engine, Google has compiled a vast library of global geospatial data as well. Its Places API is regarded as one of the most complete collections of POI data in the world, including streetside imagery and traveler feedback. However, this data is only available through the API, and there are very strict terms on its use. It is also one of the most expensive options on the market. 7. OpenStreetMap Best for: Free geospatial data, if you have the coding and legal know-how to access and use it OpenStreetMap is likely the most well-known open-source global geospatial database. It has become a popular source for POI data because it does not charge a licensing fee for use of its data. However, OSM’s licensing and non-standardized data formats can make the data somewhat complicated to collect and use. Also, because the data is contributed mainly by a small non-profit (the OpenStreetMap Foundation) and volunteers at non-routine times, its accuracy and consistency can be questionable. Get quality POI data with accuracy, completeness, and timeliness from SafeGraph The best providers of POI data have datasets that: reflect the current reality as closely as possible give enough relevant information for enough relevant places are updated frequently to preserve accuracy and limit data decay have flexible licensing terms that allow for your organization’s intended use cases SafeGraph’s Places dataset checks all of those boxes. It contains over 40 million records, with accuracy backed by over 15 million polygon building footprints and high fill rates for over 40 attributes. We update Places monthly so your business gets newer and more precise data more often. And we deliver the exact data your company needs via CSV file, or through common GIS or data management software (Snowflake, Amazon S3, Databricks Delta Sharing, CARTO) so you’re ready to hit the ground running with it. #### Business Intelligence vs. Competitive Intelligence: A Comparison Business intelligence and competitive intelligence make up two thirds of what is sometimes known as the “strategic intelligence triad” for businesses (the other third is knowledge management). But because they are similar terms, there is sometimes confusion as to whether they have different meanings or instead refer to the same thing. So we’re going to compare and contrast business intelligence vs. competitive intelligence to help you better understand how each one can be used to reinforce your company’s decision-making processes. Here’s a quick rundown of our agenda: Business vs. competitive intelligence: an overview What is business intelligence? What is competitive intelligence? The main differences between business intelligence and competitive intelligence We’ll start by offering summary definitions of the two terms. Then we’ll get into explaining the particulars of each one and how they can influence your business strategies. Business vs. competitive intelligence: an overview The terms “business intelligence” and “competitive intelligence” are sometimes used interchangeably, but they do mean different things. Business intelligence is a broader term that refers to any kind of data-based analysis and decision-making aimed at helping a company operate more efficiently. Competitive intelligence is a more exclusive term that refers specifically to a company gathering data about its competitive position within its market or industry – basically, where and how it’s positioned next to its rivals. This helps the company make moves that take advantage of market conditions and competitor weaknesses, as well as cover for its own shortcomings. Here are a couple of short definitions that sum up the two terms: Business intelligence: Analyzing and using a company’s data, tools, processes, people, and market standing to look for and implement more efficient operations Competitive intelligence: Studying data about a company’s competitors, as well as other external factors that affect the company, in order to shape its business strategies and contingency plans The important thing to remember is that while business intelligence and competitive intelligence aren’t exactly the same thing, they do have similar goals. They’re both aimed at making a company run as smoothly and profitably as possible so it can make the best of the good times and stay afloat when times are tough. We’ll also mention that our data at SafeGraph can be useful to gather data for both applications. Our data can be used to optimize the locations and layouts of your stores and advertisements, to study foot traffic and demographic patterns so you can identify profile competitors and analyze their strategies, and more. Having said that, next we’ll take a closer look at what each type of intelligence specifically entails, as well as how it can be used. What is business intelligence? Business intelligence is the process of a business collecting and analyzing large amounts of data to aid in making decisions and solving problems. It is typically conceptualized as being directed at streamlining a business’s internal operations in order to improve efficiency and reduce costs. It’s difficult for a company to be successful if the cost of operating it outweighs the amount of money it makes selling its goods and services. That’s just a basic principle of business. To that end, business intelligence refers to a broad range of activities aimed at optimizing the different systems that make up how a company runs. The overall goal is usually to speed up how fast goods and services can be produced, minimizing the costs of producing them, or both at the same time. It can also include balancing those objectives with improving employees’ satisfaction with, and devotion to, the company and their positions within it. The latter is important for attracting and retaining talented employees that can help the company reach its targets. Benefits of using business intelligence Again, part of taking a company in the right direction is assessing where it stands now. Business intelligence lets a company contextualize data on its current position with data on how it has grown: what tools and processes worked or didn’t; what targets or goals were met or missed; and what employees liked or disliked about working at the company. This can pay dividends for various parts of the company; for example: More informed management: Business leaders are often focused on the big picture, but may not be aware of everything that goes on in the day-to-day operations of the company. Business intelligence can help them understand the smaller systems that work together to keep the company running, and thus make informed decisions on how to improve them. Smoother product development and testing: Product teams at a company can benefit from analyzing data on what products are purchased and used the most, as well as which features of products customers like or dislike the most. They can also look at what kinds of people use certain products or features the most, so they have a target demographic in mind when developing new products or testing new features. Stronger sales strategies: A company’s sales team can use business intelligence to determine the average length of time it takes the company to close deals. Then, they can look at which representatives are getting the most sales and/or closing sales the fastest, and study their approaches. This can be used to improve training for every salesperson on the team, so they can both seal more deals and reduce the amount of time it takes to do so. More precise marketing: Advertisers at a company can improve several parts of their operations through business intelligence. For instance, they can compare impressions for campaigns across multiple platforms (e.g. blogs, social media posts, television, radio) and weigh them against conversions (e.g. website traffic & sales). This can tell them which channels to focus on and what content they should promote. As another example, they can analyze actual sales activities to find out who the company’s main customers are in terms of demographics. They can also evaluate past product launches and large client deals through win/loss analysis to determine which marketing tactics work and which don’t. Better corporate culture: A company’s HR department can look at data on past employees, and poll current employees, to get a sense of what it’s like to work at the company. How much are they paid? How long have they stayed at the company? Why did past employees leave? What company values or initiatives do they like or dislike? Analyzing this data can help the HR team build an atmosphere that makes top talent want to join the company and stay for a long time. What is competitive intelligence? Competitive intelligence is a sub-type of business intelligence. It still involves a company collecting and analyzing mass quantities of data, but most of this information is about a company’s competitors. The goal is to use this information to develop business strategies that outperform rivals. Competitive intelligence sometimes gets a bad reputation because it is equated with companies spying on each other. Rest assured, though, that competitive intelligence is not the same thing as industrial espionage, in that it does not involve obtaining information in illegal or unethical ways. When done correctly, competitive intelligence is a perfectly legal and very powerful tool for generating competitive advantage for your company. To reiterate, the point behind competitive intelligence is not just to know things about competing companies. It’s to understand how what you know about your competitors can be used to help your company counteract their strategies or weather market disruptions better than them. Remember, though, that your competitors may be trying to find out the same things about your company. So it can also be important to think about your own company’s weaknesses and how competing businesses may act to exploit them. Benefits of using competitive intelligence Competitive intelligence deals with studying the business maneuvers of competing companies. This helps your business assess where it stands relative to its rivals in the marketplace, and where it should go next. The goal of competitive intelligence is for your business to try to anticipate competitor moves and other external market forces, so that it has plans in place to seize opportunities and defend against threats caused by these shifts. Here are some different ways that can benefit your company: Minimizes risk: Whether your company is developing new product ideas or new business strategies, it can avoid risky decisions by looking at what has or hasn’t worked for competitors and determining why. Improves product R & D: Your company’s product teams should study competing products and what reviewers say they like or dislike about them. Then, they can design different elements of your products to give customers more of what they want and less of what they don’t. Supports sales and deal-making: It’s essential for your company’s management and sales teams to know how your company and its products/services compare to the competition. Otherwise, they’ll have a tough time convincing big clients to pick your company over its rivals. Sharpens marketing messages: Your company’s marketing teams should research competitors’ advertisements for things like how they are positioning themselves and their products/services, who their likely target audiences are, and where/how the ads are being deployed. This can help your own marketers target niches that your competitors don’t have a strong hold on, or develop superior strategies that win back market share. Boosts talent acquisition & retention: It’s useful for your company’s human resources teams to keep an eye on competitors’ hiring practices, as well as reviews of their corporate culture. This can help them develop policies and postings that attract great workers – and make them want to stay at your company. The main differences between business intelligence and competitive intelligence The main differences between business and competitive intelligence are their orientation and scope. Business intelligence is inward-facing, aimed at improving a company's internal systems. Competitive intelligence is outward-facing, meant to improve a company’s competitive standing in the market. It can be said that competitive intelligence is a special type of business intelligence. As business intelligence is concerned with improving all facets of a company’s operations, it naturally scrutinizes how prepared the company’s internal systems are to deal with competitors’ strategies and other events that affect the overall industry. That is part of what competitive intelligence involves. The other part is to look outside the company to try and predict what competitors will do, and what other market-affecting events will happen, so the company can hypothesize what exactly it has to be prepared for. Having said that, both types of intelligence can make use of data both internal and external to an organization. To illustrate, most facets of business intelligence examine data, tools, systems, and people a company already has on hand. However, it can also involve some third-party external data on what kinds of technology, systems, best practices, and so on are currently successful within the industry and that the company could adopt. Conversely, because a lot of the direct information on competitors is kept hidden and protected by law, competitive intelligence tends to rely more on data from alternative third-party sources. These include news stories, social media posts, websites, job postings, and advertisements. However, it can still be useful for a company to compare data from these sources with data on its own internal workings. This could give a company a more complete picture of how well it’s prepared to attack and defend against competitor moves and industry events, if and when they happen. The following table summarizes some of the key differences between business intelligence and competitive intelligence. ‍ We hope you now understand the differences between business intelligence and competitive intelligence, so you know how to use each one towards setting up your business for success. We’ll also remind you that the data we carry at SafeGraph can be helpful in optimizing the location-based facets of your company, or in outmaneuvering competitors’ geospatial business strategies. #### ChainXY Alternatives for More Complete and Flexible POI Data A key to market analysis is to know what big brands are present in a particular area, and whether they will compete with or complement your business (or neither). That’s the philosophy behind ChainXY, a company that sells data and analysis tools for keeping tabs on popular retail chains around the world.So how well does that match up with the approach your company needs to take to scouting trade areas? To answer that question, we’ll look at ChainXY’s strengths and weaknesses, and then compare those of similar services to help you find the right point of interest data provider. Here’s a summary of our agenda:What is ChainXY?8 Things to Consider in a ChainXY Competitor8 ChainXY Competitors for Greater Coverage and More FlexibilityIn case you aren’t very familiar with ChainXY, we’ll start by giving a short overview of the company.What is ChainXY?ChainXY is a data vendor with a focus on point of interest data about retail stores, restaurants, and other commercial POIs. It also has a suite of analytics tools that can be used to assess proximate POIs, see what types of businesses are lacking in an area, and track store openings and closings.Furthermore, ChainXY’s analytics platform features a two-way feedback function for users to suggest the addition of new location entries or corrections to existing ones. The company caters mainly to analysts for retail chains or real estate companies who are looking to conduct trade area research.8 things to consider in a ChainXY competitorSo how do you evaluate whether ChainXY is a good geospatial data provider? Or how it stacks up against similar companies, for that matter? Based on our experience in the industry at SafeGraph, here are eight important factors to keep in mind when exploring your options.‍ Criteria Why it matters Price Your company needs to fit purchasing or licensing data within its budget, and it shouldn’t be paying for more data than it needs. Scope Depending on your organization’s use cases, it may only need data for certain countries or territories, or for certain types of places. Or it may need a wider breadth of data to look at international trends in different industries. Find datasets that cover the geographies and categories your business is (or could be) interested in. Completeness Don’t forget to consider data depth, as well. Data with fewer attributes – or lower fill rates for attributes – may not give your company all the information it needs. This can necessitate finding extra data to fill in what’s missing, often needlessly costing your business time and money. Accuracy A dataset’s accuracy is critical to its value for your organization. POI data that has incorrect attribution, or multiple records referring to the same place, costs your company in terms of having to scrub the data yourself or replace it with cleaner datasets. It can also cost you in terms of inaccurate analysis and loss of customer trust if the problem isn’t fixed. Freshness POI data can change more often than you might expect, especially if your organization is trying to manage a large-scale database. Check to see how often a vendor updates its datasets; longer update intervals mean a greater chance that data might no longer be correct by the next time it’s reviewed. Documentation It’s hard to get value from data if you don’t have a solid understanding of what it actually represents. A dataset that has thorough supporting documentation will help your company interpret the data correctly and apply it to cases where it’s actually relevant. Delivery Data tends to be more useful if you can get it all at once, where you need it, when you need it, so your business can analyze or apply it as a whole. Having to get it piece-by-piece from multiple downloads or API calls generally isn’t all that practical. Licensing Even the best data in the world won’t be all that helpful for your company if the provider’s terms of use disallow using it for your intended application(s). Be sure to check the licensing agreement on a dataset before your organization buys or otherwise acquires it, as you’ll want to avoid the legal implications of using data in a way that was explicitly not intended. ‍As far as ChainXY goes, its data is reasonably priced, and is easy to access because it’s offered in a variety of file formats. Also, a premium subscription gives full access to the data, as well as a suite of analysis tools, making this option even more cost-effective.ChainXY’s main weakness is its limited scope. Despite having global POI coverage, it only tracks stores from the most notable brands. That means its data is missing smaller brands and independent businesses, as well as non-commercial POIs. In addition, the data is updated every three months, which leaves a significant amount of time for it to become stale and inaccurate. Finally, ChainXY’s licensing terms only allow its data to be used internally within a company; its data generally can’t be used to power consumer-facing applications.8 ChainXY competitors for greater coverage and more flexibilityIf your organization needs data on points of interest beyond just big-brand stores, there are plenty of options out there. Many of them also have more relaxed terms of use, giving your company greater freedom in how it uses any data it gets from them. Here are 8 examples.1. SafeGraphSource: SafeGraph‍Pricing: $$Free trial: Sample data availableBest for: Affordable and accessible data that’s more complete, fresh, and flexibleLike ChainXY’s data, SafeGraph’s Places data is affordable and conveniently accessible on a number of different platforms. However, it covers more types of POIs than just big-brand stores, including independent retailers and “point POIs” (e.g. ATMs, transit stops, EV charging stations, and vending machines). It’s also updated monthly, so it’s significantly less likely to be missing newly-opened places or containing outdated information. Furthermore, our flexible licensing terms allow your organization to use our Places data for a greater number of applications.2. Google PlacesSource: Google Places API‍Pricing: $$$$$Free trial: NoBest for: POI data with decent scope and accuracy that includes images and social sentimentAlready well-known for its search engine, Google has expanded its operations into the geospatial data realm with products like Google Maps, Google My Business, and Google Earth. Google Places is its API that allows companies to tap into its database of point of interest information.While this database has global reach, and contains elements like crowdsourced reviews and streetside imagery, its licensing terms only allow for a limited range of uses. Two other common complaints are that its data attribution is somewhat shallow, and that the data has become extremely expensive to use as of pricing changes in 2018.3. FoursquareSource: FoursquarePricing: $$$$Free trial: NoBest for: Global POI data with optional social sentiment from people who have visited places Foursquare was formerly a company that made mobile applications, such as Swarm and Foursquare City Guide, for crowdsourcing reviews and travelers’ tips on points of interest. Foursquare now sells enterprise-level geospatial datasets, but still includes this extra user-contributed information from its apps.This social sentiment can be valuable in some use cases, especially when Foursquare tracks over 100 million points of interest across over 200 countries and territories worldwide. The data is very expensive, though, and many of Foursquare’s featured data attributes cost extra money to access.4. AggDataSource: AggDataPricing: $Free trial: Sample data availableBest for: Affordable data on big-brand store locations in specific countries and territoriesAggData is quite comparable to ChainXY: it focuses on stores for major brands; it offers a subscription service for full data access; and its data is relatively inexpensive. Its datasets are organized by country/territory and brand, so it’s possible to focus on specific geographies and chains.However, the data doesn’t include some useful contextual attributes, such as NAICS codes, hours of operation, or building footprints. Also, AggData’s datasets can have wildly varying freshness – anywhere from a month to three years since one was last updated.5. Data AxleSource: Data Axle‍Pricing: $$$$$Free trial: Yes (30 days)Best for: Very in-depth information on points of interest in the US and CanadaData Axle’s POI data covers nearly 17 million points of interest, though only in the US and Canada. What Data Axle lacks in data scope, though, it more than makes up for in data depth. Each point of interest has over 400 attributes associated with it, and Data Axle has contact information for over 150 million people across all businesses it covers.Unfortunately, Data Axle’s data is rather costly, especially considering quite a few POI attributes require upgraded accounts or add-on packages to access.6. OpenStreetMapSource: OpenStreetMap‍Pricing: FreeFree trial: NoBest for: Geospatial data that’s free to license and use, but requires coding know-howOpenStreetMap has become a popular solution for POI data because it has no upfront monetary costs. Licensing data simply requires crediting OSM and its contributors, as well as using said data under an identical license.The tradeoff is that OSM’s data and map making tools can be difficult to access and use without some experience with programming languages. The data is also contributed and maintained on a by-and-large voluntary basis, so its accuracy, attribution, and documentation can be inconsistent. So OSM’s data can still have costs associated with it, in terms of the time and work needed to merge and clean multiple independent datasets.7. PreciselySource: Precisely‍Pricing: $$$Free trial: Sample data availableBest for: Data with global coverage and granular attribution that’s easy to access and work withPrecisely is a company that’s more often recognized for its data management solutions. It does have a portfolio of geospatial data, though, tied together by the PreciselyID location identifier for easy interoperability between datasets. The data itself has fairly good scope and completeness, and is available in multiple file formats so it’s easy to access.Its accuracy is iffy, though, for two major reasons. First, the data contains a fair number of duplicate entries, which will have to be merged or cleaned up to make sure each location your organization is looking at is unique. Second, like with ChainXY, the data is only updated every three months. So there’s a greater chance some places will newly open, change, or close between updates, leaving your company to do the manual work of correcting this out-of-date information.8. HERESource: HERE Technologies‍Pricing: $$$Free trial: NoBest for: Strong search/filter/sort options and data attribution, priced for smaller businessesHERE makes point of interest data easy to work with because it has deep data management functions and attribution. So if your organization wants to find things like electric vehicle charging stations, all the Mediterranean restaurants in a city, or a place along a logistics route, HERE makes it simple to find those things.HERE’s POI data spans over 120 million places in 100 countries and territories worldwide, though only about half of those regions are indicated to have complete coverage. HERE gives a generous amount of free credits for accessing its data, but the data is fairly pricey once those credits run out. So it’s a better option for smaller operations, but doesn’t scale up very well. Get robust, complete datasets with flexibility on licensingTracking big brands in a trade area is an important first step in performing market analysis for your organization. But it often doesn’t tell the full story, especially if a region has popular local independent retailers. And not knowing where a region’s non-commercial points of interest are robs your company of valuable contextual information that may help to explain why some businesses are more popular than others in a region. For instance, local transportation systems could be set up in a way that makes certain stores more convenient to access than others.We believe it’s better to have a POI data solution that covers many different types of places so your company can see the bigger picture. We believe that point of interest data should be revised as frequently as possible, so your business can spend more time putting it to work and less time fixing missing or outdated entries. And we believe data should have flexible licensing terms, because every company is different and thus will want to use the data in different ways. That’s the philosophy behind our Places dataset. If that resonates with you and your organization, get in touch with us today to see how we can help. #### Challenges of Geospatial Data Integrations We explained earlier in this guide how geospatial data integration can benefit different types of organizations in many different ways. However, putting geospatial data to work for your company can often be easier said than done.You may not initially have the right personnel or technology infrastructure to properly work with geospatial data. And data that is inaccurate or lacks standards can require a lot of time and effort to clean, or else you risk making really bad calls on critical business decisions.We’ll elaborate on these potential pitfalls of using geospatial data, as well as some work-arounds, in the following sections:The need for data integrationTop 5 challenges of geospatial data integrationWe’ll start with a recap of why integrating data, especially geospatial data, into a company’s operations is such a big deal.The need for data integrationData allows companies to measure multiple instances of an element or variable to make fact-based decisions. But any single dataset can only give a limited amount of information and be used for a limited number of purposes. That’s why integrating and linking multiple datasets in your organization’s operations is essential for being able to gain more insights and answer more questions.Geospatial data, in particular, can be a powerful tool because it links information to specific places in the physical world. Sometimes it can also show relationships between different temporal-spatial instances. For example, it can allow a business to calculate how likely people visiting a certain point of interest nearby are to visit their store. Of course, there are many other applications that these data points and the relationships between them can be used for, such as mapmaking, civic planning, and even assessing risk for insurance policies.Top 5 challenges of geospatial data integrationUnfortunately, integrating geospatial data into your organization’s decision-making is not without its obstacles. Some are common to other data integration processes. Others are unique to geospatial data because of what it describes and how it behaves.1. Data standardizationMany data scientists and GIS analysts spend up to 90% of their time just cleaning data before using it. The reason for this is a lack of standards. For instance, timestamps may be from different time zones, or measurements may be taken using different units – sometimes ones that do not neatly convert between each other (think metric vs. imperial).A standard is also sometimes only as good as its adoption rate, and there can be barriers to this as well. For example, the standard’s creator(s) may charge money, require re-sharing of data, or impose some other obligation that makes people and organizations hesitant to adopt that standard. And remember: a standard doesn’t have to perfectly fit all cases; it just has to fit enough so that a critical mass of people or organizations agree to it and derive value from it.How to solve this problem:A good standard should allow your datasets to be understood in the context of as many other datasets as possible. To do this, it should be able to identify data points under a series of guidelines, often summarized as the “S.I.M.P.L.E.” formula:Storable – Data point IDs should be able to be stored in places that don’t require Internet access.Immutable – Data point IDs shouldn’t change over time, except in extreme circumstances.Meticulous – Data points should be uniquely identifiable across all systems they’re in.Portable – Standardized IDs should allow data points to smoothly transition from one storage system or dataset to another.Low-cost – The standard should be inexpensive, or even free, to use for data transactions.Established – The standard needs to cover almost all data points it could be applied to.2. Address standardizationAddresses are so notorious for causing data standardization problems that they deserve their own section. For starters, there are many different elements to addresses: street name, building unit number, city, region, country, mailing code, and so on. Some databases may not have these pieces of information in a standard order, or may not even have all of them. This can make it difficult for a computer program or algorithm to tell if two or more addresses point to the same location.There are other challenges as well. Some place names may be misspelled or have other typos. Even varying use of punctuation, abbreviations, or acronyms can cause problems. Does your data processing platform recognize that “US”, “USA”, “U.S.A.”, “the (United) States”, and even “America” all refer to the same country? Can it tell if the abbreviation “St.” stands for “street” or “saint”, and in which cases either one applies?How to solve this problem:Correcting these issues requires storing address data in a more efficient and less arbitrary way. That’s why Placekey was invented: to provide a free, open, and concise standard for representing information about a specific location. It generates a unique “what @ where” string of encoded characters that first identifies a location’s address, as well as a specific point of interest there (if one exists). It then defines the geographic area that location takes up, based on a hexagon whose centroid is the specific latitude and longitude coordinates of that location.3. Lack of institutional knowledgeTraditionally, geospatial data and geographic information systems (GIS) have been in a class of their own, separate from data science or other engineering fields. So only a small group of people in these latter fields (about 5%) actually know how to work with geospatial data. It doesn’t behave the same way as, say, tabular data, so many organizations struggle to ingest it into their workflows because there is a skills gap.Bridging that skills gap can be difficult as well, and not just because companies have a limited talent pool to draw from. They also have to make sure they hire people with the unique skill sets and experience they need. This often causes the recruitment process – from drafting a posting to interviews to technical tests – to take longer than usual, which often clashes with the organization’s desire to move ongoing projects along. This can put immense pressure on hiring managers to hire someone as fast as possible, instead of someone who can actually do the specific job.How to solve this problem:Start by looking within your company’s own network. Then get creative if you need to: host webinars, hackathons, or meetups; attend conferences; or hire a specialized recruiting agency to attract contacts with specialized geospatial data know-how.Ideally, you’re going to want someone with strong programming skills and a background in statistics. They should also know how to make data products, visualizations, workflows, and pipelining routines. Finally, you’ll want someone who’s familiar with machine learning, distributed computing, and (obviously) GIS software.4. File size/processing timesLike any type of data science analysis, geospatial analytics require the right systems and infrastructure. That said, you don’t necessarily need anything radically different from other types of data analysis. But basic tools like Excel and OCDB systems run through SQL might not cut it if you’re looking to work with a large number of datasets, or at least scale up in the future.You also have to decide how much you want to preprocess data or optimize it as you go (cost-efficiency versus flexibility to answer unique questions). Finally, you need to communicate these decisions with stakeholders so that they understand the limits of how fast or completely you can answer their questions, based on how fast you can process the relevant data.How to solve this problem:Data experts recommend using a cloud-based data platform like S3. While it may take more time and expertise to operate and manage, it offers better processing capabilities and scalability over time. It also offers room for developing custom parts for the tech stack as they are needed. Your system should also have a data lake, a data storage system, a processing platform, a task scheduler, and a pipeline creation tool.5. Data qualityA lot of bad data exists. Most of it is caused by a lack of expertise in how to collect and process it, or just simple human error. As we’ve already discussed, lack of standardization plays a large part in this, as it can cause analysts to miss critical details. Other inaccuracies in geocoding and digitizing physical places and features can cause a cascade of inconsistencies in their geographic representation. These make it difficult, if not impossible, to accurately measure foot traffic and other variables surrounding a business or other point of interest.Open source geospatial data is great because everyone can check it for mistakes and omissions — at least in theory. In reality, users should still be careful to vet open source data and make sure it is correct and suitable to their needs. The problem is that this process is expensive and time-consuming, so companies will often skip it — especially when they’re on a tight deadline and need insights quickly. But the consequences of making important decisions with inaccurate data can be even more costly.How to solve this problem:Take four steps to check data before using it. First, make sure it comes from reliable sources. Second, evaluate what it’s capable of, including any gaps it may leave and any assumptions you might make about it. Third, determine how much work it will take to get the data ready for use. Finally, based on what you know the data can (and can’t) do, draw up a plan for what specific function(s) it will serve in your operations.If that sounds like a lot to go through, consider cutting down on some of the manual labor by investing in SafeGraph’s datasets. They’re checked for accuracy and cleaned every month by SafeGraph’s expert data technicians, so they’re among the most up-to-date and immediately-usable geospatial data sets on the market.In summary, if you’re going to use geospatial data, first make sure you have the right people and infrastructure to work with it properly. Then, make sure the actual data you’re using is as accurate, standardized, and as relevant to your organization’s needs as possible. If you’d like further help, get in touch with SafeGraph. We’re experts in managing geospatial data – because it’s all we do.If you're ready to learn more, check out the next chapter, "Geospatial Data Management Best Practices". If you’re on the integration path and have questions about the process, make sure you check out our guide, “Geospatial Data Integration — Importance + Top 5 Challenges”. #### Data: The Future of Commercial Real Estate Location data can unlock new insights that enable commercial real estate companies—and their tenants—to bounce back in a post-pandemic world.Having Access to Up-to-Date and Accurate Location Data Leads to Stronger Investments.Data has recently become a game-changer for commercial real estate companies. Many have started relying on location data, specifically—including point of interest (POI), building footprint, and foot traffic data—alongside first-party data to conduct advanced market analysis and investment research before making major business decisions. This is critical knowing that many of those decisions come with potential long-term financial implications. Real estate is all about location, location, location—and location data is driving the industry forward.Using location data in this way has, therefore, made it possible for many commercial real estate companies to not only mitigate the potential risks of their investments but also build a more ROI-positive investment strategy from the get go. Plus, with the world quickly reopening after a challenging pandemic-ridden year, there is momentum building around an urgent need to invest in highly valuable and attractively-priced commercial real estate opportunities, following last year’s ‘domino effect’ of store closures. Access to the right data will be the key to understanding which commercial real estate opportunities hold the most promise in the post-pandemic world. And it’s up to commercial real estate companies to use these insights to inform their own portfolio management strategies and also influence how they encourage future tenants to lease the spaces available. Unfortunately, not every commercial real estate company knows exactly what to do with all this location data or even how to use it correctly. In this guide, we’ll explore the reasons why commercial real estate companies should put location data at the heart of their investment strategies as a way to drive long-term ROI and mitigate potential financial risks. Key takeaways at a glanceAlthough big data is a relatively new territory for commercial real estate companies, we’ll take a close look at why it’s quickly becoming a game-changer in a predominantly traditional industry: Location data gives commercial real estate companies a competitive edge, in terms of investment and leasing strategies, in order to drive long-term ROI.There are multiple ways commercial real estate companies can use location data, from market analysis to site selection and portfolio management to risk mitigation. SafeGraph Places data can contextualize commercial real estate data to fuel actionable insights that drive smarter and more informed decision-making.Technology is affecting every aspect of our business and the tipping point is here. The gap is growing between those who adopt technology and those who don’t.Becoming a Truly Data-Driven Organization in Commercial Real Estate Data Gives Commercial Real Estate Companies a Competitive EdgeEspecially now, with a lot of vacant properties up for grabs, it can be difficult for commercial real estate companies to narrow down the list of all available investment opportunities and then be able to single out only the most potentially profitable properties with a high degree of accuracy. Fortunately, data (of all kinds) can help get a more targeted search underway, making it easier to discover properties that meet specific search criteria or align to established investment goals. Among the many ways that data is already fueling the success of the commercial real estate industry, here are few areas that you may not have considered: Strategy development: Simply having access to any data likely won’t move the needle on its own. You need the right and most accurate data to build reliable strategies and make better investment decisions. After all, if the data you’re working with is flawed, you could end up basing your investment decisions on incomplete or inaccurate insights—which can lead to serious financial implications down the road.‍Cost reduction: Working with the right data can also shorten the timeline between starting a property search and eventually filling an open vacancy. But aside from simply using data to speed up this entire process, it can also drive broader operational efficiencies that can lead to substantial cost- and time-savings over the long-term.Tenant satisfaction: Using location data to do your due diligence around market research or trade area analysis makes it easier to go into leasing conversations with potential tenants equipped with key insights that can close deals quickly. By having a more holistic view of the properties in your portfolio before diving into these conversations, you will be able to build more compelling cases for getting the right tenants into the right properties. This is a perfect way to drive greater value across the entire commercial real estate ‘supply chain.’ What about trusting your gut instinct? It should come as no surprise that the most successful commercial real estate agents tell you that a big part of real estate investing, in general, requires trusting your gut. But today, they also know that having the right data to support and validate those instincts is critical for building confidence during key strategic planning and decision-making processes. As Ryan Passe, VP of Operations at Sands Investment Group (SIG) puts it, “The time when brokers could lean on gut feelings to get the best deal is gone. Today, if you’re not paying attention to the numbers, you’re losing money.” That’s why companies like SIG have doubled down on making commercial real estate investment decisions more data-driven. It goes beyond simply making better and quicker “matches” between buyers and sellers. It’s now a matter of pricing deals correctly, having a better grasp of capital flows and inventory levels, mitigating credit risks, and establishing accurate property valuations. “That’s empowering our teams to find opportunity in the market, make quicker decisions, and embrace new technology in a world that is mostly reliant on gut feel,” reiterates Passe. Throwing location data into the mix can, therefore, help answer questions around potential foot traffic, opportunities for revenue generation—including cross-pollination from neighboring businesses—inventory management, resource planning, and more. These are the kinds of insights that can turn a gut instinct into a more viable long-term investment decision. Many real estate firms have long made decisions based on a combination of intuition and traditional, retrospective data. Today, a host of new variables make it possible to paint more vivid pictures of a location’s future risks and opportunities.McKinseyWhat Kind of Analysis Can be Done with Location Data?There is a lot that can be done with location data within the commercial real estate industry. Here are just a few of the most common use cases to keep top-of-mind:1. Market analysisMarket analysis is a general best practice that every commercial real estate company should master. It’s an objective way to assess whether a specific business location, zoned to specific business types, will be positioned to succeed and, thus, command a higher lease value.Enriching POI data with other datasets, like purchasing power per capita, allows real estate companies to analyze specific markets and assess a property’s potential ROI.It’s also a great way to reduce potential risks before investing too much money. There are a number of factors that go into commercial real estate market analysis, including:Supply: How many vacancies of a specific property type exist? How many construction projects for that property type are underway or awaiting permitting approval? Demand: Is there pent up demand for a specific property type in a given area? Or is the market already saturated, meaning that potential tenants have a lot of choice? Location: Is the property in an area with high foot traffic? Does it offer easy (and plentiful) parking? Is it easily accessible from the highway or main roads? Pricing Considerations: What are the average rental rates in the area for a specific property type within a given sector? What are the average local occupancy and vacancy rates for that sector? What is the total square footage and overall lot size of the property? Are there any zoning restrictions? Does the property offer any unique or premium features? What are the additional expenses within a market? Does the property need improvement? ROI Potential: What kind of income was generated by that property’s tenants in the past? What is the future income-producing potential for that property today? All of these factors play a big role in determining the ultimate ‘rentability’ of a property and need to be taken into consideration by commercial real estate companies upfront when making investment decisions. Failing to do a thorough market analysis can leave you with a dud of a property that could potentially sit vacant and collect dust for months on end.We use a funnel approach for real estate investing. We start by researching markets, asset classes, and cycles...to determine what state an individual market is in and what state a particular asset class is in within a market.Data Driven Real Estate Investment (SMARTCAP)2. Site selection and portfolio managementThere are multiple ways that location data can support commercial real estate site selection and portfolio management. It comes down to the property types being considered for purchase. For example, in retail site selection, location data can help you hone in on the exact spots that will drive the greatest amount of success—in terms of foot traffic and potential revenue generation—for businesses interested in leasing those properties. In many ways, retail site selection for a commercial real estate company is all about thinking a few steps ahead: The ROI driven from that investment will be realized only when a successful tenant sets up shop. Therefore, it’s critical for commercial real estate companies to be able to paint a picture of what success will look like for tenants in order to fill vacancies, fast. When commercial real estate companies invest in office space, they are typically less concerned with foot traffic, per se, and more focused on a property’s proximity to things like cafés, restaurants, hotels, post offices, supermarkets, and other “daily essentials” that office workers would like to have easy access to. Taking a property’s surrounding environment into account enables commercial real estate companies to price “convenience” as a premium perk when developing leasing packages. Understanding the implications, both good and bad, of a property’s location is the best way for commercial real estate companies to hedge their bets and build a solid portfolio strategy that drives long-term revenue growth.Advanced analytics cannot serve as a crystal ball. In most cases, it should only support investment hypotheses, not generate them.McKinsey3. Investment research‍According to Deloitte, “Big data can help automate due diligence, as the technical records and current conditions of building components can now be generated in real-time and reliably.” This has quickly become a competitive advantage for many commercial real estate companies when conducting investment research. Data available today allows them to not only to predict potential profitability but also to measure performance in real-time at a granular level.The commercial real estate industry tends to lag behind other industries in terms of data maturity.However, this advanced use of data is not widespread across the commercial real estate industry. The industry, in general, has been slow to get on the data bandwagon. The primary factors stopping commercial real estate companies from embracing data include: Lack of awareness: Many companies are simply unaware of what datasets are available today and how to work with them to drive actionable insights. “There are no commonly accepted and widely adopted industry standards around data definitions and governance” within the commercial real estate industry. Solutions like Placekey are quickly changing this dynamic. However, “managers [still] could spend as much as 80 percent of their time on gathering or manipulating data to make it ready for analysis.” Because this work is often perceived to be inefficient, despite the value that can come from it, many companies have shied away from doing a deep dive into data.Resource constraints: Unfortunately, many commercial real estate companies don’t have the tools, technology, or talent to source, analyze, clean, connect, and drive meaningful insights from the data available in efficient and effective ways. This has become a major hurdle for companies looking to enhance traditional or proprietary data sources with location-based datasets. Because it can take a while to distill insights into actionable outcomes, there’s a perpetual fear that the time required to make that happen could easily amount to a lost investment opportunity. Risk aversion: Using new datasets to drive insights can oftentimes seem like a risky practice. This is especially the case for more “traditional” industries. Because many commercial real estate investors have trusted—and seen success driven by—a small set of traditional data sources for years to aid in decision-making, they are reluctant to pin any ROI-based conclusions to alternative data sources. As Deloitte puts it, “Investors and managers would have to do due diligence to verify the authenticity of the origin of the data,” to mitigate privacy risks and improve transparency around data collection. Because using data in this way is essentially a new frontier for the commercial real estate industry, it will take time for businesses to become truly data mature. Too relationship-minded: Much of real estate industry’s history has been driven by a dedication to both relationship-building and having a solid gut instinct. But data’s growing presence within the industry is changing this model dramatically: “For investors and managers evaluating these new business models, relying on intuition may not be enough. Use of data analytics would reduce subjectivity as they continue to follow the traditional decision-making approach.” In other words, a change in mindset is needed for investors to look at—and embrace—data in the same way they trust their gut. Conventional analytical methods and data sources make it challenging to draw clear hypotheses and build robust business cases.McKinsey4. Risk mitigationLayering on location data and other non-traditional data sources to so-called traditional commercial real estate data can drive increased clarity around a property’s nuances. McKinsey provides a clear way of thinking about this: “Two buildings that are seemingly identical when evaluated by traditional metrics can ultimately experience very different growth trajectories. It is easy to imagine how this disparity at the individual building level, when applied across a series of investments, can drive dramatic results at the portfolio level.”Location-based data can add depth to data sources about market performance, property features, and property performance, giving investors new and better ways to evaluate a property’s potential for long-term success. These insights provide additional, and oftentimes necessary, context to both drive sound decision-making and mitigate risk significantly.3 Ways SafeGraph Places Data Can Enrich Investment StrategiesThe SafeGraph Places dataset, updated monthly for utmost accuracy and precision, provides the in-depth POI and building footprint data you need to fuel more informed and accurate decision-making around commercial real estate investments and strategies.Here’s a quick overview of the three types of location-based data within this powerful dataset: POI data reveals where competitive and complementary brands are located, helping commercial real estate decision-makers to understand areas of opportunity and risk.1. Points of Interest (POIs) Places includes base information—such as location name, address, category, and brand association—for the places where people spend their time and money. It also sheds light on the relationship existing between adjacent POIs. Therefore, POI data is important because it provides a unique perspective for understanding the types of places target audiences visit throughout the day or week.When applied to the commercial real estate industry, POI data can set a foundation for mapping, inform market analysis, and offer a ‘birds-eye view’ of the environment surrounding any given place (i.e. Is that area already saturated with similar businesses or property types? Is there anything in the vicinity, like a trash dump, for example, that could devalue a property or make it less attractive to lease?). Building footprints with accurate spatial hierarchy metadata provide commercial real estate analysts with actionable details around co-tenancy risk and accessibility.2. Building FootprintsGeometry offers building footprints for POIs derived from spatial hierarchy metadata to allow for geofencing as well as a more precise and accurate understanding of attribution. For commercial real estate companies, more specifically, Geometry can provide precise building attribute details—like building heights and parking lot adjacency—to help map out and truly understand all aspects of a physical property. Check out this great overview of the ins and outs of building footprints for geospatial analysis to learn more.Understanding the origin location of foot traffic to a specific POI enables a deeper understanding of who those customers are as well as why they visit particular locations.3. Foot Traffic PatternsFoot Traffic data measures visits to and from POIs precisely and can be used to determine how often people visit certain POIs, how long they stay, where they came from, where else they go, and more. This helps real estate planners and analysts hedge smarter bets for where to invest and can also proactively identify real-time consumer trends that may influence a property’s long-term value.And by layeringfoot traffic data onto the SafeGraph Places dataset, it can layer on even more insights about how people move around—and, ultimately, transact—at a macro level in a given trade area. Not only is this information critical when doing initial investment research, but it can also give commercial real estate companies a cutting edge in pricing and contract negotiations with potential tenants. Therefore, by combining all of SafeGraph’s location-based datasets with traditional commercial real estate data, urban planners, real estate investors, and market analysts can visualize catchment analysis, understand market penetration, and make smarter investment decisions.Only a joint effort among all real estate stakeholders can optimize data to create insights that improve performance and profitability.DeloitteBetter Real Estate Decision-Making Starts with Location DataLocation data is revolutionizing the commercial real estate industry for the better. Not only does it add new kinds of value to traditional real estate data sources—around both property and market performance—but it also helps pinpoint the nuances that can make one property a smarter investment over another. The power of location data, therefore, has the potential to transform how commercial real estate companies make better, more profitable investments. However, we understand that using and analyzing location data may seem intimidating at first, especially if you haven’t used it in this way before. But it doesn’t have to be. With the right tools and techniques in place, location data can give commercial real estate companies a competitive edge in any local market—especially at a time when the industry is about to experience a massive boom. And if you’re still not quite sure where to start, our team is always here to help! #### Determining Points of Interest Visits From Location Data: A Technical Guide To Visit Attribution Introduction Understanding if a device visited a place, brand, or type of store can be valuable context to have for your business. Companies we currently work with use store visit information to build custom audiences for advertising purposes, to better attribute ad campaign spend, and to send contextual push-notifications in real-time. Unfortunately, accurately determining if a device visited a place can be a tough engineering problem to solve. Dealing with messy GPS data, incomplete business listing information, and limitations in knowing where places exactly are located make visit attribution a complex problem. However, building a visit attribution solution remains a worthwhile endeavor since it enables you to enrich digital data with physical-world context. Furthermore, building a visit attribution solution in-house allows you to tune the algorithm to your specific input data and specific use case which results in a better end solution for your customers. Luckily, with SafeGraph Places, it’s become easier than ever to build your own robust visit attribution algorithm in-house. In this white paper, learn about initial approaches taken to do visit attribution as well as potential drawbacks of these approaches. Then, learn about our current state-of- the-art visit attribution algorithm powered by SafeGraph Places. Initial Approaches and Their Drawbacks Let’s briefly cover a few simpler approaches and their potential drawbacks before presenting our current production solution. Closest centroid wins When SafeGraph first started working on visit attribution, it was very difficult to come across any building footprint (polygon) data, so we relied on points of interest (POI) centroid data. The algorithm we started with is as follows: for a given GPS ping, find the closest POI centroid and call it a visit to that POI if the distance is below some threshold. We have very consistently found that this “closest centroid wins” approach is only remotely comparable to other algorithms when you have large standalone stores (i.e. a Walmart surrounded by a large parking lot). For a clear case where centroids aren’t good enough, think about an airport or a golf course, which don’t have an obvious centroid that you would want to use. There’s likely going to be another nearby location that’s closer, especially if you are near the edges of the large POI.Another good example is if you have large stores next to small stores as shown below. The centroids of the small stores will end up being closer just by virtue of having smaller footprints, which can bias your data. This is why any approach to accurately determining visits needs to take into account the actual building footprint (polygon), and not rely solely on a building’s centroid.This conclusion is the reason why SafeGraph decided to build out SafeGraph Places with building footprints for millions of global POIs. Any ping inside a polygon is a visit SafeGraph then tried the most simple polygon approach: a visit is just a sequence of GPS pings that are all inside of one POI polygon. The biggest issue with this approach is that for most common POI, drifting GPS signals cause you to miss a lot of visits because the GPS pings will not always actually enter the building polygon. Even worse, they can drift into a neighboring building which will hurt your precision. Another very common problem when we took this approach was devices that would switch back and forth between two locations, creating dozens of correct and incorrect visits that we would have to clean up after the fact. One upside to this approach is that it works well for large places or outdoor places (airports, theme parks, etc) because any drift/noise in the GPS data is meaningless if the POI is big. Any ping inside a custom geofence is a visit Another common simple approach is to build custom geofences with padding around the POI that you care about, and then record any sequence of pings in that geofence as a visit. By adding padding around the building polygon in your custom geofence, you can address some of the GPS drift issues. However, you still have a lot of potential precision issues. A device stopped at a red light next to one of your geofenced locations will show up as visiting that location, for example. While you can add some filters to address the potential false positive visits, the new harder issue that you have to deal with is ambiguous matches. If you have two stores next to each other, either their geofences will overlap or the horizontal accuracy will make it so a device could have been in either place (or worse, both, one after the other). The good news is you can use attributes such as: Time of day Duration spent in the geofence Distance from the pings to the building polygon Distance from the pings to the centroid Characteristics of the place (category, open hours, etc) to try to predict which POI was the most likely visited (if any). This task is set up well for machine learning. You can use features like the time of day interacting with the category of place (coffee shops are popular during different points in the day than bars are, for example), whether the location is open, if the duration was too short to be a visit, into account. Overview of SafeGraph’s Visit Attribution Approach Ultimately, SafeGraph’s solution involves segmenting the task of visits attribution into two distinct subproblems, both of which we feel closely model the processes a human might engage in if asked to create visits from GPS data. At a high level, the solution can be decomposed as follows: Clustering GPS points such that every point in one cluster is associated with a visit to a single place Learning a model which can choose between a set of viable places As we discuss below, our clustering processes take advantage of a modified version of the canonical DBSCAN clustering algorithm, re-tooled to more effectively deal with geospatial data that has a time component. After clustering our GPS data and spatially joining it against our polygons, we’re left with the task of choosing the most-likely visit among a set of potential options. We found that this problem can be appropriately modeled by preference learning, and we developed a learning-to-rank model, similar to the technology used to power technologies like Google Search, that accurately learns how to rank a set of nearby places by comparing the feature vectors for those places. Step #1: Cleaning GPS Data Before we can even think about doing visits, we need to do some general data cleanup. With GPS data, there are three primary prevalent issues: GPS signal drift Spiking horizontal accuracies “Jumpy” GPS pings (a ping going from point A to B faster than is possible) The first one we will address as part of our algorithms below, but the next two are easier to address during a dedicated pre-processing step. Spiking horizontal accuracies can happen for a lot of reasons. If a device loses GPS signal, it can either fall back to WiFi or the nearest cell tower. If WiFi is unavailable, it’s not uncommon for the horizontal accuracy to spike up above 1000m. This can be really problematic since a legitimate visit can be split into multiple components, or we can create incorrect visits in this new area. We spent some amount of time trying to make use of these high horizontal accuracy pings, but ultimately realized that the most accurate strategy is to just filter all horizontal accuracies above a tuned threshold. There are a lot of reasons for jumpy GPS pings. The simplest explanation is that the data is fraudulent. Another possible explanation is that most phones will approximate your location based on the wifi information around you. If someone has recently moved, and a router that was previously in a different location is now near you, your phone may think it is wherever the last location of the router was, instead of where you are. In bad cases, this can look like a device moving from state to state rapidly. Regardless of the reason for jumpy GPS pings, the approach is still the same. For any two points that are close in time, we compute a speed between them and if the speed is too high, we filter out the pings. Finally, there’s one more filter we apply to the data before we can start clustering. We now will try to remove any non-stationary data. Since our goal is to create visits, we should be able to throw away any driving data. To detect driving points, we combined two approaches: 1. You are looking for a sequence of points that are somewhat linear. To compute the linearity of a series of points (P₀, P₁, P₂, ... Pn), we first compute the sum of sequential distances between all Pi and Pi+1: ∑ni=0d(Pi, Pi+1). We then divide this sum by the net distance traveled in the series: d(P0, PN). A value close to 1 indicates that the sequence is linear. 2. You can also calculate the speed between any two points pretty easily and filter for ones that are “too fast”. The reason you can’t just use this approach is you need to choose your threshold carefully so that noise or a few higher HA points don’t make it seem like the pings are driving when they aren’t. Combining these, we essentially remove any sequence of pings that are too linear over a long enough period of time, and those that appear to be traveling too quickly. Step #2: Clustering GPS Pings Together The goal of this phase is to take all of our GPS data and try and turn it into potential visits, without using our places data. The key insight here is if you look at a series of GPS pings on a map with no places, you can generally figure out areas that a device could have visited. You can also think of this as taking all your GPS pings and turning them into potential visits. By specifically solving just this component of the problem, the rest of the pipeline becomes a lot simpler because the ML model will only need to figure out which place is most likely as opposed to figuring out both which pings are relevant for a visit and then which location was the most likely. Also it severely reduces the data scale that you need to deal with. For our approach, we started off with DBSCAN - a density based clustering algorithm which clusters points together based on how close they are.In DBSCAN, each ping is considered to either be part of a cluster or it is considered noise. The clusters that DBSCAN produces are essentially areas that the device was at for an extended period of time. You can control exactly what DBSCAN clusters mean by tuning the two hyper-parameters (one is the number of pings that need to be close together before it’s considered a cluster, and the other is the distance between two pings). This works pretty well, but it’s missing one important element: time. If in the mornings you some location and then in the evenings you visit the location next door, we’d want those two to be part of different visits and therefore different clusters. We took inspiration from a few papers written on the topic, and ultimately modified our DBSCAN algorithm to include time as follows: What this essentially says is that a sequence of pings that are within some DIST_THRESHOLD are considered a cluster as long as they never go more than MAX_DISTANCE_THRESHOLD away from the last ping and there are at least MIN_NUM_ELEMENTS in the cluster. You should tune these parameters to best fit your data and use case. We found that DIST_THRESHOLD of 80m and MAX_DIST_THRESHOLD of 100m worked well for us.* Refer back to step #1 on data cleanup for more information on making sure your data is clean enough to be put through the clustering process. *Note this white paper was originally published with values of DIST_THRESHOLD = 100m and MAX_DIST_THRESHOLD = 300m, but these have been updated over time to A and B as the algorithm has undergone further refinement. This approach generally works really well to cluster pings that are close together in both space and time. However, we found that there were two cases where we could do a little better: 1. The constraint that we have at least MIN_NUM_ELEMENTS in a cluster makes sense if you have no other information. However, if you have any additional information for any pings, you can turn those into their own clusters (or loosen the MIN_NUM_ELEMENTS constraint). 2. Large POI like airports pose a problem because their footprints are almost always going to be bigger than MAX_DIST_THRESHOLD so you pretty much always cut up your data. The first problem is pretty straightforward to fix. For the second problem we ended up bringing back our “Any ping in a polygon is a visit” strategy, but just for POI with a really large area (which we found it to be really good at). Bringing this altogether, what we ultimately do is: 1. General data clean from the previous step (remove high HA, jumpy pings, driving, etc) 2. Complete a first pass over the data, creating clusters out of consecutive pings that are in large POI 3. Do a second pass over the data, creating clusters from the remaining blocks of unused pings using our modified DBSCAN 4. Save the clusters, discard all unused pings Step #3: Preparing the Clusters and their Possible Places Now that we have clusters of potential visits, we need to add in our places information. We are trying to go from Dataset[Cluster] to Dataset[(Cluster, List[Places])] where the list of places is all the places that the cluster could have been referencing. This step is relatively straightforward: we simply perform a geospatial join between our clusters and our polygons. You should make sure to add a buffer around the cluster to account for any horizontal accuracy uncertainty of the GPS pings. Step #4: Predicting The Best Place For a Given Cluster Motivation for framing problem as a ranking problem After associating each cluster with a list of possible places, we’re left with the task of choosing between a set of viable options, some of which are more likely than others. This choice depends on a multitude of entangled features (such as the distance between cluster and polygon), lending itself well to machine learning. The goal is to develop a machine learning system which takes as input (1) a cluster and (2) a list of places and which outputs the place associated with the cluster. You might notice, though, that the problem structure laid out above doesn’t seem to fit nicely into those structures traditionally offered by off-the-shelf classification or prediction models. In prediction models, we input into the model some set of features (for instance, a set of features describing a cluster and an individual place) and output a continuous target variable — but it’s not clear exactly what the target variable should be in this case. In classification, we classify a set of features with a label from a set that we’ve already learned; in this case, it’s not obvious what exactly the classifier is supposed to classify — a single place, one at a time? A set of places all together? The problem structure becomes clearer, though, when we imagine how a human might perform this task: Imagine being given the diagram to the right, in which a cluster C is surrounded by three potential places — Target, Walmart, and Bob’s Bar. Visually, it seems unlikely that cluster C is a visit to Walmart. Importantly, we know this not merely because C is absolutely far from Walmart but because C is relatively far from Walmart, as compared to Target or Bob’s Bar. This thought experiment indicates to us that choosing the correct visit is inherently a comparative exercise between places — that is, choosing a visit involves looking at the differences between sets of possible places. Thus, our feature set needs to somehow encapsulate these differences between places and not just raw place data itself. After ruling out Walmart, we’re left with the following situation: Now we’re tasked with choosing between Target and Bob’s Bar. This decision would likely be well informed by category and time-of-day information (for instance, we’re more likely to choose the bar if the cluster occurred at 11 p.m.), which we’ll return to shortly. In the meantime, the takeaway here is that choosing the right place from an arbitrarily large set of options involves whittling down the option pool by comparing options between themselves. And this structure is one that can indeed be formalized in a way that’s more amenable to traditional ML techniques. In fact, this problem is essentially isomorphic to any plain ol’ ranking problem, wherein your goal is to compute a ranking (also called an ordering) of an arbitrary large set of items. These models, of course, are ubiquitous, powering the rankings in technologies like Google Search, Yelp Restaurants, and Facebook’s People-You-May-Know. Encoding ranking information The key insight behind learning these models is to structure your feature set in such a way that it encodes ranking information. In the case of determining visits, we start by constructing a set of features for every pair. If a cluster matches against 8 possible places, this leaves us with 8 rows of features. Now we must manipulate this data to encode ordering information. To encode ordering information between any two rows, A and B, we merely subtract B’s feature set from A’s. Training the model on these difference vectors helps it learn how to determine relative orderings between places. This is a standard technique in learning-to- rank called preference learning. We set the label associated with this difference vector to be 1 if A was the true visit in our training data (thus, A should be ordered before B), -1 if B was the true visit in our training data (thus, B should be ordered before A), and 0 otherwise (that is, if neither A or B was the true visit in our training data). Ultimately, we filter out all 0 labels before training because we only care about finding the highest-ranked possible place and are not concerned with any orderings beyond that. Our labels come from POI check-in data from a popular consumer app. By combining this check-in data with our GPS data, we’re able to construct clusters associated with each check-in. Because we know the true check-in, we’re aware which of the possible places was the true visit and which were non-visits. These “non-visits” form the negative examples in our training data. Learning-to-rank model summary To summarize what we’ve learned: For each cluster and possible place, create a set of features (that’s one row of features for each pair) Within this set, create all pairwise difference vectors -- for N pairs, that’s (N * (N - 1)) / 2) difference vectors. For each pairwise difference vector A - B, set the label to 1 if A was the true visit, -1 if B was the true visit, and 0 otherwise. Filter out all 0 labels. This leaves us with (N - 1) rows with a 1 label and (N - 1) rows with a -1 label, giving us 2N - 2 rows in total. Learn a model on this set of features As far as model choice is concerned, we borrowed inspiration from Microsoft Research and used a solution similar to the canonical LambdaMART architecture — that is, we built a gradient-boosted forest. We find these models to consistently strike the right tradeoff between being flexible enough to fit the data appropriately without learning noise, and all off-the-shelf learners have mechanisms in place (typically through hyper-parameters) to control regularization. What’s more, we weren’t deterred by forests’ lack of interpretability because we were able to utilize libraries like ELI5 to pull out rich information about our model, such as feature importances. Though we had an incredible amount of training data, we saw model performance plateau after ~50,000 training examples, which is small enough to feed into any standard tree library (though we used XGBoost). Useful features for the model We experimented with a large number of intuitive features and ultimately found the following set to be the most robust. Unsurprisingly, distance-related features had the most predictive validity. DistanceToPlaceCentroid Distance between cluster centroid and place centroid DistanceToPlaceWkt Distance between cluster centroid and nearest point on place WKT We include this feature in addition to DistanceToPlaceCentroid because there are cases (say, for relatively large polygons) in which the cluster is far from the centroid but close to the edge of the WKT, so failing to include this feature makes it seem as if the cluster is far away from the place when, in reality, it’s quite close DistanceToPlaceCentroidRank Rank (e.g. 1st, 2nd, 3rd) of distance between cluster centroid and place centroid We chose to include rank-based features because we didn’t want our model to solely focus on the scale of the distance features DistanceToPlaceWktRank Same as above, but for place WKT NAICS x Hour Also, we created a large set of dummified features representing the interaction between the first 4 digits of each place’s NAICS code and the hour of day that the cluster occurred This intuition behind this set of features is you’re more likely to be seen at some types of places depending on the time. For instance, it’s likely that a cluster visited a nearby bar over a nearby Walmart if it’s 11 p.m. but more likely that a cluster visited the Walmart if it’s 11 a.m. Using the output of the model to choose a visit Because we’re creating pairwise combinations of feature vectors, we have more work to do after running our model over our features. We think of our model’s output as being similar to a round-robin tournament, in which every possible place “battles” every other possible place. Now we must combine the results from this round-robin tournament in such a way that we can choose a clear winner. Our approach is simple: using the labels that our model predicted, we create essentially what amounts to a “tournament scorecard” — that is, a mapping between each possible place and the number of times it “won” when facing off against every other possible POI. The visit we choose is the merely the visit that has the most number of wins. You may be wondering why we chose to perform ~N^2 comparisons instead of ~N. Similar to an algorithm that finds the maximum number in a list, our approach consisted of randomly choosing a possible place among those matched by a given cluster, calling it the “current champion,” and iteratively updating our understanding of the current champion by comparing it to other possible places. Unfortunately, though parsimonious, this approach isn’t valid because preference learning doesn’t guarantee transitivity. That is, our model may believe that place A should be ordered higher than place B, place B should be ordered higher than place C, and place C should be ordered higher than place A, thereby creating a cycle. Because our model isn’t guaranteed to be transitive, performing ~N comparisons instead of ~N^2 isn’t robust. Thus, we recommend using a scorecard approach. Below are the final set of steps we use in the production ML pipeline to choose a visit: For each cluster, create a row of features for each possible place using the feature set outlined above Create all pairwise difference vectors between this set of raw features Run each difference vector through the model, which will output 1 if it thinks that the left-hand feature vector should be ordered higher than the right-hand vector and -1 if the right-hand should be ordered before the left-hand. Using these labels, create a scorecard that associated each possible place with the number of times it “won” a face-off Choose the final visit to be the one that won the most number of times Benchmarking performance Ultimately, we care about our ability to choose the correct visit among a set of potential options, so we benchmarked our performance on this task relative to a set of classifiers that are less sophisticated. In particular, we compared ourselves to classifiers that 1. Chose a visit at random from the pool of viable options and 2. Chose the polygon that was closest to the cluster center. Of course, performance for these strategies varies largely as a function of the number of possible places that were matched to each cluster in the first place (in other words, choosing the correct visit among a set of 2 options is significantly easier than for a set of 10). From benchmarking, we found our model significantly outperforms both naive classifiers, though the distance-based classifier performs admirably as well. #### Foot Traffic Data Providers Foot traffic data is extremely valuable for understanding customer engagement and managing your business locations. This information can inform which stores to close, and where to open new store locations. Demographics data also gives you much needed insight into how your stores perform in different areas and when serving different customers. 8 actionable ways to get foot traffic data 10 examples of foot traffic data providers with different models To help you learn how to make the most of mobility patterns data, we’ll tell you the top eight methods of collecting foot traffic data, and then cover the best foot traffic data providers available. 8 actionable ways to get foot traffic data Analyzing customer mobility patterns is essential for perfecting your retail store design, understanding customer engagement, digging deeper into customer demographics, perfecting your marketing efforts, and determining the best locations to set up new stores. One of the biggest struggles with footfall data is navigating privacy laws while still gaining access to enough information to produce meaningful insights. Anonymized data can give you demographics data such as age, income, voting patterns, and more, without compromising individual identities. Non-anonymized data connects information directly to individual people, and can contain personal information that makes it unusable by law or by ethics. To help you leverage mobility data, we’ll explain how to get foot traffic data. 1. Anonymous mobile device tracking data Anonymized mobile device tracking data offers insights about users with incredible accuracy while avoiding personally identifiable information. Many vendors can offeraggregated datawork better than physical foot traffic solutions such as video cameras and break beams, which can only track foot traffic at exact locations or within specific areas. Mobility tracking data offers insights about users’ complete mobility patterns, offering not only movement into, out of, and within your retail stores, but all user movement that they’ve opted-in to. This gives you the ability to draw even deeper, and more meaningful insights from your analytics, and understand customer behavior with greater accuracy and detail. With anonymized tracking, customer data is safe, while offering you valuable aggregate data to leverage. 2. Video camera systems Video cameras can be used to gather foot traffic data. While this can be done in real-time, it would require manual monitoring, and is therefore more commonly used to collect foot traffic data after the fact. There are a number of shortcomings with counting footfall using video cameras: No means of getting demographics data on the foot traffic being monitored, as you can’t link people to their mobility patterns Real-time data is difficult to collect and use in a meaningful way These systems are expensive to install and maintain, often requiring significant upkeep and repair costs Video solutions are time-consuming to install and manage, and are not the fastest method to adopt Only valuable for monitoring physical locations (which can be good for retail stores) 3. Artificial Intelligence (AI) and facial recognition technology Artificial intelligence (AI) and facial recognition technology enable video camera surveillance to be leveraged even further. AI will enable automated analysis of video camera feed, tracking mobility without requiring manual monitoring. Facial recognition - in certain applications - will enable mobility tracking and deeper demographics data, when users opt-in to certain use cases. There are a couple setbacks with the current state of this technology. There are legal restrictions around using facial recognition solutions, and the potential use cases. This will require significant knowledge and navigation to be able to leverage facial recognition solutions. 4. Break beams and infrared sensors Break beams, or infrared sensors, create an unbroken infrared beam at areas where you’d like to track mobility. These are most commonly employed at entrances, exits, and key spots within a location. These systems count every time the beam is broken, indicating that someone has passed this ‘barrier’. This allows you to track mobility to and from (with entrances and exits) and within your retail locations. Break beams are best suited for tracking foot traffic to, from, and within a physical location, as they are typically installed in doorways. They have to be set up in a specific place, and can only monitor mobility at the locations that these infrared beams are installed. Similar to video camera systems, these sensors do not provide a way to append demographic data for deeper insight into who crossed the boundary. 5. Thermal sensors Thermal sensors use heat to detect foot traffic. These solutions can be placed with more flexibility than break beams, as they are simply tracking a thermal signature, and do not need to be set up to facilitate a beam. These solutions use very little battery power and require little maintenance. By nature, thermal sensors protect the identity of the people they gather foot traffic for, as individual data cannot be gathered through thermal tracking. However, this also means you can’t gain demographic information from your data, and will have more limitations on the insights that you can draw. 6. Pressure mats Pressure mats serve the same function as thermal sensors and break beams, but instead use weight sensors installed in the floor. The premise is the same, whenever a customer steps on the pressure mat, that movement is tracked, allowing you to gain foot traffic data at key physical locations. This can be used at entranceways, and to monitor in-store movement, but only provides counts for those who engage with them and no enriching detail about who those people are. 7. WiFi Businesses can use their internal WiFi to gather foot traffic by allowing guests to connect. Ultimately, any time a user connects, your network will track this. This can be used to track recurring visits if the visitor sets their mobile device to automatically connect to your network. While this can help you count visits to retail locations, it lacks precision. It is only able to track those that actually connect to the WiFi network, and therefore does not accurately represent the entire scope of foot traffic at that location. Similarly, if a visitor has their network set to automatically connect to a local coffee shop that they walk past - but do not enter - regularly, they may connect to the network without actually visiting the location. One advantage of this is that depending on what the mobile user has opted-in to, you may be able to derive where they have come from or what other networks they have joined. 8. Manual tally or clicker counter You can still count foot traffic the old fashioned way - with a pen-and-paper tally, a clicker counter, or even a stone and chisel. This method is extremely outdated and virtually no business - besides your nephew or niece’s lemonade stand - will still be using it. It requires a significant amount of manual input, is prone to human error, and doesn’t allow for meaningful analysis in real-time. 7 examples of foot traffic data providers with different models Purchasing foot traffic data can be difficult to navigate, as footfall data providers collect and sell data in a variety of ways. Below, we cover different foot traffic data solutions, emphasizing the different methods they use to help you find the best fit for your data needs. 1. Veraset Image Credit: Veraset Free trial: No Methods available: Anonymized mobile device tracking data Veraset has two main product offerings. Veraset Movement provides anonymized GPS signals that have already been cleansed and validated. Veraset Visits merges raw GPS signal data with places data to analyze which devices visited various POIs, at which times, and for how long. 2. Gravy Analytics Image Credit: Gravy Analytics Free trial: No Methods available: Anonymized mobile device tracking data Gravy Analytics offers data with your privacy in mind. Mobile location data is cleansed, so you don’t have to worry about pre-processing data. Gain access to verified visits, consumer personas, and market trend data for leading companies. 3. Blix Image Credit: Blix Free trial: No Methods available: WiFi tracking Blix is a WiFi analytics solution provider, offering data insights using their CountSmart technology. This solution monitors foot traffic via WiFi networks, tracking those that connect to the network. 4. aspectum Image Credit: aspectum data-on-demand Free trial: 14-day free trial Methods available: Anonymized mobile device tracking data Aspectum is a geodata and analytics visualization platform that enables data visualization of various datasets and access to data. Their solution lets you gain access to custom datasets and add your own data for analysis. 5. dôr Image Credit: dôr Free trial: No (30 day money-back guarantee) Methods available: Thermal sensors Dor specializes in thermal sensor technology for tracking foot-traffic. These solutions use low battery, and are easier to set up than break beams, offering similar tracking functionality. 6. Bluefox Image Credit: Bluefox Free trial: No Methods available: Video camera systems | Breakbeams and infrared sensors | WiFi network tracking Bluefox is an out-of-home advertising solution that offers multiple ways of monitoring foot traffic, including video camera surveillance, break beams and other infrared sensors, and WiFi network tracking systems. 7. GroundTruth Image Credit: GroundTruth Free trial: No Methods available: Anonymized mobile device tracking data | Point of interest (POI) data GroundTruth has high-quality data that has been verified by an independent audit. Their flagship product - Blueprints - is a mapping technology designed specifically for contextualizing physical locations. Alternatively, anonymous mobile device tracking solutions require no upkeep, instead leveraging data from mobile devices. On top of saving on upkeep and setup costs, these solutions offer deeper insights about the users, such as age, demographics, income, and more. Since this data is still anonymized, you can use it for many applications. #### Foot Traffic Data: Calculations, Accurate POIs, & Where to Get It Foot traffic data, also known as mobility data, can reveal consumer trends that are fundamental to strategic decisions in a variety of industries.For example, measuring foot traffic to a grocery store can indicate when the store is most crowded and should be staffed accordingly. Similarly, comparing the mobility data of two competitor coffee shops can show which will be a stronger investment for a private equity firm.There are many uses for and types of foot traffic data. This blog outlines the following to help you better understand what mobility data is and how best to use it:What is foot traffic data?How is foot traffic data collected?How to calculate foot traffic for a physical locationHow to get foot traffic data for competitor storesWhy is accurate points of interest (POI) data needed for proper foot traffic analysis?Top use cases for foot traffic dataBefore we get into specifics of how you can use foot traffic data for yourself, we’ll cover what foot traffic data is and how it’s collected.What is foot traffic data?Foot traffic data associates people’s movements with physical places. There are a few different types of foot traffic datasets, each collected differently and with their own pros and cons. Mobility data can be collected via mobile devices, WiFi connections, sensors, and even manually. Some foot traffic data providers aggregate and anonymize the data to purely provide context around the volume and patterns of visits to specific places, while others deliver more personalized information. One of the biggest struggles with footfall data is navigating privacy laws while still gaining access to enough information to produce meaningful insights. Anonymized data can be joined to demographics data such as age, income, voting patterns, and more, without compromising individual identities. Non-anonymized data connects information directly to individual people, and can contain personal information that makes its usage highly regulated.Regardless of the type of foot traffic you choose, incorporating it into your analytics will give you deeper insight into who is going where, and when. Analyzing customer mobility patterns is essential for perfecting your retail store design, understanding customer engagement, digging deeper into customer demographics, optimizing your marketing efforts, and determining the best locations to set up new stores.More generally, foot traffic datasets include metrics which answer questions such as:How many people visit this place on a daily or monthly basis?How long do visitors stay at this place (dwell time)?What times of day do people visit this place?How many people walk past the establishment vs. into the establishment?How accurately you can answer these questions will depend on what type of foot traffic data you use, which depends on how that data was initially collected.How is foot traffic data collected?Foot traffic data can be collected many different ways, ranging from leveraging artificial intelligence (AI) to manually using a clicker counter every time someone enters a place. The proliferation of mobile devices has made foot traffic data collection easy and efficient, enabling data providers to deliver mobile ping locations as well as context around those visits to analysts across industries.Foot traffic is most accurately collected from mobile devices. Anonymized and aggregated foot traffic data offers insights about where mobile device users travel with incredible accuracy while avoiding personally identifiable information. Mobility tracking data offers insights about users’ complete mobility patterns, offering not only movement into, out of, and within your retail stores, but all user movement that they’ve opted-in to. This gives you the ability to draw even deeper, and more meaningful insights from your analytics, and understand customer behavior with greater accuracy and detail. With anonymized tracking, customer data is safe, while offering you valuable aggregate data to leverage.Because privacy is a key issue in the use of mobility data, foot traffic datasets made from AI and facial recognition technology are less popular. There are legal restrictions around using facial recognition solutions, and the potential use cases. This will require significant knowledge and navigation to be able to leverage facial recognition solutions.Some organizations leverage foot traffic data collected from WiFi signals. Businesses can use their internal WiFi to gather foot traffic by allowing guests to connect. Ultimately, any time a user connects, your network will track this. This collection method is less accurate than deriving foot traffic from mobile devices because some people may not connect to WiFi (resulting in under-counting) or may automatically connect to your WiFi when passing by but not entering (resulting in over-counting).Other common methods of collecting foot traffic data involve hardware such as sensors, pressure mats, video cameras, and clicker counters. These require more manual effort for the collector and also are generally less accurate than mobility data.To help you learn how to make the most of mobility patterns data, we built a list of the top eight methods of collecting foot traffic data - check it out here.How to calculate foot traffic for a physical locationUnderstanding if a device visited a place, brand, or type of store can be valuable context to have for your business. Companies use store visit information to build custom audiences for advertising purposes, to better attribute ad campaign spend, and to send contextual push-notifications in real-time. Unfortunately, accurately determining if a device visited a place can be a tough engineering problem to solve.Dealing with messy GPS data, incomplete business listing information, and limitations in knowing exactly where places are located make visit attribution a complex problem. However, building a visit attribution solution remains a worthwhile endeavor since it enables you to enrich digital data with physical-world context.Store visit attribution uses GPS location data from mobile phones with POI data to determine if a device visited a place, brand, or type of store. There are two main methods for attributing store visits, but the most accurate way is using precise POI polygons as geofences to truly see which mobile devices passed through a threshold.The other popular method for store visit attribution is using a centroid radius as the polygon. While this can be easily done with any data point and basic geoprocessing tools, it often contributes to incorrectly attributed visits because a centroid radius is less precise than a building footprint polygon. As a result, GPS pings can be under or over-counted using this method of visit attribution.For a technical breakdown of visit attribution and help deciding how to measure footfall, read our guide.How to get foot traffic data for competitor storesAnalyzing foot traffic for competitor stores is just as important (if not more) as it is for your own locations. But many methods of attributing store visits require sensors physically at the location, or access to secure information, like the store’s WiFi records. Anonymized and aggregated foot traffic data can be procured for any location, regardless of whether it is your own or not. This is particularly useful for businesses looking to understand competitors and complementary locations.Why is accurate points of interest (POI) data needed for proper foot traffic analysis?Context - With latitude and longitude points that represent where people are going, and how long they spend there, you can begin to see trends and patterns. But those trends and patterns don’t mean anything without context around what is happening at a specific location. For example, seeing a cluster of GPS pings at (43.0568076,-77.6523542) isn’t necessarily useful, but understanding that they are at a McDonald’s off the New York State Thruway can indicate this is a popular stop for people on road trips.‍Precision and Accuracy - While context is key to truly analyzing GPS pings, that context needs to be accurate. If the POI data used is stale or incorrect, you could be misinterpreting GPS pings. If the coordinates above are incorrectly labeled as a gas station, you may think there is an untapped opportunity to place a fast food restaurant, when in reality McDonald’s is already there.‍Detailed Metadata - Spatial analysis, particularly with foot traffic data, is so advanced today that brand name alone is not enough for truly understanding what is happening at a location. Brand relationships, such as parent/child brands, are necessary attributes in any POI dataset so analysts can fully measure brand affinities and footprints. Similarly, spatial hierarchy metadata, like the sets provided in SafeGraph geometry data, provides crucial information about stores located within the same structures. This can make the difference between an incorrectly attributed GPS ping and an accurate representation of visits to that store.3 top use cases for foot traffic dataAny organization that deals with consumers in some way can benefit from analyzing foot traffic data. Here are the top 3 use cases for mobility data:1. Site selectionChoosing where to open (or close) a location is an important operation for retailers, healthcare providers, and government agencies alike. With mobility data, any organization can analyze how people move throughout the day and how that can impact the success and accessibility of a new (or existing) location. To learn more check out our retail site selection guide.2. Trade area analysisTo create a solid business strategy, organizations require details about who their target customers are, where they live and go, and how they can best be reached. Foot traffic data provides these insights, resulting in the most accurate method of trade area analysis that is grounded in actual human activity rather than predictions based on proximity alone.3. Investment researchWhether private equity firms are researching their next investment, performing due diligence, or managing their portfolio, they need reliable, up-to-date indicators of current and future business performance. Mobility data is often produced more frequently than other common inputs, such as datasets from the federal government, enabling private equity firms to update their research models more often for timelier results.Foot traffic data is becoming increasingly critical to business operations across industries and use cases. #### Geospatial Data Analytics: What It Is, Benefits, and Top Use Cases Key Takeaways GIS data combines geographic coordinates with descriptive attributes to model the real world digitally. Reliable GIS analysis depends on accuracy, freshness, consistent schemas, and broad coverage. Many organizations struggle with outdated or incomplete GIS datasets that require heavy cleaning. SafeGraph focuses exclusively on high-quality places data designed for integration with GIS tools. Monthly updates, human validation, and Placekey support make SafeGraph well suited for large-scale GIS workflows. In the previous chapter of this guide, we went over some uses for the different types of geospatial data out there, like polygons and points of interest. But geospatial data in and of itself isn’t that useful unless you know how to read it properly using the right analytical context.So what is geospatial data analysis, and why are many organizations incorporating it into their analytics and other operations alongside modern geospatial technology? We’ll answer these questions and more as we look at the following:What is geospatial data analysis?4 benefits of using geospatial data in analyticsTop 5 ways geospatial data analysis is used effectivelyThe changing geospatial data analytics market & broader geospatial industryLet’s start with the basics by explaining what geospatial data analysis is.What Is Geospatial Data Analysis?Geospatial data analysis involves collecting, combining, and visualizing various types of geospatial data using analytical tools and geospatial technologies. It is used to model and represent how people, objects, and phenomena interact within space, as well as to make predictions based on trends in the relationships between places.Put another way, geospatial data analytics puts data in a more accessible format by introducing elements of space and time through spatial computation. Information that would be difficult to get out of reading line after line in a table or spreadsheet becomes much easier to understand in the context of a visual representation of what the world really looks like. This allows people to more easily pick up on patterns such as distance, proximity, density of a variable, changes over time, and other relationships that are inherently geographic.In short, geospatial data analysis is about going beyond determining what happens to understanding not only where and when it happens, but also why it happens at a specific place and/or time by applying geospatial solutions.What Are Geospatial Technologies?Geospatial technologies are the systems and tools that make geospatial data analysis possible. They support the collection, processing, and interpretation of location-based data and help translate raw spatial inputs into meaningful insights.This technology comprises of several key components, including Geographic Information Systems (GIS)Global Positioning Systems (GPS)Remote sensing technologiesCartography and spatial analysis toolsThese tools enable the mapping and analysis of Earth's features, events, and phenomena by combining location data (coordinates) with attribute data (characteristics of the object or event) and temporal data (time-related information).4 Benefits of Using Geospatial Data in AnalyticsGeospatial big data analytics makes trends regarding space and time more visually obvious than they would be in a massive set of raw data without spatial context. This, in turn, offers many advantages over analyzing datasets without this type of context. To illustrate, here are 4 benefits of using geospatial data in analytics:Identifying spatial patterns and trends – Some relationships and connections cannot be understood without factoring in “where” (or “when”) they are occurring within a geographic framework.More opportunities for segmentation – When location is added as a component of an analysis, you can begin to segment and filter based on geography, which makes your entire analysis more detailed and actionable. ‍Modeling the real world – Everything has a geographical position, so analysis without location is already missing a key component. Geospatial data enables you to model the real world, often within real time using advanced geospatial technology.Accurate predictions lead to better decision-making – When you study a phenomenon over time in the context of a particular location, you begin to better understand why it happens where and when it does. This helps you better predict not only what will happen, but also when and where it will happen across different environments. Then you can plan out how you might react to (or even influence) future events.Top 5 Ways Geospatial Data Analysis Is Used EffectivelyIt shouldn’t be a surprise that geospatial data is increasingly being integrated into several different industries and corporate functions as part of broader geospatial solutions. After all, it provides a lot of extra information and context that most other types of data don’t. Here are just a few business practices that are now leveraging geospatial data analysis methods.Visit Attribution – Combine property and mobility data to determine how many people entered your store, as opposed to simply walking past it within a defined geographic boundary. ‍Investment Research – Analyze consumer behavior and movement patterns for hints at which businesses are worth investing in based on location-driven performance signals.Competitive Intelligence – Find out which nearby businesses and points of interest are hurting (or even helping) your stores, based on their locations and spatial proximity.‍Risk Assessment – Knowing a building’s size, shape, location, purpose, and occupancy helps insurers estimate how vulnerable it or its tenants are to an accident or environmental risks.  Consumer Insights – Observe patterns in what other stores your customers visit and what brands they buy to strategically plan your business’s locations and inventory with geographic precision.You can learn more about these (and other) uses for geospatial data in this guide’s chapter on geospatial technology examples and location-based analytics use cases.The Changing Geospatial Data Analytics Market & IndustryThe increasing number of use cases for geospatial data is steadily growing the geospatial data analytics market and the wider geospatial industry. Some market analysts estimate that the geospatial data industry will nearly double in size between 2021 and 2026.The types of fields – both commercial and non-commercial – that geospatial data is being used in are diversifying as well. We already touched briefly on how the retail, private equity, and insurance industries are utilizing geospatial data. But utilities providers can also make use of it to predict where and when service disruptions may occur and thus optimize when and where they should perform maintenance using spatial forecasting. And governments can use it to formulate better emergency response and public information protocols in the event of a natural disaster or other crisis through applied geospatial solutions.All of this means that geospatial data analysis companies will be more in-demand than ever. Another prediction is that, as the fields of machine learning and geospatial data analysis intertwine, we will see the emergence of self-piloting vehicles and maybe even high-definition custom maps on demand enabled by advanced geospatial technology.Speaking of maps, they are the primary medium for visualizing geospatial data so it can be analyzed. But there are many different types of maps, and which type you use to display your data can sometimes have a big impact on what you get out of analyzing it. We’ll explain more in our next chapter on methods of visualizing geospatial data.If you're ready to learn more, check out the next chapter,  "12 Methods for Visualizing Geospatial Data on a Map".Learn more about use cases in our previous chapter, “Top 10 Uses of Geospatial Data + Where to Get It”. FAQ’s 1. What is geospatial data analysis? Geospatial data analysis involves collecting, combining, and visualizing data with geographic attributes to understand how people, objects, and events interact across space and time. 2. How does geospatial data analysis make data easier to interpret? By adding spatial and temporal context, it allows patterns such as proximity, density, movement, and change over time to be understood more clearly through visualization. 3. What are the main benefits of using geospatial data in analytics? Key benefits include identifying spatial patterns, improving segmentation, modeling real-world conditions, and supporting more accurate predictions. 4. How is geospatial data analysis used in business decision-making? Businesses use it for applications such as visit attribution, investment research, competitive intelligence, risk assessment, and consumer behavior analysis. 5. What types of industries commonly use geospatial data analytics? Industries such as retail, insurance, investment research, utilities, and government agencies regularly apply geospatial analytics in their operations. 6. How does geospatial data analytics support forecasting and predictions? Analyzing events across specific locations and time periods helps explain why outcomes occur where they do, improving predictive accuracy. 7. What role do maps play in geospatial data analysis? Maps are the primary medium for visualizing geospatial data, though different map types can influence how patterns and relationships are interpreted. 8. How is the geospatial data analytics industry changing? The industry is expanding as use cases increase across sectors, with growing integration of machine learning and advanced spatial analysis techniques. Geospatial data analysis involves collecting, combining, and visualizing data with geographic attributes to understand how people, objects, and events interact across space and time.By adding spatial and temporal context, it allows patterns such as proximity, density, movement, and change over time to be understood more clearly through visualization.Key benefits include identifying spatial patterns, improving segmentation, modeling real-world conditions, and supporting more accurate predictions.Businesses use it for applications such as visit attribution, investment research, competitive intelligence, risk assessment, and consumer behavior analysis.Industries such as retail, insurance, investment research, utilities, and government agencies regularly apply geospatial analytics in their operations.Analyzing events across specific locations and time periods helps explain why outcomes occur where they do, improving predictive accuracy.Maps are the primary medium for visualizing geospatial data, though different map types can influence how patterns and relationships are interpreted.The industry is expanding as use cases increase across sectors, with growing integration of machine learning and advanced spatial analysis techniques. #### Geospatial Data Management Best Practices: 5 Steps to a Winning Strategy Our last topic in this guide will deal with the burning question that most readers likely have on their minds: “How do I make geospatial data work for my organization?” Well, as you might expect, there isn’t a one-size-fits-all answer. Every company is different in its needs, processes, and goals.With that said, leading geospatial data firms such as SafeGraph and Esri have suggested some geospatial data management best practices that your organization can use as a framework. We’ve sorted them based on a five-step process developed by Esri to build a winning geospatial data strategy from the ground up:Phase 1: Define what your organization needsPhase 2: Ensure that your organization’s needs will be metPhase 3: Build your geospatial data strategyPhase 4: Optimize your geospatial data strategyPhase 5: Adapt your geospatial data strategy for the futureAs we go through each category, feel free to choose and tweak the practices you want to use in order to create a geospatial data strategy that’s right for your organization.15 geospatial data management best practices for a successful business strategyIn some ways, the principles of geospatial data management aren’t all that different from those of managing other types of data. For example, you probably won’t need much hardware and software other than the industry standards for data science.However, as we talked about with the challenges of using geospatial data, it has its own quirks in that it’s inherently related to physical locations (and occasionally times, as well). So there are some extra things it can tell you, but getting the most out of it may take some more specialized knowledge and management practices. Here are 15 recommendations to get you started.Phase 1: Define what your organization needs1. Remember that geospatial data can do more than you thinkGeospatial data is about more than just static locations. It also offers insights into relationships between points of interest, products, brands, and people. With this information, you can ask questions such as: how likely are people from a certain census block group to visit particular businesses? Based on that likelihood, how likely are they to shop for products of one brand over another? Which businesses in an area are likely competing with each other? Which businesses may be complementing each other by allowing people to quickly and easily complete a series of tasks in sequence? These are questions that geospatial data can answer, whereas other types of data may not be able.2. Ask stakeholders what they wantAlthough geospatial data can do a lot of things, your time will be better spent if you know specifically what your stakeholders want. Sit down with them and have a non-technical conversation about the types of insights they’re looking for. This will inform the types of geospatial data you will need to collect and work with.3. Focus on the big pictureGeospatial data is a powerful tool, so don’t let it go to waste on side projects. Map out the most pressing issues and challenges your organization is facing, and then think about how you could use geospatial data to solve them. This will help to get higher-ups on board, which will make things easier for your best practices later.Phase 2: Ensure your organization’s needs will be met4. Turn your organization’s needs into geospatial needsTake the things stakeholders say they want and map them to common use cases for geospatial data. Are you trying to visualize or map something? Monitor and analyze activity at a particular location? Plan or design buildings or other infrastructure? Support investment decisions? Better engage your customers or constituents? Once you translate your organization’s needs into geospatial data use trends, it becomes clearer how to apply geospatial data to meeting them. 5. Define how geospatial data powers your organization’s overall objectivesOnce you’ve matched stakeholders' needs to geospatial data use cases, go a step further. Based on the patterns you identify, go back to stakeholders and discuss with them how the concept of geospatial data management fits into the organization’s overall philosophy. What does your organization ultimately hope to achieve? What beliefs and behaviors do you adhere to in pursuit of that goal? And what milestones will serve to mark progress towards the greater objective? Using your likely primary uses of geospatial data as bases, develop a broader vision of how geospatial data will work in service of your organization’s mission, principles, and desired accomplishments.Phase 3: Build your geospatial data strategy6. Get the right parts for your technology stackYou can actually use standard data analysis tools to work with geospatial data. However, you’ll likely only be able to accommodate small-scale data production that will be difficult to expand as your organization’s operations grow. Instead, a cloud-based data platform offers a much more speedy, reliable, convenient, and scalable alternative. Beyond that, you should also have a data lake, a data storage solution, a processing compute platform, a task scheduler, and a tool that simplifies writing data processing pipelines.7. Bring in a dedicated team to manage your geospatial data infrastructureA geospatial data analysis IT infrastructure can be rather resource-intensive. That’s why it’s a good idea to devote a separate IT department to maintaining it. This helps to avoid overtaxing your main IT department, which may leave your other departments fighting over limited IT resources.8. Lay down some ground rulesYou’ll also need to develop geospatial data standards, guidelines, and policies. Use the vision for geospatial data’s place in your organization as a guide. From there, build rulesets that can be backed up by industry-standard procedures, baselines for minimum compliance, and proven methods and practices.Ideally, your rules should cover the following:How your data will be securedWhat your data is allowed to be integrated with, and how this should be doneHow to protect the integrity of your dataWho has access to your dataWho owns your dataFor a smaller organization, you may only want to focus on problem areas where having documented rules will improve productivity. Larger organizations may need more comprehensive documentation to ensure the compliance of all stakeholders.Phase 4: Optimize your geospatial data strategy9. If you’re struggling to balance your data infrastructure’s cost with versatility, split itOne issue you may encounter with regards to your geospatial data infrastructure is how to balance its cost efficiency with its flexibility. Preprocessing data increases your system’s cost efficiency, but reduces its ability to help answer unique queries on demand. You may want to split your infrastructure and have each part focus on one goal or the other.10. Create a central coordination committee for your organization’s geospatial dataIt’s recommended that you set up a committee to build and manage a complete catalogue of all geospatial data your organization has on file. This makes it easier to communicate to stakeholders and clients what you have (or don’t). It’s also important to have so that employees in different departments can ask about the availability of data outside the kinds they normally use. This simplifies things by allowing teams to internally connect their datasets instead of needing to search for the data themselves somewhere else.11. Have employees be responsible for each datasetYou should also assign one or two employees to act as managers for each individual dataset you have. Having experts on specific datasets will, again, make communicating their capabilities to stakeholders and others within the organization easier.12. Get your dataset experts to work closely with your analytics teamIt’s ideal to have your geospatial dataset managers in close contact with your corporate analysts, and playing under the same (or similar) rules. Doing so will help you acquire and manage the assets, equipment, and insights your teams need more quickly and efficiently. This also helps to improve your organization’s data quality – leading to better analyses and decisions – by having extra people reporting on it to catch any errors.Phase 5: Adapt your geospatial data strategy for the future13. Add metadata to make managing your data easierOnce you have an idea of the most common ways geospatial data is used within your organization, you can start adding the appropriate metadata to your datasets. Classify them based on traits such as which departments typically use them, which format(s) their data is in, how often they’re updated, when they were last updated, and which geographical areas they cover. This will help to streamline management of your datasets by allowing you to organize and sort them based on the most often-used and/or up-to-date data.14. Treat geospatial data as an organization-wide assetMore often than not, geospatial data will have some use in more than just a select few of your organization’s operations. That’s why it’s smart to form a geospatial data technical guidance committee, made up of employees from all of your different departments. This ensures that a few people in every department know how to analyze and interpret geospatial data, and discuss its use with members of other departments.This makes sure your departments are all on the same page with regards to how each of them is using geospatial data. Employees will also know exactly who in their department to go to if they need help with a geospatial data problem. This helps to cut down on delays from one department needing to ask another for help. This also avoids multiple people in the same department needing to ask for assistance with the same problem from other departments, which can result in duplicated work.15. Track, review, and revise how geospatial data is serving your organizationDevelop a set of performance indicators based on how your organization’s geospatial data management plan relates to its short-term and long-term goals. Be sure to review them at least once or twice per year to make sure everyone is sticking to your ground rules, and that your organization's use of geospatial data is driving you towards where you want to go. Look not only at how much progress has been made, but also at whether that progress has been in the areas most important to you.Of course, an organization’s values and priorities can change over time. So don’t be afraid to adjust your geospatial data strategy’s success metrics if you conclude that it’s warranted.‍That wraps up our guide to geospatial data: where to get it, how you can use it, and how to get the most out of it. Of course, as to where to find geospatial data, you don’t have to look far to get started. Visit SafeGraph to download some sample data and see how geospatial data can inform a winning strategy for your business. If you’re on the integration path and have questions about the process, make sure to check out “Geospatial Data Integration — Importance + Top 5 Challenges”. #### Geospatial Data Providers: Services, Companies, and Products Big data has become a big industry as all manner of organizations look for more accurate and timely ways to make operational decisions.As part of this process, organizations are discovering the power of geospatial data to help explain why things happen in certain places and not others. But where do they get geospatial data from? And what types of data do geospatial technologies provide?We’ll answer these questions in the following sections:What are geospatial data providers?29 best geospatial data providers for deeper analysisLet’s start by going into a bit more detail about what kinds of companies provide geospatial data.What are geospatial data providers?Geospatial data providers are organizations that collect, process, and/or distribute different types of location-based data. This can include information on points of interest, building footprints, human mobility patterns, population demographics, transportation networks, and more.These providers can be government agencies, scientific or academic institutions, private corporations, and others. Regardless of who provides it, geospatial data can be used for a number of different purposes that benefit businesses, communities, or even the whole of humanity or Earth.Due to geospatial data’s growing importance, there are some organizations that have based the bulk of their operations around it. Their main purpose is to gather geospatial data and process it into usable formats, then give or sell it to others who need it. We’ll take a few minutes to talk more about that now.What is geospatial data as a service?Geospatial data as a service (DaaS) is an industry where companies gather and clean location data for the purpose of selling it to other institutions. Many organizations don’t have the resources to collect and process location data in-house, so they get it from businesses who specialize in doing that.Geospatial data can be used for a lot more than just mapping. To give some examples, businesses can determine where they should build new stores or close old ones down. Governments can plan where to build transportation, housing, and other critical services. Investors can look at potential performance indicators and make investment decisions without waiting for official financial data to be released. And insurance firms can more accurately assess risk for a location, so they can write liability policies that are better tailored to clients and priced more competitively.With more business sectors continually finding new applications for geospatial data, it should come as little surprise that DaaS companies are in demand. We’ll introduce you to a number of them next.29 best geospatial data providers for deeper analysisThe top geospatial data companies are a mix of public and private organizations. Some specialize in certain types of geospatial data, making them ideal for particular use cases. And some may also offer software platforms and other solutions that make geospatial data analysis easier. You’ll meet all kinds as we go through our list.1. SafeGraphCost: bulk pricing availableTypes of data: POI | propertySafeGraph is your source for geospatial truth, providing comprehensive and accurate data on points of interest. Our Places and Geometry datasets provide location information and building footprints for millions of global locations. 2. CAP LocationsCost: $0.05/record; charged on a per-dataset basisTypes of data: POI | propertyCAP Locations provides point of interest and building footprint data for over 1.2 million restaurants and retail stores in the US and Canada. Along with this niche, it also provides comprehensive spatial hierarchy data for over 42,000 malls and other shopping centers spread between the two countries.3. BA45Cost: $0.005/record; charged on a per-dataset basisTypes of data: propertyBA45 has attribution data on over 125 million residential and commercial properties in the US. Its data schema has 85 attributes that cover things like when a building was built, what it’s made out of, what amenities it has, and what price it last sold for. At half a cent per record, its data is very affordable as well.4. First American Data & AnalyticsCost: $0.05/record; charged on a per-dataset basisTypes of data: propertyFirst American Data & Analytics specializes in property data attributes related to buying, selling, leasing, and mortgaging houses. This makes it a great source of information for real estate insurers and investors looking to gauge the financial risk associated with specific residential properties. Its data is comprehensive enough to cover the entire US housing market.5. RegridCost: $0.05 per record; can vary based on level of detail and whether purchasing for the entire US or a single stateTypes of data: propertyRegrid has data on over 150 million parcels of land in the US, which can be segmented by state or county if you only need specific sets. Its premium datasets feature more attributes, including building footprint geometry.6. LocomizerCost: $0.02-$0.05/record, charged on a per-dataset basisTypes of data: POI | mobilityLocomizer provides general footfall data around points of interest. It also uses machine learning to estimate how likely a person is to engage in a brand-affiliated activity (e.g. shopping, sports, movie-watching, and eating/drinking) around a particular point of interest. Locomizer’s data covers mainly the UK, but can include some other countries.7. VerasetCost: contact for pricingTypes of data: mobilityVeraset’s “Movements” dataset measures foot traffic centered on major points of interest in over 150 countries worldwide. Its “Visits” dataset combines this data with property profiles on over 6 million commercial buildings in the US to show who’s visiting which businesses, and when.8. US Census BureauCost: freeTypes of data: demographicsThe US Census Bureau is a United States government agency that provides survey-based information on demographics throughout the country to the public. However, it isn’t always in the most usable form, which is why SafeGraph has provided a cleaned-up version of its American Community Survey data.9. Spatial.aiCost: contact for pricingTypes of data: demographicsSpatial.ai uses geotagged information from social media to build its demographics datasets. This allows it to segment census block groups into over 70 lifestyle categories based on food and drink, entertainment, faith, relationships, hobbies, and more. 10. EsriCost: annual subscriptions to their SaaS platform that includes data accessTypes of data: POI | property | mobility | demographics | boundaries | environmental | streets | imageryEsri partners with leading data providers to provide pretty much every kind of geospatial data that an organization could want to use. It also provides ArcGIS, a top geospatial data management system.11. InfutorCost: $0.01/record; $8,000-$10,000/monthTypes of data: property | demographics | addressInfutor combines comprehensive US property data with transaction and demographic data to provide anonymized consumer profiles for over 260 million Americans. It also has address data for over 360 million places across the US, including geographic coordinates.12. US Department of TransportationCost: freeTypes of data: addressThis is another United States government institution providing free data to the public. Among its datasets is the National Address Database, which contains over 65 million confirmed records of street addresses across the US.13. Trust for Public LandCost: freeTypes of data: POI | property | boundaries | demographicsThe Trust for Public Land provides boundary and associated property information for public parks in nearly 14,000 communities across the US. Its dataset also includes some demographic information about people who live near the parks.14. CARTOCost: annual subscriptions to their SaaS platform integrates with their data marketplaceTypes of data: POI | property | mobility | demographics | boundaries | environmental | streetsCARTO is another company that sells many different types of geospatial data, having partnered with over 40 other providers. It also provides services and software for spatial analysis and location intelligence.15. Tomorrow.ioCost: $10,000/year; some content is freeTypes of data: environmentalTomorrow.io’s geospatial data software and datasets provide timely and accurate weather data so businesses can prepare for inclement weather. This has implications for events, logistics, transportation, agriculture, construction, utilities, insurance, and more.16. ClimateCheckCost: $0.05/record, charged on a per-dataset basisTypes of data: property | environmentalClimateCheck compares attributes of US properties against historical weather patterns and over 25 internationally-recognized climate change models. This allows them to provide an assessment of how vulnerable any property in the US is to natural disasters and other weather-related damage caused by climate change.17. CustomWeatherCost: $0.02/recordTypes of data: environmentCustomWeather collects weather data from over 80,000 sources worldwide to provide daily, monthly, and year-over-year weather forecasts and comparisons. Its dataset includes attributes such as humidity, wind speed, dew point, visibility, and many more.18. MapboxCost: annual subscriptions that vary based on the geographic region(s) coveredTypes of data: mobility | boundaries | streetsMapbox provides platforms for displaying maps, planning routes, doing custom mapping, and more. It also sells datasets for over 4 million jurisdictional boundaries worldwide; live and typical traffic on streets around the world, as reported by over 600 million active monthly users; and human mobility patterns based on over 20 billion daily location updates across the globe.19. BingCost: contact for pricingTypes of data: POI | address | environmental | streets | imageryBing is most commonly known as Microsoft’s search engine, and one of Google Search’s chief rivals. But it also provides a Spatial Data Services API that lets you store, query, and geocode various types of location data. Bing’s imagery also powers many online mapping applications.20. CarePreciseCost: $0.05/record; other options depend on specific product and length of licenseTypes of data: POI | demographicsCarePrecise’s geospatial data products cover over 6 million healthcare providers in the US, from hospitals to specialist clinics to individual practices. You can get data on what a practice is classified as, what kinds of (and how many) procedures they do, how to contact them, and what health organizations they’re affiliated with. You can even get anonymized information on patient and employee demographics and reviews.21. NetwiseCost: up to $0.02/record; contact sales for other optionsTypes of data: POI | demographicsNetwise specializes in data that is useful for business-to-business marketing. It has information on over 30 million American companies, including anonymized demographics for both employees and customers of each company.22. DatabaseUSACost: contact for pricingTypes of data: POI | demographicsDatabaseUSA is another business-to-business data specialist that provides data on over 15 million American companies, including over 40,000 new ones each week. It also has anonymized demographics data on over 260 million Americans, segmented by professionals, customers, and homeowners.23. TransparentCost: $0.05/record; charged on a per-dataset basisTypes of data: property | addressTransparent provides data on over 35 million properties worldwide that have been listed as vacation rentals. It has over 50 attributes of granularity, including property type, occupant capacity, minimum length of stay, and more.24. Vertical KnowledgeCost: $0.05/record; $800-$7,000/month (average is $2,500-$3,000/month)Types of data: POI | property | mobility | demographicsVertical Knowledge collects and processes all sorts of different information that’s publicly available on the Internet. So it has data on a wide variety of subjects, including rental properties, retail brands, automotive transactions, air and sea travel, and corporate demographics.25. HARNESS DataCost: $0.005/record; charged on a per-dataset basisTypes of data: POI | property | addressHARNESS Data is a firm specializing in UK location data. It provides comprehensive information on most commercial properties in Great Britain. It also has an address-matching tool to check if an address corresponds to a place with a differently-formatted address, or to a place that even exists at all.26. Greenwich.HRCost: $0.05/record; charged on a per-dataset basisTypes of data: demographicsGreenwich.HR’s data is about corporate demographics. You’ll get to look at over 85,000 data attributes on positions at over 5 million companies. Find out who’s hiring, which roles are in demand, and how much money people are making per role and/or company in over 200 countries around the world.27. SMR ResearchCost: $0.05/record; $4/reportTypes of data: POI | propertySMR Research focuses on data concerning commercial real estate. This includes attributes that public records may not provide, such as information on building tenants, square footage, owner contact information, property use classification, and spatial hierarchy metadata for multi-unit buildings (e.g. apartments and malls). It also includes data on property valuation and other financial information that could be useful in measuring insurance or credit risk.28. CRED iQCost: $0.05/record or $300-$400/user/month; free option availableTypes of data: POI | property | addressCRED iQ is another firm that specializes in commercial real estate data. It provides basic property and address data for over 140,000 parcels of commercial real estate in the US, as well as finance-related information (where applicable) such as lease durations and loan terms.29. GoWeeWeeCost: $0.02/recordTypes of data: POIGoWeeWee has data on over 230,000 public restrooms worldwide. Attributes include gender designation (including gender-neutral), availability of baby-changing amenities, accessibility options, and when a restroom was last cleaned.Hopefully, this list of geospatial data services has given you a better idea of what kinds of geospatial data are out there and where you can get them. Of course, you can start right here with us at SafeGraph. #### Geospatial Data Sources: Where to Get the Data You Need Once you understand what kind of geospatial data your organization might need, you next need to know where to find it. There are many places where you can get geospatial data, but not all of them will provide the type(s) of data that you’re looking for.It’s important to know where to go to get the right kinds of data for building your organization’s geospatial data ecosystem. We’ll help you do that here by listing and explaining some common reliable geospatial data sources for the following categories:Points of interestPropertyMobilityDemographicsAddressBoundaryEnvironmentalStreetsImageryLet’s begin our data safari.23 geospatial data sources to find the right data for your operationsNot every person or organization is going to need the same kinds of geospatial data. So for the sake of speed and efficiency, it helps to know ahead of time where to look.Our list here will break down sources of geospatial data by the types of data you’ll likely be after. Note that some sources may provide more than one kind of data, so you don’t necessarily need to shop all over the place to get what you need.1. Points of interestSources of points of interest (POI) data have information on pretty much all kinds of non-residential buildings and properties. Basically, anywhere where people congregate and hang out, other than a private dwelling, can be considered a point of interest. Usually that’s a commercial building like a store or restaurant, but not always. POI data also often contains attributes of the places it describes, though these can be slightly different between providers.SafeGraph PlacesSafeGraph has extensive points of interest data known as Places. This includes information on millions of global places, with optional data for businesses that have permanently closed. You can find out a place’s name, address, geographic coordinates, business type, associated brand, and more.CAP LocationsCAP Locations data focuses specifically on commercial points of interest. It contains records on over 40,000 retail stores, restaurants, and malls across the US and Canada (with plans to expand this dataset to the UK). Their information spans from addresses and business categories to attributes like parking and tenant capacity, or the year a building opened or was remodeled. 2. PropertySources of property data have sets of polygons and other geometric shapes to represent the physical boundaries of buildings and other properties on Earth. Mostly, these datasets are on 2-dimensional planes of length and width, but some include height to become 3-dimensional. Many of these datasets also include additional attributes of the properties they define, much like POI data.SafeGraph GeometrySafeGraph’s property dataset is called Geometry. It provides building footprint information for millions of POIs in the US, UK, and Canada. It also contains spatial hierarchy metadata, which shows spaces or rooms within other buildings.RegridBorn from a merger between Landgrid and Loveland Technologies, Regrid has data on over 150 million parcels of property across the US. Download this geospatial data to also get standardized building footprints of over 155 million structures around the country.BA45BA45 has a database on over 125 million properties across the US, including information regarding who owns them. Each property has been given over 40 different attributes, from how much money it last sold for to how many stories it has to whether it has a garage, pool, and/or central HVAC system.3. MobilitySources of mobility data use anonymized GPS signals sent out by people’s cell phones to provide a general picture of where people are throughout the day, and when. They do not track individual users or their activity, but rather measure activity around points of interest and neighborhoods (i.e. census block groups, or CBGs) to see how frequently people visit there, and in what volumes.VerasetVeraset has a basic mobility dataset called Movement. This offers anonymized information on movement patterns around POIs located in over 150 countries around the world. This includes timestamps and geographic coordinates. Veraset also has a Visits dataset, which combines mobility and property data to determine whether or not someone actually visited a specific location, and when.LocomizerLocomizer runs a geospatial data platform that measures mobility data around specific points of interest. They are mainly called upon to help their clients put together targeted advertising campaigns.4. DemographicsSources of demographic data contain information about the people who live in a certain geographic area. This includes bulk population counts, but these are also often segmented by attributes such as sex/gender, age, median income, and average housing costs.US Census BureauIn many cases, demographics data is open geospatial data because it’s made publicly available by government agencies. For example, SafeGraph has aggregated and cleaned data from the US Census Bureau and their American Community Survey report to cover demographic approximations across the US from 2016 to 2019.Esri Demographics DataEsri has ready-to-use demographic data in various forms for over 130 countries around the world. It’s segmented by over 15,000 variables, including ones related to income, families/households, health, education, employment, age, gender, ethnicity, and more.Spatial.aiSpatial.ai does demographic data differently. Their data is sourced from social media and segmented into over 70 categories that go beyond basic attributes to model and predict how people actually behave. These range from people’s hobbies, lifestyles, relationships to what they eat and drink, what they do for fun, and what they believe in.5. AddressAddress sources provide data on the locations of properties to power geocoding and routing. Information is usually in the form of postage details or geographic coordinates, but may include other attributes as well. InfutorAmong Infutor’s geospatial datasets is its National Spatial Reference File, which compiles over 360 million addresses across the US. These include precise latitude and longitude coordinates, as well as all addresses that are on file for US postal organizations (and even some that aren’t). US Department of TransportationThe USDOT has partnered with state and local governments across the US to create a National Address Database (NAD). Its goal is to have accurate, up-to-date, and free geospatial data on addresses for use in transportation safety, emergency response, and many other government services. As of May 2021, it contains over 60.5 million records from over half the states in the US.6. BoundarySources of boundary data provide information about the political divisions in geography. These include borders between countries, but can also include the boundaries of smaller administrative jurisdictions such as states, provinces, territories, counties, and regions. They can also include small civic boundaries such as school districts.CARTOCARTO has data on several different kinds of boundary segmentation. These include general regional borders, but also include things like census block groups (CBGs), school districts, postal code catchment areas, and more.Esri Places DataEsri provides an amalgamation of several different categories of geospatial data. One is administrative boundaries, such as postal code catchment areas. This geospatial data download also includes information on POIs and properties.Mapbox BoundariesMapbox Boundaries has data on over 5 million boundary sets from countries around the world. These include administrative, legislative, local, postal, and statistical boundaries such as states/provinces/counties, electoral districts, major metropolitan areas, and census block groups (CBGs).7. EnvironmentalSources for environmental data gather information on what’s going on in the natural geographic world. Some may be government agencies, while others may be private corporations or non-profit conservation groups. They track things like temperature, weather patterns, wildlife migration, and seismic activity.ClimateCheckClimateCheck is an environmental data service that synthesizes over 25 internationally-recognized models on climate change. Built for the insurance and real estate industries, it assesses the risk of climate-related damage (fires, floods, heat, storms, etc.) to over 140 million individual properties in the US over the next 30 years.Tomorrow.ioTomorrow.io offers complete, accurate, and customizable historical weather information from around the world. Their platform is built to help minimize the impact of inclement weather on businesses. It does so through services such as monitoring and forecasting conditions, making actionable recommendations, providing team-wide alerts, and streamlining communication channels to speed up response to weather-related incidents.CustomWeatherCustomWeather aggregates weather data from over 80,000 locations worldwide to provide the most comprehensive global weather coverage. By providing daily, monthly, and year-over-year weather comparisons for specific places on the planet, CustomWeather provides vital climate intelligence for workers in broadcast media, agriculture, insurance, renewable energy, and more. Notable weather attributes they track include min/max/average temperature, precipitation, humidity, and atmospheric pressure. 8. StreetsSources of street data map out the myriad of road transportation networks around the world. They may also provide metadata on these networks, such as where and when roads get the most traffic and what potential obstructions might slow commuters down. Their goal is to safely and efficiently get people where they want to go.Mapbox Traffic DataMapbox’s Traffic Data contains information about over 30 billion road segments (each taking an average of 5 minutes or less to traverse) around the world, consistently updated by over 600 million monthly active users worldwide. It works with other major geospatial data solutions such as OpenStreetMap, HERE, and TomTom to provide the data necessary for route planning and traffic analysis.GoogleGoogle’s Roads API allows for inputting up to 100 sets of GPS coordinates, whereupon it will then map those points to the geometry of known roads to determine the most likely route a vehicle took. It includes features for interpolating coordinates to better fit the actual shapes of roads, and even provides other metadata about those road segments (such as speed limits). 9. ImageryThere are many sources of images out there. But the best ones for geospatial purposes are those that show what the physical world actually looks like, as a visual frame of reference for other geospatial data. It is also helpful if they include other geospatial metadata, such as what points of interest can be seen (and perhaps information about them), what time of day it is, and what other signs and signifiers might be nearby for navigation and safety’s sake. BingMany search engines, such as Microsoft Bing, now offer mapping services that incorporate aerial photography as a map layer option. Some, like Bing, also offer “street view” services that allow users to view particular road segments as if they were actually traveling on them. ‍Nexar‍Nexar’s imagery data focuses on roadways. They use special dashboard cameras on cars to not only capture what roads and their surroundings look like, but also to detect road signs and other factors that may affect traffic. This allows navigation companies, insurance firms, governments, and others to assess the safety and driveability of roads more accurately.These are a few of the many places out there where you can find reliable geospatial data. Now that you know where to get the data you need to build your geospatial data ecosystem, the next chapter will give you some ideas on what you can do with it all.If you're ready to learn more, check out the next chapter, "Top 10 Uses of Geospatial Data + Where to Get It"If you want to learn more about geospatial data types, check out “Geospatial Data Types and How You Can Use Them”. #### Geospatial Data Types and How You Can Use Them | SafeGraph The first thing to know about geospatial data is that it comes in many different forms. Some datasets are more suited to certain tasks than others, and some tasks require more than one type of dataset to see the full picture.This is why it’s important to be aware of the existence and primary uses of each geospatial data type within your geospatial data ecosystem. So let’s get started by explaining a bit about some of the most common classes of geospatial data you’ll encounterPoints of InterestPropertyMobilityDemographicsAddressBoundariesEnvironmentStreetsImageryAs you can see, there are quite a few to cover. So let’s get to it.9 geospatial data types: ways of representing places, people, and things on EarthAt SafeGraph, we’ve boiled down the geospatial data ecosystem into 9 distinct data types. Your organization will likely need to use several of them in combination to get the insights that you want, though you may not necessarily have to use them all. It depends on what your operation does and what the specific projects you undertake entail.1. Points of InterestPoints of interest (or POIs) are one of the most fundamental types of geospatial data. They describe any number and type of physical places on Earth (besides private residences) that people may want to visit, or use as reference in analysis. On smaller-scale maps, POI data can also abstractly represent cities or towns.Many POI datasets may also contain additional attributes of the points they describe. For example, POI data for businesses may contain information like street addresses, mailing codes, and phone numbers, as well as open hours and brand affiliation.It is important to note that POI data can be dynamic, especially in regard to human-made points of interest. Sticking with the business example, stores open and close all the time as their owners make strategic decisions. Stale POI data won’t be very helpful if a particular business you’re looking for isn’t at its specified location anymore, or has been replaced by a different business altogether. That’s why SafeGraph updates our Places data every month, to account for this volatility.Primary uses:‍POI data is used in many different ways. Many organizations use it in their mapping projects to show people where things are on Earth, along with other information people might want or need to know about those locations (like store hours). Real estate companies use it to analyze business opportunities and then decide on whether or not an area is worth investing in, based on predictions about population growth and competition.‍Retailers and CPG brands use it to assess local market conditions and measure how large their clientele base might be in a certain area. Financial institutions use it to track the openings and closings of different types of businesses across trade areas in order to decide which brands or sectors to invest in. And healthcare planners and providers use it to locate existing facilities, then compare their number and type to surrounding demographics to make sure everyone has access to the care they need. Check out a free sample to get started on your own.2. Property Property data represents the accurate physical boundaries of tangible places in the real world. Usually, it refers to the shapes of buildings or parcels of land. However, it can also be used to refer to different parts of a spatial hierarchy (i.e. multiple properties within buildings), like apartment units, stores in a mall, or offices in a business complex. Property data is often polygon data. SafeGraph’s Geometry data set is a great example of this type of data, and includes spatial hierarchy metadata to provide detailed context about property relationships.Primary uses:Property data is often used in mapping as a more accurate method than point data of representing what a place looks like in the physical world. This is useful for visit attribution, or determining whether or not people actually visit a POI (as opposed to just walking past it) and how long they stay there. Insurers use property data to more accurately assess a building’s risk factor based on how many people visit it, what other businesses are inside it, and what other buildings are nearby. For example, a nail salon sharing a wall with a fireworks store will have a higher risk profile than a nail salon located next to a daycare center. So insurers can use property data to accurately assess risk and write policies. Check out our free sample to get started on your own.3. MobilityMobility data refers to aggregated and anonymized data regarding where and when people move about in their daily lives. It is usually collected via global positioning system signals sent out by people’s phones. Mobility data does not provide individual mobile phone locations or activity, but instead provides aggregations of movement at the POI and Census Block Group (CBG) levels to give a general sense of volume and frequency of visits to certain locations.Primary uses:‍Mobility data has several uses. By knowing where people go and what stores they shop at, businesses can make decisions about things like where to locate their own stores, what brands to carry, and where to place advertisements.Insurers can also use this data to develop general liability policies for properties by looking at approximate visit counts. They can also look at visit counts for different times of the year, for businesses that operate seasonally. As an example, businesses that get more foot traffic in the winter are more likely to have people slip, trip, or fall because of icy conditions. So they have a different risk profile than businesses that get most of their customers in the summer.Urban planners use mobility data to better understand the communities they serve and how to better support the population. Measuring the volume of people going from one area of a city or county to another at a specific time of day can indicate a need for more public transportation routes, or more housing options near the destination. 4. DemographicsDemographics data refers to aggregated population counts, along with information about characteristics of the people within them. These include things like gender, age, income, housing costs, and so on. They are usually collected through government-run censuses and surveys.While SafeGraph doesn’t typically produce these kinds of geospatial data models, we do clean the data up for use with our POI and mobility data so that groups looking to analyze it can more easily find what they’re looking for.Primary uses:Demographic data is often combined with mobility data to get a sense of a business’s potential clientele – not only who visits the area, but also who actually lives there. By looking at the mobility, lifestyles, and economic strength of people who live in (and move through) an area, businesses can get an idea of whether that area is worth investing in or not. And if they do decide to invest, businesses can also use demographic data to help determine where to locate their stores, how and where to position their advertising, and what products and brands to carry. Check out our free sample to get started on your own.5. Address(Image source: Carto)Address data is the foundation for any geospatial data. It provides navigation-related information regarding specific places, represented by pairs or sets of geographic coordinates associated with street addresses.Primary uses:Address data is used to map, visualize, and analyze where places are located. Address data is an important input in POI data, but differs in that it can represent residential places (as opposed to just places where businesses operate or people spend time and money). While address data can be extremely helpful as an analytical input on its own, it can also be used in conjunction with other geospatial data types to see what is exactly happening at a specific place. For example, joining address data to weather data can reveal historical weather patterns at a granular level, and joining it to boundary data shows which school district or tax jurisdiction a place falls within.Address data is also fundamental to geocoding: translating street addresses into geographic coordinates and vice-versa (reverse geocoding). It may also be used to check whether an address is actually tied to a real place (address validation). Street data is usually built with address segments, requiring accurate geocoding to get a true rendering of where a place is located on that street.Address data is one of the trickiest geospatial data formats to work with because it’s difficult to standardize. Street addresses, in particular, contain multiple pieces of information that can each be commonly represented by different acronyms, abbreviations, and punctuational variations (e.g. “USA” vs. “U.S.A.” vs. “America”). This can make it very easy for a computer to mistake two addresses that point to the same location as representing entirely separate places. Unique identifiers (or join keys) for addresses, like Placekey, can help mitigate this problem. We’ll talk more about this in the chapter on the challenges of working with geospatial data. Check out our free sample to get started on your own. 6. Boundaries(Image source: Wikimedia Commons)Boundary data is like a large-scale version of property data. It outlines the limits of larger geographic areas that typically contain more than one address, property, and/or point of interest. And, like property data, it is usually represented by polygons as opposed to singular points.Primary uses:Boundary data serves an organizational purpose in mapping, often being used to designate separation between countries and the regions within them. On a more local scale, boundaries can be used to analyze the catchment areas of schools and other important facilities. Or a business could use boundary data to make decisions on where to locate their shops or display their advertising, based on the rules or other attributes of the jurisdiction(s) they fall under. Real estate investors or brokers can use boundaries in much the same way.7. Environment(Image source: Esri ArcGIS)Environmental data relates to natural geographic phenomena. These include things like climate (including weather and temperature patterns), tides, elevation, seismic activity, and flora/fauna habitats or migration patterns.‍Primary uses:Obviously, environmental data is critical to conservation workers and other environmentalists. But it can also be useful to people working in the insurance sector. By analyzing how prone an area is to, say, the effects of extreme weather and natural disasters (such as fires, flooding, wind damage, and structural collapse), insurers can take this information into consideration when performing risk assessments and developing liability frameworks.8. Streets(Image source: Ordnance Survey Data Hub)Street data provides information about road transportation networks. It may also include information about the volume of traffic on these routes at certain times, and sometimes the causes (such as construction, inclement weather, or accidents).‍Primary uses:Street data is integral to many forms of mapping, as it provides context on transportation routes for people looking to get from one place to another. Advanced street data can also help with planning a specific (or alternate) route if one or more is overly obstructed or completely blocked off. Routing applications and tools provided by GIS software use street data as an essential input.9. Imagery(Image source: United States Geological Survey)Imagery data refers to true-to-life representations of what places look like in the physical world, whether those places are natural land and water masses or man-made structures (such as buildings and roads). It usually consists of aerial photography or satellite imaging. Imagery data is always in a raster format, which means that it stores information as a grid instead of as points, lines, or polygons.Primary uses:Imagery data is typically used in mapping, often as a contextual foundation for other geospatial data layers (i.e. a basemap). Conservationists and other environmentalists can also use it to get a more accurate depiction of what the surface of the Earth looks like at any given point in time. This can reveal information important to them but that may not be present in other maps, such as tree cover, water quality or level, animal herd movements, and spread of wildfires. A fundamental step in using geospatial data for your organization is learning precisely what’s out there, as well as what kind of information each type can (and can’t) tell you.If you're ready to learn more, check out the next chapter "Geospatial Data Sources: Where to Get the Data You Need".If you want to go back to basics, check out "Geospatial Data: A Comprehensive Guide" #### Geospatial Data: A Comprehensive Guide SafeGraph is a company built around a belief in the importance of geospatial data. We’re built that way because we believe in knowing more than just what happens. We believe in knowing why it happens, and we also believe that you can’t know that without knowing where it happens. Now, you may ask: what is geospatial data, exactly? And what makes it such a unique and valuable asset to so many organizations? If you’re curious about what geospatial data is or how it might be able to help your business, this guide is for you. If you’re looking for the basics, jump down to these topics now: What is geospatial data? Importance of geospatial data for strategic decisions If you need the full crash course on geospatial data use, from the types of data out there and where you can get them to what you can do with geospatial data and how you can work effectively with it, check out our detailed guides below. Here’s what you can look forward to: Chapter 1: Geospatial Data Types and How You Can Use Them Everything has a geography, so almost any data can be made geospatial. In this guide, we’ll define geospatial data in terms of its most common categories and ways of representing places, people, and things. We break down everything from POIs to building footprints to mobility data and everything in between. If you want to learn more about geospatial data types, check out “Geospatial Data Types and How You Can Use Them”. Chapter 2: Geospatial Data Sources — Where to Get the Data You Need Some geospatial data providers specialize in one type, while others produce a wide variety of datasets. Find where to get the data you need for a particular geospatial analysis. We list 9 types of sources for your geospatial data, and 20+ providers and vendors that specialize in each of those areas. If your goal is to learn more about the sources and where you can actually get geospatial data, read through “Geospatial Data Sources — Where to Get the Data You Need”. Chapter 3: Top 10 Uses of Geospatial Data + Where to Get It What is geospatial data used for? It’s obviously critical to mapping, but it’s also seeing increasing use in business analytics and strategy planning. See the various ways geospatial data is being applied across industries and organizations, including in mapping, retail site selection, visit attribution, urban planning, network planning, investment research, and more. Learn more about use cases in “Top 10 Uses of Geospatial Data + Where to Get It”. Chapter 4: Geospatial Data Analytics — What It Is, Benefits, and Top Use Cases The real value of geospatial data lies in the insights gained from analyzing it. In this guide, learn what geospatial data analysis is, the benefits of using it in analytics, the top ways it’s used most effectively, and about the changing geospatial data analytics market and industry. “Geospatial Data Analytics — What It Is, Benefits, and Top Use Cases” will teach you everything you need to know about this topic. Chapter 5: 12 Methods for Visualizing Geospatial Data on a Map Visualizations are critical for giving geospatial data meaning. We break down the top 12 methods used for visualizing geospatial data (with image examples), how to do these visualizations, and in what instances they are most useful. If you need to learn more about the top “12 Methods for Visualizing Geospatial Data on a Map”, this is your step-by-step guide. Chapter 6: Geospatial Data Integration — Importance + Top 5 Challenges Using geospatial data is not without its complications, some of which you won’t find elsewhere in data science. This guide explains why data integration is necessary, and breaks down the top 5 challenges associated with geospatial data integration. Ensure you have the right guidelines, know-how, and tools to utilize geospatial data effectively, and learn how to solve problems such as data standardization, address standardization, processing times, data quality, and more. If you’re on the integration path and have questions about the process, make sure you check out “Geospatial Data Integration — Importance + Top 5 Challenges”. Chapter 7: Geospatial Data Management Best Practices — 5 Steps to a Winning Strategy Geospatial data has a few key differences from other types of data, and so requires a somewhat unique management style. If you are managing geospatial data or you need to soon, you need to identify your organization’s needs and optimize your strategy to accommodate those needs. We break down these questions and concepts into 5 phases that will help you plan your strategy and implement it at your organization. And we include 15 best practices you can use as the base for your geospatial data management strategy. If you’re already working with large-scale geospatial data management or you’re about to, make sure you check out “Geospatial Data Management Best Practices — 5 Steps to a Winning Strategy”. What is geospatial data? Geospatial data is any information about an object, event, or phenomenon relative to its location on (or near) Earth’s surface. It can often also include more details than just the address or coordinates of where it is occuring, such as timestamps, categorization, and other attribution. The location component is what’s critical to this geospatial data definition. It means that the data doesn’t just exist in a vacuum; it inherently points to a real place (or set of places) somewhere on Earth. This makes geospatial data behave a bit differently than other types of data, but it also makes it easier to visualize and conceptualize. Adding other attributes to geospatial data provides even more context and opens up even more avenues for analysis. For example, adding a time component allows for monitoring dynamic objects and events, such as how close a delivery truck is to reaching its drop-off destination, or if/when a severe storm over the ocean is likely to make landfall. Possibilities like these are also part of what makes geospatial data unique. What is geospatial data in GIS? It’s sometimes asked what the relationship between GIS and geospatial data is, since the two terms are often used together. Typically, GIS (i.e. geographic information systems) refers to a specialized system of computer software that collects, manages, analyzes, and maps geospatial data. In other words, it processes geospatial data into forms that are easier for humans to understand and use. GIS platforms are more common than you might think, too. If you’ve ever used Google Maps to get driving directions or find the address of a local place to eat (both of which are types of geospatial data), you’ve used a GIS. Importance of geospatial data for strategic decisions So why is geospatial data important? Well, as we mentioned, it adds spatial (and sometimes temporal) context to information. And being able to relate data to specific places and times in the physical world makes it easier to conceptualize. Patterns in things such as shopping habits, migrations, severe weather, and road traffic are much more apparent if they’re mapped to a representation of what the world actually looks like, as opposed to just being numbers in a table. The ability to recognize more of these patterns, and faster, is what is giving organizations that use geospatial data a competitive advantage. Here are three reasons why: Everything has a geography — A lot of data gives little to no information without being analyzed in the context of specific places. For instance, measuring the magnitude of an earthquake isn’t that helpful unless you also pinpoint where in the world its impacts were felt.‍ Location provides context — Understanding the attributes of a location where an event occurs, along with those of the surrounding area, gives you a starting point for examining what may have caused or influenced the event. Geographic patterns reveal relationships — Measuring occurrences over specific places allows you to better model, analyze, and ultimately predict behaviors. You’ll understand not only what and where they will likely happen, but also why they happen at specific places and maybe even times. This will better equip you to react to, affect, or even initiate future events. This only scratches the surface of what geospatial data encompasses and what it’s capable of. Throughout this guide, we’ll discuss what types of geospatial data are out there, where to find them, what you can do with them, and how you can effectively put them to work towards building a better business strategy. To lead off, we’ll cover some of the common forms that geospatial data comes in. This will help you recognize what they look like and, more importantly, understand what information they provide. You can use this to decide how they may fit into your organization’s operations. If you're ready to learn more about geospatial data types, check out “Geospatial Data Types and How You Can Use Them”. #### Google Places API Alternatives for Sourcing POI Data If your organization is building a location-based app or widget, it needs information on where people can go and what they can do at those places. Google has invested heavily in geospatial apps such as Google Maps and Google Earth, so its Places API is a common choice for searching and fetching point of interest data in apps. It may not always be the best choice, though: it can be expensive, lack certain data, or have restrictive use clauses. This article will discuss how to find the right Google Places API alternative if your business has specific needs. Here’s what’s inside: What is the Google Places API? The Difference between Google Places API and Google Maps API 15 Google Places API Alternatives for More Completeness, Affordability, and Usability We’ll first talk a bit about what the Google Places API is, to give some greater context. What is the Google Places API? The Google Places API is a service that takes HTTP requests and returns information, in XML or JSON format, about locations on Earth. Such locations are classified as establishments (e.g. businesses or government facilities), points of interest (e.g. memorials or parks), or general geographic areas. The API is made up of four main functions: Search: Shows the user a list of places close to their current location, or related to criteria that they search for (e.g. name, address, phone number). Details: Displays additional information about a particular place, including general user reviews of any products or services sold there. Photos: Allows the user to view photos of a specific place. Autocomplete: Predicts the name or address of a place a user is searching for as they type it, or provides an on-the-fly list of places and/or possible search queries related to the user’s current search terms. Each location in the Places API database also has a unique Place ID that can be used to find it in either the Google Places API or Google Maps API. There is a slight difference between these two application programming interfaces, which is why they are often used together instead of one or the other. We’ll take a minute to briefly explain. The Difference between the Google Places API and Google Maps API Since people typically use maps to find places, it’s understandable to think that the Google Places API and Google Maps API are two different names for the same service. While they are somewhat related, they fulfill slightly different roles. The Google Maps API primarily deals with depicting physical geography, as well as directions between locations. In that sense, it’s a starter kit for those wanting to build their own map widgets to show people where on Earth they are and how to get where they want to go. But it doesn’t necessarily provide a lot of information about what is specifically at one point on the map or another. That’s where the Google Places API comes in. It’s more concerned with what’s interesting about a particular place. Can you eat/drink there? Shop? Sightsee? Be entertained? Rest for the night? Places API acts as a directory that allows map applications to access this information. Think of it this way: if the Google Maps API tells people where things are and how to get there, the Places API tells people what is at those places and why they would want to go there. 15 Google Places API Alternatives for More Completeness, Affordability, and Usability As recognizable a brand as Google is, its Places API may not necessarily be the right fit for organizations needing POI and other geospatial data for their mapping applications. Companies with large operations, popular apps, or more dynamic mapping needs may find they burn through free request credits too quickly and end up with expensive bills. Other groups may need specialized data that Google doesn’t have. Still others may find Google’s data licensing and use terms too restrictive for the kinds of projects they want to build. The following is a list of alternatives to Google Places API if your organization needs a solution that’s cheaper, more flexible, or generally more tailored to its needs. 1. SafeGraph Places Headquarters: Denver, Colorado, USA Pricing: $$ Free Trial: Sample data available Best for: Fresh, accurate, comprehensive, and easy access data for firms that need precision SafeGraph’s Places dataset provides over 20 standard attributes of information regarding over 30 million distinct POIs around the world, including over 6 million parking lots. This data is updated monthly – far more frequently than most POI datasets – so your company can spend more time using it and less time worrying about having to fix it because it’s stale or inaccurate. Plus, our thorough documentation makes it easy to understand how the data is organized, and why some parts are more complete than others. Our data is available through popular data platforms (such as Amazon S3, Snowflake, Databricks Delta Share, and Microsoft Azure) or direct CSV download, so you can start using it sooner for whatever application you need. What you get Complete, accurate, detailed, and fresh POI data that’s clean and ready to use Extensive category tag system for richer information about places than Google Places Easy data access through a CSV file, or a variety of other common data platforms Data comes at a fair, flat, value-based price instead of expensive “pay as you go” models 2. ChainXY Source: ChainXY Headquarters: Vancouver, British Columbia, Canada Pricing: $ Free Trial: No Best for: Coverage of major brands and chains ChainXY provides data for major store and brand chains around the world. Using their self-serve portal, users can download points of interest data for specific brands and regions in the file format of their choice, such as .shp, .tab, .kml, .csv. and more. ChainXY data can be purchased an a one-off basis or as part of an annual subscription. However, the usage terms for ChainXY data do not allow for the use of the POIs externally, so there are limitations. While ChainXY could be a good alternative for the Google Places API for users looking to populate maps and applications with major brand storefronts, the data is not ideal for coverage of mom-and-pop stores or smaller brands. ChainXY also only updates their database each quarter, so data is commonly stale and not a true representation of the real world. They also do not cover non-commercial places, like transit stops, industrial warehouses, or parks. What you get POIs for major brands across the world Quarterly updates to data when possible Flexible file format download options, but restrictions on how it can be used 3. AggData Source: AggData Headquarters: Dupont, Washington, USA Pricing: $ Free Trial: Free data available Best for: Locations of major brands AggData offers POI data bundles for specific brands by geographic region. On their site, users can search for the geographic extent and commercial brand they need and purchase the data directly. Data is delivered via a CSV download with each individual purchase, but if users would like complete access to all AggData bundles, they can pay for a premium subscription. Within their self-serve data buying experience, AggData shows how recently each bundle has been updated. The data freshness really depends on the specific bundle, but can range anywhere from one month to three years since the last update. AggData bundles do not provide contextual attributes related to store building geometry, open hours, or NAICS code, so are best used for getting a general idea of where major brands may be located in the world. What you get POIs for major brands across the world Self-serve buying experience for the exact brand and geography you need CSV with geocoded points of store locations, although the geocoding quality is not as precise as Google or others on this list 4. Precisely Source: Precisely Headquarters: Burlington, Massachusetts, USA Pricing: $$$ Free Trial: Sample data available Best for: Global coverage and interoperability with their other geospatial datasets Precisely (formerly Pitney Bowes) is a data integrity company with a geospatial data portfolio that includes points of interest and building footprint data. They have global coverage and provide data records for all types of POIs, and their large data portfolio is all connected through a unique identifier for addresses, the PreciselyID. As a large company, Precisely does a lot of things related to data integrity and their geospatial portfolio is just a small subset of their focus. While their points of interest dataset does have hundreds of millions of records, many are duplicates of each other. The data is updated quarterly, so POIs are not always a true representation of the physical world. Depending on how fresh and clean you need your POI data, Precisely POIs may be a good alternative to the Google Places API. What you get Global coverage of POIs from a wide variety of categories Interoperability with Precisely's other geospatial datasets using the address-based unique identifier the PreciselyID Flexibility in file format download (.csv, .shp, .tab, and more) Lots of duplicate records, so you will need to be comfortable cleaning and de-duping data 5. OpenStreetMap Source: OpenStreetMap Headquarters: Cambridge, UK Pricing: Free Free Trial: No Best for: Free, flexible data for do-it-yourself mapping projects OpenStreetMap is a popular free alternative to Google Places API. It’s built and maintained by a community of cartographers, humanitarians, software engineers, and others contributing information about points of interest all over the world. Many organizations use it because it has very relaxed licensing requirements, requiring only crediting OpenStreetMap contributors for any data used and requiring the same conditions for any derivative works produced. However, its data can be tricky to access, and can have issues with coverage, attribute fill rate, and documentation completeness. This can limit its usefulness for operations needing more precise and detailed information. What you get Free data (but can have completeness issues, as it’s contributed mainly by volunteers) Flexible licensing to create a variety of projects without running into legal issues 6. HERE Geocoding & Search API Source: HERE Technologies Headquarters: Eindhoven, Netherlands Price: $$$$ Free Trial: No Best for: Finding places based on several different types of criteria HERE’s Geocoding and Search API provides many of the same functions as the Google Places API. It allows for finding or discovering locations using names, addresses, coordinates, telephone numbers, business categories, brand names, etc. It can also autocomplete or autosuggest places based on valid addresses and terms. Furthermore, it can geocode addresses or reverse geocode coordinates, or – similar to how Google Places can – find a specific location based on a unique identifier. HERE’s database contains over 120 million points of interest in over 100 countries and territories. It can also draw from third-party datasets such as TripAdvisor’s. However, its coverage is limited outside of a handful of prominent countries. What you get Multi-criteria search and discovery functions, including by proximity or along a route Geocoding, reverse geocoding, autocomplete, and lookup by unique ID features Local coverage can be lacking in some geographic areas tracked 7. Foursquare Places API Source: Foursquare Headquarters: New York City, New York, USA Price: $$$$ Free Trial: No Best for: Core POI information plus crowd-powered extra details Foursquare grew its fame from location information crowdsourcing apps, but is now a general geospatial data company. Its Places API lists over 100 million POIs from over 200 countries and territories around the world. Similar to Google Places, it not only allows users to search for places through various criteria, but it also has autocomplete functions, place details, and photo capabilities. Plus, it includes place reviews and traveler’s tips on locations from users of Foursquare’s family of apps, such as City Guide and Swarm. What you get User-generated reviews and tips about locations from Foursquare app users 25 core POI data attributes; other ones available, but at additional cost Allows only temporary caching of data for enterprise customers 8. Esri ArcGIS Geocoding REST API Source: Esri Headquarters: Redlands, California, USA Price: $$$ Free Trial: No Best for: Ability to search for information about places in a variety of different ways Esri’s ArcGIS is one of the most well-known mapping applications in the world, and it has a number of APIs associated with it as well. The one closest to an alternative to Google Places API is the Geocoding REST API, which allows for geocoding, reverse geocoding, and searching for locations in a variety of different ways. For example, you can search for places by category type, or even how they are referenced in different languages. You can also limit searches to proximities, geofences, cities, or countries. There are many other formatting options, such as searching for specific data attributes, limiting the number of search results, and specifying whether to search for a rooftop or street address. What you get Lots of different ways to search and filter information on locations API has other functions for base maps, routing, demographics, elevation, hydrology, etc. 9. Leaflet Source: Leaflet Headquarters: Kiev, Ukraine Price: Free Free Trial: No Best for: A simple map-making tool for developers with lots of customization options Leaflet is a free, lightweight, and open-source JavaScript library that can be used to create interactive, mobile-friendly maps. It’s designed for simplicity, so it doesn’t contain any POI data itself. However, Leaflet has a large user base that has created a number of plugins for the service, some of which allow for overlaying place details from various sources. Because of its lack of cost and abundant community support, Leaflet is considered a top Google Places API free alternative. It’s made for people with software development experience, though, so you need to know how to work with coding languages to get the most out of it. What you get Free and flexible toolset that doesn’t take much processing power to use Large support community has created lots of plugins for customization Doesn’t inherently come with POI data and requires coding experience to use well 10. TomTom Places API Source: TomTom Headquarters: Amsterdam, Netherlands Price: $$$ Free Trial: Yes (with limitations) Best for: Proximity-based or route-based searches, or locating EV charging stations As one of the pioneers behind route-planning software, TomTom has a number of different geospatial APIs. Its Places API allows for geocoding and reverse geocoding locations (up to 10,000 at a time); searching for locations with autocomplete and fuzzy logic capabilities; searching for locations within a proximity or along a route; and even searching for electric vehicle charging stations (though this latter function is more pricey to request). TomTom Places provides POI data for about 270 countries and territories, but only has complete data for about 70 of them. What you get Ability to find locations within a certain proximity or along a specific route One of the few APIs that tracks electric vehicle charging stations Data completeness is limited for the geographic areas it tracks 11. OpenLayers Source: OpenLayers Headquarters: Beaverton, Oregon, USA Price: Free Free Trial: No Best for: Free mapmaking for developers who are only looking for specific functions OpenLayers is similar to Leaflet in that it’s a JavaScript library for building custom maps. Also like Leaflet, it requires a bit of programming knowledge to use correctly, and it needs to pull data from services like OpenStreetMap, Bing, Mapbox, Stamen, etc. So it can suffer some of the same problems regarding data breadth and completeness as some other free and/or open-source services. On the plus side, OpenLayers lets users pick the features they want to include, which makes it lightweight and thus allows maps to load quickly. What you get Free and customizable mapmaking solution Fast and lightweight, since users only need to call what they need Uses a mix of POI data sources, so data scope and fill is unreliable 12. LocationIQ Source: LocationIQ Headquarters: Milpitas, California, USA Price: $$$ Free Trial: Yes (with limitations) Best for: Google Maps functionality at a fraction of the cost LocationIQ positions itself as an alternative to Google Places API and Google Maps API that costs up to 90% less. It also has more flexible policies and features. LocationIQ has three different APIs: one for geocoding and reverse-geocoding; one for creating maps with customizable tiles; and one for planning routes, including measuring distances and matching coordinates to road networks. LocationIQ uses a combination of open-source and third-party data, which improves its data quality a bit. But it still can’t outperform a curated POI database. Also, it currently has strict limits on the number of API calls that can be made under each subscription plan. If these limits are exceeded, additional requests will fail. LocationIQ is trying to loosen these restrictions, but hasn’t been able to yet. What you get Contains geocoding, mapping, and/or navigation APIs Street-level accuracy in most places; house-level accuracy in some places Reasonable pricing and licensing terms Uses open-source and third-party data; data breadth and completeness can be iffy ‍ 13. Oracle Spatial Studio Source: Oracle Headquarters: Austin, Texas, USA Price: $$$$$ Free Trial: Yes (30 days) Best for: Advanced geospatial data analysis at an enterprise level Oracle’s data management solutions also include tools for viewing and analyzing geospatial data, collectively called Spatial Studio. This software allows for manipulating and enriching various forms of geospatial data – including POI, street, boundary, imagery, network, topology, and movement – in a no-coding mapping platform. Possible operations include geocoding and reverse geocoding, route finding, and other types of spatial analysis. What you get Wide variety of geospatial analysis tools, compatible with many data types Uses data from third-party and open sources; often requires your own data Requires subscription to Oracle Autonomous Database 14. Apple MapKit / Mapkit JS Source: Apple Headquarters: Cupertino, California, USA Price: $$ Free Trial: No Best for: Ready-to-go digital mapmaking with advanced but intuitive features Apple has its own mapping service in Apple Maps, and MapKit (for mobile apps) and MapKit JS (for the web) are APIs that allow you to build off of it. In addition to standard features like geocoding / reverse geocoding, filterable searches, route planning, and place details, they have a bunch of other capabilities. These include importing your own geospatial data; limiting maps to certain regions and zoom levels; adding custom annotations to places to highlight them and provide more information; and creating panoramic photos to give views at street level. What you get Creative map-building functions like zoom/region limits, annotations, and street views Uses Apple’s proprietary location database Requires annual commitment to Apple Developer Program 15. Build your own API One last option we’ll mention is to create a custom POI data API from scratch. This can have a few advantages for your organization: It doesn’t have to pay or rely on a third party to maintain the system It maintains greater control over the actual data, and what that data can be used for It gets a system that can be built to fit its exact use cases There are some downsides, however. For one thing, it costs a huge amount of money and time upfront to secure the hardware, technical expertise, and partnerships to get the system up and running. By the same token, your organization needs just as many resources to update POI data as it changes, let alone scale up the system infrastructure when operations expand. Then there’s the issue of the quality of the data itself. Your organization has to have dedicated employees with the skills to properly collect, clean, organize, and merge data so that this data is always as accurate, trustworthy, and current as possible. And even then, there’s still a risk that the data may lead to biased models and conclusions if your company doesn’t critically evaluate its own data collection and processing methods. So it can sometimes be better to get data from a third party so there’s less chance of it being biased towards a particular organization or industry. SafeGraph Places as an Alternative to the Google Places API Those are some options for getting point of interest data for your organization’s geospatial app or widget if Google Places API doesn’t quite fit the bill. However, we at SafeGraph feel that our Places dataset is still the best option. We work with you to negotiate a fair price for our data so you get great value for your money. Our data is also easy to access through CSV files or common data management platforms that your organization may already be using. So there’s no need to repeatedly call an API to get the data. And once you have the data, our flexible licensing terms let you do pretty much whatever you want with it. Most of all, we pride ourselves on the fact that data is all our company is about. So we put all of our effort into creating the best-quality POI datasets possible with broader and more complete coverage, as well as unrivaled data freshness for time-sensitive applications. We even have additional columns with building footprints or transaction data that can be combined with POI information for even greater context. If you’re interested in learning more about what SafeGraph data has to offer, download some sample Places data or contact our sales team. #### Google Places API Pricing: Is It Worth It for Your Business? Google has devoted a lot of resources to making geospatial data accessible to everyday people through products like Google Earth, Google Maps, and Google Business Profile.So it’s understandable why some would want to rely on the company’s Places API to get information about locations when building applications. But how much does the Google Places API cost? When you seriously consider that question, it can become evident that Google’s solution may not be the most economical option.To demonstrate, we’ll break down the various costs associated with using the Google Places API. Along the way, we’ll explain how using SafeGraph’s Places data instead can serve as a better solution – and not just in terms of cost.How Google Places API pricing worksDifferent types of data SKUs in Google Places APIHow SafeGraph is different (and why we’re better)To start, we’ll give a general overview of how Google Places API pricing works.How Google Places API pricing worksThe Google Places API price works on a “pay as you go” model. You can get a 90-day or $300-credit free trial of the Google Places API if you’ve never used paid services on Google Cloud or the Google Maps Platform (which includes the Google Places API). In addition, you get $200 worth of free request credits each month.Beyond that, your organization will have to measure how much it uses the different services (SKUs) of the Google Places API, and create a budget accordingly. Then it will have to pay in advance to put credit on its account for one of three tiers of service. If your company’s usage of Google Places API goes beyond the amount of credit it paid, it will result in additional charges.Pay-as-you-go pricing model where you pay by SKUGoogle’s API for places bases pricing on stock keeping units (SKUs). So it bills for how often its services are used, and bills different rates for different types of services. Each SKU has three pricing tiers: one for 0 to 100,000 requests per month, another for 100,001 to 500,000 requests per month, and a third for over 500,000 requests per month.You can calculate how much your company can expect to pay each month by using the Google Places API pricing calculator.How Google Places API Autocomplete sessions workThe autocomplete function can be a bit tricky when it comes to the cost of Google Places API, since it may provide information the user ultimately doesn’t utilize. An autocomplete “session” starts once a user has typed enough characters into the search bar to generate an autocomplete suggestion. Further suggestions can be generated as the user adds or deletes characters. The session ends when the user selects a place they want to get more information about.Each autocomplete session needs a session token to authenticate the other calls. Where that comes from depends on which Google API you’re using, as outlined below. API Billing Does your application need to provide session tokens? Google Places API Place Autocomplete Can be session-based or per request Yes Google Maps JavaScript API Places Autocomplete Can be session-based or per request Yes Google Maps JavaScript API Autocomplete Widget Session-based billing automatically enabled No ‍Different types of data SKUs in Google Places APIThe Google Places API classifies the different kinds of data and functions it calls when responding to a Places request. Depending on the type of data and/or function called, there may be an additional charge – as we’ll outline below.One or more Data SKUs can be triggered by the following calls: Android fetchPlace(), findCurrentPlace() iOS fetchPlaceFromPlaceID:, findPlaceLikelihoodsFromCurrentLocationWithPlaceFields: Web service each Places request, depending on the fields specified in the request ‍Basic DataBasic data contains things like a place’s name, address, building footprint, feature photo, category, or status of being open or permanently closed. This data is requested from the following fields: Android address_component, adr_address, business_status, formatted_address, geometry, icon, icon_mask_base_uri, icon_background_color, name, permanently_closed, photo, type, url, utc_offset, vicinity iOS GMSPlaceFieldFormattedAddress, GMSPlaceFieldBusinesssStatus, GMSPlaceFieldID, GMSPlaceFieldCoordinate, GMSPlaceFieldName, GMSPlaceFieldPhotos, GMSPlaceFieldPlusCode, GMSPlaceFieldTypes, GMSPlaceFieldViewport Web service address_component, adr_address, business_status, formatted_address, geometry, icon, name, permanently_closed, photo, place_id, plus_code, type, url, utc_offset, vicinity ‍Pricing Calls per Month Price per Call Price per 1000 Calls Price per Plan Max 0 - 100,000 No additional charge No additional charge No additional charge 100,001 - 500,000 No additional charge No additional charge No additional charge 500,001+ No additional charge No additional charge No additional charge As you can see, basic data calls have no additional cost on top of the price of the related Place Details request. However, that is not the case for all types of data, as we will soon demonstrate.‍Contact DataContact data has to do with operational information about a business. This includes the hours it’s open, its phone number(s), and its website URL (if it has one). This type of data is called by the following fields: Android OPENING_HOURS, PHONE_NUMBER, WEBSITE_URI iOS GMSPlaceFieldOpeningHours, GMSPlaceFieldPhoneNumber, GMSPlaceFieldWebsite Web service formatted_phone_number, international_phone_number, opening_hours, website ‍Pricing Calls per Month Price per Call Price per 1000 Calls Price per Plan Max 0 - 100,000 $0.003 $3 $300 100,001 - 500,000 $0.0024 $2.40 $1,200 500,001+ Contact sales team Contact sales team Contact sales team Now you can see how the Google Maps or Places API cost can start to add up. If your organization is on the low volume request tier, contact data can cost up to $300 extra per month, while the middle tier can cost up to an additional $1200 per month! With SafeGraph, we work with you to determine the right price for the exact columns and rows of data you need.‍Atmosphere DataAtmosphere data has to do mainly with user-generated information, such as ratings and reviews of a business. It can also include the relative price rating of that business’s products or services. It’s requested via these fields: Android PRICE_LEVEL, RATING, USER_RATINGS_TOTAL iOS GMSPlaceFieldPriceLevel, GMSPlaceFieldRating, GMSPlaceFieldUserRatingsTotal Web service price_level, rating, review, user_ratings_total ‍Pricing Calls per Month Price per Call Price per 1000 Calls Price per Plan Max 0 - 100,000 $0.005 $5 $500 100,001 - 500,000 $0.004 $4 $2,000 500,001+ Contact sales team Contact sales team Contact sales team This type of social sentiment can be useful in some cases, but it’s very subjective. SafeGraph doesn’t include this type of data because we’re more interested in quality factual information about physical places, not opinions. So by not providing data your organization may not even need, we can save you up to $2000 or more every month.‍Autocomplete – Per RequestThese calls refer to autocomplete requests that are invalid because a session token is not provided or is reused. In some cases, they may also be invalid because the user types (or copies and pastes) multiple addresses into the search field. Android findAutocompletePredictions() iOS findAutocompletePredictionsFromQuery: JavaScript Anything from the Maps JavaScript API’s Place Autocomplete service, and some actions from the Maps JavaScript API’s Place Autocomplete Widget Web service Anything from the Places API Place Autocomplete service ‍Pricing Calls per Month Price per Call Price per 1000 Calls Price per Plan Max 0 - 100,000 $0.00283 $2.83 $283+ 100,001 - 500,000 $0.00227 $2.27 $1,135+ 500,001+ Contact sales team Contact sales team Contact sales team The basic idea here is: with whatever application you’re using part of the Google Places API (except the Google Maps JavaScript API Autocomplete Widget) for, it’s your responsibility to program it so it supplies session tokens correctly for autocomplete instances. If it doesn’t, you’ll still be on the hook for a feature of your app that doesn’t work. And even if you do get things set up properly, Google Places API’s cost can still add up with the autocomplete function if the user isn’t careful with how they set up their data requests.SafeGraph, on the other hand, doesn’t need to try to predict what you’re looking for information on. We just bundle together the data you want and let you do the finding. So you’ll never have to worry about a needless expense that could cost you upwards of $1,100 per month.‍Autocomplete without Place Details – Per SessionThis refers to autocomplete requests where the user ultimately doesn’t select a suggested place to view more information about, and instead chooses another action. Some sample calls look like this: Android findAutocompletePredictions() (.setQuery(”par”), .setSessionToken(XYZ)) findAutocompletePredictions() (.setQuery(”paris”), .setSessionToken(XYZ)) iOS placesClient?.findAutocompletePredictions(fromQuery: "par" ... placesClient?.findAutocompletePredictions(fromQuery: "paris" ... Web service Place Autocomplete Request (input=”par”, session_token: XYZ) Place Autocomplete Request (input=”paris”, session_token: XYZ) ‍Pricing Calls per Month Price per Call Price per 1000 Calls Price per Plan Max 0 - 100,000 $0.017 $17 $1,700 100,001 - 500,000 $0.0136 $13.60 $6,800 500,001+ Contact sales team Contact sales team Contact sales team Even if a user never actually selects a place to view from autocomplete suggestions, you still get charged for using the autocomplete function. Based on the Google Places autocomplete API pricing, that could cost you almost $7,000 per month.At SafeGraph, we put much more focus into making sure we not only have the data about a place, but also that said data is correct and unambiguous. This cuts down on the need for repeated API calls because queries about names or addresses of certain locations don’t return what the user expects.‍Autocomplete (included with Place Details) – Per SessionThese calls refer to autocomplete sessions where the user selects a suggested place to view more details about it. Android fetchPlace() iOS fetchPlaceFromPlaceID: Web service a Place Details request ‍Pricing Calls per Month Price per Call Price per 1000 Calls Price per Plan Max 0 - 100,000 No additional charge No additional charge No additional charge 100,001 - 500,000 No additional charge No additional charge No additional charge 500,001+ No additional charge No additional charge No additional charge While it looks like autocomplete sessions that end with the user choosing a suggested place won’t cost you anything, that’s not exactly the case. A successful session produces a Place Details request, which does cost money. In addition, the Places Detail request will also call data (Basic, Contact, or Atmosphere), based on what fields are specified in the request. If none are specified, then all data types will be called, and you will be charged accordingly.Again, this Google Places API autocomplete pricing is something you’ll never have to worry about with SafeGraph. We don’t try to predict what places you’ll want information on; just tell us, and if we have it, we can give it to you.‍Query Autocomplete – Per RequestThe Google Places API can also be programmed to suggest categorical phrases for narrowing a search to specific types of places (e.g. “pizza in New York, NY”). This is in addition to the places themselves. The following requests and services will call this function: JavaScript getQueryPredictions(), the Maps JavaScript API’s SearchBox widget Web service ThePlaces API Query Autocomplete service ‍Pricing Calls per Month Price per Call Price per 1000 Calls Price per Plan Max 0 - 100,000 $0.00283 $2.83 $283+ 100,001 - 500,000 $0.00227 $2.27 $1,135+ 500,001+ Contact sales team Contact sales team Contact sales team ‍Place DetailsPlace details are called when a user selects a specific location that they want more information about (as opposed to a categorical search, as explained in the above section). These requests will do that: Android fetchPlace() iOS fetchPlaceFromPlaceID: JavaScript getDetails, getPlace, getPlaces() Web service getPlaceDetails() ‍Pricing Calls per Month Price per Call Price per 1000 Calls Price per Plan Max 0 - 100,000 $0.017 $17 $1,700+ 100,001 - 500,000 $0.0136 $13.60 $6,800+ 500,001+ Contact sales team Contact sales team Contact sales team Remember that these requests will also call different data types (Basic, Contact, or Atmosphere) depending on what data fields are specified in the request. If no fields are specified, all types of data on a place will be called, and you will be charged for them. So the Google Places API price level could go up to 2.5 cents per Place Details call if your organization is on the minimum plan and repeatedly requests all types of data, which can cost up to $2,500 extra a month!With SafeGraph, most attribute categories are included in the data by default. So you don’t need to worry about costs compounding like this. Also, we have a more flexible pricing structure based on a negotiation of how much value you’ll get out of our data, rather than based on an arbitrary number of API calls.‍Place Details – ID RefreshThis is a request to retrieve only a location’s unique Google place ID if it changed for some reason (e.g. a business closed, or a new business moved in). It is done by making a Place Details request and specifying that you only want the place ID field returned, like so: Android fetchPlace(fields: place_id) iOS fetchPlaceFromPlaceID:, placeFields: place_id Web service getPlaceDetails(fields: place_id) ‍Pricing Calls per Month Price per Call Price per 1000 Calls Price per Plan Max 0 - 100,000 No additional charge No additional charge No additional charge 100,001 - 500,000 No additional charge No additional charge No additional charge 500,001+ No additional charge No additional charge No additional charge This particular function has no extra charge, and Google also recommends refreshing place IDs once every 12 months. With SafeGraph’s data, however, managing unique place IDs is done on our end through location data standards like Placekey. We also update our Places data every month, so your business gets much fresher data without needing to jump through any hoops.‍Find PlaceFind Place requests use text strings in various data fields to search for, identify, and view details on a specific location. JavaScript findPlaceFromQuery(), findPlaceFromPhoneNumbe() Web service https://maps.googleapis.com/maps/api/place/findplacefromtext/output?parameters ‍Pricing Calls per Month Price per Call Price per 1000 Calls Price per Plan Max 0 - 100,000 $0.017 $17 $1,700+ 100,001 - 500,000 $0.0136 $13.60 $6,800+ 500,001+ Contact sales team Contact sales team Contact sales team Like with a Place Details request, a Find Place request will also call any specified data fields. If any of those data fields are classified as Contact or Atmosphere data, you will be charged for them on top of the base cost of the Find Place request. If no fields are specified, the request will return only the location’s unique place ID. Since this is considered Basic data, it will not cost anything extra.SafeGraph Places data includes most location attributes and information as part of the package when you purchase it. So you can just search through the dataset at your convenience to find a place you’re looking for, rather than having to pay each time you call this data from an API.‍Place – Nearby SearchNearby Search allows you to search for a group of related places near a specified location. You can search by keywords, languages, price level, categories, and whether a place is open or closed. You can also constrain searches to a specific radius, or sort results based on relevance or proximity. Nearby Search is requested by these calls: JavaScript nearbySearch() Web service https://maps.googleapis.com/maps/api/place/nearbysearch/output?parameters ‍Pricing Calls per Month Price per Call Price per 1000 Calls Price per Plan Max 0 - 100,000 $0.032 $32 $3,200+ 100,001 - 500,000 $0.0256 $25.60 $12,800+ 500,001+ Contact sales team Contact sales team Contact sales team ‍Place – Text SearchText Search allows you to search for places based on keywords related to a place’s name, address, or type of establishment. You can also refine results by language, price level, category, and whether a place is open or closed. You can also constrain searches to a specific region, location, or radius around a location.Text Search is requested by these calls: JavaScript textSearch(), getPlaces() [if selecting a query and not a specific place] Web service https://maps.googleapis.com/maps/api/place/textsearch/output?parameters ‍Pricing Calls per Month Google Places API Price Price per 1000 Calls Price per Plan Max 0 - 100,000 $0.032 $32 $3,200+ 100,001 - 500,000 $0.0256 $25.60 $12,800+ 500,001+ Contact sales team Contact sales team Contact sales team Like with Nearby Search, it is not possible to constrain the types of data fields that a Text Search calls. So each request will also incur charges from any Contact or Atmosphere data that is returned, on top of the base cost of the request.If you use SafeGraph’s Places dataset instead, you don’t need to worry about this. You can find a place by searching over 20 standard attributes’ worth of information, including up-to-date open hours and the market’s most granular category tags. More importantly, the data is all there from the start – you don’t have to repeatedly query an API to get it, which can get pretty expensive.‍Place PhotoThis type of request allows for resizing a photo of a place so that it fits a screen. In other words, it allows you to manipulate a photo referenced by a Place Details request (which can return up to 10 photos at once); or a Find Place, Nearby Search, or Text Search request (each of which can return up to 1 photo). It’s requested through these calls: Android fetchPhoto() iOS loadPlacePhoto: JavaScript PlacePhoto.getUrl() Web service https://maps.googleapis.com/maps/api/place/photo?parameters ‍Pricing Calls per Month Price per Call Price per 1000 Calls Price per Plan Max 0 - 100,000 $0.007 $7 $700+ 100,001 - 500,000 $0.0056 $5.60 $2,800+ 500,001+ Contact sales team Contact sales team Contact sales team ‍How SafeGraph is different (and why we’re better)One of the biggest weaknesses of the Places API pricing Google uses is that it’s inflexible. It forces organizations to play a guessing game with how much their app(s) will be using the API, and choose a rate accordingly. Not only are there penalties for overuse or underuse, but each request can call multiple types of data that your organization might not need for its use cases. And more often than not, you have to pay extra for that potentially wasted data.SafeGraph doesn’t take this kind of one-size-fits-all, pay-as-you-go approach. We work with your company to identify exactly what data you need, and then negotiate a price to make sure you’re getting value for money. You can also have us send the data to several common data management platforms that your organization may already be using. These include Amazon S3, Databricks Delta Share, and Snowflake, as well as GIS platforms such as CARTO and Esri ArcGIS. Or you can just take the data as a CSV file if that’s more convenient. The point is that all the data you buy is available to access at any time for a flat fee – no nickel-and-diming by having to call it piece by piece from an API.The other thing that makes us different is that data products are our only products. That means 100% of what we do is delivering geospatial datasets that not only have the highest attribute fill rates, but also are as up-to-date as possible (our standard refresh rate is monthly, while most of our competitors refresh only quarterly or annually). That goes beyond our Places dataset to information like the polygon-based building footprints of our Geometry dataset, as well as the location-based transactions our Spend dataset reveals.In contrast, Google is a massive company that has products and services for many different industries. While this isn’t inherently a bad thing, it does limit how many resources Google can spend on any one area, such as optimizing its geospatial databases. For example, freelance food delivery company Doordash had to do its own research to fill in some of the hyperlocal data that was difficult or impossible to find using Google Maps alone.‍Google is a recognizable brand that has become good at doing a lot of different things. However, it can’t be everything to everyone. While its geospatial services work decently for a lot of people and organizations, they’re expensive and don’t always include all the details necessary for certain use cases. If your business is looking for a more complete and affordable POI data solution, come see what SafeGraph Places has to offer, or get in touch with us.‍ #### Introducing SafeGraph's New Data Maturity Model The four key stages in becoming a data mature organization.Data is Everywhere, But it’s Certainly Not All Created EqualAt SafeGraph, we pride ourselves on holding data up to an entirely new standard of excellence. Not only because we live and breathe ‘data’ every day and in everything we do, but more so because we know that good, clean, accurate, and comprehensive data can be hard to come by. We also know that good data can propel businesses, governments, non-profit organizations, and research institutions on an upward trajectory, enabling them to uncover unique insights and drive new innovations like never before. When used to its full potential, data can change the world and advance our understanding of society for the better. But our purpose here isn’t to talk up the value of data. You already know that data is valuable. You wouldn’t be reading this if you thought otherwise. The real trouble is, there’s a ton of data floating around and, unfortunately, many organizations still struggle with incorporating it effectively into their overarching business strategies. This comes down to one thing: data maturity. But how do you know where your organization sits on the data maturity spectrum? Fortunately, you’ve come to the right place. In this guide, we’ll provide a fresh take on the four stages of data maturity and clearly explain how progressing along the data maturity spectrum can have a fundamental impact on your organization’s future. So at a time when even the Economist has boldly claimed that the “world’s most valuable resource is no longer oil, but data,” the big question perpetually remains: How can we use it to drive long-term value?What sets good data apart? Learn more in SafeGraph’s Data Evaluation Checklist. Key takeaways at a glanceAssessing your organization’s data maturity means understanding what happens at each stage of SafeGraph’s new data maturity model: Phase 1: Explorer Data is primarily used for internal reporting purposes only.Phase 2: User Data-driven insights are used to inform strategic business decisions.Phase 3: Leader Data is leveraged strategically to drive competitive intelligence. Phase 4: Innovator Data informs a continuous evolution of business strategy.Data has proven to be a competitive differentiator…. Company performance is highly correlated to data maturity.William McKnight in The Importance of Data MaturityWhat is Data Maturity?Sisense tells us that “data maturity is a measurement of how advanced a company’s data analysis is.” Seems like a reasonable definition, but what does that really mean? Data maturity is not just about the role that data plays within an organization’s day-to-day operations as much as it is about how it can enable organizations, of all shapes and sizes, to do something in the future that it couldn’t have done in the past without using data.Therefore, when looking at data maturity from this angle, it becomes a question of empowerment: How can data be leveraged in a powerful way to unlock new insights and innovations that can eventually turn ideas into reality? Here’s what we already know to be true. Organizations have more access to data—really good data at that—than ever before. Truth be told, they probably have more than they even know what to do with it. Even so, the 59% of companies (and growing) that use data analytics in some capacity every day have likely only scratched the surface in unlocking its real potential. As a starting point, many organizations primarily use data today to run standardized reports and build metrics dashboards. While this is better than nothing at all, limiting data to play a purely ‘administrative analytics’ role is a huge disservice—and an insult to the data itself.Even so, and as the most data-savvy organizations know, data can tell stories. Data can uncover unseen truths. Data can inform, improve, and even challenge decision-making at all levels. And finally, data can overhaul a business’s strategy and fuel its long-term success. All of this is possible—and quite a bit more—as long as organizations know what to do with all this data they have at their fingertips. And while there may be different interpretations of what data maturity is, our approach gets at the heart of why it truly matters. Data can transform organizations in powerful ways. They just need a clear roadmap to get there. 49% of companies say data helps them make better decisions, while 16% say it enables key strategic initiatives and 10% say it improves relationships with both customers and business partners.The Analytics Advantage (Deloitte)The Stages of Data Maturity, According to SafeGraphFor us, data maturity is a journey of exploration—where organizations not only get more acquainted with the data sources they have to work with but also learn how to leverage it in oftentimes surprising, eye-opening, and unexpected ways. In fact, once organizations take the big step from seeing data as merely a source of information and, over time, begin to understand its real potential as an influencer—or even disruptor—of decision-making, an organization’s desire to become more data mature will likely (and immediately) increase tenfold. It’s important to keep in mind, however, that the shift from a data novice organization (what we call the ‘Explorer’ stage ) to a sophisticated, data mature organization (what we call the ‘Innovator’ stage) isn’t something that can or will happen overnight. Working your way through the stages of data maturity takes time and patience. We actually see it as a journey. That’s why we decided to reimagine our data maturity model in 2021. Rather than simply create a structure that, for lack of a better way of putting it, puts organizations in a “box,” we wanted to give them a starting point for their future data maturity journey: an easy way to assess where they are today and a path forward for becoming a more data mature organization in the future. Getting there doesn’t always follow a linear path. Becoming a data mature organization takes time, effort, and a lot of work. Our framework can help you get there, as long as you are committed to unleashing the power of data to work harder for your organization. To get an idea of what our model looks like, here’s a birds-eye view of each of the stages: A snapshot of SafeGraph’s new data maturity model.Phase 1: ExplorerAs the name suggests, this is the stage where organizations are barely scratching the surface with the data available to them. In fact, the most defining characteristic of this stage is the lack of consistency in both how data is managed and used across the organization. Explorers tend to lack a central data infrastructure. Any data collection or analysis happening across the organization, therefore, takes place on an individual by individual basis. There’s simply no coherent strategy for organization-wide data sharing or data quality. Because they don’t have a centralized data strategy in place, Explorer organizations often rely solely on their own first-party data for simple reporting purposes. They have not yet begun connecting to third-party datasets to help answer critical and in-depth business questions. While these organizations have some work to do to grow along the data maturity spectrum, at least they recognize the value of data and are using it to drive rudimentary insights.Phase 2: UserBecoming a User organization is a fairly big step up within our data maturity model. These organizations understand the importance of data quality and have put measures in place to make that an organization-wide standard. This includes building an internal data architecture that makes it easier to share data across departments, teams, and individuals—including ad hoc datasets from third-party providers that are used to enrich internal data sources.What truly separates Users from Explorers is how they use data, analytics, and insights to inform decision-making. Even so, the collection and analysis of data is typically reactive, meaning that it is used primarily for measuring results and reporting on performance. User organizations don’t yet leverage data as a foundational element for business strategy planning. Phase 3: LeaderLeader organizations put data at the heart of decision-making and competitive intelligence. To that end, data analysis fuels business strategy and clears a path for achieving organizational goals and maximizing business outcomes. To do this, Leaders often see the joining of third-party datasets to their own data as the differentiator giving them a competitive edge in the market. These organizations have also streamlined and centralized their data infrastructure. Not only do they have systems and standards in place to ensure the highest data quality—helping to build greater confidence around the insights provided—but they also have built an architecture that makes it possible for the entire organization to be data-driven. Phase 4: InnovatorReaching the proverbial data maturity ‘mountain top’ are the Innovators. These organizations see data as more than just a tool to inform decision-making and maximizing outcomes; rather, they embrace it as a catalyst for constant and continuous innovation organization-wide.Innovators realize that even the best business strategies must ebb and flow over time. They use data proactively—and build predictive algorithms around it—to improve business outcomes and stay one step ahead of the competition at all times. For these organizations, being able to adapt to changes in the market and in society at the drop of a hat is not optional; it is table stakes. Additionally, they regularly seek new ways of joining their own data with third-party datasets—even from what you might consider ‘non-obvious’ data sources—to remain perpetually competitive and seize new opportunities to make a marked impact. This, therefore, implies that data governance is built into all business processes, with a robust architecture in place to be able to share large amounts of high-quality data with speed and efficiency. Innovators are the nirvana of data maturity, something that all organizations should aspire to. A high level of data maturity is the stage reached when data has woven its way deeply into the fabric of an organization and when data has become incorporated in every decision that an organization makes.Data Maturity (Sisense)4 Fool-Proof Steps for Evaluating Data SourcesIf your organization relies on data, in any way, to fuel business success, competitive intelligence, or future innovations, you must know how to weed out “good” versus “bad” data. Failing to do so could be a massive waste of time and resources. As a starting point, ask yourself the following questions: Does the data come from credible sources? When people don’t like what the data is telling them, the first instinct may be to “cry wolf” and blame it on data quality. So to avoid falling into that rut, be sure to verify the accuracy, quality, and trustworthiness of the data’s original source upfront. What can (and can’t) the data tell you? True, data can do a lot of things, but not every dataset can answer every question on your mind. Be clear about its limitations and work within those parameters. How much scrubbing will be required to clean the data? There’s always a bit of cleaning, sorting, and processing in order before being able to use data, especially if you plan to connect it to other datasets. But knowing that some datasets are inherently “cleaner” than others, set some ground rules around how much time and effort is worth scrubbing a truly messy dataset to make it usable. How will you ultimately use the data? Always have a plan in place about how you’ll put the data to work. But also give yourself some room to be creative—after all, you never know what unique and unexpected insights might arise by joining different datasets together. There’s a lot more where this came from. To learn more about what to keep in mind when assessing data source quality, be sure to check out SafeGraph’s Data Evaluation Checklist.A single dataset on its own has limited value. The real value from data comes from connecting it across multiple disparate datasets.SafeGraph CEO, Auren Hoffman and FICO CEO, Will Lansing in Why Data Standards MatterData Maturity is a Journey, Not a SprintData is one of the most valuable assets available to any organization today. Unfortunately, many simply don’t know how to use data to its fullest. So if your organization falls into this category, don’t worry—it just means that you are on the start of your own data maturity journey. The good news for you: There are a lot of ways to become a data mature business. It’s not always a linear path nor is it going to happen over night. But when you make the important decision to put data at the heart of your organization—to fuel business strategy, inform decision-making, and uncover competitive intelligence like never before—you are taking the first step in bringing your organization into the data age. #### OpenStreetMap API Alternatives for More Reliable POI Data OpenStreetMap – sometimes referred to as “OSM'” – has become sort of the Wikipedia of mapping: an open platform of geospatial data that anyone can contribute to or utilize for their own purposes.Cartographers, humanitarians, software engineers, and many more have built datasets to add to OSM’s libraries, or built applications powered by OSM’s data and map-editing capabilities. Even many prominent companies are now using the OpenStreetMap API to get the data fueling their map-based applications.Despite all this, OSM isn’t the ideal solution in all scenarios where businesses require point of interest data. There are a number of proprietary data companies that excel in areas where OSM falters: data currency, attribute fill rate, documentation support, and ease of data access. We’ll be discussing all of this in the following sections:What is OpenStreetMap?10 OpenStreetMap API AlternativesA Better (+ Cheaper) Alternative to API DataBefore we get too far, we’ll expand a bit more on the answer to the question: what is OpenStreetMap?What is OpenStreetMap?OpenStreetMap (or OSM) is a crowdsourced database of geospatial information about Earth. People contribute and edit information on places based on their local expertise, and others are able to use that data to make their own maps or power other applications (as long as they credit OpenStreetMap).OpenStreetMap was created in 2004 in the UK by Stephen Coast. His impetus for doing so was the lack of openly-available map data (both locally and globally) and the success of Wikipedia as a collaborative knowledge project. Since OSM’s inception, OpenStreetMap’s data has been at least partially adopted by a number of prominent companies. This is largely because OpenStreetMap does not charge for licensing its data, and only requires that users credit OpenStreetMap and its contributors wherever they use said data.10 OpenStreetMap API alternativesJust because OSM’s API is currently very popular doesn’t mean it’s the best choice for all potential use cases requiring POI data. Like its inspiration, Wikipedia, OSM has some issues that stem from it being a mainly volunteer-run project. Namely, it can suffer from a lack of data completeness or freshness, supporting explanatory documentation, and streamlined methods to download OpenStreetMap data.Though OSM was created to address an overabundance of proprietary POI data providers, some of these services are still worthy alternatives. They cost money, but have more complete and up-to-date data, more flexible data delivery options, more advanced search parameters, and so on. Here are 10 of the most notable ones.1. SafeGraphSource: SafeGraphPricing: $$Free trial: Sample data availableBest for: Accurate and complete data that’s updated regularly and easy to accessSafeGraph aims to be the authority on factual information about physical locations on Earth. Our flagship Places dataset covers over 30 million points of interest worldwide, with high fill rates for over 20 standard information attributes. Part of why it’s so complete is that we update it monthly – more frequently than most of our competitors – to keep the data fresh.Another advantage of our data is that we can bundle it together for you and deliver it to a common GIS or data management platform, or as a CSV file. In contrast, it can often take multiple separate OpenStreetMap downloads to aggregate the datasets needed to cover all of the point of interest data your organization needs.2. PreciselySource: PreciselyPricing: $$$Free trial: Sample data availableBest for: Interoperable datasets that offer global geospatial data coveragePrecisely is a company that works with data to ensure its accuracy and integrity. To that end, it has a portfolio of global geospatial data that includes information on points of interest and building footprints, all connected through the PreciselyID join key. Precisely’s data is easier to access than an OpenStreetMap download, as it can be acquired in a variety of file formats. However, it’s updated less frequently than our data at SafeGraph (quarterly as opposed to monthly), and it can take more work to clean because many data entries are duplicates.3. AggDataSource: AggDataPricing: $Free trial: Free data availableBest for: Reasonably-priced data for tracking major brand locationsAggData organizes its POI data primarily by major brands and by geographical area. So if you’re looking for all the 7-Eleven convenience store locations in Mexico, for example, you can purchase just that data. However, there is also the option to purchase a premium subscription and get access to all of AggData’s datasets at once.Each dataset on AggData is managed individually, so the freshness of each one can vary greatly; the typical range is anywhere from one month to three years. AggData’s POI data also lacks some useful contextual attributes, such as building footprint geometry, NAICS codes, and hours of operation.4. ChainXYSource: ChainXYPricing: $Free trial: NoBest for: Populating regional maps with major store chain locationsSimilar to AggData, ChainXY is a self-serve portal that allows for purchasing data related to major store chain locations in different regions of the world. You can purchase data by brand and region, or get unlimited access with an annual subscription. Either way, it’s not very expensive, and is easier to get than an OpenStreetMap data download because ChainXY offers it in a variety of file formats.However, ChainXY’s terms of service restrict the use of its data to within your own organization. ChainXY also isn’t great for covering non-branded POIs, and its data is often somewhat stale because it’s only updated quarterly.5. TomTomSource: TomTomPricing: $$$Free trial: Yes (with limitations)Best for: Advanced search options, such as by proximity, along route, or for EV charging stationsTomTom’s Places API has POI data for over 270 countries and territories around the world, though it counts only about 70 of those regions as having complete data. It has some novel search capabilities, such as geocoding up to 10,000 addresses at once, searching for POIs within a defined geometry on a map, locating POIs close to a specific point, and even finding POIs along a travel route. TomTom’s Places API can also find electric vehicle charging stations, though this function is expensive.6. HERESource: HERE TechnologiesPricing: $$$$Free trial: NoBest for: Powerful search capabilities, including by contact information, category, or brandHERE’s Geocoding and Search API has a variety of ways to search for and call POI data. Eligible criteria include names, addresses, geographic coordinates, telephone numbers, business categories, brand names, and even types of food. You can also search within the proximity of a location, along a travel route, or within a geometrically-defined area. HERE has data for over 120 million points of interest in over 100 countries and territories, but only has full coverage for about 40 of the latter. The data is also fairly pricey.7. FoursquareSource: FoursquarePricing: $$$$Free Trial: NoBest for: User-submitted extra information about POIsSome people may know Foursquare from when it used to be an app (now called “Foursquare City Guide”) that allowed users to “check in” at points of interest, then submit reviews or other advice for travelers about those places. Foursquare is now a general geospatial data company, but it still includes tips and opinions from its app users in its POI data via its Places API. It has good coverage – over 100 million points of interest in over 200 countries and territories worldwide – but the data is expensive and subject to strict licensing terms.8. Esri ArcGISSource: EsriPricing: $$$Free Trial: NoBest for: Highly-technical search functions for precise data retrievalArcGIS is one of the world’s most prominent mapping applications, so it’s little surprise that Esri has built an API into it for finding POI data. Like with the OSM Map API, the ArcGIS Geocoding API’s search functions are designed for people who understand the more technical side of geospatial data. The upside is that they provide very granular ways to filter out the data you need.For example, you can search for data in specific languages or a particular geospatial format. You can also search for POIs within a point proximity, geometric extent, city, or country. You can even define which specific place attributes to search for, specify whether to look for a place’s rooftop coordinates or street address coordinates, or designate a particular name to look for if an attribute may have multiple different values. The data is reasonably priced, too.9. LocationIQSource: LocationIQPricing: $$$Free Trial: Yes (with limitations)Best for: Affordable way to get deeply detailed information on POIsLocationIQ’s Geocoding API provides many ways to call POI data for less money than some other major POI APIs. These include geocoding and reverse geocoding with options related to language, places with multiple attribute names, and building geometry. It can also find points of interest (of specific categories) around a location, provide information about a location’s time zone, and even measure the distance between two places.However, LocationIQ limits how much data you can call in a day. It also uses a mix of third-party proprietary data and open-source data – including some from OpenStreetMap – so its data quality can be inconsistent at times.10. Google PlacesSource: GooglePricing: $$$$$Free Trial: NoBest for: Global POI data coverage from a well-known brandWe’d be remiss to leave as big a player as Google off this list. Google has put a lot of effort into its geospatial data products (Google Maps, Google Earth, Google My Business, etc.) over the past number of years, and its Places API is no exception. Google Places is often touted as having some of the broadest and most complete global POI data coverage of any API.However, a common complaint about Google Places is that it’s somewhat shallow in terms of the data attributes it tracks. Google also has strict licensing rules for what kinds of applications the data from its Places API can be used in. Finally, Google drastically increased the price of its Places API in 2018, which has prompted many to seek less expensive alternatives.A better (+ cheaper) alternative to API dataOpenStreetMap’s API allows companies to get free point of interest data that is easy to license for use in many applications. But these reasons for its growing adoption come with hidden costs. OSM’s data is updated in a voluntary, unscheduled way, so its datasets can vary greatly in terms of freshness and completeness. Thus, they are more likely to contain inaccuracies and duplicate entries. This costs your organization extra time, money, and effort in having to clean the data before it’s usable.Also, although data from OpenStreetMap is free to download, the process for doing so is convoluted and inconsistent. Third-party programs are required for querying and extracting the data; most require some knowledge of computer coding or GIS to use effectively, and they don’t all extract the same level of data attribution. In addition, many datasets on OSM don’t have complete supporting documentation explaining the different data attributes. This makes comprehensive geospatial analysis difficult if using OSM’s data alone.As an alternative to OpenStreetMap’s API, SafeGraph’s Places data is routinely updated monthly, so you can reliably get fresh and precise POI data. Plus, we do all of the cleaning and de-duplicating work so your organization doesn’t have to. We also offer flexible delivery options for our data; you can get it as a CSV file, or have it sent to one of many popular GIS or data management platforms your organization may already be using (e.g. CARTO, Amazon S3, Snowflake, or Databricks Delta Sharing). And our extensive Places data documentation helps you spend less time figuring out what the data means and more time actually putting it to work.If you’d like to learn more, download some sample data or get in touch with our sales team. #### POI APIs: Why They’re Used and What to Use as Alternatives POI (point of interest) data can be quite useful in a number of different business scenarios. These include adding mapping functionality to an application, providing waypoints to assist with routing and navigation software, or conducting an analysis of the business mix in a local area. But not every company has the resources to collect and manage this data on its own, so many choose to access it through an API.There are pros and cons to this approach, which we’ll briefly look at as an extension of explaining how an API works. We’ll also outline some business use cases for POI data that have spurred companies to make use of a POI API. Finally, we’ll talk about how getting POI data from SafeGraph can give you information that’s more comprehensive and less expensive than what you can call from an API.Here’s a summary of what’s in store:What is a POI API?What can POI data be used for?How to get high quality, affordable POI dataWe’ll lead off with a refresher on what POI data is, and thus what a POI API is used for.What is a POI API?A POI (“point of interest”) API is a service that allows a mobile or web app to search a database for information on points of interest, then retrieve and display this information for a user. This saves the developer from needing their own database to store, search for, and retrieve this data.For the sake of context, a point of interest is a geographic location that holds some importance to humans. This could be a building that helps people perform a useful function, such as a restaurant (for eating), an airport (for traveling), a hospital (for receiving medical care), an office building (for working), or a gym (for exercising). It could also be a place that people find aesthetically pleasing or sentimentally significant. This could be a natural phenomenon (e.g. a mountain, canyon, forest, geyser, waterfall, or exceptionally large tree), or it could be a man-made construct (e.g. a statue, a memorial, or ruins).A POI API example is this one from IBM used for in-vehicle navigation.What can POI data be used for?POI data is almost essential for mapping because it provides reference points as to where places are in the world. Think about how people usually don’t navigate to places using exact coordinates or addresses (unless it’s a private residence they’ve never been to before). Instead, they often describe how to get to places by relating them to where other places are.In this way, POI data is important for giving context to numerous other types of geospatial data. It helps people read maps by indicating places they may be familiar with. It can also give information on what goes on at those places. And, as we mentioned, it helps people navigate by showing where certain places are relative to each other.The following are some common use cases for POI data.1. Lay a foundation for consumer-facing mapping applicationsAn increasing number of consumer-facing mobile and web applications require geospatial data and technology. People need to know where places are and what’s going on there for numerous purposes: navigating in their cars, determining which nearby stores have the products they want in stock, seeing which local restaurants serve the kinds of food they like, or simply finding a place to rest for the night.Companies like Mapbox rely on having precise and accurate POI data from us so they can build mapping applications for these kinds of use cases. Businesses in logistics, automotives, outdoors, retail, travel, real estate, and more count on Mapbox’s maps having the POI data they need – and that data being correct – so customers can make informed decisions on where they can go and what they can do.2. Give context to routing and navigation systemsWhen looking to get from one place to another, it’s helpful to know not only where the departure and destination points are, but also what is there so a person can recognize them. It can also be useful to point out places and other landmarks along a route that people can use for reference.That’s why having quality POI data from us is essential for companies like Telenav in building in-vehicle infotainment systems. Telenav doesn’t just need to get people where they need to go quickly, efficiently, and safely. It needs to let them know what’s nearby when they’re in their vehicles so they can find and purchase what they need, when they need it – fuel, food, or anything else.3. Build retail analytics toolsRetail consulting companies also need reliable POI data to help their client businesses plan and execute their business strategies. They may want to ask questions like these:Trade area analysis – What is the proportion of different types of businesses in an area, and is there room for our client’s business to be competitive there?Site selection – If our client’s business does decide to set up a location in an area, where exactly should it go? How accessible will it be through various transportation methods? Will it be near places where people tend to congregate anyways? How close will it be to complementary or competing businesses?Competitive intelligence – Where are locations for competing businesses? Which neighborhoods do they control market share for? How can our client’s business position its locations and advertising to compete for this market share? Are there complementary businesses nearby that may be able to help our client’s company via partnerships?It’s important to emphasize that consultants not only need the right data to make these decisions, but they also need it at the right time. The world of commercial real estate can move pretty quickly, and POI data that is out-of-date – or that takes too long to complete, clean, and analyze – can lead to ill-advised choices and missed opportunities. That’s why Avison Young depends on our data to help its clients make timely site selection decisions that drive economic, social, and environmental progress for their neighborhoods.4. Develop visit attribution modeling applicationsOne of the big challenges for businesses with brick and mortar stores is to measure how effective their site selection is, for both stores and advertisements. One way to do this is with visit attribution. This involves comparing the amount of foot traffic in one or more areas against how many people crossed into one or more designated spaces within those areas.For consulting firms that offer this service, an obvious strategy is to monitor a client’s own store locations to see how many people actually visited. However, they may also want to keep an eye on places where a client has posted out-of-home advertising, to both measure impressions and estimate how many of them are turning into conversions. They may even want to attribute visits to clients’ competitors to assess how the regional industry market share is shifting.Of course, visit attribution depends on having accurate data of what places are and where they are. Stale data showing rival stores that have long since closed can throw competitive analysis off right from the start. Likewise, if your business is using building footprint data that doesn’t accurately match where the bounds of a building truly are, it can be difficult to confidently count how many unique visits that building gets over a period of time. The latter problem is what caused Olvin to switch to SafeGraph’s data for performing more reliable retail analysis.5. Add location context to productsMany places are associated with one or more brands. This can be important information for consumers who are loyal to certain brands, including if they need service for a specific product. It can also be useful for services like Dosh that offer rewards systems for buying from particular groups of brands.These companies need correct, detailed, and fresh POI data to know which stores are still operational, and to which brands they correspond. This makes it easier for them to onboard merchants to the program, direct consumers to eligible merchants, and avoid having to manually register missed transactions through customer service.6. Power internal analysis toolsPOI data can also be a useful tool for internal business intelligence. Having the right data leads to faster and deeper insights into processes such as site selection, competitive analysis, or trade area assessment. And having the right data infrastructure means that key decision-makers within your company get those insights and start acting on them sooner.Take, for example, the case of Volta Charging. By using a combination of SafeGraph data and the Snowflake data management platform, Volta was able to not just optimize its EV charging station site selection to balance the needs of motorists and advertisers. It was also able to streamline data delivery to its analysts, so the company could spend less time on processing work and more time scaling up its scope and capabilities.How to get high-quality, affordable POI dataA POI API is one option for getting place data to use in your company’s applications. The upside of it is that your business doesn’t have to collect, manage, and update this information using its own resources; it can draw from a pre-existing database. But there are drawbacks to this convenience.The individual HTTP requests that APIs work through usually have fees associated with them. So using an API means getting the data your organization needs piece-by-piece, which can simultaneously add up to an expensive bill. Furthermore, the data APIs call is still largely controlled by the database creator. This often results in strict licensing terms that limit the types of use cases the data can be used in. It can also cause a lack of transparency if data is incomplete or inaccurate, and a POI API’s documentation fails to explain why.In contrast, SafeGraph works with your organization to decide on a flat price for our Places POI data (based on exactly how much data you need) so you get better value for money. You also get all the data you purchase at once, so there’s no need to call it little-by-little through an API. We can also deliver the data to several common GIS or data management platforms, or even as a CSV file. So once your organization has the data, it’s yours, and you can do (almost) anything you want with it. Finally, our extensive documentation explains how our data works, and why there may be some gaps or inconsistencies in it.If you want to learn more about how our data can be a more complete and affordable solution than using an API to get POI information, contact us today. #### SafeGraph’s Data Evaluation Checklist Not All Data is Created Equal. Here are 4 Things to Look for at All TimesIt goes without saying that data is virtually everywhere. Unfortunately, a lot of it is bad or unusable. Not every dataset has the same capacity to spark new innovations, drive critical insights for timely and life-changing solutions, and answer tough or complex questions. If you rely on data to do your work well, it’s important to have a fool-proof way to distinguish one data source from another—in essence, the “good” versus the “bad”—to know whether it’s worth your time and investment. Here are a few tips to get you moving in the right direction.4 Key Steps for Evaluating Data SourcesWe’ve discussed at length why maintaining data standards is so critical for businesses today. For starters, some data sources can be messy, inaccurate, or simply hard to join together with other datasets. This requires data scientists to take the extra step to scrub and clean the data before being able to extract any real value from it. And because accurate data is an essential element in today’s analytics-obsessed world, having objective criteria in place to assess whether a given dataset is fit for use is more important now than ever before.Then, of course, there’s also the question of cost. Although data is everywhere, good and clean data often comes with a price tag—and for good reason. But there are also times when even subpar data lives behind a paywall. So, it’s important to know what kind of data you’re actually purchasing before going too far down the data “rabbit hole.” Good news for you: If you follow these four steps, you can avoid falling into the bad data trap:Step 1: Determine if the data comes from credible sourcesFirst, verify how dependable and trustworthy the data is based on its original source and be able to defend it should your findings ever come under scrutiny: What is the true source of the data? Some data intermediaries take source data and process it to add value. You need to know this, as it could influence your findings or potentially even be used against you should others want to challenge your findings.‍What assumptions are being applied? Datasets are often filtered to fit a set of assumptions, which can alter your results if left unchecked.‍What is the depth, breadth, and cadence of the data? Some datasets are aggregated while others represent individual transactions. Some datasets are captured from a specific moment in time while others span over a time period. Some datasets are generated from large panel sizes while others reflect only a small subset. Either way, you need to know this information up front to be able to defend and justify your results.Step 2: Establish what the data can (and can’t) tell youNext, determine the limits of the data, so you can put it to good use:What does the data represent? Depending on the source, data can shine a spotlight on transactions, consumer intent, foot traffic, or behavioral patterns. ‍‍What observations can the data allow? Data can reveal explicit relationships or implied behavioral patterns just as much as it can leave gaps to be filled by assumptions. But it can’t do everything. You need to be aware of a dataset’s constraints at all times.‍What are the unique characteristics of the data? Some data providers may be the only source for a particular type of dataset. Or they may treat the data in a unique way that renders a dataset more immediately usable or joinable to other datasets. Or is this a proxy to other data not readily available or accessible in the market but can nonetheless be used to infer important insights? Step 3: Assess the genuine usability of the dataNot all data is immediately usable without a little scrubbing. To establish how much cleaning, sorting, or processing may be required to make a dataset useful, ask the following questions:How is the data presented? The data may be available via GUIs and charts or packaged up as raw data. This can influence the amount of processing required.How easy is it to work with the data? Complex datasets often require an advanced level of expertise to work with, while others are more plug-and-play and can integrate seamlessly with data analytics platforms via APIs, thereby making it easier for even a non-technical user to analyze and glean immediate insights from the data.How much additional work is needed to make the data usable? If a lot of extra work is required to expand, clean, or sort a dataset, that could quickly become a roadblock for driving meaningful, actionable, or timely insights.Step 4: Be clear about how you plan to use the dataFinally, you need to define how you intend to put this data to work:How many companies, metrics, or regions does the data apply to? A deep, dependable, and highly accurate dataset can allow you to answer a wider range of questions that could ultimately be applicable to multiple businesses, sectors, and beyond. This is less of a limitation as it is a way of creating new value out of existing data sources.Can the data be combined with other datasets? Joining one set of data with another can reveal unexpected, yet incredibly valuable insights that you couldn’t have gained from analyzing those two datasets independently. Unfortunately, based on how clean or organized those datasets may be, joining them can become a feat in and of itself.‍Are there opportunities to get creative? Some datasets can be put to use in ways you might not expect, which oftentimes generate the most immediate and long-term value. Think about it like this: Your imagination is your canvas, while data is your paint; not all “color” combinations will yield the results you anticipate, but when all the pieces fall into the right places, incredible things can happen. You just have to be creative.Don’t Ever Be Duped Into Using Subpar Data AgainEvery dataset is inherently different. Some are good while others are just not worth anyone’s time. Before deciding to invest in any data source for your own research or analysis, you must ask yourself the questions above to avoid potentially paying a lot for nothing at all.Being crystal clear upfront about the credibility, usability, and applicability of any given dataset—including its limitations—will help you determine what you can get out of it before spinning too many wheels. This is a fool-proof way to ensure you always get the data you need. #### SafeGraph’s Guide to Better Location-Based Marketing Location-based data can give AdTech companies (and marketers) a competitive edge.It’s Time for AdTech Companies to Marry Location-Based Data with Audience-Based DataThere are very few (if any) marketers today who would deny the value of programmatic advertising. In many ways, it has revolutionized how brands and businesses get closer to their target audiences and make more calculated bets on how to drive and improve conversion. Not to mention, it has helped marketers reduce a significant amount of marketing waste.Defined broadly as the automated buying and selling of online advertising space, programmatic advertising, for those who might need the refresher, is essentially a digital marketplace where publishers can put up ad space for rent (aka, the “supply-side”) and advertisers can search for ad space that meets broader campaign goals and objectives (aka, the “demand-side”). The magic all happens on ad exchanges, where the smart technologies and algorithms of supply-side platforms (SSP) and demand-side platforms (DSP) come together to bring these advertising transactions to life (among other things). But more importantly, there’s a massive layer of audience-based data and insights that permeates the entire programmatic advertising marketplace. Simply put, advertisers can now easily “target users based on characteristics like demographics, geography, interests, behavior, etc. without any human intervention” (source). This alone has made it possible for marketers to build highly targeted and relevant digital marketing campaigns instantly that are well-optimized up front to achieve specific performance goals.More and more, location-based data is revolutionizing the AdTech industry. Even an ecosystem as advanced and data-driven as programmatic advertising can be better, more effective, and more competitive. The many opportunities to improve audience creation, targeting, and location-based marketing continue to grow. So we thought we’d break down exactly how POI, building footprint, and foot traffic data can be used to create winning AdTech strategies.In this guide, we will explore how location-based data can effectively become a competitive differentiator for AdTech companies looking to break through the noise in a now very crowded programmatic advertising marketplace. Even more, it can provide marketers with a new level of sophistication around planning, buying, optimizing, and measuring programmatic campaigns—especially with respect to online-to-offline attribution. Want to learn how to use location-based data for store visit attribution? Check out SafeGraph’s Technical Guide to Visit Attribution here. Key takeaways at a glanceWe’ll discuss why now is the perfect time for AdTech companies to harness the power of location-based data, including: Why location-based data is the key for supercharging audience-based data insights.How AdTech companies can help marketers close the online-to-offline attribution gap.What must-have location-based datasets to begin using immediately.How foot traffic data can be used to build accurate attribution models.From offline attribution to geo-targeting, location data can help marketers understand the bigger picture of the customer journey.Location Data 101: A Primer for Marketers (MarketingLand)‍Location Really Does Make All the DifferenceIn the physical world, we place a lot of value on location. Where a brick-and-mortar business is located, for example, can determine its potential to succeed or fail. It’s always a question of how much organic foot traffic it can drive from either people living in its surrounding area or those visiting adjacent businesses. As a variation on a theme, think back to the last time you were at an airport. Were there more people congregating at the restaurants, cafes, and bars in the center of the terminal than those even a five- or ten-minute walk way? The moral of the story here is that it pays to be in a great location. We’d like to think the same philosophy applies to ads as well. After all, the more you know about where your target customers live, how they move about within and outside of their local area, what kinds of businesses they frequent, and how much time they spend within those places can add new meaning, value, and understanding to existing audience insights. Armed with those valuable insights, you can not only choose the right context for your ads but also develop messaging or offers that will ultimately resonate with (and convert) your target customers.Detailed POI and building footprint information provides context for consumer behavior.In this way, audience-based data is clearly indispensable for beginning to paint a picture of who your target customer is. It can also provide a clear roadmap for getting your brand or business squarely in front of the people who actually want or need what you’re selling. Location-based data, however, serves a slightly different purpose in this equation. It can amplify those audience insights to provide a real-time snapshot of how your target customers take action on those behaviors, needs, and impulses throughout the day. Together, audience- and location-based data can form the backbone of a powerful and precise marketing strategy that ensures the right ads can reach the right people with the right message at the right time.Bringing these types of data together can provide marketers with even more granular insights into their target audiences than ever before. This is just one of the reasons why we believe that location-based data is both the new frontier and the biggest opportunity for AdTech companies today. “Customer location data has emerged over the last decade as a wealth of information for marketers,” explains MarketingLand’s Taylor Peterson (source). It provides a “digital footprint of where customers are spending time and how they interact with brands—both online and offline.”More than 84% of marketers use location data in their marketing plans, and 94% plan to in the future.83% Increase in Customers Due to Location-Based Advertising (Martech Advisor)Remember, not all data is created equalAdTech is all about quality, accuracy, and precision. It’s important to remember that not every data provider follows the same quality standards as we do here at SafeGraph. Unfortunately, we can’t guarantee the quality of the data you get from other sources or that it will do exactly what you need it to do—with or without serious scrubbing. Even so, by asking yourself these four important questions, you’ll be in a better position to evaluate whether any dataset you get is really worth your while: Does the data come from a credible source? What can (and can’t) the data tell me? Is the data immediately usable or will it need to be cleaned? What do I plan to do with the data? To learn more about how to evaluate third-party datasets, be sure to check outSafeGraph’s Data Evaluation Checklist. Closing the Online-to-Offline Attribution Gap‍It’s one thing to pinpoint relevant audiences through programmatic advertising; it’s another thing to influence decision-making along the customer journey. Petersen goes on to say: “In addition to enabling a more highly targeted ad experience, location intelligence can be used to help close the gap between online and offline purchasing behavior. For businesses with storefronts, real-time location data can help fill in the holes of the customer journey by shedding light on the relationship between an online touchpoint and an in-person transaction.”The world has transformed into a vast network of interconnected devices. In fact, it has been estimated that by 2025, there will be 75 billion IoT devices in circulation. This includes everything from mobile phones and wearables to home security systems and voice-activated home assistants to medical devices and connected vehicles—and the list goes on. The key takeaway here: As more of these IoT devices become part of the fabric of day-to-day life, there will be an even greater influx of location-based data to tap into for understanding the very nature of consumers and their daily habits—in a more intimate and hyper-relevant way.Understanding what other stores consumers visit can help determine ad placement and messaging.This is the goal and, suffice it to say, untapped opportunity of today’s marketers. Programmatic advertising has taught us that putting the right message in front of the right consumer in the right digital context has a greater chance of driving an online conversion. With location-based data added to the mix—coupled with location-based in-app (mobile) targeting—it is possible for brands and businesses to capture people’s attention and influence their behaviors in real-time, when they’re simply going about their daily lives. However, doing this well requires having access to highly accurate location-based data. Marketers and advertisers need to understand precisely where they should target consumers, how those consumers interact with those places, and what defining characteristics—including key demographic traits—could potentially impact the effectiveness of a future ad campaign. In many ways, by bringing together both the quantitative and the qualitative sides of data, marketers and advertisers now have a plethora of new possibilities at their fingertips for anticipating, shaping, and, ultimately, influencing the online-to-offline customer journey.Marketers believe location-based marketing results in higher sales (90%), customer growth (86%), and higher engagement (84%).83% Increase in Customers Due to Location-Based Advertising (Martech Advisor)‍SafeGraph Places: 2 Types of Location-Based Data for AdvertisingThe SafeGraph Places dataset, updated monthly for utmost accuracy and precision, provides the in-depth POI and building footprint data you need to add extra fuel to your audience-based data. The biggest perk of using SafeGraph Places data: It can help you make well-informed advertising decisions and better anticipate eventual campaign performance.Here are the two types of location-based data that make up this powerful dataset: Points of Interest (POIs) Places includes base information—such as location name, address, category, and brand association—for the places where people spend their time and money. It also sheds light on the relationship that exists between adjacent POIs. POI data is important because it offers marketers and advertisers a unique perspective for understanding the types of places their target audiences visit throughout the day or week. Read the case study here. Building FootprintsGeometry offers building footprints for POIs derived from spatial hierarchy metadata to allow for geofencing as well as a more precise and accurate understanding of attribution. Check out this great overview of the ins and outs of building footprints for geospatial analysis, including highly targeted mobile marketing. With accurate building footprints, advertisers can be sure their ads reach the target audience at the right moment along the real world customer journey.Learn more about how these datasets work together to paint a clearer and more actionable picture around online-to-offline attributon.Use device-level data to understand store visit attributionAdTech firms looking to create best-in-class visit attribution prefer to create their own visit attribution models with raw mobility data. SafeGraph can help with that. If your goal is to use device-level data with our POIs and Geometry data for determining store visit attribution, there are a few important steps to take to ensure you drive the results you’re looking for—and with the highest level of accuracy possible:Clean GPS Data: Start by processing and cleaning datasets to account for GPS signal shift, spiking horizontal accuracies, and “jumpy” GPS pings.Cluster GPS Pings Together: The objective here is to take all of the GPS pings collected and turn them into potential visits on a map without relying on the Patterns dataset.Prepare the Clusters: This involves forming a geospatial join between the clusters identified in Step 2 and the polygons in our Geometry dataset in order to create a list of possible places that the clusters could be referencing.Predict the Best Places: Now comes the time to identify the best or most relevant place to associate with each cluster. This part involves a number of interconnected variables, so we find it most useful to leverage machine learning to aid the classification process. Attribution isn’t easy, but it is possible with the right location data.Less accurate attribution models to avoidAs we developed Patterns data, we tested a few, shall we say, simpler approaches in an effort to drive the same outcome. If you’re developing your own store visit attribution model, you may be considering some of these methods. To save you time, here’s a quick rundown of their most noticeable limitations:Closest Centroid Wins: This strategy assumes that, for any given GPS ping, the closest POI centroid is considered a “visit” to that POI if the distance is below a certain threshold. This is the approach we relied on when we first started doing work around store visit attribution, but we quickly found that it was most effective at determining visits for large standalone stores (like the Walmarts of the world). The data can become flawed when trying to replicate this for smaller building footprints, either adjacent to or within larger polygons. Any Ping Inside a Polygon: We also explored the approach of simply identifying a store visit as a sequence of GPS pings all within a single POI polygon. As in the case above, this approach works best for large POIs or outdoor places, like airports and theme parks. However, there are two primary issues we found with this approach: Drifting GPS signals mean that certain pings might not be captured as entering the building polygon or, even worse, be attributed to a neighboring polygon; and device pings may automatically switch back and forth between two different locations, which requires a good amount of clean-up to make the data usable.Any Ping Inside a Custom Geofence: The general idea behind this approach is to assume that any sequence of GPS pings that take place within a “padded” custom geofence constitutes a store visit. While the padding helps correct for some GPS drifting issues noted above, it simultaneously makes the outcome of data analysis less precise by default. As a work around for this, it’s possible to bring other attributes, like time of day or amount of time spent within a geofence, into the mix and let machine learning do the work of identifying the right place associated with a given ping. Be sure to check out SafeGraph’s Technical Guide to Visit Attribution for step-by-step instructions around our unique attribution approach.Build Stronger Campaigns with Location-Based DataRegardless of your approach to visit attribution and AdTech analysis, one thing is clear: The most successful ad campaigns are created with a combination of location- and audience-based data. Not only will combining these two data sources give you a competitive edge—versus your competitors who still rely solely on audience-based data—but it will also help you get closer to cracking the code around online-to-offline attribution (which is every marketer’s main goal!). The question today is not if you should weave location-based data into your analytics mix but rather when you decide to buck the trend and do so. Location-based data is the future of programmatic advertising. This is your chance to be an “early adopter” before the rest of the industry catches on. And trust us, it’s catching on! That being said, we understand that you might still be on the fence because you know that working with tons of new data may require a lot of work on your end. And although we’ve done everything in our power to make our datasets as easy as possible for you to use immediately, our team of data experts is always here to help get you over the finish line, so rest assured that we are here to support you throughout your location-data journey. Get in touch with us today.‍ #### SafeGraph’s Guide to Developing a Winning Retail Strategy with Location Data 4 ways retail businesses can harness the power of location data to make smarter decisions and remain competitive with consumers.The Brick-and-Mortar Retail Industry is Becoming More Data-Driven Than Ever BeforeRetail has always been a competitive space. And as more consumers now shift their attention to the fast-growing e-commerce space—which accelerated at an even faster clip during the COVID-19 pandemic—brick-and-mortar retail businesses have no choice but to overhaul their end-to-end operations strategies. For one, consumer behaviors are radically different today than they were even five years ago. The pandemic has a lot to do with this shift, but in all fairness, certain consumer shopping behaviors were already evolving well before life got turned upside down. But knowing that, at least as of now, nearly 50% of consumers say they’ll continue shopping online after the worst of the pandemic is over, brick-and-mortar retailers are going to have to go the extra mile to not only win back foot traffic but also provide better and more convenient customer experiences. The good news: 90% of retail transactions still take place offline. And despite the challenges presented by COVID-19, retail is still poised to make a comeback because, let’s face it, people want to be out and about in public again. Which means now is the time for retail businesses to think about what the future looks like and proactively prepare for the next generation of retail. Accurate data will be the key for driving actionable insights around retail’s rebound in the years ahead. Location data, such as points of interest (POIs), building footprints, and foot traffic, can offer both retailers and urban planners in-depth insights for not only going head-to-head with the competition—including e-commerce—but also for developing long-term retail strategies that touch all points along the customer journey.In this guide, we’ll show how location data can give retail businesses a competitive edge in market analysis, site selection, promotional strategy, and store planning. Simply knowing how to use this data to your advantage will give you the power to adapt your retail strategies in real-time to ever-evolving consumer and market dynamics.Key takeaways at a glanceRetail businesses can use location data in powerful ways to fuel insights around:Market analysis: Assessing local market dynamics at a deeper level to determine whether launching a storefront in a given retail trade area will drive long-term success. ‍Retail site selection: Pinpointing the best location to build a new retail store, based on demographic alignment and other factors uncovered during the market analysis.Promotional strategy: Identifying how best to target, reach, engage, and attract desired consumer audiences—from within a local trade area—to a specific retail location.Store planning: Determining operational efficiencies for maximizing revenue potential and creating exceptional and more relevant in-store customer experiences.Location Data is Your Secret Weapon for Retail SuccessBuying into the saying, “If you build it, they will come,” isn’t a winning retail strategy. In fact, before you “build it,” there are a flurry of important decisions that need to be made to set your retail business up for success right from the start. Long gone are the days when urban planners could just leverage past experience and loosely quantifiable trends to inform retail site selection decisions. And while there’s always an element of “trusting your gut” involved in launching a successful retail business, location data has now become a retailer’s greatest strategic ally in this endeavor.Let’s take POIs as an example. As a retailer, you want—no, need—to know where and how close you might be positioned to your nearest competitor. Even more, you’d want to know how long that competitor has been there because, if your goal is to steal market share, you’ll need more than just a great location to make that happen. Similarly, you should think about what other complementary types of businesses are near your proposed locations, as those, too, can either generate “recycled” foot traffic to your location or hog all of the attention for themselves. In short, understanding what POIs surround your proposed location can provide keen insights into how well your business will potentially thrive within that context. Taking this a step further, you can also use foot traffic data to understand how many people, including who those people are, demographically speaking, regularly visit the other retailers in the area. And then once you’ve established your location, you can use this data in multiple ways: from determining store hours to anticipating the busiest times of the day. Finally, at a more micro-level, you can use building footprints to plan smarter store layouts that maximize revenue-earning opportunities throughout the in-store customer experience. Long story short: There are a number of actionable uses for location data that can empower you to build more holistic retail strategies that have a greater potential to succeed.Only 45% of retailers currently use location analytics to inform retail strategy, even though 74% say it’s important.Location Analytics for Retail (ESRI)Build Better Retail Strategies with Location DataRunning a successful retail business involves a lot more than simply picking the right location—even though that, in and of itself, is a critical factor for success. What you may not realize is that at every stage of a retail business’s life cycle, there is an opportunity to use location data to give it a competitive edge. Here are four specific use cases that demonstrate how location data can help retailers, urban planners, marketers, and merchandisers create better retail experiences that measurably boost the bottom line. 1. Market analysisWhile market analysis is a general best practice for starting up any new business, it’s absolutely critical for launching new brick-and-mortar retail store locations. Simply put, market analysis is an objective way to assess whether your business will be well-positioned to address the needs of consumers via the products and services you offer. It’s also a great way to reduce potential risks before investing too much money in making a business dream a reality. According to the U.S. Small Business Administration, a market analysis looks at the following factors that will ultimately determine your retail business’s long-term success: Demand: Do the consumers in your area want or need what your business offers? Market size: How many consumers can your business conceivably target and reach? Is that enough for your business to break even and then drive a profit?Economic indicators: Do the consumers in your area have enough disposable income to buy the products or services your business offers? Location: How far is your business from where your target consumers live and is it easy for them to get to and from your store? Market saturation: What other similar businesses already exist—and for how long—in the same area? What are their relative strengths and weaknesses? Does this area need another similar business? Is the competitive field already too full to break through?Pricing: What are your target consumers willing to pay? If there are already competitors in an area, how does the price of the products and services you offer compare to those of your competitors? If you don’t offer a lower price, what does your business uniquely offer that substantiates the price you’re asking customers to pay? Catchment analysis is a great way to position your business for success within a given marketFor a new retail location to be successful in any market, it must be able to deliver on an unmet consumer need that hasn’t been adequately addressed by other local businesses. Location data can help you assess this with greater precision and accuracy. This is called catchment analysis. Retailers, market analysts, and urban planners do this by leveraging geographic information (i.e. number of households or family size in a given area) coupled with key consumer demographic details (i.e. age, income, education, and occupation) to paint a picture of where your business’s customers are most likely to come from. This alone can give you a good idea about what your true market potential looks like. When you combine anonymized mobile foot traffic data with third-party data sources, like those from the U.S. Census Bureau of the U.S. Bureau of Labor Statistics, you can begin to understand consumer behaviors in much more granular detail—from media consumption patterns to how local consumers perceive your nearest competitors. This kind of information can drive unique insights that can take your market analysis to an entirely new level, both shedding light on untapped opportunities as well as avoiding the risk of potential failure.2. Retail site selectionWhereas market analysis is intended to identify your retail business’s opportunity or potential in a given area, retail site selection is about honing in on the exact spot that will drive the greatest amount of success (and foot traffic) for your business. As residential communities are rapidly evolving, especially within metropolitan suburbs, it is having a massive impact on the retail landscape as well as on consumer demand within a specific retail trade area. Traditional market research tactics to understand these shifts are too slow to provide actionable insights in real-time. By the time those insights are available, they’re pretty much already obsolete. This is where location data can really give retailers a competitive edge in retail site selection. It can help you visualize how local market dynamics have changed over an extended period of time, uncovering new opportunities and insights in real-time that wouldn’t have necessarily been apparent through more traditional or one-dimensional data sources. 90% of retail transactions still take place offline.CARTOIn addition to these insights, location data can also answer the following questions: What other POIs are located within the same retail trade area? What Census Block Groups (CBGs) feed into those POIs? How much travel takes place between different neighborhoods? How much foot traffic do those POIs drive? Who are the consumers going to those POIs (in terms of demographics)?Using the Huff Model to identify the retail trade areas of five Whole Foods locations in Los Angeles.Once you’ve decided where to place your business, you can use location data once again to help run your business more efficiently. For starters, it can help you determine: What day of the week a CBG is the busiestWhat time of day a CBG is the busiestWhen people stop in the CBG at peak travel times (breakfast, lunch, dinner)Where people travel from to get to that CBGWhether foot traffic patterns differ on weekends versus during the work weekWhy is this important? By understanding the total market dynamics surrounding your retail location, you can ensure your business is staffed up, stocked up, and ready to meet consumer demand at peak periods. So, in this way, while location data can help determine where your retail business could conceivably have the greatest revenue-driving potential, it can also help you run your business in a way that squarely addresses the needs of your local market area.A note about retail site de-selectionIn the same way that location data can help retail businesses pinpoint the best place to set up shop, it can also help inform retail site de-selection or, in simpler terms, where to close down operations. This became a go-to tactic for many retail chains during the COVID-19 pandemic. Faced with steadily decreasing foot traffic caused by various lockdown and social distancing measures, many retail businesses had to place strategic bets on the locations that had the greatest potential to continue driving value for the business as a whole. As a result, this also forced them to make the hard, yet data-driven choice to close down storefronts that were either seen as being unable to rebound in the short term or simply too expensive to keep open.3. Promotional strategy While location data may not always be top-of-mind when developing marketing and advertising campaigns, much of the insights we glean today to target, reach, engage, and convert our ideal customers, via virtually every digital advertising platform, is deeply rooted in location data. This is essentially a logical extension of market analysis. The only difference here is that, as opposed to assessing your market opportunity and potential, you’re now needing to drive foot traffic to your retail business location. When approached in that way, it’s basically two sides of the same coin; how you take action on the data is what separates one outcome from another. Although there are limitless ways to use location data in marketing and advertising, here are a few thought starters to show you what’s truly possible: Create location-based audiences using visits to a specific place, brand, or category.Understand which campaign actually drove in-store foot traffic by mapping attribution (conversions) in a more granular way. Personalize in-app experiences, push notifications, exclusive promotions, and spontaneous recommendations based on a user’s proximity to a business.Identify out-of-home (OOH) inventory and targeting based on proximity to POIs.Run “conquesting” campaigns to devices in or near a competitor’s stores.As our world gets increasingly connected through emerging technologies that become mainstream, location data will become much more prevalent as it stems from various new sources.Insider Intelligence + eMarketer (2019)4. Store planning Aside from being a well-known brand or offering consumers products and services that they truly love, it’s important for retail businesses to not only create a positive customer experience but also to ensure that the in-store experience maximizes the potential for revenue generation.According to the location intelligence experts at CARTO, “Location data generated from WiFi networks, beacons, GPS applications, and Apple Indoor Maps makes it easier for retailers to optimize layouts and maximize efficiency, answering pressing business questions related to pricing, opening hours, and operations.” In other words, by understanding foot traffic patterns in stores—beyond simply understanding foot traffic to stores—retail businesses can create better in-store experiences. Similarly, they can use this information to more accurately determine: Store opening and closing hoursPeak or rush hours Staffing needs throughout the dayBest time(s) to restock the shelvesInventory replenishment purchasing cycle Merchandising strategyStore layout and designThe big takeaway here is that location data can be used in a number of practical, functional, and immediately actionable ways that can help a retail business run more efficiently. SafeGraph Places Data for the WinThe SafeGraph Places dataset, updated monthly for utmost accuracy, provides the in-depth POI and building footprint data you need to build winning retail strategies. Places includes base information—such as location name, address, category, and brand association—for POIs where people spend their time or money. Geometry offers building footprints for POIs derived from spatial hierarchy metadata. Combining the SafeGraph Places datasets with first-party retail data, retailers, urban planners, and market analysts can visualize catchment analysis, understand market penetration, and create great maps to tell compelling data stories.Make Better Decisions in Retail with Location Data Location data is revolutionizing the retail industry for the better. Not only can it help retailers assess market dynamics and identify the best places to set up shop, but it can also enable them to run more targeted and effective promotions while, at the same time, create better in-store experiences that drive customer loyalty and maximize revenue potential. When looking at it this way, the power of location data to transform retail businesses is truly limitless. However, we understand that analyzing location data might seem intimidating at first, especially if you haven’t used it like this before. But it doesn’t have to be. With the right tools and techniques in place, like those we’ve shared here, location data can give your retail business the competitive edge it needs to be successful in any local market of your choosing. And if you’re not sure quite where to start, our team is always here to help! #### Store Visit Attribution: Importance, Methods, & Where to Get Data Understanding if a device visited a place, brand, or type of store can be valuable context to have for your business. Companies use store visit information to build custom audiences for advertising purposes, to better attribute ad campaign spend, and to send contextual push-notifications in real-time. Unfortunately, accurately determining if a device visited a place can be a tough engineering problem to solve. Dealing with messy GPS data, incomplete business listing information, and limitations in knowing where places exactly are located make visit attribution a complex problem. However, building a visit attribution solution remains a worthwhile endeavor since it enables you to enrich digital data with physical-world context. Furthermore, building a visit attribution solution in-house allows you to tune the algorithm to your specific input data and specific use case which results in a better end solution for your customers.We've outlined all the essential topics here, including:What is store visit attribution?Why measuring store visits is important for online-to-offline attributionBuilding footprint polygons vs store centroids for visit attributionHow to correctly attribute visits to your storeStore visit attribution case studies to help you get startedIf you’ve got the foundational aspects of this down already, we also have a store visit attribution technical whitepaper you can check out. if you’d rather start with the basics, read on!What is store visit attribution?Store visit attribution uses GPS location data from mobile phones with POI data to determine if a device visited a place, brand, or type of store. There are two main methods for attributing store visits, but the most accurate way is using precise POI polygons as geofences to truly see which mobile devices passed through a threshold.The other popular method for store visit attribution is using a centroid radius as the polygon. While this can be easily done with any data point and basic geoprocessing tools, it often contributes to incorrectly attributed visits because a centroid radius is less precise than a building footprint polygon. As a result, GPS pings can be under or over-counted using this method of visit attribution.Which store visit attribution method you choose will depend on the level of accuracy you need. For some organizations, a centroid radius generated from POI data will suffice, while others need the precision provided by building footprint geometry data.The other popular method for store visit attribution is using a centroid radius as the polygon. While this can be easily done with any data point and basic geoprocessing tools, it often contributes to incorrectly attributed visits because a centroid radius is less precise than a building footprint polygon. As a result, GPS pings can be under or over-counted using this method of visit attribution.Why measuring store visits is so important for online-to-offline attributionBuild custom audiences - With an accurate view of who is visiting your store (or your competitors), you can strategically plan inventory and marketing campaigns that align to consumer needs and expectations.‍Better attribute ad campaign spend - Measuring visits to a specific location enables you to target the right people with the right ads at the right time, so you can optimize ad spending for the correct audience.‍Send contextual push notifications in real-time - Geofencing is most effective when it’s accurate, so truly understanding when a person enters or leaves a specific place prevents you from targeting the wrong individuals or missing prime marketing opportunities.Building footprint polygons vs store centroids for visit attributionA store centroid radius represents the distance "as the crow flies" from a building or property's centroid. Centroid radii can be generated using simple geoprocessing tools commonly found in GIS or BI programs. They are ideal for geofencing the general area around a point and conducting proximity analysis, but are not the most accurate or strategic method for attributing exact store visits because they often lead to under- or over-attributing visits.Building footprints are polygons that denote a structure or property's exact physical boundaries. They are the most precise method of visit attribution because they represent specific places rather than proximity, so GPS pings can be accurately attributed to exact POIs. Geometry data that contains spatial hierarchy is especially useful for store visit attribution because it includes parent/child relationships (ex. when a store is inside a strip mall). With spatial hierarchy metadata, building footprint polygons can be used to attribute visits to stores within other locations, making them the most accurate method for visit attribution.To download precise building footprint polygon data that includes spatial hierarchy information, get in touch.How to correctly attribute visits to your storeHere is a breakdown of SafeGraph’s store visit attribution method. To read about it in more detail, read the technical whitepaper.Step 1. Cleaning GPS dataWhen dealing with GPS data, there are three primary prevalent issues that need to be addressed before correctly attributing store visits: GPS signal drift, spiking horizontal accuracies, and jumpy GPS pings. To clean the GPS data, remove any non-stationary data and filter all horizontal accuracies above a tuned threshold. For any two points that are close in time, compute a speed between them and if the speed is too high, filter out the pings.‍Step 2. Clustering GPS pings together‍Next, try to determine where pings are coming from without using POI data for context. The key insight here is if you look at a series of GPS pings on a map with no places, you can generally figure out areas that a device could have visited. After cleaning the data, do a first pass over it, creating clusters out of consecutive pings that are in large POI. Then do a second pass, creating clusters from the remaining blocks of unused pings using a modified DBSCAN. Finally, save the clusters and discard all unused pings to ensure you are using the most relevant pings for your analysis.‍Step 3. Preparing the clusters & their possible places‍With these clustered pings, you can now begin to analyze the data with geospatial context. Simply perform a geospatial join between the clusters and building footprint polygons. You should make sure to add a buffer around the cluster to account for any horizontal accuracy uncertainty of the GPS pings.‍Step 4. Predicting the best place for a given cluster‍The final step in accurate visit attribution is to apply machine learning models that determine the most likely place that ping cluster visited. This is especially helpful for areas where multiple places are located closely together. Models can be developed using logic related to the time of day visits are recorded, as well as what type of business it is. For example, a cluster of visits between a retail store and a bar at 11 pm would likely indicate they are from the bar, not the store.Store visit attribution case studies to help you get startedToday’s top retailers and advertisers are leveraging geospatial data in their store visit attribution workflows. Here are some examples of ways leading data science teams are thinking about visit attribution.1. How Billups pioneered a data-driven solution to outdoor media placement and audience measurement problems‍The outdoor advertising industry has historically lacked good data upon which to select placements that reach ideal audiences. Without quality data, the industry also struggles to help brands measure campaign effectiveness. With building footprint polygons and anonymous location data derived from mobile phones, Billups was able to complete the customer journey from exposure to in-store visit, transforming OOH into a performance-based media. When joined with GPS data, the precise geofences increase the accuracy of detecting store visits when compared to using store centroids or geocoded street addresses for the store location.2. How Media Storm accurately attributed MAID visits with building footprint data‍Media Storm found it challenging to determine from geolocation data whether a mobile advertising ID (MAID) had visited a store without accurate data on where stores were precisely located. Without exact building footprints for the stores, Media Storm’s audiences would include irrelevant MAIDs resulting in wasted ad-spend. Using brand information and NAICS codes (categories) for a place, Media Storm was able to quickly identify store locations of its clients and those of its clients’ competitors, as well as the exact building footprints of those places, to more accurately create location-based audiences and reduce inefficient ad spend. When done correctly, visit attribution is a gamechanger for retailers and advertisers looking to optimize ad spend and strategically plan campaigns. To learn more about how SafeGraph data can be used for store visit attribution, contact our data experts for a free demo. #### The Big Wake-Up Call: Re-thinking the Entire Playbook for the Quick Service Restaurant Industry How the pandemic has fundamentally reshaped the future of the QSR industry The pandemic hasn’t been kind to the restaurant industry—and the fun’s far from over.As a preface to what you’re about to read, we promise not to drone on and on about the pandemic. We get it, you’re tired of talking about a topic that has dominated pretty much every part of our lives over the last two years (and counting). But now that the dust is finally settling—or, at least, as one would like to believe—we can’t help but take a step back to see how this global human experience turned many industries and businesses upside down. Some for the better and, unfortunately, some for the worse. The quick-service restaurant (QSR) industry is a perfect example of an industry that has evolved from the inside out over the course of the pandemic—wherein the pre-pandemic status quo is a far cry from what the future of the industry looks like. And QSR business owners, from ‘mom and pop’ restaurants to multi-national chain restaurants, need to accept that reality head-on.The truth is, there’s no returning to “normal” anymore (whatever that really means). Because the pandemic has ebbed and flowed for such an extended period of time, the QSR industry has had no choice but to adapt to new consumer behaviors—and change course entirely. But what does that actually look like? To get an inside scoop on the evolution of the QSR industry during the pandemic—including the major pivot the industry must make in order to thrive from here on out—we had an in-depth conversation with Mike Lukianoff, data science entrepreneur and QSR industry guru, to map out the new QSR industry playbook. What you’ll learn by reading thisHow the pandemic expedited disruption across the entire QSR industry.How evolving consumer behaviors forced QSRs to build new business models rapidly.Why the future of the QSR industry will rely heavily on having access to the right data.“The restaurant industry has completely transformed over the past two years. The way we analyze it must change, too. What we know is that the old playbook simply isn’t working anymore.” – Mike Lukianoff, Data Science Entrepreneur & QSR Industry GuruPushing the long overdue ‘reset button’ on the QSR industryThe last thing we want to do here is to relive the experience when the pandemic first took hold. So, we’ll keep this part short and sweet. But here’s what happened in a nutshell. Everything shut down. Businesses. Schools. Restaurants. Pretty much any public space imaginable (except for grocery stores and banks). And for those people who could do their jobs remotely, the home became the new office. This massive shift in where people were now spending the bulk of their time—aka, at home—essentially turned business districts and commercial centers into ghost towns overnight. The big problem here, however, is that much of the QSR industry was built around people not being at home all the time. Generally speaking, the assumption was that people leave their homes in the morning—to go to work, school, the gym, the doctor, and so on—and spend their hard-earned dollars in the businesses dotted along their daily journeys. For example, people who commuted to work in the morning would typically grab their morning coffee (and pastry, let’s be honest) before heading into the office and then, more likely than not, go to a nearby restaurant to grab food for lunch, whether take-out or dine-in. There’s a good chance that they would run errands at other local businesses, too. And of course, after a long day at the office, the occasional happy hour was certainly well-deserved. So, suffice it to say, a big part of the QSR ecosystem revolved around restaurants being accessible to where people spent most of their time. This is the primary goal of retail site selection and trade area analysis. That being said, the industry was not prepared for the massive, pandemic-induced migration to remote work whatsoever. This is why, according to the National Restaurant Association, as many as 12% of restaurants closed over the course of the pandemic. Interestingly enough, these weren’t necessarily the restaurants that many QSR businesses would have, perhaps, anticipated closing—based on past performance models—prior to the pandemic. “The QSR industry was built on the premise that consumers will leave their homes daily to transact with their local community in some way.” – Mike Lukianoff, Data Science Entrepreneur & QSR Industry GuruThe QSR industry was already at an inflection (ahem, breaking) pointEvery industry goes through its own fits and starts at some point. The QSR industry was no exception to this rule. In fact, it could not continue down the ‘status quo’ path without reaching a breaking point. The pandemic merely expedited this, just in slightly unexpected ways. So, what do we mean by “breaking point” here? Generally speaking, the industry was already overbuilt. For example, retail trade areas zoned for restaurants—those primarily catering to daytime commuters—were oversaturated with restaurants well before COVID-19 reared its ugly head. Whereas residential areas still remained restaurant ghost towns. This created a huge unbalance making it tough for the few restaurants in residential areas to keep up with growing (pickup and delivery) demand. But truth be told, had the pandemic not happened, it was pretty much only a matter of time until the industry “rightsized” itself naturally.But still, the pandemic created a new dynamic that really had nothing to do with rightsizing and quite a bit more to do with an unplanned shift in consumer behaviors. As lockdowns became the new normal and retail (including restaurant) locations had to cease in-person business operations, consumers looked to digital to keep their lifestyles afloat. The restaurants that were able to make this digital pivot quickly—which just so happened to be the bigger chains with greater tech and financial resources to tap into—were much more successful at keeping the lights on. Additionally, those with drive-thru or home delivery offerings were able to weather the storm as well. As a matter of fact, during this time, digital ordering grew from less than 15% of QSR sales (pre-pandemic) to over 35% since COVID-19 became a household name. And now, it’s projected to reach 50% by 2025!Unfortunately, on the other end of the spectrum, those smaller brick-and-mortar restaurants, typically those of the ‘mom and pop’ variety, simply couldn’t keep up over the long term. True, many were able to take advantage of the initial groundswell of “support for local businesses” that happened when the pandemic first hit. However, as consumers became increasingly fatigued with lockdowns and other security measures droning on, they started to lean into their own self-interest (versus community interest) and began seeking out more convenient options. This swung the pendulum back towards the big chains that were able to leverage technology to streamline the customer experience as a tactic for winning back business.Now, under normal circumstances, the expanding and contracting of the entire QSR ecosystem, especially in highly saturated trade areas, would have been a function of demand dynamics simply doing their thing. Though, when faced with unpredictable demand irregularities that had, at the time, no real end date in sight, traditional momentum models based on long-term trends fell apart. This basically threw the entire industry upside down in one fell swoop.The proverbial ‘center’ of the QSR industry shifted in a big wayTraditionally, the heart of the QSR industry revolved around business or commercial centers—in other words, the places near where people would go to work or run errands every day. But the demand that was once created by daily commuter traffic disappeared almost overnight during this so-called “residential shift.” This threw big cities an unexpected curveball. With people no longer going to the office daily, restaurant owners had to ask themselves some serious questions: Were these locations (in business districts) even needed anymore? How long can we hold out—or buy time—until people start going back to work again? If we stay open, even with the help of tech or delivery offerings, where will our customer base come from? Although some business districts have started to rebound as return-to-work restrictions have finally started to relax across the country, the reality of long-term hybrid work is still very much on the table. If the pandemic has taught us anything, it’s that the myth that work can’t happen when people work from home got completely debunked. This may explain why, in spite of employees slowly returning to the office, occupancy is still hovering at 40% (give or take). Now, here’s the longer-term problem of this shift. With many office-based employees now used to not having to commute to work daily—much less during this frothing “great resignation” job market—employers have had to make hybrid or full-time remote work an available option in order to either retain current employees or attract new talent for open roles. This has essentially caused the center of the QSR industry to officially move away from bustling business districts toward the places where people live. This shift has also created new demand dynamics—and pricing pressures—that never existed before. Simply put, this has challenged the QSR industry’s ability to thrive in the face of a constantly changing consumer landscape.But this notion of “thriving” varies significantly based on where restaurants are located. On the one hand, while the restaurants in business districts have buckled under the pressure of reduced foot traffic, restaurants in residential areas, on the other hand, have had a renaissance that continues to build steam. Either scenario hasn’t necessarily been easy to adjust to.“The ongoing impact of the pandemic has called into question the future of tens of thousands of QSRs zoned in ‘work/shop’ areas. Fixing this supply-and-demand disequilibrium could last well into 2023.” – Mike Lukianoff, Data Science Entrepreneur & QSR Industry GuruUncertain times make pricing decisions a moving targetThe QSR industry has experienced various bouts of whiplash throughout the pandemic’s peaks and valleys. This has thrown traditional pricing models for a loop, especially in the face of labor shortages, supply chain constraints, and the inflationary environment we’re living in today. The question most restaurant owners are asking themselves now is, “How do I manage or maintain my cost structure without the potential risk of losing my customer base?” Answering that question isn’t necessarily easy. While there’s still a lot of great data (we’ll get to that soon) available to inform a restaurant owner’s decision-making, deriving a true sense of price elasticity in this unprecedented environment is much less reliable than it used to be. Though, not impossible. It just requires the QSR industry to build a new price elasticity model. The QSR industry needs to build a new value equation Had consumers not shifted their entire sphere of existence around where they lived during the pandemic, this would likely be a non-issue. Yet, when business districts suddenly had zero demand and residential areas, which were not previously zoned for as many restaurants, went into overdrive, the very nature of zoning had to shift as well. Until that happened, consumers had very few nearby options to choose from and, as a result, were probably more willing to pay a premium in order to get access to more restaurant choices (or at least, those with a larger delivery radius). Though, as demand increased in residential areas, the number of new restaurant choices naturally started to pop up. This essentially forced the QSR industry to build a new value equation for pricing decisions in this new environment.This is why restaurants must now approach price elasticity from both a product and location standpoint—and then develop assumptions around what the price “inflection points” would be based on the spending threshold of the consumers near those trade areas. In many ways, this new value equation is akin to doing a tightrope walk around what increase or decrease in prices would potentially cause consumers to change behaviors in some way (for the good or bad). The reason why you have to take both product and location into consideration here—thereby creating a truly localized approach to price elasticity—is because different trade areas could have radically different price inflection points. For instance, if you took into account location alone, you might assume that a trade area with few competitors would be ripe for price increases. But when you layer on the average income levels of the people living in or near that trade area, and perhaps find that those consumers don’t have the disposable income to justify price increases, any price hikes could backfire and cause consumers to consider other options. But then there are the factors of experience and convenience as well. If a restaurant offers a better overall customer experience, including online ordering and low-fee delivery service, along with a high-quality food product, then consumers may be willing to accept reasonable price hikes as long as the positive experience offered by the restaurant remains intact. Clearly, there’s still a lot of guesswork at play here because we haven’t firmly settled into this new industry dynamic. And because historical pricing models, which were deeply rooted in pre-pandemic demand dynamics and consumer behaviors, can no longer serve as a reliable blueprint for future planning, price elasticity will continue to be a moving target until the dust of the post-pandemic aftermath and so-called run-away inflation settles.“Traditional pricing models rely on stable demand trends for price elasticity and other demand metrics to work. Wild variation in demand has made this kind of modeling obsolete in the near-term.” – Mike Lukianoff, Data Science Entrepreneur & QSR Industry GuruLocation data is the key to making sense of it allIt’s clear that we can’t think about trade areas in the same way anymore. Although a restaurant’s location is still an important part of the value equation, the remote-work migration to residential areas has created a need for restaurants to be more aware of who their customers are as well as where they are coming from (i.e. Census Block Groups). This enables QSRs to group “like” restaurant locations by demographic cohorts instead of by retail trade area alone.But it doesn’t stop there. Because we are still operating in uncharted waters, it’s important for these businesses to look at competitor pricing (within the same trade area) as well, in order to understand what wiggle room there may be around purchase dynamics. And it’s only when you start tiering restaurant locations in this way that you can begin to make calculated assumptions about how certain locations can manipulate pricing based on various “opportunity” variables. Speaking of variables, there are two kinds that need to be taken into account for modeling: Static variables: These are the things surrounding a restaurant’s location in a trade area that, in some cases, haven’t changed in over 20 years but, as a result of the pandemic, now may look slightly different for the first time in a very long time. QSRs need to reassess what trade areas look like today as well as the new kinds of customers frequenting those locations (as in, where they are coming from, whether it’s from home or from work).This is basically the equivalent of pushing the ‘reset button’ around our understanding of trade areas and how they operate today. To derive these insights, you need access to high quality, granular, and accurate location, polygon, and foot traffic data coupled with up-to-date Census Block Group (CBG) data. Combined, these can paint a picture of what any given trade area of today really looks like.Dynamic variables: This is basically anything else that may fluctuate rapidly—much less in this atypical business environment we’re living in—or has the potential to provide more “texture” around the customers who visit a specific restaurant location. This could include loyalty program-, restaurant-, and marketing-related data as well as virtually any other data sources that can add a nuanced dimension to the static variables above.‍In the past, many QSRs relied on momentum revenue models to predict future patterns. But now these, too, need to be questioned, with a specific focus on capturing “error” on a daily basis—whether around revenue, supply chain issues, and so on. Unfortunately, many models fail to incorporate these outside factors into the mix and, thus, have now become somewhat out of touch with real-time consumer (demand) dynamics.The long story short is actually quite simple: The only way to make sense of this ever-changing environment is to lean into all of the data we have at our fingertips—especially location-based data—to create new awareness and understanding of how the QSR industry can get its footing and thrive (with a greater degree of confidence) in this soon-to-be post-pandemic world.So, what next? A more data-driven QSR industry.By now, it should be clear that the business model for the QSR industry not only had to change in the face of the pandemic but also must continue to evolve and adapt itself to ever-changing consumer behaviors and demand dynamics. The truth is, prior to the pandemic the business model was already dying a slow death. It needed to be rethought and resuscitated from the inside out. Unfortunately, it took a global ‘resetting’ event to put the wheels into motion. Now, under normal circumstances, the forced change that the QSR industry experienced during the pandemic is something that, perhaps, would have taken 20 years to come to fruition. But the industry didn’t have 20 years to pivot and, therefore, had no choice but to embrace compressed transformation (especially of the digital sort) in merely two years. Of course, this hasn’t been easy for anyone involved, consumers included, but nonetheless, it has signaled an opportunity to create a stronger, more resilient, and more adaptable QSR industry of the future.What does that look like? It’s still anyone’s guess because, truth be told, we’re all still connecting the dots as we work through this period of unprecedented change. However, what’s clear is that the industry as a whole is now reinventing itself through the lens of innovation and automation. Not only will this create better—and dare we say, more efficient—customer experiences, but it will also set the foundation for a more sustainable QSR industry. This is the key to being able to weather whatever other headwinds we may experience down the road. But none of this is possible without using high-quality data to inform decision-making. As we explained earlier, location-based data is an absolutely critical part of this equation. However, layering other relevant data sources on top of it is what will allow us to tease out cutting-edge insights at the local level that can fundamentally transform (for the better) how QSRs make decisions around site selection, pricing, product variety, inventory, and more with confidence. While massive change is always a challenge to work through, the opportunity it leaves in its wake is undeniable. This is the QSR industry’s time to shine in the (data-driven) spotlight.About SafeGraphSafeGraph is a data company that builds datasets on the physical world for leading companies like ESRI, Domino’s, Sysco, and Jefferies. Our high-precision Places dataset covers business listings and POIs for any brand, anywhere in the world, in addition to building footprints and transaction data. #### The Ultimate Guide to Alternative Data for Financial Analysis The finance industry is full of rapid change and heated competition. Analysts need to stay a step ahead of their rivals to provide the latest information and trend breakdowns to clients. At the same time, they need to be careful that their assessments are backed by enough legitimate data to avoid too much risk to their clients’ investments. That’s why many financial analysts are now consulting sources of alternative data to reinforce their research.But what is alternative data, and why is it useful to financial analysts? This guide will explain by covering where alternative data comes from, what forms it can take, where you can get it, what advantages it has over traditional financial data, and how different types of financial analysts can use it. Here’s a brief overview of the article:What is alternative data?How is alternative data generated?5 main types of alternative dataBenefits of using alternative data for analysisAlternative data analysis: how it’s usedTop alternative data providers: what to look for + comparisonsWe’ll start off with an alternative data definition so you understand a bit more about what alternative data is.What is alternative data?Alternative data refers specifically to data that is used in financial analysis, but is generated or collected in non-traditional ways. That is, the company doing the analysis doesn’t create the data itself, nor do they get it from official sources like press releases or quarterly earnings reports.So where, then, does alternative data come from? The next section will explain.How is alternative data generated?In general, alternative data is produced by three types of sources:Sensors: Sensors typically include things like satellites, mobile beacons, surveillance cameras, and WiFi hotspots. They mainly collect geospatial information, such as local weather conditions or imagery of a place from different angles. They can also track things like foot traffic in or around a particular area.Individuals: Non-transactional things that people do every day are some of the most plentiful alternative data sources. Someone posting a comment on a social network, leaving a product review on an e-commerce site, taking a survey, visiting certain websites, or even just downloading and using certain mobile apps produces data that businesses and investors alike can use.Business processes: Company operations can often produce data as a byproduct, which investors and other corporations may be able to take advantage of. For example, credit or debit card companies may track transactions they facilitate for specific businesses. Or a business may send out email or paper receipts when a customer makes a purchase; these can be tracked as data too. A company may also produce public-facing data on a website by advertising their prices, how much they’ve sold, or how many of a particular product they currently have left in stock.Generally speaking, alternative data that comes from sensors or individual actions tends to be inexpensive to acquire. But it also often doesn’t come in very workable formats, requiring a significant amount of processing to be made usable. In contrast, alternative data that comes from corporate actions tends to require little processing and can be mined for insights almost immediately. However, it’s usually more expensive.5 main types of alternative dataWe discussed in the previous section that different sources produce different types of alternative data. While neither those mentioned nor the following list are meant to be exhaustive, here are 5 examples of alternative data categories that SafeGraph deals in.1. TransactionsPeople buy and sell things every day, and data on these transactions is one of the prime sources of alternative data. Companies may post some of this information publicly (like on their website or in earnings reports), but typically only do so when they are mandated to by law. However, there are other ways to access this data.As one example, it could be possible to track a company’s sales and other transaction information through receipts they send out via email after a customer completes a purchase. In addition, many transactions these days are carried out through third party companies, such as financial institutions or payment processors (or both). So information on anonymous and aggregated debit card, credit card, and online account transactions may be available for purchase from the companies that facilitate them. This latter method is how we at SafeGraph compile our Spend dataset via permissioned consumer spending data for places.2. Human mobilityHuman mobility data refers to anonymized measurements regarding people’s movements within a limited geographical area over a certain period of time. At base, these include attributes like which specific places people visit, how many people visit, and how long people stay at one place before moving to another one nearby. It can also include places or directions from where people enter the area or to where they go after leaving the area. 3. Point of interestPoint of interest (POI) data refers to general information about non-residential places people may want to visit. Sometimes, these are monuments or other landmarks that attract tourists and other visitors. Often, however, they are places where people can buy or sell products and services. A few common attributes are hours of operation, price range, affiliated brands, product/service classification, street address, and contact information. Our Places dataset contains reliable, accurate POI data that is regularly updated.4. Property detailsProperty details refer to information about a parcel of land or any building(s) on it. It can include attributes like ownership and financial details, including leases, mortgages, previous sales, and assessed value. It can also include specifications about particular buildings, such as square footage, number of rooms, HVAC specs, construction materials, and so on.Another important attribute is the property or building footprint(s) - visual representations of the actual physical dimensions the property and its building(s) take up. This can include spatial hierarchy metadata of buildings that are separate units inside a larger building, such as apartments, mall stores, or offices in a business complex. Have a look at our Geometry dataset to see what we mean.5. DemographicsDemographics data is aggregated information about people in a neighborhood, city, state, country, or other geographic region. That includes attributes such as age, ethnicity, sex, income, employment status, marital status, and highest level of education achieved. Demographics data is important because it gives a general overview of people in a certain area, in terms of what their lifestyles are like and (therefore) what they are likely to spend money on.Much of this data is publicly available for free, but it isn’t always easy to access and organize for alternative data analysis. That’s why SafeGraph offers a cleaned-up version of data from the US Census Bureau’s American Community Survey for the years 2016 to 2019.Benefits of using alternative data for analysisWe still haven’t answered a very important question: why use alternative data at all? Why not just rely on financial data from official sources? As it turns out, there are at least four very good reasons to factor alternative data into financial analysis:Immediacy: One of the big disadvantages of traditional financial data is that there are often significant periods of time between when it’s published. So until the next batch of data is released, there’s a greater risk that the current batch may become stale and irrelevant. In contrast, alternative data is produced daily (or sooner), so using it makes it easier to stay on top of the latest happenings.Frequency: Another advantage of alternative data being produced more frequently is that it provides a larger sample size with which to make comparisons over time. This allows for greater accuracy in spotting trends, as well as being able to tell when a pattern is actually a trend and not just an anomaly.Context: Identifying trends is one thing, but understanding why they’re occurring is another. Looking at alternative data such as footfall, logistics, social sentiment, and even weather can help to forecast whether a company’s performance will continue as expected, or will be turning in a different direction in the near future.Creativity: Analysts using alternative data can examine deals from unique angles, which can lead to some outside-the-box investment strategies. They may be able to spot risks with an investment that aren’t immediately apparent, or they may find potential opportunity in other deals that don’t initially seem to have much upside.Now that we’ve established advantages to using alternative data analytics in finance, we’ll discuss examples of how investors might use it.Alternative data analysis: how it’s usedAlternative data can be used for a number of different functions in financial analysis. But how exactly it’s used can depend on whether an analyst is working for a specific client, or for a more public-facing firm that services multiple clients simultaneously. These are commonly referred to as “buy-side” and “sell-side” positions; we’ll briefly explain more about them below.Buy-side vs. sell-side side analystsFinancial analysts typically fall into one of two categories: buy-side and sell-side. They both use alternative data in similar ways; their main differences are in who they work for and what their roles are.Buy-side analysts: Their job is to seek out and make recommendations on investment opportunities for a specific client, based on that client’s investment strategy. That means they consult alternative data sources for helping hedge funds, private equity firms, and similar companies buy or sell financial assets for maximum returns and minimal risk.Sell-side analysts: Their job is to provide financial information and services to clients of a brokerage firm or investment bank. So they may consult alternative data sources for those lending money or making investment decisions, trying to get these people to continue doing business with their institution by giving them sound financial advice.Now that we’ve established some basic differences between buy-side and sell-side financial analysts, let’s look at how each type of role might use alternative data.Alternative data use cases for financial investors + moreAs we mentioned, alternative data use cases can differ depending on the type of role a financial analyst plays. But there are quite a few use cases that can benefit both buy-side and sell-side investors. Here are eight examples.1. Monitoring online activityRole type: BothA lot of things are being done on the Internet these days, so it makes sense to pay attention to what goes on there. Alternative data such as online transactions, web traffic, and mobile app use can give clues as to what people are interested in. So can comments on social networks or news sites, or online product reviews. Analysts may even be able to gain insights from data on logistics companies, as shopping from home is becoming increasingly common.2. Deducing industry or brand relationshipsRole type: BothUsing alternative data to account for geospatial relationships between businesses can also be a useful financial analysis strategy. Certain businesses that cater to similar lifestyles, but don’t directly compete, tend to do well when they are close and accessible to each other. So it’s important to take these complementary business relationships into consideration, rather than just competitive ones.3. Forecasting demand Role type: Sell-sideCertain kinds of alternative data can help sell-side analysts hypothesize if demand for various types of products or services will rise, fall, or remain steady. Online social sentiment and footfall counts around brick-and-mortar stores can be useful for this, but generally transaction data is a clearer indication. Points of interest data can be helpful, too, if it includes attributes on what kinds of stores are opening or closing in an area, and how many.4. Research for sourcing dealsRole type: Buy-sideBuy-side analysts can use alternative data to look at how consumers interact with stores and brands from different angles. They can look for things like competing or complementary points of interest in an area, how much foot traffic an area gets at certain times, how many people enter specific stores, and what brands are popular at stores relative to the ones they stock. This research can become even more informative if done across areas with similar geography, or in the same area within comparable time periods.5. Modeling financial performanceRole type: Sell-sideMeasures of consumer demand are just a part of modeling how a potential investment’s financial situation will change in the near future. Alternative data can allow sell-side analysts to factor in other things such as supply chain performance, online vs. offline sales, sales in areas with similar geographic profiles (e.g. points of interest, demographics, and human mobility patterns), social sentiment. Overall trends in an industry, or in related industries, can also be indicators of how an asset will perform for the foreseeable future.6. Performing due diligenceRole type: Buy-sideBeing able to assess potential investments on multiple levels is also important to buy-side analysts when they perform due diligence. They often deal in very large sums of money, which leaves little room for error. So having a greater volume and variety of data with which to examine all of an investment’s potential benefits and risks is a huge boon.7. Finding competitive edgesRole type: Sell-sideSell-side financial analysts are often expected to collect, analyze, and distribute reports on financial information quickly to keep up with competing firms. Since alternative data is usually fresher and more frequently produced than traditional financial data, it can be very helpful in this case. To illustrate, an investor may follow the social media accounts of particular companies to stay on top of their announcements and gauge how customers feel about them. Or they could look at POI data to track store openings and closings in a particular geographic area as an alternative measure of a company’s or industry’s performance.8. Managing portfoliosRole type: Buy-sideEven after investments are made, buy-side analysts need to monitor the market to ensure that assets are performing as expected. This is where being able to get timely and frequent updates from alternative data sources comes in handy. It allows analysts to spot, assess, and take action on unexpected changes quickly before they potentially result in huge losses.Top alternative data providers: what to look for + comparisonsAlternative data is a pretty broad field, so it should be little surprise that there are all sorts of different alternative data providers out there. But you want the ones that are going to get you the data you need at a price you can afford. This section will cover things to look for and questions to ask, as well as introduce you to reliable alternative data companies that you can count on for certain types of data.What to look for from an alternative data providerWhether or not you choose to buy from particular alternative data vendors should depend on more than just them having the kind(s) of data your organization needs. Here are points to consider to help you avoid compromising between quality and quantity of data.Scope: Does the supplier provide data in enough breadth or depth that you can make as objective an assessment as possible?Attribution: How much – and what kinds of – information does the supplier provide about each data point?Accuracy: Does the data convey precise and correct information?Freshness: Is the data current enough that it’s still relevant to present circumstances?Interoperability: How easy is the data to work with, especially in the context of connecting it to other datasets for broader analysis?Cost: Are you getting value for money by only paying for data that’s relevant to what you intend to use it for?Other questions you may want to ask about alternative data suppliers you’re thinking of sourcing data from include:Why should you source from a particular provider if others offer the same kind of data you need?How is the data they offer relevant to a specific question or problem you have?If they didn’t produce the data themselves, how have they filtered or otherwise processed it, and could this possibly bias your analysis?How much time and effort will be required to make the data usable, especially if it hasn’t been organized or filtered?Has the data been pre-organized to help reveal particular patterns or relationships, and what assumptions might this lead you to make?We know that can be a lot to process, so we’ll make things a little simpler for you by listing some alternative data firms we trust.8 best alternative data providers for deep analysisThere are companies out there that specialize in collecting and processing different types of data. Their goal is to make it easier for analysts to get the insights they need without having to do a lot of legwork. Here are our top alternative data companies for sourcing the information you need to make financial decisions faster, more accurate, and more creative.1. SafeGraphMajor data types: POI, property, transactionsKey use cases: retail investment, consumer insights, risk assessment, real estate site selectionSafeGraph is the market leader in global POI data. Our Places and Geometry datasets contain detailed information and building footprints for millions of locations worldwide. Our Spend dataset is the first US consumer transaction dataset that’s based on where people spend money, to give context to when and how they spend it. Use these datasets to analyze the relationship between consumers and retail stores, compare building locations with human traffic to assess accident risk, and more.2. ClimateCheckMajor data types: US properties and historical weather patternsKey use cases: real estate investment and risk assessmentClimateCheck is a valuable resource for those investing or insuring in the US real estate market. It processes historical US weather data through over 25 internationally-recognized climate change models to predict weather and climate patterns for the next 30 years. The resulting data offers snapshots for over 140 million US homes of how vulnerable they are to the potential effects of climate change – droughts, storms, fires, floods, and more.3. Greenwich.HRMajor data types: financial, labor statisticsKey use cases: workforce analytics, talent acquisition and managementLike the “.HR” in its name implies, Greenwich.HR lets you look at companies’ financial viability from a human resources perspective. See how many positions are open, what kinds of workers are being sought after, what salary ranges are like (for over 80% of jobs), and more at over 5 million companies from over 200 countries around the world.4. HARNESS DataMajor data types: UK points of interest, properties, and addresses; PDF document analysisKey use cases: real estate investment, insurance risk assessment, logistics planning, fraud preventionHARNESS DATA is one of the best alternative data sources hedge funds investing in the British Isles can consult. It has the most complete information on addresses, properties, and points of interest in the UK. That includes a free price per square meter assessment of over 16 million properties in England and Wales. It also has a software tool that can scan PDF documents for actionable, industry-specific data points.5. InfutorMajor data types: property, demographics, phone & email communication, automotive & other transactions, addressesKey use cases: real estate Infutor is one of the top alternative data providers for insights on US consumers. It offers a variety of alternative data on the US including demographics, address and property information, automotive transactions, online transactions, and email & phone communication metadata. So it’s a good source for getting a general overview of who US consumers are, where they spend money, and what they purchase. It can also be useful for real estate or automotive investment.6. TransparentMajor data types: vacation rental propertyKey use cases: travel & tourism investment, hotel investment, real estate investmentTransparent tracks public data on over 35 million vacation rental property listings across all major rental property booking platforms. Its dataset includes over 50 attributes on each listing including its address, number of bedrooms, occupancy limit, minimum booking period, and price. Monitoring the supply, demand, and competition in this market is useful if you’re investing in real estate, hotels, or travel & tourism.7. VerasetMajor data types: property, mobilityKey use cases: visit attribution, consumer insights, site selectionVeraset aggregates its data from multiple sources to deliver accurate footfall measurements around major points of interest in over 150 countries worldwide. It also has a more precise foot traffic dataset for the US that includes building footprints of over 6 million points of interest. This makes it easy to tell if someone actually entered a building or just walked past it.8. Vertical KnowledgeMajor data types: online transactions, rental property, transportation, business summaries, points of interest Key use cases: real estate investment, corporate research, retail insights, travel & tourism investmentVertical Knowledge sources public data on the Internet and processes it into a privacy-compliant form. So it has lots of different alternative data examples: lists of best-selling books, air travel statistics, cruise metrics, company summaries & reviews, retail locations & information, short-term rental property details, and more.Hopefully, this guide has given you an understanding of what alternative data is and why it’s being increasingly used in the financial industry. #### The Ultimate Guide to Competitive Intelligence Research In any kind of competition, you’re at an advantage if you can predict how a competitor will act and have a plan ready to counteract their moves. This requires paying attention to various sources of information to look for patterns of behavior, as well as other factors that can influence how participants will act. In the world of business, this information is commonly known as competitive intelligence.So what is competitive intelligence, and what differentiates it from companies outright spying on each other? How does one get the information needed lawfully? And how can your company turn this information into insights that give you an edge over the competition?No need to worry. When gathered within acceptable bounds, competitive intelligence is not just perfectly legal; it can also be a huge boost to the decision-making capabilities of several of your company’s departments. We’ll explain the basics of what you need to know in the following sections:What is competitive intelligence?Benefits of using competitive intelligenceSources of competitive intelligence: where to get it7 competitive intelligence analysis techniquesHow competitive intelligence works: how to perform competitive intelligence researchCompetitive intelligence examples to learn fromLet’s start off with a competitive intelligence definition to clarify what it is and what it is not.What is competitive intelligence?Competitive intelligence is the collection and analysis, by a company, of openly-available data on their competitors, which is then used to develop business strategies that outperform them. Such data can include press releases, advertisements, web content, patent filings, and so on.There is some debate over whether competitive intelligence constitutes a form of corporate spying. We will address these concerns below.Competitive intelligence vs. industrial espionageCompetitive intelligence only uses information gained through legally acceptable processes. Industrial espionage, however, involves stealing information that a company would otherwise be allowed by law to keep hidden. Examples include trade secrets or insider information on the company’s operations.The difference between the two hinges largely on whether the information in question has its confidentiality protected by law, as well as whether competitors obtain it through legal and ethical means. For example, a public company is typically required to publish a public quarterly earnings report. Using information found in a company’s public quarterly earnings report is fair game. On the other hand, if information is procured illegally (bribery, blackmail, privacy-violating surveillance, physical/digital theft, etc.) the information’s confidentiality is considered protected by law.Benefits of using competitive intelligenceSo what are the goals of competitive intelligence? Why pay attention to what your rivals are doing instead of – quite literally – minding your own business? As it turns out, your opponents can be some of your greatest teachers. This is because they’re by-and-large after the same thing: doing what your company does, but better in one or more ways.The purpose of competitive intelligence can be found in these and other benefits:Smarter sales tactics: If your sales team has a clear picture of how your company’s products and services stack up against competitors, they’ll be better able to address concerns from prospective clients and steer conversations to focus on your strengths.Personalized campaign marketing: Your competitors are your competitors because they’re trying to sell similar products and services to similar types of customers. Understanding how they’re succeeding or failing in reaching audiences can give you ideas on how to position your own marketing to target specific audiences.Better product development: Reverse-engineering a competitor’s product can help your own teams understand how it’s packaged and priced (for example), which can give teams a better starting point when designing products for your company.Risk mitigation: Why make the same mistake that a competitor already made themselves? By looking at what worked and what didn’t from things your competitors have already tried, you’ll minimize the chance that you’ll implement something that either doesn’t work or that your competitors have improved upon anyway.As great as this all is, we’re guessing that there’s still a rather large elephant in the room: where do you get meaningful information on what your competitors are doing? We’ll discuss that in the very next section.Sources of competitive intelligence: where to get itOne of the key parts of any good competitive intelligence plan is knowing where you can find the data you need without getting into legal or ethical trouble. To err on the side of caution, these are some safe and commonly-used sources:Open location information: If you’re in an industry with brick-and-mortar stores, looking at data on competitor locations can be informative. How accessible are they by foot or local transit? How much foot traffic does the area get? What space requirements do the stores need? When are their operating hours? Geospatial data like the kinds SafeGraph provides can give clues as to what might work (or not) with your own stores.Websites: Visiting competitors’ websites is a common way to get information on them. You can find clues as to what audiences they’re targeting, how they’re positioning themselves in their marketing, the products they’re selling, their price points, and other updates to their operations.News and press releases: Pay attention to news publications, especially if they specialize in business news. They often contain announcements by competitors on their activity, such as new products they’re launching, new people they’ve hired, or other expansion moves. Competitors may even showcase these updates on their own websites (under a “News” or “Press” section).Social networks: News feeds on social media are also potential sources of intel on competing businesses. They’re often updated frequently, so they can be good places to look for current news about products or services your competitors have in the works. Feeds that allow comments also let you see how customers are interacting with your competitors, including what they like and dislike about them.Job boards: Another way to get hints on what competing companies are up to is by looking at what new talent they’re trying to bring on. By searching job listing websites to see what positions competitors are hiring for, you might be able to tell if they’re expanding their businesses in new directions.Industry conferences: Trade shows and conferences are some other good opportunities to gather intelligence on your competitors. They’ll often be showcasing or speaking about their latest ideas for products, services, or business directions. You can also talk to participants and attendees to get their perspectives on what’s being presented.Marketing materials: Competitors’ advertisements can also be a source of information. What are their messages, and how are those messages communicated? How are the visuals (if present) designed? What products, services, or features are they focusing on? What audiences do they appear to be appealing to? Questions like these can help you figure out your competitors’ marketing strategies.Financial statements: Those of your competitors who publicly offer assets for trade on a stock exchange are required, by law, to periodically release earnings reports. You can look at these as indicators on whether or not your competitors’ business strategies are working.As you can see, there are many legally-acceptable options for finding out what competing companies are up to. And again, even data regarding their store locations and the surrounding areas can provide valuable insights when put in the proper context.Okay, so now that you know where to get the data for competitive intelligence, how do you actually put it to work? The next section will cover some competitive intelligence analysis techniques that you can use to get the most out of the data you’ve found.7 competitive intelligence analysis techniquesHaving raw data on what your competitors are doing doesn’t mean a whole lot on its own. You need to be able to organize it into a framework that allows your company to see patterns and themes that it can leverage to get ahead. With that in mind, here are a few analysis models that are commonly used for competitive intelligence.1. SWOT / TOWS analysisSWOT stands for strengths, weaknesses, opportunities, and threats. It involves looking at your business through four lenses: what your company does well or uniquely compared to your competitors; areas in which you could improve or your competitors are doing better than you; chances for positive things to happen to your company; and things that may negatively impact your business. Based on this information, you can do a TOWS analysis. This reverses the focus by asking how your company can seize opportunities and avoid/lessen threats by taking advantage of its strengths and shoring up its weaknesses.2. Porter’s Four CornersMichael E. Porter is a famed Harvard Business School professor who has developed several competitive intelligence methods. The goal of this one is to use competitors’ motivations and actions to predict their future behavior.On the motivations side, you have to consider two factors: drivers and management assumptions. Drivers represent your competitors’ goals, strategies, corporate cultures, leadership backgrounds, and values/missions. Management assumptions are what you think your competitors believe about their strengths and weaknesses, as well as their ability to take advantage of opportunities and deal with threats. Ask yourself what assumptions they may be making about these things, as well as about their overall involvement in the industry.There are two factors to consider on the actions side as well: strategy and capabilities. Strategy refers to how well your competitors’ actions are aligning with their stated goals. Capabilities represent the strengths, partnerships, and other resources that allow your competitors to execute their strategies or respond to threats. Weaknesses or incorrect assumptions here may provide opportunities for your own company.3. Porter’s Five ForcesAnother competitive intelligence framework created by Michael E. Porter, Porter’s Five Forces analysis, helps to gauge how competitive and profitable an overall industry is. The five forces are:How many companies in the industry are directly competing with each otherHow easily a new company can break into the industry and become a competitorHow many companies supply the industry, and how costly it is to switch between themHow many customers a company has, and how costly it would be to find new onesHow easily a company’s products or services could be replaced by alternativesGenerally, industries that have fewer directly-competing companies, higher barriers to entry, more suppliers, more customers, and fewer alternatives tend to be the least competitive and the most profitable. The inverse is also generally true. However, there are a few caveats to this, so we recommend using this model in combination with other analysis methods.4. Value chain analysisValue chain analysis is a third competitive analysis method from Michael E. Porter. It involves looking at the cost (money, time, and human resources) of each activity involved in creating and delivering a product or service, and then comparing that against the value that customers get out of said product or service. It commonly divides the process into five types of primary activities and four types of support activities:Inbound logistics (primary) – managing supplies of raw materials, unsold products, and other necessary equipment or itemsOperations (primary) – converting raw materials into finished productsOutbound logistics (primary) – delivering finished products to customersMarketing (primary) – creating awareness of a company’s products and services, especially to target audiencesService (primary) – providing customer support and product maintenanceProcurement (support) – sourcing raw materialsR & D (support) – inventing manufacturing techniques and ways to automate processesHR management (support) – hiring, training, and retaining employees who fit the company’s business strategyInfrastructure (support) – managing the composition of a company’s operational systems and leadership teamsConsider your company’s value proposition – what sets you apart from your competitors – and how you can fine-tune these activities to most optimally fulfill it. Also, think about how your competitors’ value chains may be configured to support their individual business strategies.5. BCG growth-share matrixThe BCG growth-share matrix is a business decision-making tool developed by the Boston Consulting Group. It weighs the success of a company’s products and services against the competitiveness of the overall market. Assets are grouped into four colloquial categories: “dogs” (low success, low competition), “cash cows” (high success, low competition), “stars” (high success, high competition), and “question marks” (low success, high competition).Generally, businesses want to focus mainly on “cash cows” and “stars”. The former are reliable sources of profit because they are not only very successful, but also don’t have many alternatives. Money from them should be invested in the latter, which are also highly successful but require lots of resources to make them stand out from a large number of other competing products and services. The potential benefit, though, is that a “star” may become a “cash cow” if it remains a market leader as competition dwindles.“Dogs” are assets that aren’t doing well in a market where competitors have already staked out most of the market share. They may be able to succeed if given different strategies, but usually are best abandoned. “Question marks” are assets that may quickly become profitable in a highly-competitive market, but will take a lot of resources to undercut established competitors. This makes the latter, along with “stars”, the best assets to concentrate competitive intelligence on to see if they are worth sustaining.6. Scenario analysisScenario analysis involves estimating how a company’s financial standing may change if key factors change or critical events happen (or don’t) over a specific period of time. It is often used as a risk management technique in response to unfavorable events, in order to conceptualize and avoid a theoretical worst-case scenario.Scenario analysis works in four steps. First, it identifies key events that could affect a company’s financial standing within the designated time period. Then it hypothesizes how likely each of those events is to happen, either independently or based on another event happening. Third, it estimates how much impact each event would have on a company if certain other events do or do not happen. Putting it all together, this type of analysis uses math and statistics principles to theorize how these various scenarios could play out. It often does so through computer simulations in order to process many of these outcomes quickly.The important thing to remember, though, is that these outcomes are only as accurate as your assumptions on what factors are important (and to what degree), and the data you use to support those assumptions. That’s why it’s important to have thorough business intelligence (or competitive intelligence, if using this model on competitors to predict their behavior) and minimize bias when performing this type of analysis. Otherwise, your conclusions could end up being wildly inaccurate.7. PEST analysisPEST stands for political, economic, social, and technological. It’s an analysis framework that looks at major external factors that can affect a company’s competitiveness in the marketplace. There are also variations of PEST that cover additional factors, such as SLEPT (which adds a legal dimension), PESTLE (which also adds the environmental/ecological dimension), and STEEPLE (which additionally brings in ethics).Political factors that might affect a company’s competitiveness include a country or region’s legislative changes to things like corporate tax and employment standards. The state of international trade relations can also greatly influence a company’s competitive power. Economic factors that can affect competition, meanwhile, include national interest rates and currency exchange rates; supply and demand for specific products and services; and a region’s level of economic growth (or lack thereof, including inflation).Social influences on a company’s competitiveness can include demographics, lifestyle trends, and changing cultural attitudes. The importance of these factors may depend on how broad or specific a company’s target audience is. Finally, technology can influence market competition based on scientific and technological developments within a particular industry, or within society at large. Additionally, this can be affected by the degree of national government investment.You can use PEST analysis to think about how outside disruptions and shifts may affect your business – and its competitors. This technique is especially effective when combined with methods like SWOT analysis, which lets you compare how prepared your company is to take advantage of opportunities or manage threats next to the competition.How competitive intelligence works: how to perform competitive intelligence researchSo what is the competitive intelligence research process? How do you collect data on your competitors, and formulate winning business strategies from it, without breaking the law? This section will put everything we’ve talked about so far together and show you how to build a competitive intelligence report from start to finish.Step 1: Identify both direct and indirect competitorsStart by doing some basic market research for your competitive intelligence. Look at the products and services your company offers, as well as what demographics your target market consists of. Then look for nearby businesses that sell similar products and target the same demographics. These are your primary competitors.It’s also good to identify your secondary and tertiary competitors. Secondary competitors are businesses that offer some of the same products or services you do, but tend to offer variations that attract customers in different demographics. Examples include luxury brands for affluent shoppers, or low-cost substitutes for those on tight budgets. You can use these types of companies to gauge where your company’s niche should be.Tertiary competitors are businesses that don’t sell the same products or services as your company, but still attract the same types of customers. You might be able to look to them for potential partnerships, or at least as inspiration for how to outmaneuver your rivals. However, you should also keep an eye on them because if they expand their product offerings, they could turn into secondary or even primary competitors.Our Places data and open US census data may be able to help you get started with this sort of analysis.Step 2: Determine what data you want to collect, and for what purposeDepending on what you (or the company shareholders) want to accomplish with your competitive intelligence research, the type of data you’ll need on your competitors might be different. For example, in many cases, marketing research plays a critical role in developing competitive intelligence. You might want to look at which products or services your competitors are really pushing, and why (maybe great customer feedback and testimonials). You might also want to see if you can find things like products or services that your competitors used to provide, but don’t anymore, and try to figure out why they discontinued them.The point is to specifically define what you’re looking for so you can narrow down where to look for it in the next step.Step 3: Find sources of data, and collect itOnce you’ve decided what information you want to find out about your competitors, you have to go find it. Financial data and marketing content are good places to start, but there are many other alternative sources of data you can consult. For example, SafeGraph data can give context to geospatial insights on competitor locations.Competitors’ websites and blogs show what they’re offering, advertising, or otherwise writing about. Similarly, social media feeds hold plenty of public information regarding feedback from your competitors’ audiences about their products and services. Be sure to check your company’s own social feeds to see if people are providing feedback for your own offerings.The actual data sources your company consults may vary, of course. Use your specific objectives from step 2 as a guide so you aren’t wasting time looking in places that don’t have relevant data.Step 4: Analyze the data you’ve found for strengths and weaknessesWhen you think you have enough information on your competitors to start drawing conclusions, it’s time to get analyzing. One thing that is helpful here is to build a competitive intelligence model (or two, or three, or more) based on your objectives from step 2. This allows you to organize the data you’ve found based on the metrics that are most relevant to your business.For example, you may want to do a comparison of which products or services your company and its competitors have in common, and which ones you do not. Or, you may want to compare keywords and specific messaging across marketing materials. You may even want to look at what the most common compliments and complaints are across customer reviews. You can use our examples of competitive intelligence techniques as starting points as well.Again, how you build your model (or models) depends on what your company is trying to achieve with its competitive intelligence analysis. The point, however, is that you don’t need to limit yourself to looking at the problem in just one specific way. If you’re finding it difficult to extract insights from a particular model, don’t be afraid to try a different approach.Step 5: Transform your conclusions into action itemsThe last part of the process is convincing the appropriate stakeholders in your company to take action and make decisions based on the observations you’ve made. Think of this step as your chance to be a storyteller: you’re trying to communicate not just what conclusions you drew from your data analysis, but also why they matter for a particular department or the company as a whole.How to do this effectively can differ depending on whom you’re presenting to. So if you’re not sure, ask them about the kinds of things they’re looking for. To give an example, a sales team might prefer battlecards: short summaries of key points on offerings, features, and pricing that show how your company and its products or services stack up against one or more competitors. These are useful when your salespeople only have a very brief time to impress upon potential clients why your company, product, or service is superior to the competition.Step 6: Rinse and repeat regularlyCompetitive intelligence shouldn’t be a one-time or even an every-now-and-then thing. Set a regular schedule – ideally, at least once per week – for when the company (or at least specific departments) should expect to receive competitive intelligence data. Also be sure to include historical data, insights, and trends in your reports if they’re applicable.This helps your company avoid three key problems. First, market conditions can change quickly, so gathering competitive intelligence irregularly can cause you to overlook events that present critical opportunities or threats. Second, presenting competitive intelligence reports irregularly means that the people in your company who need them don’t know when to expect them. This increases the risk that they’ll act rashly on the information. Third, without regular updates to look back on, stakeholders won’t have the necessary context to make decisions based on observed trends and other factors.Competitive intelligence examples to learn fromSo what do market research and competitive intelligence look like in practice? We’ve explained where to get the data, how to analyze it, and how to present it in order to inspire action from the appropriate stakeholders. Now, we’re going to show off a few competitive intelligence examples so you can see what the finished data product looks like.Retail cross-promotion opportunitiesThis measures foot traffic trends around retail locations in the US from March to April of 2020. That includes the number of visits, average dwell time, and average distance traveled to reach a location. These metrics are also sorted by industry.The purpose of this is to show how patterns in foot traffic to retailers – including those in specific industries – changed over the course of a month in response to shelter-in-place orders being issued to combat the spread of COVID-19. This could be used as competitive intelligence to see which businesses thrived or struggled through COVID-19 restrictions and make decisions on cross-promotion opportunities accordingly.Validating store counts for brands against company reportingA discussion of how to properly attribute store opening and closure metadata for branded points of interest, then compare it against official company reporting. It tackles issues such as limiting analysis to specific countries, counting child brands as separate from their parent brands, and differentiating between stores that are closed temporarily (due to renovations, health protocols, being outside operating hours, etc.) and stores that have been permanently closed.It’s a primer on how to use POI data as a form of competitive intelligence to gauge competitors’ business strategies. That is, you can compare a competitor’s reporting on their number of store openings and closures versus a geospatial measurement of those metrics to see if their strategy really is what it appears to be, and to what extent it’s working (or not).Mobile network footprintsA story map examining the market share of the three biggest mobile network providers in the US – Verizon, AT&T, and T-Mobile – throughout 2021. The map illustrates which carrier was dominant in each census block group, and also plots the retail store locations throughout the US for each carrier. Furthermore, the map shows what percentage of visitors to each retail store were on which network, and has an accompanying graph showing how many customers on each network also visited stores for the top 10 brands in the US. This has several potential applications for competitive intelligence. It can be analyzed to see to what degree network adoption correlates with store locations, how competitive the mobile network market is in specific regions, and which big brand(s) a carrier might want to approach for cross-promotion opportunities. We hope this guide has demonstrated to you the importance of competitive intelligence and analysis for your company’s strategic decision-making, as well as taught you ways to to get the right data required for it. In fact, you can start right here with us at SafeGraph. #### The Ultimate Guide to Geofencing for Marketing and Beyond Marketing companies and business departments are increasingly using a powerful geospatial tool: geofencing. With geofencing, advertisers are able to deliver targeted messages to specific locations and demographics, without wasting as much effort and money sending messages to people who aren’t likely to become customers. Many other individuals and organizations are also finding use cases for geofencing.So what is geofencing? How does it work? And how and why are marketers (among other people) using it to streamline their operations? This guide will give you a quick but thorough introduction to geofencing, including the following topics:What is geofencing?How does geofencing work, and what does it do?7 benefits of geofencingWhat is geofencing used for, and who uses it?We’ll start with a brief geofencing definition in an attempt to explain, in simple terms, what geofencing is.What is geofencing?Geofencing is the process of drawing a virtual boundary around a real-world place with computer software. When the software detects a location-enabled device crossing this boundary, it triggers one or more context-dependent actions. A common one is sending a push notification to a mobile phone.Of course, this is a rather reductive way to describe geofencing’s meaning. So the next section will dive into some finer details about the way geofencing works.How does geofencing work, and what does it do?Geofencing starts with drawing a virtual boundary around a real-world place with a computer program. Sometimes it is just a radius around a specific point, or sometimes it is a polygon representing the exact spatial extent of a place. Geofences can also be small or large, encompassing everything from a single building to the entire area covered by a postal code.Geofencing technology is designed to pick up signals sent out by location-enabled devices – commonly mobile phones – when they cross the geofence. These signals include GPS, WiFi, cellular data, and radio frequency ID (RFID). Geofences are often built directly into the coding of mobile apps, and usually require the user to consent to enabling an app’s location-based services before the geofences will work. This is mainly done to address privacy issues with geofencing and other forms of location-based tracking.So what does geofencing do? Quite a few things, actually. It is mainly used in marketing to send push notifications, text messages, and other alerts to nearby shoppers. But it can also be used to track people, vehicles, or items when they enter or exit a certain area, and can even be used to disable some technologies in the process. We’ll get into some specific use cases later.Now that we’ve answered the question “How does geofencing work?”, we’ll answer another one: why use it in the first place?7 benefits of geofencingSo what's geofencing good for? Well, the main reason to use it is to track activity across pretty much any area you want. And that area can be as broad or as precise as you want, often right down to an individual building. That way, you can monitor only the places that are important to you and leave out areas that are irrelevant to your purposes. This has major implications in advertising, but also for some other industries and use cases.Among other things, geofencing:Makes it easy to reach people: Many people these days carry a mobile phone with them to research things and stay in touch with contacts. This is especially true of those who live in urban areas, where there tend to be a lot of stores. So geofencing represents an easy way to reach potential customers without a lot of effort.‍Allows for better-targeted advertising: Geofences allow marketers to limit their marketing efforts to specific geographic areas. They can even be tuned to send out different messaging to match consumer preferences or special events happening in the area. This allows a company to advertise to people who are most likely to become customers, at the places where those people are most likely to be found.Has immediate impact: Most people don’t take more than a few minutes to start reading mobile phone messages after they receive them. Since geofences automatically send out messages as soon as someone crosses them, they represent a way to get a person to do something when they’re in exactly the right area to do it.Helps a brand stand out: Stores in busy urban areas likely have many rivals nearby. Marketing with geofences lets a store or brand stay at the top of customers’ minds and gives customers reasons to shop with them instead of at competitors.Costs less money: Mass advertising is expensive because it sends a company’s messaging out over a broad area. It’s also not very cost-efficient because the consumers who receive that messaging are equally as likely to be interested in it as not. Geofence-based advertising solves these two problems by only targeting specific areas and demographics that a company is likely to get conversions from.Can be used to gather data on consumers: Geofencing can also be used to collect information, which can make it a very effective trade area analysis tool. A company that sets up geofences in an area can learn about things there like foot traffic patterns, popular nearby stores, how long people stay in one place, how often their messaging leads to a conversion, and which demographics convert most often. The company can then use this data to inform their other marketing efforts.Can track the movement of people or other things: Though subject to privacy stipulations, geofence monitoring has applications for a number of businesses and other institutions. For example, they can track their property or products to prevent them from being stolen or used improperly. Or they can track people or animals to tell when they enter or exit a specified area. This helps to improve efficiency by curbing the unnecessary loss of assets.What is geofencing used for, and who uses it?Part of being able to define geofencing is explaining who uses it, and for what purposes. We’ll begin by summarizing what geofencing is used for in some different industries and business roles. Geofencing is used in:Market research / advertising: Trade area analysis in order to target advertisements at specific areas and demographics.Asset management: Tracking when items leave a specific location, and restricting access to them if necessary.Fleet management: Tracking logistics vehicles to make sure they’re going to the correct destination(s).Drone management: Issuing a warning or deactivating a drone if it enters or leaves a certain area.Pet and livestock care: Alerting owners if pets or livestock wander off a property.So what is geofencing used for – or what can it be used for – specifically in these scenarios? Here are a few more in-depth examples.1. Converting customers from competitorsA clever way that some marketers have used geofencing is by placing geofences around locations owned by competitors. That way, when people enter rival stores, a company can send them targeted advertisements showcasing things like better product selection, lower prices, or discount opportunities. That may be enough to get consumers to at least visit the advertised company (especially if they don’t already know where the company’s stores are located). They might even buy something, and that could be the beginning of a full-on brand loyalty switch.2. Measuring conversion rates of ad campaignsMarketers can combine the advertising and data-collecting capabilities of geofences to gauge how effective their campaigns are. For instance, they can create advertising geofences to count how many people entered the areas where they’re serving digital ads, or were exposed to a physical advertisement (such as a billboard). Then they can create visit attribution geofences around the businesses they’re advertising to measure how many people actually entered stores.3. Connecting a business’s offline and online presencesAnother way companies can use geofence-based advertising is to increase online customer engagement. They can set up geofences to notify people who enter their stores (or even just pass by) of where to find their website or social media accounts. This can prompt people to go online for more information, news, and even discounts from the company. Customers can also go on social media to ask questions or provide feedback, helping the company improve its products and services.4. Providing services or information from within appsMany modern companies have their own mobile apps, or are partnered with companies that have apps. They can use geofences to mark out certain points of interest and offer useful information about them. For instance, they can point out which nearby locations have free WiFi available. Or, if it’s an app for a particular brand, the app can notify the user when they’re near a store that sells that brand’s product. This helps to keep the company or brand at the top of consumers’ minds because it isn’t just selling to them; it’s doing something to help them out.5. Tracking, securing, and recovering propertyBusinesses can install location-enabled devices in their products or office supplies. Then they can set up geofences to track whenever one of these items leaves their office, warehouse, etc. This can help reduce expenditures by limiting theft that requires them to repeatedly replace their assets.This is especially important for computers and other digital equipment that is meant to only be used on the premises because it contains confidential information. A geofence can be programmed to automatically lock these devices if they are taken outside the building or office complex.6. Coordinating logistics operations and systemsCompanies that maintain fleets of trucks for delivering products or otherwise moving things around can make use of geofences as well. They can set up geofences as virtual checkpoints along their trucks’ routes. On a basic level, this can help them make sure their trucks follow their prescribed routes, as well as actually make it to their destinations and back.Companies can also use these geofences to do things such as test different truck routes for speed efficiency, accounting for things like traffic and weather. As an extension, they can also compare the average time spent on specific routes with how long a particular truck is spending on a route. This can help them evaluate things like driver performance and fuel efficiency.7. Ranching and pet careRanchers can create geofences around their properties and then attach tags, collars, or other location-enabled implements to their livestock. That way, if an animal breaks through a physical fence or wanders away, the geofence can immediately alert the ranchers as to where and when it happened. This allows them to quickly find the animal before it gets too far.As an added bit of security, geofences can be programmed to give livestock very mild electric shocks when they cross one. These shocks aren’t enough to harm the animal, but they may be enough to deter it from going somewhere it isn’t supposed to. Some who own pets that like to run around outdoors also use geofences this way.8. Protecting people and property from dronesFlying drones can be used for things like racing, delivering items, and taking photos or videos. However, they can be dangerous or privacy-invasive if they are used improperly. That’s why organizers of sanctioned activities can set up a geofence location so that the drones only operate within that specific area. If a drone is piloted outside of this zone, the geofence can warn the operator or even shut down the drone if necessary.For anyone who has ever wondered, “What does geofencing mean?”, we hope this overview has given you an idea of what geofencing is, how it works, why it’s used, and what it can be used for. Of course, to make geofencing work properly, you need to have accurate location data with which to build your geofences. Let SafeGraph take the worry out of that for you with our trusted Places and Geometry datasets for points of interest and building footprints, respectively. #### The Ultimate Guide to Mobility Data: Sources, Benefits, and Applications Note: SafeGraph does not license, sell, or provide individual-level mobility or foot traffic data. This guide is for educational purposes only. However, SafeGraph enables partners to build mobility insights by leveraging our Places and Geometry data. There’s a saying that many things don’t happen unless people make them happen. But where and when people do things can give important context to what they actually do. That’s why it’s important to have mobility data when building a geospatial data ecosystem. Although SafeGraph does not license or sell mobility data, SafeGraph enables partners to build mobility insights by leveraging our Places and Geometry data, which serve as the foundation for working with it. But what is mobility data, exactly? Where does it come from? Why is it advantageous to have? And what kinds of applications can you use it for? We’ll answer all of these questions and more as we cover the following: What is mobility data? 3 benefits of using mobility data 7 advanced use cases of mobility data We’ll start with a mobility data definition, and an explanation of how mobility data fits into larger geospatial and big data systems. What is mobility data? Mobility data, in a geospatial context, is an aggregated, anonymized measurement of people’s movements surrounding points of interest (POIs) or neighborhoods (i.e. census block groups or dissemination areas). It can include where people come from, how long they stay, and where they go afterwards. Social mobility data is sometimes also referred to as “footfall data” or “foot traffic data”. It can be collected manually, through GPS signals, through connections to WiFi networks, through mobile beacons, and more. Apps often ask users if they want to opt into mobility data sharing before their data can be collected and anonymized as geospatial data. An important note is that, for privacy reasons, mobility data collection does not mean tracking specific people or their activities. Rather, it only involves anonymously noting how many devices enter the proximity of a point of interest or other area, when they enter, and how long they stay. Big data and mobility data: how it all fits Mobility data is an integral part of the geospatial big data ecosystem, often combined with data types like property/boundary (i.e. the physical and jurisdictional dimensions of a place) and POI (i.e. what a place actually is). Together, these data types make it a little easier to see not only where people go, but why they might go there and what they might be doing there. Other types of big data and mobility data can form powerful combinations as well, for use in both the commercial and civic sectors. We’ll discuss some specific use cases a bit later. 3 benefits of using mobility data Mobility data management and exploration comes with several advantages. Here are a few of the most significant benefits: Give human context to other data: Knowing where a place is and what it looks like might not be all that helpful if you don’t know whether people are actually going there or not. Human mobility data lets you know where people are active, so you can react to – or even predict – where they go and what they might do once they’re there.‍ Get your insights fast: Sources of mobility data are in action every day, and the data itself is collected and processed fairly quickly as well. This means you can use it to do analyses without waiting for official documents that may come out much more infrequently, such as company finance reports, government announcements, or news stories. Measure consumer activity over time: Another benefit of mobility data being produced so consistently is that it allows for more granular insights with respect to time. With enough data, you can see patterns in where people frequently go at certain times of the day, on certain days of the week, during certain seasons, or on specific holidays. 7 advanced use cases of mobility data Mobility datasets are most powerful when combined with other datasets (geospatial or otherwise). Here are some examples of what global mobility data can do when it’s used as part of a larger data ecosystem. 1. Trade area analysis Trade area analysis involves figuring out what types of businesses have opportunities in a geographic area, and who their competitors (or businesses that may complement them) might be. This is where adding data on human mobility to POI information on what businesses already have a foothold in an area can come in handy. For example, if people have to travel a long distance to shop at a certain type of store or access a certain type of service, it could be a signal that there’s room in the local market for a similar service that’s more accessible. Or the flow of human traffic inside or out of an area could point to consumers visiting (or avoiding) certain types of businesses after others. There may be complementary businesses in the area, or competitors that people are passing by because they already have what they need (or perhaps patronizing if they couldn’t find exactly what they wanted at stores they already visited). 2. Retail site selection As a business owner, you usually want to move your store(s) to areas that have high foot traffic. That way, you get exposure to more potential customers for your business (though not necessarily if there are too many established competitors already in the area, which is why you should do trade area analysis first). Conversely, if you notice an area where one of your stores is located isn’t getting as much foot traffic as it used to, perhaps it’s time to close down and/or move out of that area. You can make that decision in concert with POI data on that particular store to see if it’s costing more than it’s bringing in. 3. Location-based marketing Even if a business can’t set up their store in an area with high foot traffic, they may still be able to draw in a larger customer base by setting up advertisements there. That way, people will know where to go if they can’t find what they need at stores in the high-traffic area. Of course, the company will also need to look at property data to see what advertising space is available in the area. They may also need to analyze street-level data to see how accessible their store is from where they place their ads, or even explore opportunities for mobile advertising (on buses, taxis, etc.). For a deeper dive into how to use mobility data this way, check out our guide to better location-based marketing. 4. Consumer insights Cloud mobility data never reveals any personally-identifiable information, as per privacy regulations. But there are ways to get a general sense of who consumers are based on where they go and what they do. This is where footfall data shines when combined with datasets like POI, demographics, and anonymized purchase records. To illustrate, a company could look at demographics data for neighborhoods around their stores to get a sense of the average age, gender ratio, income, etc. of the people who live there. They can then compare that with other behavioral data (including anonymized credit or debit card transactions or WiFi network connections) for the area around their stores to discover shopping and other activity patterns that approximate their clientele’s lifestyles. Then, based on data about the company’s own stores, they can modify their operations to appeal to the demographics and shopping patterns of people in nearby neighborhoods. For example, they could arrange a store so that departments with products nearby consumers are likely to buy are easily accessible. Or they could design their advertising so that products popular with surrounding demographics and lifestyles are front and center. 5. Financial investment research Traditional official financial data is often produced too infrequently to be useful these days, so investment firms are increasingly using alternative data for faster insights. One basic way they can do this is by joining mobility data with accurate property data to perform visit attribution analysis. This involves measuring how many people in a geographic area around a store actually entered the grounds of the store, and how long they stayed, versus how many simply walked past the store. Understanding mobility based on GPS data alongside POI data, property data, credit/debit transaction data, and other metrics provides a clearer picture of an individual store’s sales performance. And when performed on several stores, visit attribution can give clues regarding a company’s financial health long before quarterly reports or other official indicators are released. 6. Insurance liability assessment Insurance companies can use mobility data modeling, management, and understanding alongside POI, property, and even environmental data to create more accurate liability policies for both commercial and residential spaces. Average foot traffic plays a role in how likely a person is to be accidentally injured on a particular property, along with the property’s location (relative to weather patterns and other hazards) and layout. Also remember that footfall traffic patterns can increase or decrease depending on time and other factors. For example, some places are busy during the day, while others are patronized mostly at night. Other places may see foot traffic spikes on certain days of the week, or during certain seasons. These increases, combined with environmental hazards (darkness, rain, ice/snow, fog, etc.), can make risk profiles for some places different from others. 7. Urban planning Mobility data can be used for the public good as well. Government agencies can combine mobility patterns, big data, and transport analytics to compare where people go throughout a day versus how easily they can get to each place with existing transportation routes and methods. This may tell them that they should plan to locate more essential services (such as hospitals) near places where people typically gather. Or it could be a signal to build out critical infrastructure to make it easier for people in outlying areas to access important facilities. Knowing where people go, and when, can complement information on what they do. And this can lead to insights concerning why people go certain places, as well as do certain things at particular places and times. This can give you a sizable and timely advantage in determining what you should do, whether you’re trying to set up a business, attract more customers, make safer or more lucrative investments, accurately assess risk, or serve your community better. #### The Ultimate Guide to Points of Interest Data In this guide, you'll learn about points of interest (POI) data & its applications, where to download free POI data, how to evaluate points of interest databases, and the top alternatives to the Google Places API & Factual Places API.What is a POI (point of interest)?A point of interest is a specific physical location which someone may find interesting. Restaurants, retail stores, and grocery stores are all examples of points of interest. Since the phrase is a mouthful, 'points of interest' is often abbreviated as ‘POI’. Many companies that sell POI databases or POI data API’s name their products ‘Places Data APIs’.What are some applications of POI data?Retailers and retail analytics firms use POI databases to get lists of where stores are located. Along with store location info, they use metadata about a POI such as the category (NAICS code) to power site-selection and trade area analysis.AdTech companies create geofences around points of interest to create location-based audiences. They also join the GPS data with POI geofences for advertising measurement (online-to-offline attribution).‍POI data is also used in the real estate, consulting, and financial services industries.Where can you download free POI data?POI Factory is one source of free POI data, but coverage and data freshness are lacking. POIplaza is another source of free POI data but it suffers from similar problems.‍In partnership with ESRI, SafeGraph has free points of interest data available on the ArcGIS Marketplace. However, this data isn’t exportable and is meant only for visualizing POI data on a map for ArcGIS Online users.‍For bulk access to POI data, there are several paid enterprise offerings (each with pro's and con's) covered below.What are some pros & cons of the Google Places API?The Google Places API generally has the best accuracy and one of the most comprehensive datasets of POI in the market. It is high quality and they have a fantastic team that keeps the data up-to-date.However, one of the biggest disadvantages with Google Places API is the licensing terms. It’s extremely hard to legally use their data unless you are just displaying it directly to users on a Google Map. For example, Apple or Uber find it tough to license Google data in a way that works well for them.The other worry with Google Places is that they might change their (already restrictive) terms or increase their already expensive pricing. Google did that in 2018 and it hurt various geospatial and mapping companies.What are some Google Places API alternatives?The Google Places API is one of the best-known Places APIs but enterprises often choose alternatives such as Foursquare Places, Factual Places, Facebook Places, and SafeGraph Places due to factors such as price, licensing terms, and data availability.‍At SafeGraph, we’ve seen companies switch from Google Places to SafeGraph Places due to SafeGraph’s broad and permissive licensing terms. Short of directly re-selling our data, almost anything goes since SafeGraph is just a data company. We encourage companies to use our data to create derivative products and applications. Our Places data is also much cheaper for enterprises looking to buy data in bulk.‍Lastly, SafeGraph Places has attributes that the Google Places API does not offer. For example, when it comes to building footprint data (geofences) for a POI, Google has that data internally but it isn’t offered via API. Thus, many companies choose to augment their Google Places data with SafeGraph Geometry data or to use SafeGraph Places exclusively.What are some strengths of Factual Places data?Factual Places is another well-known Places API provider. They have global data on places, supporting 130 million POI in 52 countries. They also have many granular POI category specific attributes. For example, for restaurant POI, they have data on whether the restaurant accepts reservations, is cash only, is kid friendly, or has a kid's menu.What are some Factual Places alternatives?For several reasons, businesses choose Google Places, Facebook Places, Foursquare Places, and SafeGraph Places as alternatives to Factual Places.‍Businesses chose Foursquare Places vs. Factual Places because Foursquare has POI in every single country on earth while Factual only has data on 52 countries. Due to the user-generated nature of Foursquare data, Foursquare places coverage is also higher and more accurate for POI that people tend to check into often, like restaurants and bars. But for this same reason, at less checked-into POI, like a doctor's office or grocery store, Foursquare data can be lacking compared to Factual’s data.‍Businesses also often chose as an alternative to Factual Places due to SafeGraph’s broad licensing rights, price, and high accuracy coverageUnique to SafeGraph, almost every POI has data on the exact building footprint (polygon) for a place. Factual (and all other popular POI alternatives) offer only a location’s centroid.‍For geofencing applications, where knowing the precise location of a place is crucial (such as for turning GPS data into store visit intelligence), SafeGraph polygons make SafeGraph superior to other Places data vendors.How should one evaluate a POI database provider?Evaluating a POI database can be a tough task, so SafeGraph put together a POI data evaluation guide and a data evaluation checklist. Some critical questions to ask when evaluating a places database:What countries are covered? Is it US-only or international?Precision (how accurate is the data?) Are the places open businesses and not real consumer POI, like home-business LLC’s, or stores that went out of business?Recall (what percentage of real-life locations are actually in the given dataset)? You can measure this for both national brands (like McDonald's) as well as for long-tail POI like mom and pop stores.Completeness (what is the fill rate for attributes?) POI data providers boast about all the metadata for a POI, like open hours or phone number or category (NAICS code) but the fill rate might be lacking. How complete is each listing on average?What is the granularity of a POI location? Is it just POI addresses? POI centroids? What about the exact building footprints (geofences?) for a POI?Why is getting accurate places data so difficult?The physical world is in constant flux. Maintaining accurate data on places is an immensely difficult engineering problem. At SafeGraph, even with a top machine learning and data science team, we still find bugs and inaccuracies in our own data all the time (and we'll openly publish known discrepancies each release for transparency).You can read more about the complexities of physical data in our blog post, Forget ML:  4 Weird Edge Cases Which Confuse Even Humans When It Comes To Places Data.How are points of interest categorized?POIs are classified by NAICS codes. NAICS stands for the North American Industry Classification System. Some example NAICS categories for POI include “Restaurants & Other Eating Places” and “Grocery Stores.” Some examples of NAICS POI sub-categories are “Limited-Service Restaurants” and “Convenience Stores.” You can view SafeGraph’s full list of POI categories to understand our POI data coverage by different NAICS code.‍Classifying a POI to its correct NAICS code is an incredibly challenging problem that we’ve worked heavily on at SafeGraph. We tackled this problem by using natural language processing and human feedback to analyze the content on a POI’s website and use that to map the POI to its most probable category.How is POI data sourced?The biggest problem with POI data is that physical places info is fragmented around the web. Company store locators, governments, commercial real estate companies, and user-review sites all have some POI data but with varying attributes, freshness, and accuracy.The tricky part is combining all the data into a unified schema and also verifying the accuracy of the different underlying datasets. You can read more about how SafeGraph creates its POI data. #### The Ultimate Guide to SafeGraph’s Geocode Data   Key Takeaways Geocoding converts addresses and place names into precise geographic coordinates. High-quality geocoded data improves navigation, analysis, and operational accuracy. Differences in geocoding data quality can lead to meaningful real-world inaccuracies. SafeGraph’s geocodes prioritize precision, consistency, and regular updates. Choosing the right geocoding provider directly impacts downstream analytics and decisions. In today’s dynamic world, accurate and timely geocoding data is more important than ever. According to Gartner, poor data quality costs organizations nearly $13 million per year. Businesses, organizations, and research institutions rely on precise location data to drive successful operations across use cases —from logistics and navigation to marketing and spatial analysis. This guide will explore what geocoding is, its importance, a comparison between SafeGraph’s geocoding, Google, and OpenStreetMap (OSM), and the wide-ranging applications of geocoded data. What is Geocoding? Geocoding is the process of transforming descriptions of locations, such as addresses or place names, into geographic coordinates (latitude and longitude). In simple terms, this explains what geocode means. This process is fundamental to numerous applications, enabling precise mapping, spatial analysis, and more. Some of the benefits include: Enhanced Navigation Accurate geocoding ensures navigation systems provide precise directions, critical for logistics, ride-sharing apps, and personal navigation devices. Geocoding data enables these systems to translate an address into a pinpoint location on a map, ensuring users arrive at the exact destination without ambiguity. Improved Customer Experience Businesses use geocoded location services such as Placekey to guarantee accurate deliveries and help customers find store locations effortlessly, enhancing overall satisfaction. By converting addresses into geographic coordinates accurately through reliable geocoding, companies can ensure their logistics operations run smoothly, reducing delivery errors and improving customer service. Data Analysis and Visualization Geocoding facilitates effective analysis and visualization of spatial data, revealing geographic patterns and trends that inform strategic decision-making. It allows businesses to overlay address-based data on maps, making it easier to spot trends and make data-driven decisions using geocoded data. Comparing SafeGraph's Geocoding vs. OSM & Google Whether building a consumer-facing application or developing an internal analytics tool, product managers and engineers rely on accurate geocoding as an essential component. We conducted an analysis comparing SafeGraph’s geocoding data to OpenStreetMap (OSM) and Google in the Little Rock area to assess precision and reliability. SafeGraph Data: Precision: The average distance between SafeGraph geocodes and actual locations is very small, with an average deviation of just 2.17 meters. Consistency Across Locations: Most SafeGraph geocoded points are highly accurate, ensuring that POIs are correctly placed. OSM & Google Data: Variable Precision: While the average distance between OSM geocodes and actual locations is also 2.17 meters, there are notable outliers: Schlotzsky's: 15 meters difference Subway located in a Walmart Supercenter - OSM and Google pin situated on opposite end where a nail salon is located These outliers indicate that some OSM and Google geocoding data could return neighboring buildings or businesses, leading to potential inaccuracies in real-world applications. How SafeGraph Curates Geocodes and POIs SafeGraph's geocoding and POI data are meticulously curated to provide the most accurate and up-to-date information at scale. Monthly Refreshes Our Places dataset is updated every month, capturing the dynamic nature of the physical world and ensuring you always have the most current geocoded data. Data Validation SafeGraph employs rigorous data validation methods to maintain high data quality: Web Crawling: Seamlessly gathering public web data from thousands of sources. Machine Learning Models: Cross-referencing, validating, and classifying POIs at scale during the geocoding process. Human Validation: Developing ground truth data to ensure the highest precision and recall. Once all of the data is ingested, we go through a rigorous de-duping and merging process to make sure the dataset is clean and ready for use as reliable geocoding data. Choosing the Right Geocoding Provider Accurate geocoding is essential for a wide range of business applications, including marketing, logistics, urban planning, navigation, and retail analysis. High-quality geocoded data ensures precise mapping, efficient routing, effective spatial analysis, and more, which are critical for making informed decisions and optimizing operations. While there are various providers offering geocoding data, evaluating factors such as accuracy, coverage, and data updates can help businesses choose the data provider that best meets their needs and supports their specific use cases. Additional Resources Placekey’s Geocoding Service Placekey recently released a Geocoder that converts POIs and addresses into specific location coordinates (latitude and longitude). This helps clarify what is geocoding in practice. As discussed in this blog, accurate coordinates are essential for maintaining precise address records, performing mapping, logistics, and business analysis. The Placekey Geocoder offers rooftop accuracy, ensuring the highest level of precision for your geocoding needs. Geocodes can be easily obtained by making a simple API request. The Placekey Geocoder has a generous free tier of 10,000 per day and allows you to store the geocoded results received forever. SafeGraph Places Data SafeGraph Places is a comprehensive dataset composed of high-quality POIs, leveraged by thousands of organizations globally who trust the data as their primary source of truth. It includes geospatial attributes such as address strings, geographic coordinates, brand affiliations, open and close dates, and NAICS or category codes that support accurate geocoding data workflows. FAQ’s 1. What is geocoding? Geocoding is the process of converting location descriptions such as addresses or place names into geographic coordinates like latitude and longitude. 2. What is forward geocoding? Forward geocoding translates an address or place name into geographic coordinates. 3. What is reverse geocoding? Reverse geocoding converts geographic coordinates back into a human-readable address or place name. 4. What is a geocode? A geocode is the geographic coordinate assigned to a specific location after the geocoding process is completed. 5. Why is geocoding important? Geocoding enables accurate mapping, navigation, spatial analysis, and location-based decision-making across industries. 6. What is the difference between geocoding and geolocation? Geocoding converts addresses into coordinates, while geolocation determines a device’s or user’s position, often in real time. 7. Why do businesses use geocoding data? Businesses use geocoding data to support logistics, validate addresses, analyze spatial patterns, and improve operational accuracy. 8. How does geocoding help recognize patterns? By mapping data to specific locations, geocoded data makes spatial patterns and trends easier to identify and analyze. 9. How does geocoding support address validation? Geocoding helps confirm whether an address corresponds to a real, precise location, reducing errors in delivery and analysis. 10. What are additional geocoding techniques? Additional techniques include rooftop-level geocoding, POI-based geocoding, and hybrid methods that combine automated models with human validation. Geocoding is the process of converting location descriptions such as addresses or place names into geographic coordinates like latitude and longitude. Forward geocoding translates an address or place name into geographic coordinates. Reverse geocoding converts geographic coordinates back into a human-readable address or place name. A geocode is the geographic coordinate assigned to a specific location after the geocoding process is completed. Geocoding enables accurate mapping, navigation, spatial analysis, and location-based decision-making across industries. Geocoding converts addresses into coordinates, while geolocation determines a device’s or user’s position, often in real time. Businesses use geocoding data to support logistics, validate addresses, analyze spatial patterns, and improve operational accuracy. By mapping data to specific locations, geocoded data makes spatial patterns and trends easier to identify and analyze. Geocoding helps confirm whether an address corresponds to a real, precise location, reducing errors in delivery and analysis. Additional techniques include rooftop-level geocoding, POI-based geocoding, and hybrid methods that combine automated models with human validation. #### Top 10 Uses of Geospatial Data + Where to Get It We’ve looked at what geospatial data is and where you can find it, but what is it used for? That’s a bit of a complicated question. There are many ways it can be used, some of which are related and others which are actually built on top of one other.For example, most use cases of geospatial data involve visualizing the data as a map. From there, a business may use their analysis to choose where to locate their stores or advertisements, and a private equity firm may use data on that to decide which companies are worth investing in. Or a government agency may use a different kind of analysis to determine where critical public buildings and infrastructure should be built.To demonstrate further, we’ll look at 10 popular uses of geospatial data and some of the types of data that power them.Top 10 uses of geospatial data & where to get the data you needGeospatial data is often used in scientific or government administration contexts, but it has an increasing number of commercial uses as well. From retail to investment to insurance, here are 10 scenarios where you can make use of geospatial data.1. MappingOne of the most common examples of geospatial data use is visualizing the area that the data describes. Whether it includes building footprints, transportation routes, or other points of interest, a precisely-drawn map based on accurate location data can be an immensely powerful tool. And not just to travelers who may not know their way around a particular area, either.Mapping forms the foundation of many other methods of using geospatial data.A good place to start is with SafeGraph’s Geometry Data. It uses polygons to show the locations, sizes, boundaries, and relationships between various points of interest (POIs).2. Site SelectionIf you’re a business or other organization, you want to set up shop(s) where customers and other supporters are likely to visit you. Likewise, you also want to be able to tell when things have changed around a location that make it no longer viable, and it’s time to close down and move on. Geospatial data can help you do both of these things.There are a number of things to consider when selecting or deselecting retail sites. Where do you currently have locations? Are other nearby stores going to compete with you, or possibly help you? How close are you to where the customers you want to attract live? How accessible transportation-wise are your sites to those target customers?Looking at geospatial data can give you answers that will hint at whether a location will succeed or fail.A potential shortcut is using SafeGraph’s Places Data to look at the attributes of stores that are already successful (even if they aren’t your own) and try to find locations that mimic those qualities.3. Visit AttributionAnother use case for geospatial data example is telling the difference between someone actually entering your store or just simply passing by. Comparing accurate building footprint data with precise GPS data from mobile devices lets you know how many people actually entered the grounds of your store to buy something (or at least look around).You can also use this information to look at nearby points of interest to see if there are opportunities for sites to get greater foot traffic, or place advertising to direct more people to your store.‍SafeGraph’s Geometry Data can show you accurate geofences of buildings in an area so you know the specific dimensions of business locations and exactly how much patronage your stores are getting.4. Urban PlanningGeospatial data science techniques and applications can be helpful if you’re in government or the public sector. You can make sound community planning decisions based on data regarding where buildings are (and what they are), where people live, where they typically go in a day, and what routes they take to get there.For instance, you can plan construction for roads or other transportation systems to cut walk/drive times to places people visit on a regular basis, such as grocery stores. Or, you can do things the other way around and plan to build important facilities, such as hospitals or schools, in areas that are already accessible and get a lot of foot traffic.If you’re planning to use geospatial data in this context, foot traffic data can be a big help. It can give you insights into when people in your community are most active, where they come from, and where they typically go.5. Network PlanningTelecommunications systems planning is another area in which geospatial data can be informative. For example, foot traffic data can help providers plan their infrastructure around the busiest areas in a neighborhood. This can allow them to achieve as wide a coverage as possible with the least amount of hardware. It can also inform the price they charge for setting up WiFi hotspots, with installations in higher-traffic areas being more expensive.For other businesses, knowing how much foot traffic they get at their stores allows them to decide on the scope of the telecom plan they need. That includes where they should set up WiFi hotspots, based on how many they can afford. It also allows them to manage the bandwidth limits on their hotspots based on when they expect to be busiest.6. Investment ResearchIf you work in an investment bank or private equity firm, knowing how to use geospatial data can be a powerful – yet often overlooked – tool for your portfolio. Combining location data for stores and other points of interest with foot traffic data allows you to model and predict consumer behavior and movement patterns. You can then combine this with other contextual data to get a better picture of how the businesses you’re investing in are actually performing, and which ones may be in for upturns or downturns. To start here, it helps to know as much as you can about the businesses in an area that you are planning on investing in.7. Competitive IntelligencePart of analyzing how well your business is doing is knowing the competition: who they are, where they are, and how their performance may be impacting your own. Geospatial data can help you figure out how much geographic influence your company has in relation to where your customers come from and where your competitors are.For instance, you may be able to see that customers choose your store over a competitor’s because it’s more accessible in terms of parking space and transportation routes. Or you may discover that you have stores too close together in terms of walk/drive time, and that they’re competing over the same customers. Or you may notice that nearby stores you assume would be competing with you actually aren’t, because your target customers may be from different demographics and/or coming from different locations.8. Risk AssessmentThe insurance industry is another sector that is increasingly making use of geospatial data applications. To develop liability frameworks for buildings, insurers need to know a number of their geospatial traits. These include precisely where a building is, how much space it takes up, and how close it is to surrounding buildings. These attributes let them assess how vulnerable a building is to things like destructive weather, or even damage indirectly caused by other sources (e.g. a spreading fire, vehicle accident, or collapse of some sort).Additionally, insurers need to know things like how many businesses are contained in the same building, along with what they do or sell. Some businesses present more hazards than others, which can affect risk assessment, even for nearby operations. Occupancy is another factor they have to consider. The more people who live in or visit a building on a day-to-day basis, the greater the risk for an accident to happen.SafeGraph’s Geometry Data is a good starting point here. It offers precise and accurate information on the location and area of buildings, including buildings inside buildings (in the cases of malls, apartments, office complexes, etc.). Another ideal example of how this is useful is a McDonald’s location inside of a Walmart.9. Trade Area AnalysisTrade area analysis is a process to which geospatial data is integral. Think of it as a big-picture version of site selection for stores. Sometimes it’s not enough to just look at how accessible a location is and how close competitors are.Take census block groups (CBGs), for example: relative to where people live, would their general lifestyles make them want to actually buy what you’re selling? And would they be able to afford it, based on their average income range? Asking these types of questions helps you more accurately pinpoint your customer base, allowing you to adjust your site selection (among other things) to cater to the people who will most likely be patronizing your stores.Also consider how many competitors are nearby and how well-established they are. It’s going to be tough to succeed in an area if demand for the type of product or service you sell is already being met there. This is especially true if your competitors in that area have had years to build up their reputations. You might want to consider another area where supply for your products or services hasn’t met the demand yet, and where rival businesses will be easier to compete against.Start with SafeGraph’s Places Data to know where complementary and competitive businesses are, and how you stack up against these competitors.10. Consumer InsightsA specific geospatial data sample may be able to give you even more granular insights into your customers’ shopping habits. For instance, you can look at other stores your patrons visit that feature specific brand names. Then you might decide to carry products of that brand, or to more heavily market them if you already carry them. You might do this by, say, making them the focus of rewards programs, or by holding seasonal sales. The goal is to meet consumers’ demands so that they don’t have to go to another business besides yours to get what they want.Another idea is to look at businesses that consumers visit before or after they come to yours. To illustrate, imagine that you run a smoothie shop or some other type of health-conscious restaurant. You may notice that a lot of customers are going to gyms or yoga studios before or after they drop by your store. This might tell you that they’re likely coming to you to either fuel up before a workout or recharge after one. You can use this knowledge to approach these types of stores for cross-promotion opportunities, perhaps on specific days or at certain times.Now you know some of the things that are possible when you unlock the power of geospatial data. But how exactly do you do that? We’ll show you the starting point for getting the most out of geospatial data from SafeGraph in the next chapter.If you're ready to learn more, check out the next chapter, Geospatial Data Analytics: What It Is, Benefits, and Top Use Cases hereIf your goal is to learn more about the sources and where you can actually get geospatial data, read through our guide, Geospatial Data Sources — Where to Get the Data You Need #### Ultimate Guide to Location Intelligence: Uses and Providers More and more companies and organizations are using research based on big data to make decisions and solve problems. In the corporate world, this research is known as business intelligence, or BI. Part of a company’s BI strategy can include data on where (and when) things are located and events happen. This is called location intelligence, or LI.So exactly what is location intelligence? How does it relate to business intelligence, and how is it different? What sectors inside and outside of the corporate world use it, and for what purposes? And where can one get the kinds of data needed for location intelligence? In answering these questions, this guide will give a general overview of location intelligence through the following sections:What is location intelligence?Who uses location intelligence?Location intelligence use cases: how it’s used + best applicationsWhere to get geospatial data for location intelligenceWe’ll start off with a location intelligence definition so you can understand what the term means with a little more clarity.What is location intelligence?Location intelligence (LI) refers to using geospatial data to understand how to perform a certain task or solve a specific problem. This is typically done by overlaying geospatial data on a map to study relationships between locations, or how a location’s attributes change over a period of time.The first recorded utilization of location intelligence techniques was in London, England during the mid-19th century. A physician named John Snow was able to use geospatial data to trace and minimize the impact of a cholera outbreak in one of the city’s districts. He did so by mapping out areas of the district where infections had occurred, and then comparing them against a map of the district’s water supply points. In doing so, he was able to pinpoint and disable the specific water pump where the outbreak had originated. Since then, LI has played an increasingly integral role in businesses, governments, and academia – and become much more technical.Location intelligence (LI) vs. business intelligence (BI)The difference between location and business intelligence is in what kind of data is used and what the resulting insights are applied to. Business intelligence involves the integration and analysis of several different kinds of data. However, the information gained from this analysis is used for a specific purpose: to make decisions that improve a business’s operations.Location intelligence, meanwhile, uses a specific class of data: geospatial data. That includes things like information about points of interest, building footprints, footfall patterns, weather systems, and road traffic. However, it is not used only to solve problems and improve processes for businesses; it has several other use cases as well.In short, you can have location-based business intelligence, but not all business intelligence is location-based.Who uses location intelligence?As we just mentioned, businesses can make use of location intelligence, but they’re not the only organizations that do so. Here’s a quick list of sectors that use location intelligence, and why:Retail: Selecting sites for brick and mortar stores that are most likely to get repeat customersInsurance: Accurately assessing risk and price policies for properties or peopleFinance: Making more accurate predictions about which industries or brands will be good investments, and whyHealthcare: Measuring accessibility of health services and how geosocial factors affect people’s health (and risks to it)Marketing: Understanding who a brand’s main customers are and targeting campaigns to reach them at the places they typically go toReal Estate: Choosing to invest in particular properties that will give good returns when sold to commercial or residential developers Energy: Determining average foot traffic in commercial areas to monitor power use and regulation complianceTelecommunications: Analyzing demand and competition in an area in order to optimally plan a networkGovernment: Planning buildings and services for urban areas based on constituent needs and movement patternsLocation intelligence use cases: how it’s used + best applicationsAs we demonstrated in the previous section, many different types of businesses and organizations use location intelligence. Now, we’ll take a more in-depth look at some specific location intelligence applications.1. MappingPlotting geospatial data on a map is one of the cornerstone applications in location intelligence. It allows for visualizing potential relationships between sets and attributes of geospatial data. This is something that can be very difficult without actually mapping out where (and sometimes when) in the real world the data corresponds to.2. Urban planningLocation intelligence plays a big role in helping governments plan out how municipal land is to be used, as well as many other facets of urban life. Artificial intelligence and location-based services allow local authorities to leverage geospatial data to design more efficient communities through understanding who constituents are, where they go, and what they need. That includes things like increasing accessibility to critical facilities, reducing traffic congestion, better managing waste collection and energy consumption, and deploying security personnel more efficiently to keep citizens safe.3. Utilities and telecommunications network planningAnother thing city planners have to work out is how to design utility infrastructure for inhabited areas. Again, they need to use location intelligence to look at factors such as where terrain allows systems to be built, as well as any natural or artificial obstacles they may encounter. They also need to look at what areas tend to be busiest, so they can build critical hubs in areas where they achieve maximum coverage with minimal hardware.Telecom companies need to leverage location intelligence for similar considerations when planning their networks. They can also use it to determine the price of additional infrastructure (e.g. WiFi hotspots) for people or businesses, depending on the traffic the surrounding area gets. People and businesses themselves can use location intelligence to decide how to manage their network availability and bandwidth based on how busy their sites get (and when).4. Real estate developmentWhen a government has designated land for residential or commercial use, real estate companies have to decide which parcels are worth investing in. To do this, they need to look at geospatial factors such as the physical features of the property, what the local environment is like, how accessible the property is, and how busy the nearby area gets. These can all affect things like the costs associated with the property, as well as how much a development project on the property will sell for.These factors can also affect how the property is marketed. A commercial plot close to foot traffic and other points of interest may be advertised for its accessibility to customers, while a residential plot may be advertised as being conveniently close to essential services. Conversely, a residential plot away from busy areas could be marketed as a quiet retreat from urban hustle and bustle. So real estate investors need to use location intelligence analytics to decide which parcels of land will provide the best return on their investment, and how best to get that return.5. Trade area analysis & site selection for retailTwo fundamental pieces of retail location intelligence are trade area analysis and site selection. Trade area analysis consists of a large-scale survey of business opportunities in a given geographic space. It examines things such as how likely people there will become customers based on their demographics and lifestyles, and how well competitors already serve a business’s market niche in the area.Site selection, similar to in real estate, involves a closer examination of the advantages and disadvantages of specific properties when deciding where to build a store. Is a location accessible? How efficient are supply chain routes? How much foot traffic does the surrounding area typically get? How much of that foot traffic is likely to visit and buy the store’s products, based on their demographics? How close are other nearby stores, and are they competitors, complements, or neither? These are all questions related to location intelligence that a business should ask before settling on a spot.6. Consumer insightsOne way a business can understand how its selected locations will perform is by gathering and using location-based market intelligence. For example, it can look at data for a store’s area to figure out which products or brands are popular in nearby stores. It may then choose to stock those products or brands, make them the focus of marketing materials or campaigns, or even rearrange a store to make what customers want most more accessible.A business might also look at what other places people typically visit before or after visiting one of its stores. This may highlight complementary businesses that could be approached for cross-promotions. It may also point out competitors that customers are visiting to find certain inventory that a business’s own store doesn’t have.7. Visit attributionVisit attribution is a form of location intelligence that combines footfall data with building footprint data. It is used to determine if a person actually entered the bounds of a location, rather than walking past it, around it, or into a neighboring location. Accurate building polygon data is critical for this, especially in buildings such as malls or airports that have multiple tenants in close proximity.Visit attribution is usually used as location intelligence for retail stores. They use it to track how much of an area’s foot traffic is converting into store visits and purchases, especially if they are advertising nearby. They can also use it as a way to measure how much exposure their advertisements in specific locations are getting.8. Competitor intelligenceAnother way a business can use location-based intelligence is to analyze the geospatial strategies of its competitors. For example, a business may observe that its competitors’ stores are more popular because they’re more accessible. They might have bigger parking lots to accommodate more cars, or they might be conveniently located close to stops for public transportation. Or a business might discover that two or more of its own stores are competing over the same customers because it takes nearby consumers similar amounts of time to travel to any of them.A business can also look for potential opportunities that competitors’ geospatial strategies leave open. To illustrate, a competitor may select a site for a store based on accessibility to certain neighborhoods, because it’s targeting certain demographics that live there. If the business is targeting different demographics, it may be able to build a store in a prime location, even if it’s close to a potential competitor. This is because there is less risk that the store and the competitor will fight over the same customers.9. Insurance and fraud preventionOne of the more unique location intelligence use cases is for the insurance industry. Weather patterns and terrain in an area can give clues as to how vulnerable a person or property is to nature-related damage. Footfall and traffic patterns can also indicate risk, as accidents are more likely to occur in areas where more people and vehicles are active. Co-tenancy is another factor, as being located next to certain risk-prone businesses or people can increase the likelihood of an accident.Studying geospatial patterns can also help insurers and others quickly identify and respond to fraudulent claims and other transactions.10. Financial investmentCompanies like private equity firms and investment banks can use location analytics and business intelligence to help manage their assets. They are looking for indicators that a business or piece of real estate will produce returns with minimal risk. So they will want to look for some of the same geospatial information and patterns that real estate developers and other businesses can use to gauge and model performance.For example, how accessible are store or land parcel locations, and how busy do they get (and when)? Do nearby consumer demographics match the target audience of a house or store? How many confirmable visits does a location get? What other points of interest are nearby, and could they be beneficial or detrimental to the location’s operations or traffic? This kind of business location intelligence may give clues to a company’s financial performance long before they release the official information.Where to get geospatial data for location intelligenceWhere to get geospatial data for location intelligenceThere’s a big thing we haven’t talked about yet in relation to these location intelligence examples: where to get the specific kinds of data you’ll need to get the insights you want. Different use cases may require different types of data, and not every provider offers the same kinds of data. With that in mind, here is a list of companies and organizations that can supply you with the types of data that power location intelligence:1. SafeGraphMajor data types: points of interest, building footprints, transactionSafeGraph is the market leader in global POI data. Our Places and Geometry datasets have detailed attribution for points of interest and property information that includes accurate polygon geofences. This can all be used for various commercial use cases, and some non-commercial ones as well.Also, be sure to check out our Spend dataset. It’s the first US consumer transaction dataset that’s based on where people spend money, to give context to when and how they spend it.2. BingMajor data types: point of interest, streets, imageryBing is Microsoft’s search engine. As part of that service, they offer a GIS function that provides imagery of most parts of the world, information on road networks, and details about points of interest. Many commercial applications may find this data handy.3. CAP LocationsMajor data types: point of interest, propertyCAP Locations has data on over 1.2 million stores in the US and Canada, including retail outlets as well as restaurants. That includes stores inside malls and other shopping centers; CAP Locations has data on over 40,000 such complexes, including about 20,000 complete building footprints of them. This is great information for most retail-based applications of location intelligence, as well as some for financial investment and insurance.4. CARTOMajor data types: point of interest, property, mobility, demographics, boundaries, environmental, streetsCARTO previously specialized in environmental data, but now it’s partnered with over 40 other companies to provide most types of geospatial data. It even has pre-mapped datasets designed for analyzing specific geographic attributes. In total, it has over 10,000 datasets for a variety of location intelligence needs.5. ClimateCheckMajor data types: property, environmentalClimateCheck has run historical US weather data through over 25 internationally-sanctioned climate change models. As a result, they can offer an assessment on any home in the US of how vulnerable it will be to weather-related damage over the next 30 years. That includes fires, storms, floods, droughts, and heat waves. So it’s a good resource for those in insurance or real estate investment.6. CustomWeatherMajor data types: environmental, imageryLike its name implies, CustomWeather sells a number of different datasets on meteorological patterns and imagery on current globe conditions. Among them are daily weather forecasts for over 8,500 locations worldwide, including monthly summaries and year-over-year comparisons of weather on specific days or during particular months.It also has data on things like severe weather, air & sea travel, ski conditions, and wildfire danger. All of this is useful for conservationists and other environmentalists, of course. But it can also be applied in insurance, site selection, and decision-making for those operating (or investing) in businesses that may be impacted by inclement weather.7. EsriMajor data types: point of interest, property, mobility, demographics, boundaries, environment, streets, imageryEsri is one of the largest location intelligence platforms out there, thanks in part to its leading mapping software, ArcGIS. It also has data for sale from over 150 partner companies, including almost every type of geospatial data that you’d need for any kind of location intelligence.8. InfutorMajor data types: property, demographics, addressThose who want to use location intelligence for marketing in the US should pay Infutor a visit. Its demographics datasets cover the social and commercial activity of over a quarter of a billion Americans. It also has data on US property attributes, plus an index of over 360 million US address records that includes some places overlooked by official US government records.9. LocomizerMajor data types: mobilityLocomizer has foot traffic data for points of interest in the UK, but also puts a unique twist on it. Its “brand affinity” dataset uses a type of artificial intelligence location intelligence that blends footfall patterns and mobile application use data to estimate how likely someone is to engage with a particular brand at a certain place and time. That could be shopping, eating, learning, enjoying entertainment, transporting somewhere, and many more things. This particular dataset is superb for retail analysis, consumer insights, and investment decision-making.10. MapboxMajor data types: mobility, boundaries, streetsMapbox’s data-gathering is supported by a global community of over 500 million monthly active users. This lets the company offer datasets on over 20 billion daily mobility pings and over 30 billion road segments worldwide that are refreshed regularly for both real-time and historical traffic patterns. Mapbox also carries data on over 4 million administrative boundaries around the world. So it’s a provider to consider for applications ranging from urban and telecom planning to retail analysis and logistics.11. NexarMajor data types: streets, imageryNexar provides a unique combination of street and imagery data. It has developed AI for location intelligence on roads, not only photographing what roads look like but also identifying street signs or hazards that may affect traffic and driver safety. This can have implications for insurance firms, retail logistics, and urban planners.12. RegridMajor data types: propertyRegrid’s property data encompasses over 150 million parcels of land in the US, accounting for over 3,000 counties and almost 99% of the country’s populated areas. These datasets are flexible in that they can be bought based on certain attribute clusters and on the areas they cover (county, state, or the entire US). So anyone looking to use them for location intelligence can limit the amount of data they buy based on specific areas or attributes they need.13. Spatial.aiMajor data types: demographicsSpatial.ai enriches traditional demographics data for US census block groups with online traffic and activity data from nearby areas. This has allowed the company to create over 70 “geosocial profiles” of Americans based not only on who they are and where they live, but also what they do on the Internet. This data is great for those looking to use location intelligence in marketing.14. Tomorrow.ioMajor data types: environmentalTomorrow.io doesn’t just sell historical weather data for millions of locations worldwide. It also offers a coordination and monitoring platform that makes it easy for organizations to communicate and make decisions based on that data. So while insurers can use this weather data to assess risk, retail chains and other businesses can use the platform to minimize the degree to which weather will disrupt their operations.15. US Census BureauMajor data types: demographics, boundaryThe US Census Bureau provides publicly-available data on US census block groups and the demographics of people who live in them. This is useful information for planning communities and the infrastructure that supports them. It can also be handy for making demographics-based decisions regarding retail operations and financial investments.SafeGraph has taken data from the American Community Survey for 2016-2019 and cleaned it for a bulk download. This includes polygon-based boundaries of census block groups.16. US Department of TransportationMajor data types: addressOne of the US Department of Transportation’s projects is a National Address Database. It’s a collection of over 65 million address records from across the country, as provided by state, local, and tribal governments. This is critical data for urban planning, along with some other site selection applications.17. VerasetMajor data types: property, mobilityVeraset’s “Movement” dataset contains aggregated footfall measurements around points of interest from 150 countries worldwide. This gives you a comprehensive view of global footfall patterns. Its “Visits” dataset adds building footprints for over 6 million US properties to this foot traffic data, allowing for easy visit attribution. This makes it ideal for retail and financial investment applications.For those who have wondered “what does location intelligence do, and what does it look like?”, we hope this guide has provided you with a starting point. You should now know what location intelligence is, who uses it, what it can do, and where to get the data you need for it. #### What a Places API is (and the Best Alternative for POI Data) Many mobile and web applications need to help users find locations of – and information about – points of interest. Where are stores of a particular chain, or that carry a particular brand of product? What type of cuisine is available at nearby restaurants? How far is it to the nearest gas station or EV charging spot?The problem is that most of the developers of these apps don’t have their own database containing this information, so they have to source it from somewhere.Often, this “somewhere” is a Places API. So what exactly is this? And how does it work? We’re going to answer those questions, and in doing so, demonstrate how other options – like SafeGraph’s Places data – may be a better fit for your organization. Here, you’ll learn:What is a Places API?What types of data requests are available with a Places API?How a Places API worksPlaces API vs. SafeGraph Places data: a comparisonWe’ll start with the basics of what a Places API is, and list some common POI providers on the market.What is a Places API?A Places API is a service that fetches information about points of interest in response to HTTP requests. It is usually used by mobile or web mapping applications to help people find nearby facilities and relevant information about them, such as operating hours and contact details.A Places API is often used by application developers when they require some part of their app to provide information about places, such as a store locator or accommodations booking widget. The API allows the app to call this information from another company’s pre-existing database, rather than the developer having to collect, store, and manage this information themselves.Common Places APIs you may have heard ofGoogle Places API: A subservice of Google Maps that helps applications locate and provide details about establishments, points of interest, and other geographic locations.Foursquare Places API: Known for its crowdsourced place information apps, Foursquare has an API that lets you tap into facts about places, as well as traveler tips and opinions regarding places.HERE Geocoding & Search API: An API that allows for searching for places and place information by address, name, category, ID, and even distance from coordinates.Geoapify Places API: This API offers some unique ways to search for places, such as within a transit time or other type of isochrone, or by the amenities available at a place.TomTom Search API: Made by a pioneering consumer GPS company, TomTom’s Search API allows for searches filtered by geometries and categories, geocoding, autocomplete, and even the ability to look for electric vehicle charging stations.Precisely Places API: An API that allows for searching for places by name, address, coordinates, administrative area, category, or isochrones, as well as viewing place details or counting the number of POIs in an area.What types of data requests are available with a Places API?Place Search: Displays a list of places within a certain proximity of a user’s location, and/or related to user-input search terms and other search criteria.Place Details: Retrieves available attributes describing a place. These could include an address, business status, phone number, operating hours, photos, ratings, and reviews.Place Photos: Allows for resizing a photo referenced by a Place Search or Place Details request in order to properly display it in an application.Place Autocomplete: Predicts the name or address of a place as the user types it, and offers to fill it in. Also displays an on-the-fly list of places the user may be searching for.Query Autocomplete: Predicts a categorical search (not related to a specific place) as the user types it, and offers to fill it in. Also displays an on-the-fly list of possible search queries that may be relevant to the user’s search terms.Geocode: Takes an address and shows the precise geographic coordinates the address refers to.Reverse Geocode: Takes a set of geographic coordinates and returns the address of the point of interest.Address Verification: Checks if a place has a valid mailing address so that packages can be delivered to it.Phone Verification: Checks if a place has a registered phone number, helping to streamline communications and avoid fraud.Email Verification: Checks if a place has a valid email address registered to it for accurate email delivery and protection of potentially sensitive data.Streets: Finds information about the street(s) nearest to the point of interest, including the nearest intersection and the posted speed limit on that street (or streets).Property Information: Provides detailed information on a specific building, including its available amenities or sale-related data.How a Places API worksThink of a Places API as a library that a mobile or web application can quickly consult for information about points of interest when a user asks about them. A user can ask questions like:What notable places are near me?What can I find at places nearby?Are there places of a specific type nearby?Where can I find a specific place I’m looking for?Can I find a specific place even if I don’t know its full name or address?Here’s a rough outline of how the Places API works:The user of a mobile or web application takes an action that would require information about a point of interest, such as touching a location symbol on a map.The application translates this action into HTTPS format and sends it as a request to a Places API, along with an API key to let the Places API know that the request is valid.The Places API uses the instructions in the request to search for the relevant information in its database. This could be a common query string, or a specific place or set of places denoted by unique place IDs.The Places API returns a JSON or XML response that contains the information the request asked for.This information is displayed to the user on the application.SafeGraph’s Places data skips most of this process. It doesn’t need to be called for by a HTTPS request, and it doesn’t need an API key to access. The data can be delivered to your organization through a CSV file, or on common GIS or data management platforms like Esri, CARTO, Amazon Web Services, Snowflake, and Databricks Delta Sharing. Once it’s there, you can search through it or manipulate it pretty much any way you want.Places API vs. SafeGraph Places data: a comparisonWhile a Places API is a convenient way to fetch point of interest information for mobile and web applications, it’s not without its drawbacks. One major disadvantage is that it requires an application to repeatedly make requests to the API whenever it needs the relevant information. And there are fees associated with most of these requests. While these fees are typically small (usually about 3 cents per request at most), if an app is popular and issues a lot of requests from many different users, these fees can add up to an expensive amount fairly quickly.Another downside of this setup is that the API developer retains the majority of the control over the data. The terms of use for many Places APIs have restrictions on the ability of app developers to keep any of the data they request from the API. Some may have even further restrictions on the kinds of applications that can use the data, including derivative works. You can see an example for yourself in the Google Maps Platform Terms of Service.With SafeGraph Places, you can get a bulk CSV download of points of interest data – including detailed attributes about each place, not just the locations themselves – without having to call the data piece-by-piece from an API. This has two advantages in this context. First, it means we can work with your organization to decide on one flat price for the data that gets you better value for money. So there’s no need to be nickel-and-dimed for the data based on how often you use it.The other advantage is that your organization gets to keep the data it buys and has greater creative control over it. Want to create a map of local EV charging stations? Model the landscape of competing or complementary businesses near your company’s stores? Explore how many essential facilities are accessible to residential areas in a city? Some Places API developers don’t allow their data to be used for these kinds of applications. But you can use SafeGraph Places data for all of these things and more.Places APIs are repositories from which applications can get locations of – and information on – various points of interest. But with how they’re built, it doesn’t always give the most relevant information for certain use cases. And with how they operate, their usage is often restricted, and they can rack up numerous transactions just to access the data – which can cost your organization a small fortune in fees.In contrast, SafeGraph’s Places data is available for a flat negotiated price depending on what specific data your organization needs. It can be delivered to common GIS or data platforms, or given to your business as a CSV file – it’s all there right from the start, without having to call an API every time. And once your organization has the data, it’s free to put that data to work for several more use cases than many Places APIs allow.Get in touch with us today to get a demo of what our Places dataset can do for your organization. #### Where to Buy Alternative Data for Cutting-Edge Insights ‘Alternative data’ is a term that has become popular in the financial sector, as investors shift their attention away from quarterly earnings reports and press releases as exclusive sources of finance information. Instead, they’re looking for additional information that’s more readily available and can let them see the performance of potential assets through different lenses.But where can they find this alternative data? What can (and can’t) it tell them? And how do they know the product they’re getting is correct, timely, and privacy-sensitive? We’ll explore these questions as we explain where to buy alternative data through the following sections:What alternative data is + why you should use itUsing alternative data: buy-side vs. sell-side analysts23 alternative data sources to buy top-quality alternative dataWe’ll start with a fuller explanation of what alternative data is and why it’s worth using.What alternative data is + why you should use itAlternative data refers to financial analysis data a company sources from non-traditional areas. In other words, the company doesn’t create this data via its own operations, nor does it collect this data from official financial information outlets (such as news releases and government-mandated reports).So why should you, as a financial analyst, use alternative data? The answer has to do with the fact that, comparatively speaking, official financial data is limited in scope and published rather infrequently. That is, it doesn’t provide many contextual signals as to why a company’s financial situation is going in the direction it is. And by the time the data is released, you’ll likely be scrambling to react to major financial shifts as they’re happening.Alternative data, meanwhile, can provide clues about a company’s performance that may be more closely linked to its on-the-ground operations. These include things such as visits to its brick-and-mortar stores, or what its official accounts on social networks are posting. Alternative data is also produced much more often. So if you monitor it and analyze the trends you find in it, you may be able to predict major market events and prepare for them long before they are officially announced.Using alternative data: buy-side vs. sell-side analystsAlternative data use cases can generally be categorized according to the two major financial analyst role groups: buy-side and sell-side.Buy-side analyst: Someone who scouts out investment opportunities for a company, such as a hedge fund or private equity firm, that invests directly on behalf of clients. They have a lot to survey and little room for error, so they will often combine advice from sell-side analysts with their own research. They can use alternative data for functions such as deal sourcing, due diligence, and portfolio management.Sell-side analyst: A person who works for an investment bank, brokerage, or other firm that sells access to financial information, advice, and other services. In order to grow and maintain their firm’s clientele, they have to provide accurate and relevant insights to them faster than competitors can. So they may use alternative data for things like building predictive models of a company’s performance, including demand forecasts for that company’s products and/or services.So where do analysts actually find the alternative data they need to carry out these functions? We’ll recommend some places in the next section.23 alternative data sources to buy top-quality alternative dataThere are all different kinds of alternative data out there: credit and debit transactions (online or offline), social media activity, mobile app usage stats, and even geospatial factors like weather or traffic. You need to find sources that are thorough, accurate, up-to-date, affordable, and privacy-compliant. Here are 23 of our top suggestions.SafeGraphMajor data types: points of interest, building footprints, transactionsSafeGraph provides detailed information for millions of locations around the world. We also build data that adds additional context to what happens at those locations, such as transactions. Our Spend dataset gives anonymized transaction reports (both online and offline) for specific stores and brands in the US.You can use our data for things like understanding consumers’ shopping habits and relationships with different stores and brands, as well as the relationships between the stores and brands themselves. You can also monitor how much patronage businesses get through visit attribution.BA45Major data types: propertyBA45 has data on the specifications of over 125 million US land parcels. These specs include details from how many buildings are on a parcel and who owns those buildings (or the overall parcel) to how many rooms are in a building and what its walls are made out of. This all makes BA45’s data great info for performing due diligence if you’re thinking about investing in either commercial or residential real estate. It’s really affordable, too.BingMajor data types: imagery, roadsMicrosoft’s search engine, Bing, has a map feature that includes the ability to see road networks and either aerial or satellite imagery for most places on Earth. It also has interactive street-level panoramic views for many US locations, as well as some locations in Europe and Canada. It’s a free and handy tool for scouting out real estate and business opportunities in an area based on how accessible the area is, what businesses are already there, and what the lay of the land looks like.CAP LocationsMajor data types: point of interest and property (retail stores, restaurants, and shopping centers)CAP Locations has general, real estate, and building footprint data for over 1.2 million retail stores and restaurants in the US and Canada. That includes up to 300 information attributes, including things like parking lot capacities and when a business was opened (or renovated). CAP Locations also has spatial hierarchy metadata on over 40,000 shopping centers, including polygon-based geofence models for about half of them. So it’s helpful for those investing in malls (or certain brands common to malls) who want to know what kinds of businesses will likely succeed there, based on local shopping patterns.CARTOMajor data types: demographics, human mobility, transactions, housing, road traffic, points of interest, environment, geography, social behaviorOriginally an environmental conservation project, CARTO has grown into a leading alternative data hub. It features over 10,000 datasets from over 40 partnered data providers, most of which are geospatial in nature – human mobility, road traffic, local demographics, housing data, etc. These include over 750 pre-mapped datasets designed for specific analyses. In short, CARTO’s platform allows you to use many different types of alternative data towards pretty much any kind of financial analysis you’d want to do.ClimateCheckMajor data types: real estate; weather and environmentThose considering investing in residential real estate in the US should definitely have a look at ClimateCheck. It combines property data on over 140 million US homes with historical US weather data, extrapolated over the next 30 years based on over 25 international climate change models. This allows ClimateCheck to offer an assessment of how vulnerable a property is to weather-related damage or destruction, including wildfires, storms, and floods.CustomWeatherMajor data types: weather and environment; satellite imageryCustomWeather offers a variety of weather data solutions. It has current and real-time condition monitoring; short-term and long-term forecasting; airport delay forecasting; severe weather tracking; maritime conditions and travel routes forecasting; satellite imagery; and historical weather comparisons for over 8,500 places worldwide. It also has special reports for things like sun & moon cycles, wildfire danger, ski conditions, and air quality.Weather can impact all kinds of industries, and affects certain geographic areas more than others. So checking alternative data on it from a company like CustomWeather can help you mitigate risk in terms of what you invest in, and where.EsriMajor data types: satellite imagery, demographics, transactions, social behavior, corporate summaries, environmental, foot and vehicle traffic, properties, points of interest, boundariesSimilar to CARTO, Esri is a rather catch-all company when it comes to places to buy alternative data – and geospatial data in particular. Its software platform, ArcGIS, is one of the most powerful mapping and location analysis tools available. A major reason why is that Esri has filled it with all kinds of alternative data you can analyze, from property details and satellite pictures to company summaries and consumer spending habits. So ArcGIS can be customized to support any kind of geography-based financial analysis you need.First American Data & AnalyticsMajor data types: propertyFirst American Data & Analytics offers comprehensive property data that covers the entire US housing market. Much of its data relates to legal and ownership matters, such as sale prices, property taxes, mortgages, and foreclosures. But it also has data on property elements such as basic location and characteristics, land use zoning, flood vulnerability, and school catchment area. All of this is usable information for assessing the potential value or risk (either physical or financial) of investing in real estate or homeowners for a specific property.Greenwich.HRMajor data types: finance, employment statisticsGreenwich.HR’s specialty is data on worldwide labor markets. Its data lets you look inside the hiring practices of over 5 million companies across over 200 countries. See things like how much new talent a company is looking to onboard, which kinds of positions are in demand, and what salary ranges for specific positions look like (Greenwich.HR boasts a nearly 80% completion rate for this in its data). Data like this can give you a different perspective on what kinds of companies are likely to be successful in the near future.HARNESS DataMajor data types: Points of interest, property, and address (UK); PDF documentsHARNESS Data’s “Addressable” dataset is the most comprehensive list of points of interest, property details, and addresses in the UK. In fact, HARNESS Data is so sure that its product is the best that it offers, as a free sample, a list of prices per square meter for over 16 million pieces of real estate across England and Wales. So if you’re investing in the British Isles, you’ll want to go to HARNESS Data to get alternative data.HARNESS Data also sells a tool called “PDFx” that extracts data points from PDF document files, which can help speed up manual research.InfutorMajor data types: property, demographics, phone & email communication, automotive & other transactions, addressesInfutor is the authority on the economic activity of over 260 million Americans, and sells alternative data on many different subjects. These include car sales, phone & email communications, consumer demographics & purchases, and property attributes & addresses. Infutor’s wide variety of datasets can be useful whether you’re investing in real estate, automotives, telecommunications, or other specific businesses.LocomizerMajor data types: footfall, brand affinityLocomizer’s main offering is foot traffic data around points of interest in the UK. What sets them apart, however, is their brand affinity estimate dataset. This uses a blend of footfall data and mobile application use data, fed through a machine learning algorithm, to determine how likely a person in a particular place at a specific time is to engage in a brand-related activity. That could be eating at a particular restaurant chain, watching a specific athlete or sports team play, taking a branded mode of transportation somewhere, and so on. This is very valuable for investors to know not only that people gather in certain locations, but also make educated guesses as to why they might be there and what they might do.MapboxMajor data types: boundaries, road networks & traffic, human mobilityMapbox’s geospatial data is sourced from hundreds of platforms, fed by over 500 million monthly active users. That’s why it’s confident that it has the most accurate global datasets for over 4 million administrative boundaries, 20 billion daily human mobility updates, and over 30 billion road segments worth of typical and live traffic data. Mapbox’s data is great for real estate portfolio management, accessibility and logistics analysis, demand forecasting, impact modeling, and more.RegridMajor data types: propertyRegrid is a superb place to purchase alternative data if you’re investing in US real estate. It has up to 120 attributes worth of details on over 150 million US land parcels across over 3,000 counties. That includes over 155 million polygon-based building footprints. Regrid has flexible pricing options for its data, so if you’re only interested in properties within a certain area of the US – or certain details about them – you can choose a data package tailored and priced to suit your needs.Spatial.aiMajor data types: social media, demographicsSpatial.ai takes data on social media activity, particularly posts where the user identifies where they’re posting from, and blends it with US demographic data. The result is what it calls “geosocial profiles”, approximations of people’s lifestyles in US census block groups based on activity indicated on social networks as being from nearby locations. These profiles can be very useful for investors trying to figure out what businesses are popular, or are likely to be successful, in specific US neighborhoods.Tomorrow.ioMajor data types: weather and environmentTomorrow.io’s software platform and data are designed to help businesses plan and work to minimize disruptions to their operations caused by weather. With historical weather data for millions of places around the world, this is definitely a company to check out if you’re looking to get alternative data for factoring weather into your financial analysis.TransparentMajor data types: short-term rental propertiesTransparent’s data focuses on short-term property rentals, with over 50 data attributes on over 35 million listings around the world. Now that the industry has gone mainstream with the advent of businesses like Flipkey and Airbnb, it can be important to pay attention to if you have certain investment objectives. It’s obviously a factor in real estate investment, but it can also be one for travel and tourism investment as many people rent these properties for vacations. Hotel financiers can also see the degree of competition their selected sites are getting.Trust for Public LandMajor data types: boundary, property, and nearby demographic information for parksThe Trust for Public Land is an expert on public parks data, as it’s committed to conserving and expanding natural spaces for public recreational use. It offers free data on property information and geographic boundaries for parks in over 14,000 communities across the US. It also tracks some demographic attributes of populations who live near parks. This could be useful to investors for understanding the public recreation habits of people who live in a defined geographic area.US Census BureauMajor data types: demographics, boundariesSafeGraph provides a cleaned-up version of the US Census Bureau’s American Community Survey, suitable for a bulk data download. The report includes geometric representations of US census block groups, as well as over 7,500 demographic attributes of the people who live within those neighborhoods. This is incredibly valuable information for investors to understand who consumers are in a given area of the US, and thus which businesses they are likely to patronize.US Department of TransportationMajor data types: points of interest, boundaries, roads & transportation, addressesThe US Department of Transportation’s National Address Database project has recorded over 65 million confirmed US street addresses. Investors could use this to verify location details of businesses and real estate assets in the US. A sub-organization of the USDOT, the Bureau of Transportation Statistics, also has all sorts of pre-mapped datasets that can be analyzed for transportation investment or assessing the accessibility of places. And all of this data is free and publicly available.VerasetMajor data types: property, foot trafficVeraset gives you a choice of two geospatial datasets. Its “Movement” dataset covers over 150 countries around the world, using hundreds of billions of anonymized GPS signals to track daily footfall around various points of interest. Its “Visits” dataset currently only covers the US, but combines foot traffic data with precise polygon property models to count daily visits to over 6 million businesses. Either dataset is handy if you’re looking at a company’s performance through the lens of how busy locations near their stores get, but “Visits” in particular makes it easy to attribute visits to a specific store.Vertical KnowledgeMajor data types: online purchases, property rentals, transportation transactions, corporate insights, points of interestVertical Knowledge is another company that has a little bit of all different kinds of alternative data, as it sources its datasets from what’s publicly available on the Internet. You’ll find data there on things like best-selling books, air travel, cruises, car rentals & purchases, short-term housing rentals, company summaries, and retail points of interest. So whatever your investment strategy, you’ll likely find some alternative data from Vertical Knowledge that will provide you with insights.Hopefully you now have a better idea of what kinds of alternative data are out there, and where some of the most reliable places to get it from are. #### Where to Buy Geospatial Data from Reliable Providers Geospatial data can be a valuable tool for many use cases, including real estate site selection, city planning, retail marketing, insurance assessment, and even financial investing. But it’s difficult to collect the mass quantities necessary for meaningful insights on your own. It can be time-consuming and expensive, especially if you’re running other operations.But it can also be difficult to know where to buy geospatial data from a reliable source. There are different types of geospatial data out there, and having inaccurate, incomplete, or the wrong type of data can do more harm than good. So you need to know what specifically you’re looking for and where you can get it correct, current, detailed, and easy to work with.To help you out, we’ll give short explanations and list some reliable providers for the following data types:Point of interest (POI) dataProperty dataMobility dataDemographic dataAddress dataBoundary dataEnvironmental dataStreets dataImagery dataLet’s look at some dependable places where you can buy geospatial data.Point of interest (POI) dataPoint of interest (or POI) data gives important information about non-residential physical places. These can be things like businesses, government or public service buildings, monuments, and other landmarks. This type of data has many applications, including mapping, market research, and urban planning.SafeGraphSafeGraph’s Places dataset features comprehensive information on over 40 million points of interest around the world. We update it every month to account for businesses opening, moving, or closing locations. The dataset also features a persistent address standard called Placekey to solve issues with differently-formatted addresses that refer to the same place.CAP LocationsCAP Locations specializes in data on restaurants, retail stores, and malls across the US and Canada. It has data on over 1.2 million individual businesses and over 42,000 shopping centers, including building footprint and spatial hierarchy property data.Property dataProperty data consists of details about a specific parcel of land or a building that sits on it. This can include ownership information such as how much it was last purchased for, the length of the current lease on it, and any other financing details. It can also include polygon-based representations of building footprints, sometimes with spatial hierarchy metadata (i.e. multiple units within the same building). It is mainly used for visit attribution (i.e. determining if someone entered a physical space) and for risk assessment in either insurance or finance.BA45BA45 provides over 85 attributes’ worth of information on over 125 million commercial and residential US properties. That includes information on land use, zoning, the number of buildings on the land parcel, and who owns the land or buildings.First American Data & AnalyticsFirst American’s data deals with the financial attributes of the entire US property market: assessed value, last sale price, mortgage information, lease terms, and more. That means it’s most useful for investors or insurers assessing the financial or physical risk associated with real estate transactions and home ownership.To see how, watch our webinar featuring First American Data & Analytics on using third-party data to perform advanced types of property analysis.RegridRegrid’s property data covers over 150 million US land parcels, which accounts for about 99% of the country’s population. You can get data for the entire US, a particular state, or a specific county within a state. You can also choose options based on what data attributes you need.Mobility dataMobility data, in a geospatial sense, refers to anonymized movement patterns of people around points of interest over time. It provides a general picture of the flow of human traffic, as well as where people tend to congregate and stay for longer amounts of time. It has applications for real estate site selection, market research, insurance risk assessment, and transportation planning.LocomizerIn addition to general POI-based footfall data for the UK, Locomizer offers brand affinity estimate data. This dataset, powered by machine learning, calculates the probability of people interacting with a particular brand (e.g. entering their store, buying their products, or paying for their services) versus other nearby brands in and around a point of interest.VerasetVeraset has foot traffic data for points of interest in over 150 countries around the world. It also has a dataset that combines building footprint data for over 6 million commercial US properties with mobility data for the surrounding area. This makes retail visit attribution a snap.Demographic dataDemographic data describes the people who live within a certain geographic area. That includes things like age, gender, income, and ethnicity. It’s most often used for businesses to research their clientele and make decisions accordingly.US Census BureauThe US Census Bureau provides publicly-available demographic data on Americans in various places across the country. But this data isn’t in a very workable format for companies to get many meaningful insights out of it. That’s why SafeGraph provides a bulk version formatted for more large-scale granular analysis.Our webinar on how packaged goods manufacturers can combine census data and POI data to perform local market analysis demonstrates how useful this is.Spatial.aiSpatial.ai enriches demographics data with geotagged social media posts (i.e. posts where the location is provided) to create what it calls “geosocial profiles”. So it can tell you not just basic attributes of who lives in US census block groups, but also what their overall lifestyles are like. It has over 70 categories based on attributes like leisure, relationship status, diet, pastimes, and more.EsriEsri is one of the leading geospatial data companies in the world. Part of that is its best-in-class geographic information management tool, ArcGIS. It also partners with over 150 trusted data providers around the world, so you can get geospatial data of pretty much any kind from them. That includes demographics data for over 130 countries that can be organized and filtered by over 15,000 variables.Address dataAddress data covers navigation-oriented location designations for places. These can include geographic coordinates, postal addresses, or street addresses. This kind of information is usually used for administrative purposes, including verifying that a POI or residence actually exists at a given address. It’s also used for geocoding and reverse-geocoding, which means converting an administrative address to latitude and longitude or vice-versa.InfutorInfutor’s main strength is its demographics data, enriched by anonymized property and shopping data, that covers over 260 million Americans. However, its National Spatial Reference File is also a great resource. It contains over 360 million points of address data – including geographic coordinates – to help find places in the US that official government records might miss.US Department of TransportationThe US Department of Transportation has a National Address Database featuring over 65 million confirmed street addresses in the US. The USDOT works on this project along with state, local, and tribal governments to enhance transportation safety, emergency response, mail delivery, and other administrative services.Boundary dataBoundary data gives information about political borders and other geographic divisions that extend beyond a single property. These also include things like school catchment areas, city jurisdiction limits, and other regional separators. They’re commonly used in mapping, but can also be used for government planning, real estate brokering, and retail site selection.Trust for Public LandThe Trust for Public Land is an organization dedicated to preserving natural spaces for public use. As part of that mission, it provides data on the boundaries, property details, and surrounding demographics of public parks in over 14,000 US communities. This is useful for factoring green space into urban planning.CARTOCARTO is similar to Esri in that it’s an all-in-one geospatial data company. Through partnerships with over 40 other data providers, it allows you to purchase geospatial data of many different kinds, including boundary data. CARTO also provides tools and services for analyzing this data, so you can understand why certain things happen where and when they do.As an example, watch our webinar with CARTO on creating customer catchment areas for retail stores using foot traffic data.Environmental dataEnvironmental data is about natural phenomena as they relate to specific places on Earth. These most commonly include temperature, weather, and climate trends. They can also be things like tides, seismic activity, land elevation, or habitats and migration patterns of non-human organisms. Environmental data is typically used by natural scientists in their research, but it can also be used by insurers to assess the risk of climate-related injury or property damage.Tomorrow.ioTomorrow.io provides historical weather data for millions of locations around the world, meant to help all manner of organizations anticipate and act to mitigate the impacts of inclement weather. It also provides a software platform to make it easier to monitor this data, form action plans based on certain weather conditions, and communicate between teams for timely weather-related decision-making.ClimateCheckClimateCheck uses over 25 internationally-recognized climate change models to analyze historical weather patterns in the US. Based on this, it can help insurers, real estate companies, and individuals assess the risk of climate and weather-related damage to over 140 million US homes.CustomWeatherAmong its meteorological services, CustomWeather sells daily weather data collected from over 80,000 sources around the world. The data can also be arranged to provide monthly weather overviews, year-over-year comparisons of weather during a particular month, and more.Streets dataStreets data provides information on transportation networks. Most often, it covers roads for automobiles (along with attributes such as traffic volumes and potential obstacles). However, it can also include train tracks, bus routes, walking trails, and even air or sea routes. Its main use is for mapping and measuring an area’s accessibility, as well as planning how to get from one place to another.MapboxMapbox taps its over 600 million monthly active users worldwide to deliver some of the most accurate and up-to-date data regarding streets around the world. That includes both real-time data on traffic volume, as well as average traffic volume based on historical trends.Imagery dataImagery data refers to pictures that show what the physical world looks like. They are typically taken from airplanes or satellites, but can also be taken from cameras closer to the ground (e.g. “street view” cameras). They are often used to give visual context to other types of geospatial data, which has applications ranging from advertising to conservation efforts.BingBing is Microsoft’s web search engine. As part of its services, it provides a mapping platform that includes aerial imagery of most places on Earth. It also has street-level views of many places throughout the US, as well as some places in Canada and western Europe.The data is out there – you just need to know the right places to get it from. Hopefully, this list makes your quest to get geospatial data that’s right for your organization a little easier. #### Where to Buy Location Data: Best Providers for the Top 9 Types Location data has become increasingly vital as businesses and consumers seek geographical context to make more informed data-driven decisions. In today's world, location data is readily available, but looking at the appropriate place and format is what makes the difference. That is why understanding where and how to buy location data is essential.Geospatial data encompasses a broad range of factors, including data accuracy, depth, and area coverage, all of which can influence the overall quality, credibility, and cost of the project. Businesses should be aware that the quality and standards of the data vary depending on the geospatial data provider. To accomplish a project, some organizations require the use of multiple datasets that complement one another. As a result, they must understand how to buy location data that is both trustworthy and relevant to the task at hand.This article highlights the top types of location data, and where to buy it based on the type of data you need:Points of interest (POI)PropertyMobilityDemographicAddressBoundaryEnvironmentalStreetImageryCheck out the type of data you’re looking for, and learn about the best place to acquire it based on your needs and what you’re using it for.If you need to learn about the top 9 types of location data in more detail, check out our detailed guide on Geospatial Data Types and How You Can Use Them.Top 9 Types & Where to Buy Location Data Based On Your NeedsIf you’re looking to purchase location data for your business, it’s best to first figure out what kind you need and then find a reliable place to get it.We’ve done a bit of the legwork for you here by giving you a brief overview of 9 major types of location data, along with some of the most common use cases for each of them. We’ve also included a brief description of one of the leading providers of each data type to help you start your search.1. Points of interest (POI) dataPoints of interest data gives information on most types of buildings, properties, and landmarks. These can include shops, arenas/stadiums, restaurants, monuments, and other interesting natural phenomena (such as Uluru / Ayers Rock in Australia or the London Eye in England). Generally, POIs do not include places where individuals live, like homes or apartment units. Each POI dataset and provider is different, so definitions and inclusions vary.Primary use cases of POI data: Mapping projects to see where things are on EarthReal estate business opportunity analysisMarket condition assessment for retailers and CPG brandsFinancial analysis (opening & closing of businesses across trade areas)Healthcare planning based on locations of existing facilitiesBest place to buy points of interest (POI) data: SafeGraphAt SafeGraph, we carefully curate our POI data to represent the source of truth for physical places. Our enhanced ML algorithm ensures data reliability and accuracy without jeopardizing privacy standards. It also offers a wide range of data services in several forms, which significantly influences corporate profitability, industry, and the general public. 2. Property dataProperty data shows the boundaries of individual buildings or parcels of land. The most valuable property data includes spatial hierarchy metadata, i.e. multiple units within the same property (apartments, mall stores, offices, etc.).Primary use cases of property data: Mapping – more accurate than POI data for showing what places actually look likeVisit attribution – did a person enter a building or just walk by it?Insurance – assessing building risk factors Best place to buy property data: SafeGraphSafeGraph also has property data in the form of its Geometry dataset. This data provides building footprints and spatial hierarchy metadata for millions of points of interest in the US, UK, and Canada. This helps to determine the size, area, and boundaries of places, as well as their location-based relationships to each other.3. Mobility dataMobility data represents anonymized counts of people visiting a point of interest or a neighborhood (i.e. census block group, or CBG). It can often show not only how many people visit, but also when they visit, how long they stay, where they come from, and where else they go.Primary use cases of mobility data: Store and advertisement placements for businessesWriting general liability insurance policies based on a POI’s (seasonal) visit countUrban planning for transportation routes, housing, etc.Best place to buy mobility data: VerasetVeraset has mobility data which provides monthly anonymized information about visitors to many.4. Demographic dataDemographic data counts people in a geographic area and sorts them based on various attributes: age range, gender, marital status, employment status, and more.Primary use cases of demographic data: Understanding a business’s potential customers – spending power, lifestyle, etc.Choosing where to locate stores and ads based on nearby economic strengthDetermining what products and brands to carry for expected clienteleBest place to get demographic data: US Census Bureau (via SafeGraph)Cost: FreeA lot of demographic data is publicly and freely available through government agencies, such as the US Census Bureau. However, it isn’t always easily accessible or cleanly formatted for the kinds of analyses it’s used for. That’s why SafeGraph offers a bulk download of US census data with over 7500 attributes, organized by (and paired with geometry from) census block groups. This makes it easier to put to work right away, or combine with other data sets.5. Address dataAddress data provides specific information, through coordinate pairings, street address attributes, or both on where a location is on Earth. Addresses form the basis of geocoding and reverse geocoding processes, and can be enriched with other geospatial data to further understand what is happening at a specific place.Primary use cases of address data: Mapping and visualizing where places are located on EarthRepresentation of residential buildings (in contrast to POI data)Geocoding and reverse geocoding (street addresses ↔ coordinates)Giving context to other geospatial data types (e.g. weather, streets, jurisdictions)Best place to buy Address data: InfutorCost: 0.5¢ per recordInfutor is a leader in consumer identity management and resolution. Its National Spatial Reference File, is the largest and most comprehensive database of addresses and geographic coordinates in the US.6. Boundary dataImage source: Wikimedia CommonsBoundaries denote borders between geographic areas large enough to contain multiple POIs, properties, or addresses. Examples can include countries, states/provinces/regions, or catchment areas for certain services (such as school districts or emergency services).Primary use cases of boundary data: Political and other organizational separations in mappingUrban planning to ensure schools and other public services cover sufficient areasBusiness or real estate planning based on jurisdictional rules or other attributesBest place to buy Boundaries data: CARTOCost: Varies according to needs; contact their sales team for a demoCARTO is the leading cloud-native location intelligence platform. It has nearly 10,000 sets of geospatial data, including over 600 on geographical boundaries.7. Environmental dataImage source: Esri ArcGISEnvironmental data covers natural phenomena. These include weather and temperature patterns, land elevation, seismic activity, tides, and plant/animal habitats or migratory patterns.Primary use cases of environmental data: Environmentalism and natural conservation effortsCrisis and disruption response for businessesInsurance – assessing risk based on likelihood of (and vulnerability to) natural disastersBest place to buy environmental data: ClimateCheckCost: 5¢ per recordClimateCheck combines data from over 25 international models on climate change to gauge the risk of climate-related damage to US properties over the next 3 decades. Its Climate Risk Snapshot dataset assesses over 140 million US properties for how vulnerable they are to natural disasters like heat waves, droughts, floods, fires, and severe storms. This makes it valuable for insurers, real estate agents, and others.8. Streets dataImage source: Ordnance Survey Data HubStreets data gives information about transportation networks – most often roads for cars and other land-borne vehicles. That can include traffic and other obstructions.Primary use cases of streets data:Gives context to address data in mapping (i.e. how to get from one place to another)Route planning functions in GIS softwarePlanning detours if routes are obstructed or blocked offBest place to buy streets data: MapboxCost: Annual subscriptions that vary based on the geographic region(s) coveredMapbox Traffic Data is powered by machine learning algorithms, processing inputs from over 600 million active monthly users (across over 45,000 mobile applications) and constantly testing them against observed moving vehicles. Processing over 1 million trips and 300 million miles daily, Mapbox is able to deliver constantly-updating streets data for over 30 billion road segments around the world, checked against over 400 billion live location updates each month. Plus their data works with multiple platforms, such as OpenStreetMap, HERE, or TomTom.9. Imagery dataImage source: United States Geological SurveyGeospatial imagery data shows what places in the world physically look like. It helps to give concrete visual reference to more abstract types of location data.Primary use cases of Imagery data:Gives context to other types of geospatial data in mapping (i.e. a basemap)Moment-by-moment updates on Earth’s status for environmentalists/conservationistsBest place to buy imagery data: EsriCost: Annual subscriptions that vary based on accuracy and quantity of data purchasedPositioning themselves as the company that deals in “the science of where”, Esri has partnered with many leading location data providers to offer many different types of geospatial data. Their “Places Data” features land parcels and building footprints that can provide insights into property ownership, municipal zoning, and more. Esri also makes the popular geographic information system software ArcGIS.That’s a quick rundown of the key types of location data and some of the best places to find each of them. Of course, you can start right here at SafeGraph: we have POI and property data. #### Where to Get Business and Store Location Data + Why You Should Use It A street address is just one piece of location data for a business. Unfortunately, street addresses don’t tell you much about the business besides how to get there or where to send them mail. If you’re in private equity, banking, insurance, real estate, or retail, comprehensive business location data can tell you so much more.You can learn which businesses and brands are hot (or not) in a certain area, where their customers come from (and their demographics), where the prime spots to live or set up shop are, how at-risk a store is for an accident or weather-related damage, and more.As useful as these insights sound, you first need to know where to get the data for analysis. You can collect it yourself manually or through publicly-available information, but that will usually take up too much of your company’s time, money, and effort. We’ll point you towards some places to get usable geospatial data on stores and other businesses quickly, as part of our agenda:What is store location data?Types of store location data6 benefits of using store location dataWhere to get business & store location dataWe’ll start by explaining what “location data” on stores and businesses encompasses.What is store location data?Business and store location data is a type of “point of interest” (POI) data. This refers to data on non-residential buildings or other locations that people may want to visit. At its most basic, this includes geographic location, but can also include other attributes as well.Specifically, retail store location data likely includes attributes such as hours of operation, categories (and brands) of products sold, services provided, and average pricing compared to nearby businesses in similar industries. Food-oriented stores, including restaurants, may also have data on the specific types of foods they sell or serve.Information on stores and businesses can also include other types of data from the geospatial ecosystem, as we’ll explain in the next section.Also, keep in mind that geospatial information regarding stores and businesses can be very dynamic. Businesses may change their pricing, inventory, or operating hours every so often. Stores may also change locations or close down depending on where the overall business sees market opportunities.That means it’s important to get location data that’s relatively fresh, to make sure the information you have on specific stores and businesses is accurate. This is why SafeGraph’s Places data is updated monthly, to account for changes like these.Types of data you can join to store location data for deeper insightsInformation about store and business locations can be broken down into several different categories. Here are some ways store location data providers may represent the information they have on businesses.Point of Interest: General location and descriptive information about non-residential buildings and other landmarks. Helpful for conducting retail trade area analysis and site selection based on businesses (including potential competitors) in an area.Property: Polygon modeling of a building’s shape and footprint, which sometimes includes units within buildings. Useful for visit attribution (i.e. determining how many people actually set foot inside a store) and for insurance risk assessments.Mobility: Measures foot traffic around stores and other POIs. Mainly used by businesses to inform their site selection and inventory selection, as well as where to place advertisements.Demographics: Population counts segmented by various attributes (age, gender, income, lifestyle, etc.). Gives businesses a general idea of what types of people frequent their stores.Address: Location as provided by geographic coordinates or postal information. Useful for helping customers find businesses relative to street data and their own location.Environment: Local weather conditions and natural phenomena. Can be useful for site selection and risk assessment based on vulnerability to fires, floods, storms, earthquakes, etc.Imagery: Photographic depictions of a place. Can be used to help a store’s customers pick out the store from its surroundings, or for a business to showcase some of its products.6 benefits of using store location dataSo what’s the use in having retail location data? What can you do with it? Quite a few things, as it turns out. Geospatial data for stores and businesses have applications in sectors ranging from retail itself to the likes of insurance, real estate, and financial services.Here are a few ways companies in these industries can use location data for businesses and stores.Conduct comprehensive trade area analysis: See what types of businesses are flourishing or have an opportunity in a geographic area based on what shops are already there.Improve site selection & deselection: Place stores and advertisements in areas where you’re bound to attract lots of customers, and move away from places where there’s too much competition.Help people settle closer to things they need: Use metrics like grocery store location data to spot real estate opportunities, so people can settle down where important services are easily accessible for them.Assess insurance liability: Where a store is and how many people typically visit it can indicate how vulnerable it is to damage or accidents that insurance may need to cover.Build customer demographic profiles: Get a sense of who your customers are and what their lifestyles are like to adjust your operations and product/service offerings accordingly.Show off your store and what it offers: Give customers a visual reference for what your space looks like so they can find it easily, and show off your services or products in action to entice them.Where to get business & store location dataBusiness location data management is difficult to do on your own. It requires a lot of manual research and/or setting up systems to monitor any changes in information. Instead of reinventing the wheel, it’s much easier to purchase your data from companies like SafeGraph that specialize in location data collection. This allows you to get the data that’s most relevant to you (whether in type or attributes), and your supplier will often do most of the work in keeping you updated if something changes.Here are some top providers.1. SafeGraphCost: Contact for pricingSafeGraph has comprehensive data on millions of businesses, stores, and other points of interest across the world through our Places dataset. We also provide accurate footprints for these places through our Geometry dataset.This combination of datasets is great for mapping, visit attribution, and more.2. CAP LocationsCost: contact for pricingCAP Locations specializes in location data for malls and other shopping centers - over 40,000 across the US and Canada, in fact. They also have data on how many people visit these places per hour, what general neighborhoods they come from, and even what shops inside the mall they visit. They’ve even expanded the analysis range for almost half of the shopping centers they cover to create comprehensive trade areas for broader study. Finally, they provide POI data on over 1.2 million retail stores and restaurants, including persistent unique IDs, store categories, ownership hierarchies, and more.3. SMR ResearchCost: $4/reportOne of SMR’s product offerings is their Enhanced Commercial Property Database, which goes the extra mile to provide attribute information for stores that public records may not have. These attributes include building tenants (not just the owners), approximate square footage, owner contact information, what the property is specifically used for, its approximate value, and its specific POI name. They also provide links to images on GIS platforms, grouping for real estate parcels that are part of the same property (e.g. apartments, stores in malls, offices in business complexes), insurance and credit risk models, and more.4. CRED iQCost: $300-$400/user/month; free option availableBeyond basic property data, CRED iQ provides advanced financial data on stores and businesses in the US. That includes details on if a property is being financed by a loan, and the full terms of that loan. Their higher-tier services also provide data on tenants who are leasing properties and when those leases expire, financial summaries of tenants, approximate valuations of properties, contact information for property owners, and more. It’s all useful information for commercial investors, banks assessing credit risk, or those looking to lease their properties.Why worry about how to collect location data, much less how to organize it into a useful form, when companies like SafeGraph already do most of the heavy lifting for you? Visit SafeGraph to sample the kinds of data that can boost your business’s analytics and decision-making. #### Why Accurate Data is Important for Business Operations There are many maxims out there about how data has become one of the most critical resources to businesses and other organizations. At SafeGraph, we agree that institutions can make better decisions when those decisions are driven by data. However, we also offer the caveat that simply having more data to work with rarely, if ever, increases the likelihood that the right decisions will be made. In fact, we would argue it’s much more important for data to be accurate than abundant. Basing a decision on incorrect or irrelevant data is often worse than not having enough of the right data to support a decision. As a way of explaining how, we’ll look at why accurate data is important through each of the following sections: What is accurate data? Why data accuracy is important in business 5 other benefits of having accurate data for business operations What causes data inaccuracy (and how to avoid it) How to improve data quality How SafeGraph ensures data quality Before we get too deep into things, let’s answer a fundamental question: what makes data accurate? What is accurate data? Accurate data refers to information that reflects reality or another source of truth. That is, it can be tested against a fact or other evidence to determine that it represents something how it actually is. This could include things like a person’s contact information or a place’s location on Earth. Accuracy is often confused with precision, but there is a slight difference between what these two terms mean. Precision refers more to how similar or dissimilar values are compared to one another, usually measured against some other variable. So data can be accurate, precise, both, or neither. ‍ 5 factors, in addition to accuracy, that affect data quality So why is accurate data important? On a macro level, it’s part of a group of interrelated factors that affect how reliable data is for various use cases. This is referred to as “data quality”. ‍ Here are explanations of the other attributes that contribute to data quality: Completeness – It’s difficult to judge the quality of data that isn’t available in the first place. Likewise, if certain data is missing from a dataset, it can be more difficult to draw reliable conclusions from the data that is available. Relevance – Quality data can still be unhelpful if it doesn’t answer the question(s) your organization is interested in. Before gathering data, have your company set clear intentions on what it wants to learn and why. This lets your organization have an idea of what kinds of data to look for right from the start. Validity – Another important aspect of data quality is making sure your organization can reasonably compare similar types of data. If data is dissimilar, including being presented in different formats (e.g. 12-hour vs. 24-hour clock) or measured by different units (e.g. pounds vs. kilograms), it can be difficult to organize or analyze properly. Timeliness – Tied closely to data accuracy is the time between when data is produced and when it is collected and used. The shorter this time period, the more likely the data is to remain accurate. Conversely, the longer it has been since the event data refers to has occurred, the more likely conditions have changed and the data is no longer relevant.‍ Consistency – Related to our earlier discussion of precision, consistency refers to how often data is accurate across multiple datasets. Even if data is correct in one dataset, if it is different in content or format in another dataset, then separate groups could draw unique conclusions and be working under non-uniform assumptions. This can make it difficult for departments within the same company, or multiple cooperating companies, to work together efficiently. Why data accuracy is important in business Next, let’s look at the following question through a corporate lens: “Why is it important to have accurate data?”. Modern businesses are integrating data into more and more of their operations. While this carries the promise of greater competitive advantages if done correctly, it also means there’s much more to lose if the data is wrong. The following points will illustrate why having accurate data is critical to various facets of your company. 1. It enables better decision-making Businesses can be more confident in the decisions they make if they have accurate and relevant data as evidence to base those decisions on. This has a number of benefits, including decreasing risk and making it easier to achieve consistent results. 2. It improves productivity More accurate data makes your business more efficient for a very simple reason. The fewer inaccuracies your company’s data has, the less time employees will have to spend finding and correcting these errors. That frees up more time for employees to work on the tasks and projects your organization wants to prioritize. It also makes it easier for your business’s various departments to work together efficiently. 3. It focuses audience targeting and marketing efforts WIth accurate data on your company’s customers, it becomes easier for your marketing team to know exactly what your target audience is. Accurate data also helps your business expand its advertising efforts through appealing to consumers with similar traits to those in your core customer base. It can even inform your organization’s content or product design in order to keep existing customers engaged. 4. It develops and preserves brand credibility Accurate data builds trust in your business from both inside and outside. Internally, quality data that helps make a more productive, reliable, and successful company can smooth the adoption of cutting-edge data-driven technologies and systems. Externally, quality data – when it’s properly managed – helps to show customers that your organization is responsive to their needs, takes their security seriously, and provides reliable information. It also simplifies compliance with ever-changing industry regulations. 5. It saves time, money, and other assets In helping your business do all of these positive things, it also follows that accurate data helps your company avoid a number of pitfalls. At base, it reduces the need to spend time and money finding and fixing errors in the data. This is a resource-intensive task, and if it isn’t done properly, it can lead to further problems – especially because data errors tend to compound on top of one another. For example, bad data can lead to mistargeted marketing efforts. This means your organization is wasting time and money advertising to demographics that aren’t likely to yield customers. Worse, this can make existing patrons feel your company is no longer catering to what they want or giving them useful information, and so they may start searching for alternatives. Poor quality data can also cause your business to run afoul of industry regulations, resulting in further damage to its credibility – not to mention expensive fines. 5 other benefits of having accurate data for business operations The above reasons list why accurate data is important to a business, but there are other benefits too: Better AI implementation: Many modern businesses are using machine learning and other artificial intelligence techniques to automate processes and quickly build predictive models. But these algorithms are often only as good as the data used to train them. That’s why it’s important to use accurate and consistent data, as this leads to more reliable outputs. Easier identification of core problems: A pitfall of poor quality data is that errors are often caused by other errors, making it difficult to trace where the root issue occurred. Having more accurate, consistent, and timely data makes it simpler to isolate and correct mistakes without having to wait for high-level signals that something has gone wrong. Competitive advantage: Business is by-and-large a competition. So having quality data helps your company keep up with competitors and industry trends. With accurate data, your organization may be able to spot and take advantage of opportunities faster than your rivals can. Without accurate data, your business can fall behind the times. Improved customer service: A key part of satisfying customers is to understand their perspectives and be responsive to their needs. Having accurate data about their preferences and interests aids your business in preparing for what they may need assistance with, and perhaps also what they want to learn about or purchase next. This cycle of feedback and engagement helps build and retain a loyal customer base.‍ Increased ROI: In essence, data is an asset that a business has to invest in. So taking the care to ensure its quality right off the bat means there won’t be as much need to do so down the road. Ultimately, this lowers the costs associated with the data and lets your company start generating value from it sooner. What causes data inaccuracy (and how to avoid it) We’ve spent much of this piece answering the question “Why is it important to ensure that data is accurate within a company?” Now, let’s approach the question from a more fundamental angle: how does data become inaccurate in the first place? Things are always changing, so it’s impossible to get data 100% right, 100% of the time. However, there are certain processes and systems (or a lack thereof) within organizations that tend to cause data to be further away from reality than it should be. Here are five examples (along with explanations) of how to manage them to avoid data quality degradation. 1. Manual data entry Human error is a common cause of inaccuracies in data. No matter how detail-oriented and careful someone is, they are still at risk of making mistakes when transcribing data. This risk increases with the more data a person has to manage, as well as with the number of people who are allowed to access and edit data. Solution: Install systems in your organization’s databases to check for common input errors. Spell checking is a key one, but so are validation rules for making sure data is entered in the correct format and measurement. Note that even these aren’t immune to human error, so be sure to test them regularly to make sure they work properly. It’s also a good idea to put controls in place to manage who can access and edit certain data in your organization. This reduces the risk of someone who shouldn’t be editing your company’s data tampering with it. 2. Lack of data standardization Another frequent cause of poor data quality is a lack of validation standards. Data could be correct, but could still cause sorting and analysis problems if there are formatting changes between similar records or multiple versions of the same record. Examples include uppercase vs. lowercase letters, punctuation, abbreviations, units of measurement, and date formats (e.g. 4/3/2022 could be April 3rd or March 4th, depending on if month-day-year or day-month-year formatting is used). Solution: There should be organization-wide norms on how to classify different types of data, and what format each one should be in. Set out clear guidelines so there’s no ambiguity as to when a certain kind of data is being referenced and how it should be represented. 3. Data decay Data decay is the opposite of data timeliness. It occurs when the status of something in the real world changes, making data that refers to it no longer accurate or relevant. This usually happens when certain data is not used or accessed for an extended period of time. And that is often a symptom of a company investing too heavily in data collection instead of tools to clean, sort, and manage data in a timely manner. Solution: Have a diligent data team that stays on top of potential changes to data and revises it regularly. Investing in automated data management systems and/or dedicated data quality tools can help as well. A more general way to address this problem is to focus on collecting relevant and accurate data for your business, rather than try to collect as much data as possible. 4. Data siloing Data siloing refers to a problem where data someone within an organization needs is somewhere inside that same organization, but the person cannot access it. They may lack the proper authorization credentials for that space, or they may not even know the data exists there. This can prompt an employee to try and find comparable data from outside sources. And that can cause data consistency issues due to duplicate records, especially if the outside data is different in content or format than what an organization already has on file. Solution: Similar to with data standardization, having a well-defined system of validation rules and categorization for what certain types of data are (and aren’t) can help reduce inconsistencies. Another step that can be useful is to invest in a dedicated data catalog solution. This can help people in your organization know what data is available to them, evaluate its relevance to a particular use case, and seamlessly gain access to it. 5. Poor data culture A general reason why data inaccuracy can occur at an organization is employees have not been trained to pay attention to data quality. This is because, traditionally, it’s been thought to only be important to IT teams and BI specialists. Other employees typically focus on their tasks without even realizing they may be causing data accuracy errors, and address incorrect data only after it results in a costly mistake. Solution: It’s critical that all members of a business – not just the IT and BI people – be educated on why data quality is important. They should be taught how to maintain data accuracy in the course of their work, including how to use modern data quality tools to clean and manage data. This is especially paramount as data becomes increasingly essential to modern business decisions, and as business intelligence tools become more accessible for any type of employee. How to improve data quality Let’s digress one more time from the question of “Why is detailed and accurate data important for my business?” In the previous section, we discussed some reasons why an organization’s data may not be as accurate as it should be. Here, we’ll look at the other side of the coin and share some guidelines on how to keep your company’s data quality from degrading. Make a data collection plan: A fundamental way to ensure data quality is to plan for it at the collection stage. Set guidelines for what kind of data your company will collect; how it will collect and manage it; and who will be involved in the collection process (and what their roles are). This will help cut down on initial data entry problems. Set data quality goals: Key stakeholders need to evaluate which facets of data quality your business is doing well in, and which ones could use improvement. They should then work to solve how to fix your organization’s data quality shortcomings, including setting realistic goals that your company’s data entry team can handle. You don’t want the data entry team under unnecessary pressure, as this will often create more data accuracy errors than it fixes. Use quality data sources: It may seem obvious, but an effective way to avoid data quality issues is to get quality data from reliable sources during the collection process. While no distributor is perfect, your business should be able to assess providers regarding factors that point to their data being more or less usable than that of other vendors. The better quality data your company starts with, the less work it has to do to clean and maintain the data up to target standards. Create guidelines for intra-organization data flow: Your organization should develop protocols for how departments should distribute and integrate data, as well as communicate on data-related issues. This helps to lessen inconsistencies caused by data siloing and not following data formatting standards, which are common problems during these processes. Lay out a data audit process: Errors in data are inevitable, so it’s important for your business to have a system in place for addressing them. That system should explicitly identify who in your company is responsible for correcting data accuracy errors, and what methods they should use to find and fix these mistakes. Also schedule how often these audits will be done. A higher frequency will usually result in data that stays accurate longer, but you’ll have to weigh this against how much time, money, and engineering power your organization can afford to spend on these tasks. Continue to revise the data quality assurance cycle: It’s important to audit not just the data itself, but also the processes through which your business ensures the integrity of its data. Document and periodically review the data quality issues that your company is running into to determine which ones are most commonly coming up (and which aren’t). This should give your organization an idea of where it needs to fine-tune its data quality assurance program so that it doesn’t keep getting the same data errors over and over. How SafeGraph ensures data quality A big part of why SafeGraph is able to deliver some of the highest-quality data in the industry is because it’s our sole focus. Many of our competitors curate geospatial data as just one part of a larger suite of services, including things like data management platforms, data visualization software, and other data analysis tools. SafeGraph doesn’t have any of these other things; we devote our entire operation to sourcing, cleaning, and distributing the highest-quality data we can, as fast as we can. To illustrate, our point of interest dataset – Places – is curated through three main steps. First, we crawl public web domains and use publicly available APIs for accurate and up-to-date information about all different types of POIs and information about them. Next, we license third-party datasets to fill in any gaps we find in the public information we collected. Finally, we pass the metadata for all of the places we find through a rigorous de-duping and merging process. This allows us to standardize address formats, merge or remove duplicate records, and assign relevant place subcategories. And since data is our entire business, we can complete these processes for all of our datasets to remain fresh on a monthly basis. This allows us to not only expand our datasets more frequently, but also ensure they maintain their accuracy and completeness for longer periods of time. In contrast, other companies in our industry publish updates to their data only quarterly or semiannually on average. Merely analyzing any and all data your business can gather won’t necessarily lead to better decision-making. On the contrary, your company could be hurting itself if it draws the wrong conclusions from the data – and there are many reasons this could happen. The data could be irrelevant to your organization’s goals, significantly outdated, or simply not indicative of how things really are. That’s why having accurate data is a vital part of building a solid foundation for your business’s operations and strategies. The importance of accurate data in healthcare, finance, urban planning, retail, marketing, and many other industries cannot be overstated. Even otherwise correct decisions, when guided by incorrect data, can leave your organization no further ahead – or, in a worst-case scenario, even further behind. Don’t get stuck working with inaccurate data. Contact SafeGraph today, or get a sample of our point of interest data, to see how powerful quality data can be for your business. ‍ ‍ #### Why Your Business Should Use Financial Data Visualizations How much work is your company’s financial data really doing? Sure, it forms the basis of your organization’s investment-related decision-making. However, it’s only as good as your ability to organize it, interpret it, draw the right conclusions from it, and take appropriate action. And let’s face it: not everyone on your team has the data science skills to do all those things by themselves.Enter financial data visualizations. Putting financial data in these more picture-based formats makes it easier to digest for non-technical team members than it would be in a bunch of tables and spreadsheets. So the right data gets understood by the right people, who can take the right actions on it at the right time. Learn about the power of visualizing financial data in these sections:What are financial data visualizations?9 reasons why financial data visualizations are usefulHow to get data for financial visualizations6 financial data visualization examples to get you startedTop 8 financial data visualization toolsWe’ll start by reiterating in a bit more detail what we mean by visualizing financial data, and then get into why and how to do it.What are financial data visualizations?Financial data visualizations are methods of visually (usually pictorially) representing financial data such as profit & loss, sales figures, income & expenses, assets & liabilities, and equity. Some common examples include charts, graphs, maps, infographics, diagrams, and virtual dashboards.The overall point of visualizing financial data is to make it more accessible to key stakeholders so they can take appropriate action on it. We’ll explain some specific examples in the next section.9 reasons why financial data visualizations are usefulWhen you visualize financial data, you’re helping people at your organization see beyond just numbers and spreadsheets. You’re exploring a key question (or maybe multiple ones), or telling a story people can easily draw conclusions from. Here are some reasons why that’s so powerful:Most people are visual learners – Sight is the dominant information-processing sense in most human beings, so representing data visually makes it easier to understand and remember.Compartmentalize relevant data – Stakeholders don’t want to go through the massive quantities of data companies handle these days, so visualizations allow for chunking down data into relevant pieces that are easier to process.See trends easier – Data visualizations, especially interactive ones, make it much easier to spot patterns that can be taken advantage of – or anomalies that require corrective action.Give geospatial context – Mapping financial data to physical locations allows stakeholders to see beyond what happens to where (and sometimes when) it happens, helping them understand why it happens.Get the bigger picture – Visualizing data from multiple datasets at the same time, especially when it’s a mix of financial and non-financial data, can reveal insights a single dataset on its own might not be able to tell you.Make decisions faster – Because data visualization allows stakeholders to more quickly connect the dots between discrete pieces of data, they can spend more time taking action on insights and less time interpreting to get those insights in the first place.Reduce the risk of errors – Since data visualization makes it easier to use patterns and trends as guides, it’s less prone to errors compared to doing calculations and analysis manually.Predict and set goals more accurately – Another benefit of seeing data in context with visualizations is it becomes easier to make more accurate predictions, as well as set more reasonable and precise financial targets.Improve your marketing – Customers often understand less about a company’s data than the people inside the company do, so visualizing that data makes it easier for them to retain information and quickly make comparisons.How to get data for financial visualizationsUsually, financial data will come from within your own company. Other companies may release their financial data, but generally only as a form of marketing or because they are mandated by law to do so (e.g. quarterly earnings reports). You can find some of this information from government sources, but other times you may be able to find it on a company’s website, social media feeds, or advertisements.The problem is this data often isn’t in structures or formats that make it easy to visualize or otherwise draw inferences from. That means it can take extra time, effort, and resources to wrangle this data before it’s usable. Fortunately, SafeGraph is a company that specializes in cleaning data ahead of time to make it easier for you to get right to the analysis part.Our Spend dataset is a first-of-its-kind database on debit and credit card transactions that includes attribution related to both time and location. That means you can use it to compare spending trends at points of interest across a geographic area, or measure the sales performance of specific brands at multiple locations. You can even track a specific store to see how the number and magnitude of transactions changes over time.6 financial data visualization examples to get you startedWe should offer the caveat here that gaining the benefits of visualizing financial data requires choosing an appropriate way to present it. Otherwise, you risk your audience not understanding the insights you’re trying to convey. Or worse, the audience might draw the wrong conclusions because the data has been misrepresented somehow.To show you what we mean, here are a handful of financial data visualization examples SafeGraph has created or contributed to, in order to show you how to visualize financial data effectively.1. QSR Consumer Spending TrendsA dashboard powered by Tableau and SafeGraph data that illustrates consumer spending habits at quick-serve restaurants throughout the state of Delaware from January of 2020 to September of 2021. It includes measures of transactions by home city, average spending over time periods, transactions by intermediary, and average transaction total by customer income bracket.There are many insights that could be derived from these visualizations. Companies can see where customers are visiting their stores from and what demographics they belong to. They can also see when the prime months for business are so they can ramp up their operations accordingly. Third-party payment managers (particularly Square and Visa, in this case) can also use them as barometers to see which QSR chains to approach for (wider) adoption.2. Understanding Changes in Consumer Behavior During a Natural Disaster: Hurricane IdaAn examination of human mobility and consumer spending in New Orleans (and other parts of Louisiana) before, during, and after Hurricane Ida in August of 2021. From the visualizations, we inferred two major patterns. First, coastal areas of New Orleans (and southeastern Louisiana in general) saw larger drops in movement during and after the hurricane, as people tended to remain sheltered for longer in more vulnerable areas. Second – as pointed to by the example above – mobility and spending saw sharp declines when the natural disaster hit, then took about 2 to 3 weeks to recover to pre-disaster levels.Visualizing these kinds of data can help with disaster readiness, response, and recovery. Companies and governments can plan when (and where) to concentrate their resources in order to accommodate pre-disaster stockpiling, aid people most isolated by the disaster, and speed up a return to normal social and economic activity for the affected areas.3. How to Enrich POS Data to Analyze & Predict CPG SalesA project CARTO did with data from Mastercard, Spatial.ai, Applied Geographic Solutions, the US Census Bureau, and our Patterns dataset. It sought to model the sale of liquor products across Iowa through 2018-2019 in relation to local geospatial factors such as nearby points of interest, per-capita income of demographics, and social sentiment regarding interest in liquor. The goal was to create a model that could be used to predict sales in areas with similar geospatial conditions, without needing historical sales data for those areas.4. Analyzing Foot Traffic Data in Private EquityA webinar we hosted with CARTO and American Securities in December of 2020 discussing how foot traffic (and other types of alternative data) can be used to inform investments in private companies. An example is the dashboard pictured above, which visualizes grocery and convenience store visits in New York in March and April of 2020 after the onset of the COVID-19 pandemic. The overall trends show people were visiting these types of stores less frequently, and were not traveling as far from home to do so.However, certain stores actually saw large increases in visits. And while unique visits to grocery and convenience stores decreased, dwell times and spending at these locations actually increased. These patterns could point to people trying to cut travel times and overall trips in order to reduce their potential exposure to the COVID-19 virus. Or they may have been trying to stock up because they anticipated stores closing or not having certain items available.In any case, these are the sorts of behavioral insights geographically visualizing financial data can provide private equity investors. It can help them determine which types of stores will be in demand based on what kinds of products they carry, how many people live nearby, and how accessible the store locations are. It can also help investors guess things like whether new stores opening in an area will be successful based on the level of competition they have.5. Improving Economic Forecasting with Alternative DataAnother webinar we hosted, this time with representatives from Goldman Sachs. It explains why and how to use alternative data to supplement official financial data in predicting economic trends, especially over volatile time periods such as during the COVID-19 pandemic.The visualization shown above displays the average rate of personal consumer spending in the United States, based on whether consumers were receiving unemployment benefits or not, over a stretch of the COVID-19 pandemic (March-December 2020). It also notes dates on which there were significant changes to US unemployment benefits.As can be seen, spending among both groups hit a trough in late March and early April as unemployment, store closures, and public health restrictions limited economic activity. Past that, however, the consumer group receiving unemployment insurance had much more dramatic spikes in spending patterns relative to when benefits and government stimulus payments increased or reduced. Both groups returned to about the median rate of spending by the end of the year.Visualizations like these can provide a more on-the-ground snapshot of economic activity in a geographic area. They can display trends in things like spending patterns, bankruptcy filings, foot traffic in rural vs. urban areas, consumer sentiment regarding particular issues, and even economic activity in locations sensitive to major market shifts. This can let a company forecast the state of an industry or economy before the official data is released. Then the company can use the official data to gauge the accuracy of its predictions.6. Validating Spend Data for Brands Against Company ReportingA demonstration of how our new Spend dataset can be used to compare measured financial data against figures a company officially reports. The example above compares Target’s reported quarterly earnings in 2020-2021 with the total amount of money spent in their stores over the same time period. Visualizing financial data in this way can allow companies to observe how spending trends change in response to certain events, or to compare profit & loss metrics against actual sales performance. Spend even allows this to be done within specific geographic areas, right down to individual stores.Top 8 financial data visualization toolsAnother part of doing financial data visualization correctly is using the right tools for it. The best ones let you consolidate your datasets, build visualizations through dragging and dropping assets, search and get help through intuitive functions, and/or teach virtually anyone on your team how to create visualizations through a user-friendly interface. Here are some of the top platforms to construct your financial data visualization on.1. TableauTableau is currently one of the most popular general platforms for visualizing data. Its very intuitive interface allows for running queries by simply dragging and dropping data sources, and is designed with clean visuals that are easy to understand because they aren’t overly technical. Tableau’s role-based licensing allows team members to work within their skill sets, and its vast user community is always coming up with new ways to make the experience more powerful and user-friendly.2. Microsoft Power BIA data visualization platform from Microsoft, Power BI can take both structured and unstructured financial data and turn it into understandable business models with the help of advanced AI technology. It also allows for monitoring and visualizing analytics in real time. Power BI can be integrated with Excel – another popular financial data analytics tool – as well as many other Microsoft programs and systems.3. Esri ArcGISArcGIS is one of the most powerful platforms for analyzing spatial data, but it can also be used to analyze financial data and other forms of data. It’s primarily a mapping product, so it’s perfect for creating data visualizations, especially ones that illustrate data’s relationship to locations. ArcGIS also has a whole host of extensions for use cases in finance, business, military, energy, environmentalism, and beyond. Learn more about SafeGraph’s integration with ArcGIS.4. CARTO BuilderCARTO Builder is another spatial data analysis platform that can also be used to visualize financial data and other types of data. It integrates with many other popular cloud data and analytics services such as Databricks, Amazon Redshift, Google BigQuery, and Snowflake. It also can draw on an in-house catalog of over 10,000 datasets concerning finance, human mobility, weather, demographics, points of interest, road traffic, and more from over 40 trusted sources. Wherever you source your data from, CARTO’s Builder platform makes it easy to carry out analysis quickly and construct a visualization out of it. Learn more about SafeGraph’s integration with CARTO.5. Mapbox Studio & Mapbox Tiling ServiceMapbox is all about mapping – where things are, where people are going, and how far the boundaries of places extend. So it’s a fantastic platform if you want to visualize financial data with respect to geography on a map. Mapbox Tiling Service makes it easy to plug in sources of financial data to visualize them in a map that changes as the data does. Or, if you want full control, Mapbox Studio allows you to design custom maps that suit your purposes. You can swap color palettes, change road widths, add terrain contours, make building footprints 3-dimensional, toggle points of interest on or off by category, and much more.6. DomoDomo is a notable up-and-coming business intelligence platform that includes some neat data visualization tools. For one, it will suggest an appropriate visualization to use based on the data you upload. It also lets your team collaborate on and annotate visualizations in real time, as well as set alerts for when key metrics hit significant benchmarks. You can also use Domo to build custom interactive dashboards that are optimized to display consistently on any device. Learn more about SafeGraph’s integration with Domo.7. Amazon QuickSightAmazon QuickSight is a data visualization platform powered by cutting-edge machine learning technologies. This lets you create a visualization or do advanced financial data analysis by asking simple, natural language questions. As part of the Amazon Web Services suite of cloud-based infrastructures, platforms, and apps, QuickSight also connects with services like S3 and Athena to store and query data securely. Learn more about SafeGraph’s integration with AWS.8. HEAVY.AIFormerly known as OmniSci, HEAVY.AI uses state of the art graphics and computing to put you in control of how you visualize your financial data. It allows you to filter common attributes from across internal and external datasets, and even monitor them over time. It also allows for quickly iterating dashboards multiple times so you can make comparisons, estimates, and forecasts without changing your core parameters. It even has mapping capabilities so you can add a location dimension to your analysis – as global or as granular as you’d like. ‍So now you know what financial data visualizations are and why they’re useful. You know where to get the data for them, what effective ones look like, and which platforms can help you build them with ease. So there’s just one thing left to do: get out there and start making them. But be sure to swing by our site first to look at some samples from our Spend dataset; you might want to get familiar with some kinds of alternative data that could help you. ### Data Examples #### Austin Neighborhood Catchment Areas URL: https://www.safegraph.com/data-examples/austin-catchment-areas/ #### Austin Tourism Map URL: https://www.safegraph.com/data-examples/austin-tourism-map/ #### Gas & EV Charging Stations URL: https://www.safegraph.com/data-examples/gas-ev-charging-stations/ #### Global Energy POIs & Nearby Places URL: https://www.safegraph.com/data-examples/global-energy-pois-nearby-places/ #### Grocery Store Access URL: https://www.safegraph.com/data-examples/grocery-store-access/ #### March Madness Stadium Footprints URL: https://www.safegraph.com/data-examples/2025-march-madness-stadium-footprints/ #### Mexico City Attractions URL: https://www.safegraph.com/data-examples/mexico-city-attractions/ #### Now Available: SafeGraph UK Places Data URL: https://www.safegraph.com/data-examples/now-available-safegraph-uk-places-data/ #### Parking Lot Polygons URL: https://www.safegraph.com/data-examples/parking-lot-polygons/ #### SafeGraph Category Tags URL: https://www.safegraph.com/data-examples/safegraph-category-tags/ #### SafeGraph Places Statistics: POI Data Coverage by Country URL: https://www.safegraph.com/data-examples/places-statistics/ #### San Francisco Public Transit Points URL: https://www.safegraph.com/data-examples/san-francisco-public-transit-points/ #### Starbucks vs. Dunkin' Predominance URL: https://www.safegraph.com/data-examples/starbucks-vs-dunkin-predominance/ #### UK Brand Distribution URL: https://www.safegraph.com/data-examples/uk-brand-distribution/ #### Visit Attribution Methods: Radius vs. Building Footprint URL: https://www.safegraph.com/data-examples/visit-attribution-methods/ ### Events #### A Fireside Chat with David Rothschild, Economist at Microsoft Research URL: https://www.safegraph.com/events/fireside-chat-david-rothschild/ #### A Fireside Chat with Dr. Nicholas Christakis URL: https://www.safegraph.com/events/fireside-chat-with-nicholas-christakis/ #### Accelerating Economic Recovery with Industry Intelligence URL: https://www.safegraph.com/events/accelerating-economic-recovery-with-industry-intelligence/ #### Analyzing Canadian Communities with POI Data URL: https://www.safegraph.com/events/analyzing-canadian-communities-with-poi-data/ #### Announcing SafeGraph Spend: The First Places-Based Transaction Dataset URL: https://www.safegraph.com/events/announcing-safegraph-spend/ #### Best Practices for Working with Large Quantities of Geospatial Data URL: https://www.safegraph.com/events/best-practices-for-working-with-large-quantities-of-geospatial-data-2/ #### Building Stronger Customer Profiles with Consumer Brand Affinities URL: https://www.safegraph.com/events/analyzing-cross-store-shopping-behavior/ #### Canadian Market Analysis with POI Data from the AWS Data Exchange URL: https://www.safegraph.com/events/canadian-market-analysis-with-poi-data-from-the-aws-data-exchange/ #### Data Science Salon 2021: Industrial POI Analysis: The Future of Data Science in Ecommerce and Retail URL: https://www.safegraph.com/events/data-salon-virtual-applying-ai-machine-learning-to-retail-ecommerce/ #### GeoIgnite 2022: Accurate Geospatial Analysis for a Dynamically Changing World URL: https://www.safegraph.com/events/accurate-geospatial-analysis/ #### GeoIgnite 2022: Maintaining Quality Places and POI Data (and Why It Matters) URL: https://www.safegraph.com/events/maintaining-quality-places-data/ #### Going Global: Announcing SafeGraph Global Brands On-demand webinar Going Global: Announcing SafeGraph Global Brands Fletcher Berryman & Daniel Gray Explore the Future of Global Places Data Watch Webinar The Global POI Data Gap — And How SafeGraph Is Closing It Hear from Fletcher Berryman, Product Manager at SafeGraph, about the strategy and technical details behind SafeGraph Global Brands, the industry's most comprehensive global dataset of places data for any brand, anywhere in the world. Most companies working with location data outside the US encounter significant gaps: missing POIs, stale records, and no reliable way to track how physical locations change over time.SafeGraph resolves these issues with a brand-first approach to global expansion and a monthly update cadence that keeps data accurate, reliable, and ready to use. This webinar will explain to you: Why location data is dynamic  and what that means for businesses tracking physical places globally Why quality global POI data is scarce outside the US, and how SafeGraph is tackling that gap How SafeGraph defines a "brand"  from global chains to neighbourhood multi-location businesses How SafeGraph shifted from a country-first to a brand and category-first international strategy How open/close tracking lets you measure market change without relying on mobility data How AtScale's semantic layer makes global POI data analysis-ready on day one across any BI tool https://vimeo.com/637156445?fl=pl&fe=sh Watch this webinar to transform how you work with global places data. #### How to Operationalize Geospatial Datasets into Self-Service Analytics at Scale URL: https://www.safegraph.com/events/operationalizing-blended-geospatial-data/ #### How to Turn a Bunch of Data Into a Site De-Selection Strategy URL: https://www.safegraph.com/events/how-to-turn-a-bunch-of-data-into-a-site-de-selection-strategy/ #### Knowledge Series: Best Practices for Working with Large Datasets URL: https://www.safegraph.com/events/knowledge-series-best-practices-for-working-with-large-datasets/ #### Knowledge Series: Introductory Geospatial Analysis URL: https://www.safegraph.com/events/knowledge-series-geospatial-analysis/ #### Leveraging Geospatial Data to Improve the Top and Bottom Line at the Enterprise Level URL: https://www.safegraph.com/events/leveraging-geospatial-data-to-improve-the-top-and-bottom-line-at-the-enterprise-level/ #### Modeling OOH Exposure with Spatial Analytics and Places Data URL: https://www.safegraph.com/events/modeling-ooh-exposure/ #### Modeling Real World Change with Geospatial Data URL: https://www.safegraph.com/events/odsc-east-webinar/ #### Pint-Sized but Mighty Happy Hour URL: https://www.safegraph.com/events/snowflake-summit-hh/ #### Points of Interest Just Got More Interesting URL: https://www.safegraph.com/events/points-of-interest-just-got-more-interesting/ #### Polygon Data: Technical Deep Dive into Critical Components URL: https://www.safegraph.com/events/polygon-data-technical-deep-dive/ #### Snowflake Data Marketplace Partner Lunch & Learn for Retail URL: https://www.safegraph.com/events/snowflake-partner-retail-lunch-learn/ #### The Power of POI Open and Close Data in Geospatial Analysis URL: https://www.safegraph.com/events/the-power-of-open-close-data/ #### Top Use Cases for Polygon Data (+ How They’re Created) URL: https://www.safegraph.com/events/top-use-cases-for-polygon-data/ #### Working with Point POIs URL: https://www.safegraph.com/events/working-with-point-pois/ ### Publications #### [COVID-19] Social Distancing in Texas URL: https://www.safegraph.com/publications/covid-19-social-distancing-in-texas/ #### 20 Stunning Geosocial Solutions In 2020 URL: https://www.safegraph.com/publications/20-stunning-geosocial-solutions-in-2020/ #### 3 Reasons General Electric Stock Could Double in 2021 URL: https://www.safegraph.com/publications/3-reasons-general-electric-stock-could-double-in-2021/ #### 71 people who went to the polls on April 7 got COVID-19; tie to election uncertain URL: https://www.safegraph.com/publications/71-people-who-went-to-the-polls-on-april-7-got-covid-19-tie-to-election-uncertain/ #### A County-level Dataset for Informing the United States' Response to COVID-19 URL: https://www.safegraph.com/publications/a-county-level-dataset-for-informing-the-united-states-response-to-covid-19/ #### A Data-Driven Take on Social Distancing in the United States URL: https://www.safegraph.com/publications/a-data-driven-take-on-social-distancing-in-the-united-states/ #### A framework for delineating the scale, extent and characteristics of American retail centre agglomerations URL: https://www.safegraph.com/publications/a-framework-for-delineating-the-scale-extent-and-characteristics-of-american-retail-centre-agglomerations/ #### A Natural Coronavirus Experiment Is Playing Out In Kentucky And Tennessee URL: https://www.safegraph.com/publications/a-natural-coronavirus-experiment-is-playing-out/ #### A Resurgence of the Virus, and Lockdowns, Threatens Economic Recovery URL: https://www.safegraph.com/publications/a-resurgence-of-the-virus-and-lockdowns-threatens-economic-recovery/ #### A Social Network Under Social Distancing: Risk-Driven Backbone Management During COVID-19 and Beyond URL: https://www.safegraph.com/publications/a-social-network-under-social-distancing-risk-driven-backbone-management-during-covid-19-and-beyond/ #### A Study Into Location-Based Covid-19 Cases Throughout Los Angeles URL: https://www.safegraph.com/publications/a-study-into-location-based-covid-19-cases-throughout-los-angeles/ #### Across US, a 'tale of two cities' as some embrace reopening amid coronavirus and others remain wary URL: https://www.safegraph.com/publications/across-us-a-tale-of-two-cities-as-some-embrace-reopening-amid-coronavirus-and-others-remain-wary/ #### Aggregated cellphone data shows Americans are slowly going out more amid coronavirus crisis URL: https://www.safegraph.com/publications/aggregated-cellphone-data-shows-americans-are-slowly-going-out-more-amid-coronavirus-crisis/ #### Ahead of U.S. August jobs data, high frequency numbers still show stodgy progress URL: https://www.safegraph.com/publications/ahead-of-u-s-august-jobs-data-high-frequency-numbers-still-show-stodgy-progress/ #### AI 50: America’s Most Promising Artificial Intelligence Companies URL: https://www.safegraph.com/publications/ai-50-americas-most-promising-artificial-intelligence-companies-2/ #### AI 50: America’s Most Promising Artificial Intelligence Companies URL: https://www.safegraph.com/publications/ai-50-americas-most-promising-artificial-intelligence-companies/ #### Aiming to become the definitive source for location data, SafeGraph raises $45M URL: https://www.safegraph.com/publications/aiming-to-become-the-definitive-source-for-location-data-safegraph-raises-45m/ #### Alameda County - Cases and Visits Dashboard URL: https://www.safegraph.com/publications/alameda-county-cases-and-visits-dashboard/ #### America Is on the Road to Relapse Not Recovery URL: https://www.safegraph.com/publications/america-is-on-the-road-to-relapse-not-recovery/ #### America’s cautious comeback URL: https://www.safegraph.com/publications/americas-cautious-comeback/ #### Americans are Delaying Medical Care, and It's Devastating Health-Care Providers URL: https://www.safegraph.com/publications/americans-are-delaying-medical-care-and-its-devastating-health-care-providers/ #### Americans excel at coronavirus safety precautions URL: https://www.safegraph.com/publications/americans-excel-at-coronavirus-safety-precautions/ #### Americans splurged over Labor Day weekend. That's the good news URL: https://www.safegraph.com/publications/americans-splurged-over-labor-day-weekend-thats-the-good-news/ #### Are Republicans or Democrats Social Distancing More? URL: https://www.safegraph.com/publications/are-republicans-or-democrats-social-distancing-more/ #### As states start to reopen, here’s where people are going URL: https://www.safegraph.com/publications/as-states-start-to-reopen-heres-where-people-are-going/ #### Association of Mobile Phone Location Data Indications of Travel and Stay-at-Home Mandates With COVID-19 Infection Rates in the US URL: https://www.safegraph.com/publications/association-of-mobile-phone-location-data-indications-of-travel-and-stay-at-home-mandates-with-covid-19-infection-rates-in-the-us/ #### AtScale Adds New Datasets and BI Dashboards to AtScale Data Insights Marketplace to Uncover New Insights URL: https://www.safegraph.com/publications/atscale-adds-new-datasets-and-bi-dashboards-to-atscale-data-insights-marketplace-to-uncover-new-insights/ #### Bankruptcy and the COVID-19 Crisis URL: https://www.safegraph.com/publications/bankruptcy-and-the-covid-19-crisis/ #### Belief in Science Influences Physical Distancing in Response to COVID-19 Lockdown Policies URL: https://www.safegraph.com/publications/belief-in-science-influences-physical-distancing-in-response-to-covid-19-lockdown-policies/ #### BIG DATA ANALYTICS SHOWS HOW AMERICA’S INDIVIDUALISM COMPLICATES CORONAVIRUS RESPONSE URL: https://www.safegraph.com/publications/big-data-analytics-shows-how-americas-individualism-complicates-coronavirus-response/ #### Big Data-Derived tool facilitates closer monitoring of recovery from natural disasters URL: https://www.safegraph.com/publications/big-data-derived-tool-facilitates-closer-monitoring-of-recovery-from-natural-disasters/ #### Big Data/COVID-19 News – 6/1/2020 URL: https://www.safegraph.com/publications/big-data-covid-19-news-6-1-2020/ #### Bikers From at Least 39 States Tracked to Rally With Few Masks URL: https://www.safegraph.com/publications/bikers-from-at-least-39-states-tracked-to-rally-with-few-masks/ #### Black Lives Matter protests did not cause an uptick in covid-19 cases URL: https://www.safegraph.com/publications/black-lives-matter-protests-did-not-cause-an-uptick-in-covid-19-cases/ #### Black Lives Matter Protests, Social Distancing, and COVID-19 URL: https://www.safegraph.com/publications/black-lives-matter-protests-social-distancing-and-covid-19-20/ #### Building a resilience framework from phone location data URL: https://www.safegraph.com/publications/building-a-resilience-framework-from-phone-location-data/ #### Business Exit During the COVID-19 Pandemic: Non-Traditional Measures in Historical Context URL: https://www.safegraph.com/publications/business-exit-during-the-covid-19-pandemic-non-traditional-measures-in-historical-context/ #### Buy Norwegian Cruise Line Stock Before It Doubles URL: https://www.safegraph.com/publications/buy-norwegian-cruise-line-stock-before-it-doubles/ #### Calibrating the dynamic Huff model for business analysis using location big data URL: https://www.safegraph.com/publications/calibrating-the-dynamic-huff-model-for-business-analysis-using-location-big-data/ #### Canadian Retail Foot Traffic Jumps in Sign of Pent-Up Demand URL: https://www.safegraph.com/publications/canadian-retail-foot-traffic-jumps-in-sign-of-pent-up-demand/ #### CARTS: Chicago Fed Advance Retail Trade Summary URL: https://www.safegraph.com/publications/carts-chicago-fed-advance-retail-trade-summary/ #### Causal Chain: Shelter-In-Place, Social Distancing, and Suppression of COVID-19 URL: https://www.safegraph.com/publications/causal-chain-shelter-in-place-social-distancing-and-suppression-of-covid-19/ #### CDC COVID Data Tracker URL: https://www.safegraph.com/publications/cdc-covid-data-tracker/ #### CDC report says people in four key cities are listening to stay at home orders URL: https://www.safegraph.com/publications/cdc-report-says-people-in-four-key-cities-are-listening-to-stay-at-home-orders/ #### CDC Report: San Francisco Among Four Key Cities Listening To Stay At Home Orders URL: https://www.safegraph.com/publications/cdc-report-san-francisco-among-four-key-cities-listening-to-stay-at-home-orders/ #### Cell phone data show Bay Area’s early response to shelter orders paid off URL: https://www.safegraph.com/publications/cell-phone-data-show-bay-areas-early-response-to-shelter-orders-paid-off/ #### Cell phone data shows coronavirus kept churchgoers at home in every state on Easter URL: https://www.safegraph.com/publications/cell-phone-data-shows-coronavirus-kept-churchgoers-at-home-in-every-state-on-easter/ #### Cell phone mobility data reveals heterogeneity in stay-at-home behavior during the SARS-CoV-2 pandemic URL: https://www.safegraph.com/publications/cell-phone-mobility-data-reveals-heterogeneity-in-stay-at-home-behavior-during-the-sars-cov-2-pandemic/ #### Cellphone data shows that Americans respected stay-at-home orders but are starting to move again URL: https://www.safegraph.com/publications/cellphone-data-shows-that-americans-respected-stay-at-home-orders-but-are-starting-to-move-again/ #### Cellphone data shows where people are going in Maricopa County during pandemic URL: https://www.safegraph.com/publications/cellphone-data-shows-where-people-are-going-in-maricopa-county-during-pandemic/ #### CEO Political Leanings and Store-Level Economic Activity during COVID-19 Crisis: Effects on Shareholder Value and Public Health URL: https://www.safegraph.com/publications/ceo-political-leanings-and-store-level-economic-activity-during-covid-19-crisis-effects-on-shareholder-value-and-public-health/ #### Characterizing the Spread of COVID-19 from Human Mobility Patterns and SocioDemographic Indicators URL: https://www.safegraph.com/publications/characterizing-the-spread-of-covid-19-from-human-mobility-patterns-and-sociodemographic-indicators/ #### Chicago Fed Advance Retail Trade Summary Weekly Index of Retail Trade URL: https://www.safegraph.com/publications/chicago-fed-advance-retail-trade-summary-weekly-index-of-retail-trade/ #### Chicago making strides against coronavirus, data shows; Lightfoot says ‘we’re not where we need to get’ URL: https://www.safegraph.com/publications/chicago-making-strides-against-coronavirus-data-shows-lightfoot-says-were-not-where-we-need-to-get/ #### Chicago Must Revive the Not-So-Magnificent Mile to Thrive Again URL: https://www.safegraph.com/publications/chicago-must-revive-the-not-so-magnificent-mile-to-thrive-again/ #### Chipotle Mexican Grill PT Raised to $820 at Jefferies; Sees Solid Q3 URL: https://www.safegraph.com/publications/chipotle-mexican-grill-pt-raised-to-820-at-jefferies-sees-solid-q3/ #### Coffee Time - Do Consumers Prefer Local or Corporate Coffee Shops URL: https://www.safegraph.com/publications/coffee-time-do-consumers-prefer-local-or-corporate-coffee-shops/ #### College Student Contribution to Local COVID-19 Spread: Evidence from University Spring Break Timing URL: https://www.safegraph.com/publications/college-student-contribution-to-local-covid-19-spread-evidence-from-university-spring-break-timing/ #### Colorado Stay-At-Home Order: Most Compliant Counties, Ranked URL: https://www.safegraph.com/publications/colorado-stay-at-home-order-most-compliant-counties-ranked/ #### Community venue exposure risk estimator for the COVID-19 pandemic URL: https://www.safegraph.com/publications/community-venue-exposure-risk-estimator-for-the-covid-19-pandemic/ #### Consumer response to corporate political statements: Evidence from geolocation data URL: https://www.safegraph.com/publications/consumer-response-to-corporate-political-statements-evidence-from-geolocation-data/ #### Controlling Epidemic Spread: Reducing Economic Losses with Targeted Closures URL: https://www.safegraph.com/publications/controlling-epidemic-spread-reducing-economic-losses-with-targeted-closures/ #### Coronavirus cases drop by up to 44% due to shelter-in-place orders, study drawing on CDC data shows URL: https://www.safegraph.com/publications/coronavirus-cases-drop-by-up-to-44-due-to-shelter-in-place-orders-study-drawing-on-cdc-data-shows/ #### Coronavirus lockdowns work URL: https://www.safegraph.com/publications/coronavirus-lockdowns-work/ #### Coronavirus surges aren't linked to Black Lives Matter protests URL: https://www.safegraph.com/publications/coronavirus-surges-arent-linked-to-black-lives-matter-protests/ #### Coronavirus Tracker: See the impact of coronavirus on the economy. URL: https://www.safegraph.com/publications/coronavirus-tracker-see-the-impact-of-coronavirus-on-the-economy/ #### Covid-19 : Cell Phone Study Shows Where Infection Takes Place URL: https://www.safegraph.com/publications/covid-19-cell-phone-study-shows-where-infection-takes-place/ #### COVID-19 and the Ninth District economy: A dashboard URL: https://www.safegraph.com/publications/covid-19-and-the-ninth-district-economy-a-dashboard/ #### Covid-19 Cases Surge in 460 Counties Sturgis Riders Hailed From URL: https://www.safegraph.com/publications/covid-19-cases-surge-in-460-counties-sturgis-riders-hailed-from/ #### COVID-19 economic policy effects on consumer spending and foot traffic in the U.S. URL: https://www.safegraph.com/publications/covid-19-economic-policy-effects-on-consumer-spending-and-foot-traffic-in-the-u-s/ #### COVID-19 Hits the Economy Hard in Iowa URL: https://www.safegraph.com/publications/covid-19-hits-the-economy-hard-in-iowa/ #### COVID-19 Impact on Indiana Businesses URL: https://www.safegraph.com/publications/covid-19-impact-on-indiana-businesses/ #### COVID-19 Mobility Network Modeling URL: https://www.safegraph.com/publications/covid-19-mobility-network-modeling/ #### COVID-19 Mortality Projections for US States URL: https://www.safegraph.com/publications/covid-19-mortality-projections-for-us-states/ #### COVID-19 persists ahead of Minnesota stay-at-home decision URL: https://www.safegraph.com/publications/covid-19-persists-ahead-of-minnesota-stay-at-home-decision/ #### COVID-19 Public-Private Partnership: Syracuse University College of Environmental Science and makepath Working with Safegraph to Understand Virus Mortality URL: https://www.safegraph.com/publications/covid-19-public-private-partnership-syracuse-university-college-of-environmental-science-and-makepath-working-with-safegraph-to-understand-virus-mortality/ #### COVID-19 restrictions: Map of COVID-19 case trends, restrictions and mobility URL: https://www.safegraph.com/publications/covid-19-restrictions/ #### COVID-19 Spatial Research URL: https://www.safegraph.com/publications/covid-19-spatial-research/ #### COVID-19 Transmission Dynamics and Effectiveness of Public Health Interventions in New York City during the 2020 Spring Pandemic Wave URL: https://www.safegraph.com/publications/covid-19-transmission-dynamics-and-effectiveness-of-public-health-interventions-in-new-york-city-during-the-2020-spring-pandemic-wave/ #### COVID-19’s Striking Impact on Grocery Store Foot Traffic URL: https://www.safegraph.com/publications/covid-19s-striking-impact-on-grocery-store-foot-traffic/ #### Customer Loyalty and the Persistence of Revenues and Earnings URL: https://www.safegraph.com/publications/customer-loyalty-and-the-persistence-of-revenues-and-earnings/ #### Dashboard graphs those working from home during COVID-19 URL: https://www.safegraph.com/publications/dashboard-graphs-those-working-from-home-during-covid-19/ #### Data Driven COVID-19 Analytics for Dallas County URL: https://www.safegraph.com/publications/data-driven-covid-19-analytics-for-dallas-county/ #### Data related to "Working Remotely and the Supply-side Impact of Covid-19 URL: https://www.safegraph.com/publications/data-related-to-working-remotely-and-the-supply-side-impact-of-covid-19/ #### Data trackers show U.S. retail traffic edging up URL: https://www.safegraph.com/publications/data-trackers-show-u-s-retail-traffic-edging-up/ #### Day Trips Instead of Destinations: Tourist Hot Spots Brace for Lean Years URL: https://www.safegraph.com/publications/day-trips-instead-of-destinations-tourist-hot-spots-brace-for-lean-years/ #### Density and Distancing in the Covid-19 Pandemic URL: https://www.safegraph.com/publications/density-and-distancing-in-the-covid-19-pandemic/ #### Differential COVID‐19 case positivity in New York City neighborhoods: Socioeconomic factors and mobility URL: https://www.safegraph.com/publications/differential-covid-19-case-positivity-in-new-york-city-neighborhoods-socioeconomic-factors-and-mobility/ #### Differential COVID‐19 case positivity in New York City neighborhoods: Socioeconomic factors and mobility URL: https://www.safegraph.com/publications/differential-covid-19-case-positivity-in-new-york-city-neighborhoods-socioeconomic-factors-and-mobility-2/ #### Dine in or Take out? Trends on Restaurant Service Demand amid the COVID-19 Pandemic URL: https://www.safegraph.com/publications/dine-in-or-take-out-trends-on-restaurant-service-demand-amid-the-covid-19-pandemic/ #### Do you vote like a Toyota? Or a Ford? URL: https://www.safegraph.com/publications/do-you-vote-like-a-toyota-or-a-ford/ #### Driving Disease URL: https://www.safegraph.com/publications/driving-disease/ #### Driving Over Air Travel, Takeout Over Cafes: Pandemic Shapes Consumption URL: https://www.safegraph.com/publications/driving-over-air-travel-takeout-over-cafes-pandemic-shapes-consumption/ #### Early Evidence on Social Distancing in Response to COVID-19 in the United States URL: https://www.safegraph.com/publications/early-evidence-on-social-distancing-in-response-to-covid-19-in-the-united-states/ #### Economic Activity and COVID-19 Transmission: Evidence from an Estimated Economic-Epidemiological Model URL: https://www.safegraph.com/publications/economic-activity-and-covid-19-transmission-evidence-from-an-estimated-economic-epidemiological-model/ #### Editorial: California, Bay Area officials bear burden of reopening amid coronavirus URL: https://www.safegraph.com/publications/editorial-california-bay-area-officials-bear-burden-of-reopening-amid-coronavirus/ #### Effectiveness and Compliance to Social Distancing During COVID-19 URL: https://www.safegraph.com/publications/effectiveness-and-compliance-to-social-distancing-during-covid-19/ #### Effects of COVID-19 on demographic populations and social distancing measure URL: https://www.safegraph.com/publications/effects-of-covid-19-on-demographic-populations-and-social-distancing-measure/ #### Effects of the COVID-19 Pandemic on Park Use in U.S. Cities URL: https://www.safegraph.com/publications/effects-of-the-covid-19-pandemic-on-park-use-in-u-s-cities/ #### End of third quarter shows bright spots, holes in U.S. economic recovery URL: https://www.safegraph.com/publications/end-of-third-quarter-shows-bright-spots-holes-in-u-s-economic-recovery/ #### Estimating the infection-fatality risk of SARS-CoV-2 in New York City during the spring 2020 pandemic wave: a model-based analysis URL: https://www.safegraph.com/publications/estimating-the-infection-fatality-risk-of-sars-cov-2-in-new-york-city-during-the-spring-2020-pandemic-wave-a-model-based-analysis/ #### Exploring the Economic Trends of Santa Barbara County URL: https://www.safegraph.com/publications/exploring-the-economic-trends-of-santa-barbara-county/ #### Exploring the relationship between mobility and COVID− 19 infection rates for the second peak in the United States using phase-wise association URL: https://www.safegraph.com/publications/exploring-the-relationship-between-mobility-and-covid-19-infection-rates-for-the-second-peak-in-the-united-states-using-phase-wise-association/ #### Exploring Top Grocery Stores by State URL: https://www.safegraph.com/publications/exploring-top-grocery-stores-by-state/ #### Fear, Lockdown, and Diversion: Comparing Drivers of Pandemic Economic Decline 2020 URL: https://www.safegraph.com/publications/fear-lockdown-and-diversion-comparing-drivers-of-pandemic-economic-decline-2020/ #### Fewer people staying home as government officials discuss lifting restrictions URL: https://www.safegraph.com/publications/fewer-people-staying-home-as-government-officials-discuss-lifting-restrictions/ #### Five insights about harnessing data and AI from leaders at the frontier URL: https://www.safegraph.com/publications/five-insights-about-harnessing-data-and-ai-from-leaders-at-the-frontier/ #### Foot Traffic Still Down More Than 90% in Seven NYC Zip Codes, According to Cell Phone Location Data URL: https://www.safegraph.com/publications/foot-traffic-still-down-more-than-90-in-seven-nyc-zip-codes-according-to-cell-phone-location-data/ #### Forecasting the Spread of COVID-19 under Different Reopening Strategies URL: https://www.safegraph.com/publications/forecasting-the-spread-of-covid-19-under-different-reopening-strategies/ #### Geospatial Dashboards During COVID-19 URL: https://www.safegraph.com/publications/geospatial-dashboards-during-covid-19/ #### Geospatial Data Headache Solved with Placekey Launch URL: https://www.safegraph.com/publications/geospatial-data-headache-solved-with-placekey-launch/ #### Give Me Liberty AND Give Me Death: A Science Experiment in Six Days URL: https://www.safegraph.com/publications/give-me-liberty-and-give-me-death-a-science-experiment-in-six-days/ #### Global Covid Vital Signs URL: https://www.safegraph.com/publications/global-covid-vital-signs/ #### God is in the Rain: The Impact of Rainfall-Induced Early Social Distancing on COVID-19 Outbreaks URL: https://www.safegraph.com/publications/god-is-in-the-rain-the-impact-of-rainfall-induced-early-social-distancing-on-covid-19-outbreaks/ #### GRAPHIC-Across U.S. electoral battlegrounds, recovery may be ebbing as virus flows URL: https://www.safegraph.com/publications/graphic-across-u-s-electoral-battlegrounds-recovery-may-be-ebbing-as-virus-flows/ #### Hair Salons Reopen, and Americans Rush Back URL: https://www.safegraph.com/publications/hair-salons-reopen-and-americans-rush-back/ #### Hardware retail during COVID-19 URL: https://www.safegraph.com/publications/hardware-retail-during-covid-19/ #### Here's how COVID-19 is impacting foot traffic across the U.S. URL: https://www.safegraph.com/publications/heres-how-covid-19-is-impacting-foot-traffic-across-the-u-s/ #### Here’s how to safely go to the salon during the pandemic, according to stylists and health experts URL: https://www.safegraph.com/publications/heres-how-to-safely-go-to-the-salon-during-the-pandemic-according-to-stylists-and-health-experts/ #### High-Frequency Data Prove Their Staying Power With Fed’s Buy-In URL: https://www.safegraph.com/publications/high-frequency-data-prove-their-staying-power-with-feds-buy-in/ #### Home Depot and Lowe’s Primed for Historic Growth During Pandemic URL: https://www.safegraph.com/publications/home-depot-and-lowes-primed-for-historic-growth-during-pandemic/ #### Home Depot, Lowe’s Primed for Historic Growth During Coronavirus URL: https://www.safegraph.com/publications/home-depot-lowes-primed-for-historic-growth-during-coronavirus/ #### How are people reacting to social distancing guidelines? URL: https://www.safegraph.com/publications/domo-com-covid19-impact-social-distancing/ #### How COVID-19 made high-frequency data a go-to economic indicator URL: https://www.safegraph.com/publications/how-covid-19-made-high-frequency-data-a-go-to-economic-indicator/ #### How do political beliefs impact the response to coronavirus? URL: https://www.safegraph.com/publications/how-do-political-beliefs-impact-the-response-to-coronavirus/ #### How Effective Is Social Distancing? URL: https://www.safegraph.com/publications/how-effective-is-social-distancing/ #### How good are Michiganders at social distancing? Here's a look at Greater Lansing's grades URL: https://www.safegraph.com/publications/how-good-are-michiganders-at-social-distancing-heres-a-look-at-greater-lansings-grades/ #### How has COVID-19 Impacted Customer Relationship Dynamics at Restaurant Food Delivery Businesses? URL: https://www.safegraph.com/publications/how-has-covid-19-impacted-customer-relationship-dynamics-at-restaurant-food-delivery-businesses/ #### How J. Crew's bankruptcy sets the stage for a 'shakeout' in retail URL: https://www.safegraph.com/publications/how-j-crews-bankruptcy-sets-the-stage-for-a-shakeout-in-retail/ #### How Location Data Streams Can Be Used to Fight Against COVID-19 URL: https://www.safegraph.com/publications/how-location-data-streams-can-be-used-to-fight-against-covid-19/ #### How many visitors did Wisconsin businesses lose because of the COVID-19 pandemic and social distancing? URL: https://www.safegraph.com/publications/how-many-visitors-did-wisconsin-businesses-lose-because-of-the-covid-19-pandemic-and-social-distancing/ #### How SafeGraph measures mobility during COVID-19 URL: https://www.safegraph.com/publications/how-safegraph-measures-mobility-during-covid-19/ #### How Starbuck' Open Bathroom Policy is Impacting Foot Traffic URL: https://www.safegraph.com/publications/how-starbuck-open-bathroom-policy-is-impacting-foot-traffic/ #### How to use COVID-19 Public Data in Spatial Analysis URL: https://www.safegraph.com/publications/how-to-use-covid-19-public-data-in-spatial-analysis/ #### Human mobility data and machine learning reveal geographic differences in alcohol sales and alcohol outlet visits across U.S. states during COVID-19 URL: https://www.safegraph.com/publications/human-mobility-data-and-machine-learning-reveal-geographic-differences-in-alcohol-sales-and-alcohol-outlet-visits-across-u-s-states-during-covid-19/ #### Identifying high risk communities across the city of Los Angeles URL: https://www.safegraph.com/publications/identifying-high-risk-communities-across-the-city-of-los-angeles/ #### In the US, COVID-19 Transmission Thrives in Nicer Weather URL: https://www.safegraph.com/publications/in-the-us-covid-19-transmission-thrives-in-nicer-weather/ #### Income Effect and the Private Contribution of Public Goods: Household Mobility and the Economic Impact Payment during the COVID-19 Pandemic URL: https://www.safegraph.com/publications/income-effect-and-the-private-contribution-of-public-goods-household-mobility-and-the-economic-impact-payment-during-the-covid-19-pandemic/ #### Individualism During Crises: Big Data Analytics of Collective Actions and Policy Compliance amid COVID-19 URL: https://www.safegraph.com/publications/individualism-during-crises-big-data-analytics-of-collective-actions-and-policy-compliance-amid-covid-19/ #### Interactive graph shows Sacramento museum and park foot traffic after latest COVID surge URL: https://www.safegraph.com/publications/sacramento-museum-and-park-foot-traffic/ #### Interactive Map: Protests in wake of George Floyd killing touch all 50 states URL: https://www.safegraph.com/publications/interactive-map-protests-in-wake-of-george-floyd-killing-touch-all-50-states/ #### Interdependence and the Cost of Uncoordinated Responses to COVID-19 URL: https://www.safegraph.com/publications/interdependence-and-the-cost-of-uncoordinated-responses-to-covid-19/ #### Interdependence and the cost of uncoordinated responses to COVID-19 URL: https://www.safegraph.com/publications/interdependence-and-the-cost-of-uncoordinated-responses-to-covid-19-20/ #### Internal and External Effects of Social Distancing in a Pandemic URL: https://www.safegraph.com/publications/internal-and-external-effects-of-social-distancing-in-a-pandemic/ #### Internal and External Effects of Social Distancing in a Pandemic URL: https://www.safegraph.com/publications/internal-and-external-effects-of-social-distancing-in-a-pandemic-2/ #### Iowans Were Scared Into Taking the Virus Seriously URL: https://www.safegraph.com/publications/iowans-were-scared-into-taking-the-virus-seriously/ #### Is It Safer to Visit a Coffee Shop or a Gym? URL: https://www.safegraph.com/publications/is-it-safer-to-visit-a-coffee-shop-or-a-gym/ #### Leaders at the Frontier Offer Insights About Harnessing Data and AI URL: https://www.safegraph.com/publications/leaders-at-the-frontier-offer-insights-about-harnessing-data-and-ai/ #### Learning from Friends in a Pandemic: Social Networks and the Macroeconomic Response of Consumption URL: https://www.safegraph.com/publications/learning-from-friends-in-a-pandemic-social-networks-and-the-macroeconomic-response-of-consumption/ #### Liquor Consumption: a Data Story URL: https://www.safegraph.com/publications/liquor-consumption-a-data-story/ #### Marijuana Consumption: a Data Story URL: https://www.safegraph.com/publications/marijuana-consumption-a-data-story/ #### Mayor de Blasio Announces Launch of The NYC Recovery Data Partnership With Open Call for Data URL: https://www.safegraph.com/publications/mayor-de-blasio-announces-launch-of-the-nyc-recovery-data-partnership-with-open-call-for-data/ #### McDonald’s Celebrity Buzz Fuels U.S. Growth as Overseas Lags URL: https://www.safegraph.com/publications/mcdonalds-celebrity-buzz-fuels-u-s-growth-as-overseas-lags/ #### McDonald’s Celebrity Buzz Fuels U.S. Growth as Overseas Lags URL: https://www.safegraph.com/publications/mcdonalds-celebrity-buzz-fuels-u-s-growth-as-overseas-lags-2/ #### McDonald’s U.S. Traffic Beats 2019 Levels at Lunch, Dinner: Data URL: https://www.safegraph.com/publications/mcdonalds-u-s-traffic-beats-2019-levels-at-lunch-dinner-data/ #### Measuring Wisconsin Economic Activity Using Foot Traffic Data URL: https://www.safegraph.com/publications/measuring-wisconsin-economic-activity-using-foot-traffic-data/ #### Meet the company helping scientists study Covid-19 with your location data URL: https://www.safegraph.com/publications/meet-the-company-helping-scientists-study-covid-19-with-your-location-data/ #### Meet the people still sheltering in place in the Bay Area: ‘I don’t see an end to it’ URL: https://www.safegraph.com/publications/meet-the-people-still-sheltering-in-place-in-the-bay-area-i-dont-see-an-end-to-it/ #### Memphis city leaders use data from cell phones to track spread of COVID-19 URL: https://www.safegraph.com/publications/memphis-city-leaders-use-data-from-cell-phones-to-track-spread-of-covid-19/ #### Mobile location big data can help predict the potential infected areas as coronavirus spreads URL: https://www.safegraph.com/publications/mobile-location-big-data-can-help-predict-the-potential-infected-areas-as-coronavirus-spreads/ #### Mobile Phone Data Show More Americans Are Leaving Their Homes, Despite Orders URL: https://www.safegraph.com/publications/mobile-phone-data-show-more-americans-are-leaving-their-homes-despite-orders/ #### Mobile phone location data from US shows more people leaving home despite stay-at-home orders URL: https://www.safegraph.com/publications/mobile-phone-location-data-from-us-shows-more-people-leaving-home-despite-stay-at-home-orders/ #### Mobility and Engagement Index URL: https://www.safegraph.com/publications/mobility-and-engagement-index/ #### Mobility network modeling explains higher SARS-CoV-2 infection rates among disadvantaged groups and informs reopening strategies URL: https://www.safegraph.com/publications/mobility-network-modeling-explains-higher-sars-cov-2-infection-rates-among-disadvantaged-groups-and-informs-reopening-strategies/ #### Mobility network models of COVID-19 explain inequities and inform reopening URL: https://www.safegraph.com/publications/mobility-network-models-of-covid-19-explain-inequities-and-inform-reopening/ #### Modeling the Spatial Factors of COVID-19 in New York City URL: https://www.safegraph.com/publications/modeling-the-spatial-factors-of-covid-19-in-new-york-city/ #### Monitoring foot traffic during a pandemic URL: https://www.safegraph.com/publications/monitoring-foot-traffic-during-a-pandemic/ #### More cities and states are opening bars and restaurants despite mounting evidence of potential danger URL: https://www.safegraph.com/publications/more-cities-and-states-are-opening-bars-and-restaurants-despite-mounting-evidence-of-potential-danger/ #### NC vs. SC: What GPS phone data say about how well residents stay home for coronavirus URL: https://www.safegraph.com/publications/nc-vs-sc-what-gps-phone-data-say-about-how-well-residents-stay-home-for-coronavirus/ #### Neighborhood income and physical distancing during the COVID-19 pandemic in the U.S. URL: https://www.safegraph.com/publications/neighborhood-income-and-physical-distancing-during-the-covid-19-pandemic-in-the-u-s/ #### New Coronavirus Surges Slow Economic Recovery URL: https://www.safegraph.com/publications/new-coronavirus-surges-slow-economic-recovery/ #### New COVID-19 modeling: Social distancing is working in MN — but only if we keep it up URL: https://www.safegraph.com/publications/new-covid-19-modeling-social-distancing-is-working-in-mn-but-only-if-we-keep-it-up/ #### New Model Suggests California Can Relax Social Distancing May 17 URL: https://www.safegraph.com/publications/new-model-suggests-california-can-relax-social-distancing-may-17/ #### New Study: College Spring Break Helped Spread The Coronavirus URL: https://www.safegraph.com/publications/new-study-college-spring-break-helped-spread-the-coronavirus/ #### New York City Reopening Splits Along Lines of Wealth and Race URL: https://www.safegraph.com/publications/new-york-city-reopening-splits-along-lines-of-wealth-and-race/ #### New York City’s shutdown reduced spread of coronavirus by 70 percent, study finds URL: https://www.safegraph.com/publications/new-york-citys-shutdown-reduced-spread-of-coronavirus-by-70-percent-study-finds/ #### No NFL fans at MetLife Stadium hurts more than just the New York Giants and the New York Jets URL: https://www.safegraph.com/publications/no-nfl-fans-at-metlife-stadium-hurts-more-than-just-the-new-york-giants-and-the-new-york-jets/ #### Not All Food Delivery Options Are Savory URL: https://www.safegraph.com/publications/not-all-food-delivery-options-are-savory/ #### OmniSci Welcomes SafeGraph and Veraset to Its Data Catalog, Providing POI/GPS Data for Commercial, Business, Government URL: https://www.safegraph.com/publications/omnisci-welcomes-safegraph-and-veraset-to-its-data-catalog-providing-poi-gps-data-for-commercial-business-government/ #### Patterns of Denver Car Thefts and Nearby Population URL: https://www.safegraph.com/publications/patterns-of-denver-car-thefts-and-nearby-population/ #### PCCI’s Vulnerability Index URL: https://www.safegraph.com/publications/pccis-vulnerability-index/ #### Peloton and the fate of the fitness industry URL: https://www.safegraph.com/publications/peloton-and-the-fate-of-the-fitness-industry/ #### People are leaving their homes a lot less. CDC numbers show how much URL: https://www.safegraph.com/publications/people-are-leaving-their-homes-a-lot-less-cdc-numbers-show-how-much/ #### Percolation of temporal hierarchical mobility networks during COVID-19 URL: https://www.safegraph.com/publications/percolation-of-temporal-hierarchical-mobility-networks-during-covid-19/ #### Peter Thiel invests in Denver geospatial data startup's $45M Series B round URL: https://www.safegraph.com/publications/peter-thiel-invests-in-denver-geospatial-data-startups-45m-series-b-round/ #### Phone data show consumers avoiding stores, restaurants as COVID surges URL: https://www.safegraph.com/publications/phone-data-show-consumers-avoiding-stores-restaurants-as-covid-surges/ #### Placekey: The new standard identifier for physical places is adopted by world’s largest organizations, goes no-code URL: https://www.safegraph.com/publications/placekey-the-new-standard-identifier-for-physical-places-is-adopted-by-worlds-largest-organizations-goes-no-code/ #### Plunging U.S. GDP through June gives way to slow climb back in July URL: https://www.safegraph.com/publications/plunging-u-s-gdp-through-june-gives-way-to-slow-climb-back-in-july/ #### Points-of-Interest from Mapillary Street-level Imagery: A Dataset For Neighborhood Analytics URL: https://www.safegraph.com/publications/points-of-interest-from-mapillary-street-level-imagery-a-dataset-for-neighborhood-analytics/ #### Polarization and Public Health: Partisan Differences in Social Distancing during COVID-19 URL: https://www.safegraph.com/publications/polarization-and-public-health-partisan/ #### Political Beliefs affect Compliance with COVID-19 Social Distancing Orders URL: https://www.safegraph.com/publications/political-beliefs-affect-compliance-with-covid-19-social-distancing-orders/ #### Political beliefs affect compliance with COVID-19 social distancing orders URL: https://www.safegraph.com/publications/political-beliefs-affect-compliance-with-covid-19-social-distancing-orders-2/ #### Poorer areas have gone from being the least mobile before COVID-19 to the most mobile URL: https://www.safegraph.com/publications/poorer-areas-have-gone-from-being-the-least-mobile-before-covid-19-to-the-most-mobile/ #### Predicting Stages in Omnichannel Path to Purchase: A Deep Learning Model URL: https://www.safegraph.com/publications/predicting-stages-in-omnichannel-path-to-purchase-a-deep-learning-model/ #### Predicting when the current epidemic phase will end: initial estimates on when we could shift to containment strategies in the US URL: https://www.safegraph.com/publications/predicting-when-the-current-epidemic-phase-will-end-initial-estimates-on-when-we-could-shift-to-containment-strategies-in-the-us/ #### Private Precaution and Public Restrictions: What Drives Social Distancing and Industry Foot Traffic in the COVID-19 Era? Copy URL: https://www.safegraph.com/publications/private-precaution-and-public-restrictions-what-drives-social-distancing-and-industry-foot-traffic-in-the-covid-19-era-copy/ #### Probabilistic Program Inference in Network-Based Epidemiological Simulations URL: https://www.safegraph.com/publications/probabilistic-program-inference-in-network-based-epidemiological-simulations/ #### Projections for first-wave COVID-19 deaths across the US using social-distancing measures derived from mobile phones URL: https://www.safegraph.com/publications/projections-for-first-wave-covid-19-deaths-across-the-u-s-using-social-distancing-measures-derived-from-mobile-phones/ #### Projections for first-wave COVID-19 deaths across the US using social-distancing measures derived from mobile phones URL: https://www.safegraph.com/publications/projections-for-first-wave-covid-19-deaths-across-the-us-using-social-distancing-measures-derived-from-mobile-phones/ #### Public mobility data enables COVID-19 forecasting and management at local and global scales URL: https://www.safegraph.com/publications/public-mobility-data-enables-covid-19-forecasting-and-management-at-local-and-global-scales/ #### Pursuing Stakeholder Capitalism Is an Impossible Task When Stakeholders Have Different Beliefs URL: https://www.safegraph.com/publications/pursuing-stakeholder-capitalism-is-an-impossible-task-when-stakeholders-have-different-beliefs/ #### Putting Context Over Coordinates with New Location Encoding Standard URL: https://www.safegraph.com/publications/putting-context-over-coordinates-with-new-location-encoding-standard/ #### Putting the Air Transportation System to sleep: a passenger perspective measured by passenger-generated data URL: https://www.safegraph.com/publications/putting-the-air-transportation-system-to-sleep-a-passenger-perspective-measured-by-passenger-generated-data/ #### Rationing Social Contact During the COVID-19 Pandemic: Transmission Risk and Social Benefits of US Locations URL: https://www.safegraph.com/publications/rationing-social-contact-during-the-covid-19-pandemic-transmission-risk-and-social-benefits-of-us-locations/ #### Rationing social contact during the COVID-19 pandemic: Transmission risk and social benefits of US locations URL: https://www.safegraph.com/publications/rationing-social-contact-during-the-covid-19-pandemic-transmission-risk-and-social-benefits-of-us-locations-2/ #### Remember Lake of the Ozarks party pics? Many other places boomed Memorial Day URL: https://www.safegraph.com/publications/remember-lake-of-the-ozarks-party-pics-many-other-places-boomed-memorial-day/ #### Reopening Wisconsin: Regional Health and Economic Factors URL: https://www.safegraph.com/publications/reopening-wisconsin-regional-health-and-economic-factors/ #### Retail Spending in July Topped Pre-Pandemic Levels URL: https://www.safegraph.com/publications/retail-spending-in-july-topped-pre-pandemic-levels/ #### Role of meteorological factors in the transmission of SARS-CoV-2 in the United States URL: https://www.safegraph.com/publications/role-of-meteorological-factors-in-the-transmission-of-sars-cov-2-in-the-united-states/ #### Rural Americans Stopped Staying In. Then Covid-19 Hit. URL: https://www.safegraph.com/publications/rural-americans-stopped-staying-in-then-covid-19-hit/ #### SafeGraph creates a dashboard showing foot traffic patterns across the U.S. URL: https://www.safegraph.com/publications/safegraph-creates-a-dashboard-showing-foot-traffic-patterns-across-the-u-s/ #### SafeGraph Raises $45M to Democratize Access to Places Data URL: https://www.safegraph.com/publications/safegraph-raises-45m-to-democratize-access-to-places-data/ #### SafeGraph Social Distancing (Block Group) URL: https://www.safegraph.com/publications/safegraph-social-distancing-block-group/ #### SafeGraph Unveils the Shelter in Place Dashboard URL: https://www.safegraph.com/publications/safegraph-unveils-the-shelter-in-place-dashboard/ #### San Francisco COVID-19 Data Tracker URL: https://www.safegraph.com/publications/san-francisco-covid-19-data-tracker/ #### San Francisco flattened the curve early. Now, coronavirus cases are surging. URL: https://www.safegraph.com/publications/san-francisco-flattened-the-curve-early-now-coronavirus-cases-are-surging/ #### San Jose Social Distancing Compliance URL: https://www.safegraph.com/publications/san-jose-social-distancing-compliance/ #### See Where Texans Are Staying Home to Fight COVID-19 URL: https://www.safegraph.com/publications/see-where-texans-are-staying-home-to-fight-covid-19/ #### Shelter in Place? Depends on the Place: Corruption and Social Distancing in American States URL: https://www.safegraph.com/publications/shelter-in-place-depends-on-the-place-corruption-and-social-distancing-in-american-states/ #### Smart or Lucky? How Florida Dodged the Worst of Coronavirus URL: https://www.safegraph.com/publications/smart-or-lucky-how-florida-dodged-the-worst-of-coronavirus/ #### Smart phone data reveals how states are responding to shelter-in-place orders URL: https://www.safegraph.com/publications/smart-phone-data-reveals-how-states-are-responding-to-shelter-in-place-orders/ #### Smartphone Data: Many Americans Ignored Thanksgiving Travel Warnings From The CDC URL: https://www.safegraph.com/publications/smartphone-data-many-americans-ignored-thanksgiving-travel-warnings-from-the-cdc/ #### Social connections with COVID-19-affected areas increase compliance with mobility restrictions URL: https://www.safegraph.com/publications/social-connections-with-covid-19-affected-areas-increase-compliance-with-mobility-restrictions/ #### Social distancing ‘substantially varies’ by income, study finds URL: https://www.safegraph.com/publications/social-distancing-substantially-varies-by-income-study-finds/ #### Social Distancing and Social Capital: Why U.S. Counties Respond Differently to COVID-19 URL: https://www.safegraph.com/publications/social-distancing-and-social-capital-why-u-s-counties-respond-differently-to-covid-19/ #### Social Distancing Data with Ryan Fox Squire URL: https://www.safegraph.com/publications/social-distancing-data-with-ryan-fox-squire/ #### Social distancing in Black and white neighborhoods in Detroit: A data-driven look at vulnerable communities URL: https://www.safegraph.com/publications/social-distancing-in-black-and-white-neighborhoods-in-detroit-a-data-driven-look-at-vulnerable-communities/ #### Social Distancing Is Working, According To Your Cellphone Data URL: https://www.safegraph.com/publications/social-distancing-is-working-according-to-your-cellphone-data/ #### Social Distancing Might Stop. And Start. And Stop. And Start. Until We Have A Vaccine. URL: https://www.safegraph.com/publications/coronavirus-distancing-new-normal/ #### Social distancing responses to COVID-19 emergency declarations strongly differentiated by income URL: https://www.safegraph.com/publications/social-distancing-responses-to-covid-19-emergency-declarations-strongly-differentiated-by-income/ #### Social Influence in the COVID-19 Pandemic: Community Establishments' Closure Decisions Follow Those of Nearby Chain Establishments URL: https://www.safegraph.com/publications/social-influence-in-the-covid-19-pandemic-community-establishments-closure-decisions-follow-those-of-nearby-chain-establishments/ #### Socially Determined, SafeGraph to enhance SDOH URL: https://www.safegraph.com/publications/socially-determined-safegraph-to-enhance-sdoh/ #### Stanford and Carnegie Mellon find race and age bias in mobility data that drives COVID-19 policy URL: https://www.safegraph.com/publications/stanford-and-carnegie-mellon-find-race-and-age-bias-in-mobility-data-that-drives-covid-19-policy/ #### Stanford scientists' computer model predicts COVID-19 spread in cities URL: https://www.safegraph.com/publications/stanford-scientists-computer-model-predicts-covid-19-spread-in-cities/ #### Stanford-led team creates a computer model that can predict how COVID-19 spreads in cities URL: https://www.safegraph.com/publications/stanford-led-team-creates-a-computer-model-that-can-predict-how-covid-19-spreads-in-cities/ #### Starbuck Open Bathroom Policy Comes With Heavy Cost, Study Finds URL: https://www.safegraph.com/publications/starbuck-open-bathroom-policy-comes-with-heavy-cost-study-finds/ #### State of New Jersey Using Tyler Technologies' Solution to Understand Economic Data URL: https://www.safegraph.com/publications/state-of-new-jersey-using-tyler-technologies-solution-to-understand-economic-data/ #### Sturgis motorcycle rally in South Dakota in August linked to more than 250,000 coronavirus cases, study finds URL: https://www.safegraph.com/publications/sturgis-motorcycle-rally-in-south-dakota-in-august-linked-to-more-than-250-000-coronavirus-cases-study-finds/ #### Suitsupply bets on New York’s retail recovery with expansion of SoHo store URL: https://www.safegraph.com/publications/suitsupply-bets-on-new-yorks-retail-recovery-with-expansion-of-soho-store/ #### Texas Is Showing the World How to Reopen Cautiously URL: https://www.safegraph.com/publications/texas-is-showing-the-world-how-to-reopen-cautiously/ #### The ‘Rocket Ship’ Economic Recovery Is Crashing URL: https://www.safegraph.com/publications/the-rocket-ship-economic-recovery-is-crashing/ #### The AEI Housing Center’s Nowcast: Reopening of 40 Metro Area Economies URL: https://www.safegraph.com/publications/the-aei-housing-centers-nowcast-reopening-of-40-metro-area-economies/ #### The Black Lives Matter Protests Have Taught Us More About The Coronavirus URL: https://www.safegraph.com/publications/the-black-lives-matter-protests-have-taught-us-more-about-the-coronavirus/ #### The causal effects of chronic air pollution on the intensity of COVID-19 disease: Some answers are blowing in the wind URL: https://www.safegraph.com/publications/the-causal-effects-of-chronic-air-pollution-on-the-intensity-of-covid-19-disease-some-answers-are-blowing-in-the-wind/ #### The Contagion Externality of a Superspreading Event: The Sturgis Motorcycle Rally and COVID-19 URL: https://www.safegraph.com/publications/the-contagion-externality-of-a-superspreading-event-the-sturgis-motorcycle-rally-and-covid-19/ #### The Cost of Staying Open: Voluntary Social Distancing and Lockdowns in the US URL: https://www.safegraph.com/publications/the-cost-of-staying-open-voluntary-social-distancing-and-lockdowns-in-the-us/ #### The COVID-19 Pandemic: Government vs. Community Action Across the United States URL: https://www.safegraph.com/publications/the-covid-19-pandemic-government-vs-community-action-across-the-united-states/ #### The data on how Covid-19 disrupted summer vacations URL: https://www.safegraph.com/publications/the-data-on-how-covid-19-disrupted-summer-vacations/ #### The Data Science of COVID-19 Spread: Some Troubling Current and Future Trends URL: https://www.safegraph.com/publications/the-data-science-of-covid-19-spread-some-troubling-current-and-future-trends/ #### The Delta Variant Is Already Leaving Its Mark on Business URL: https://www.safegraph.com/publications/the-delta-variant-is-already-leaving-its-mark-on-business/ #### The Fed Boldly Saves Markets. Now It’s Worrying About Main Street Business URL: https://www.safegraph.com/publications/the-fed-boldly-saves-markets-now-its-worrying-about-main-street-business/ #### The Finance 202: Consumers are closing their wallets again as coronavirus infections soar URL: https://www.safegraph.com/publications/the-finance-202-consumers-are-closing-their-wallets-again-as-coronavirus-infections-soar/ #### The geometry of the pandemic in America URL: https://www.safegraph.com/publications/the-geometry-of-the-pandemic-in-america/ #### The iconic brands that could disappear because of coronavirus URL: https://www.safegraph.com/publications/the-iconic-brands-that-could-disappear-because-of-coronavirus/ #### The Impact of Coronavirus (COVID-19) on Foot Traffic URL: https://www.safegraph.com/publications/the-impact-of-coronavirus-covid-19-on-foot-traffic-publication/ #### The Impact of COVID-19 on Trips to Urban Amenities: Examining Travel Behavior Changes in Somerville, MA URL: https://www.safegraph.com/publications/the-impact-of-covid-19-on-trips-to-urban-amenities-examining-travel-behavior-changes-in-somerville-ma/ #### The Impact of Statewide Stay-at-Home Orders: Estimating the Heterogeneous Effects Using GPS Data from Mobile Devices URL: https://www.safegraph.com/publications/the-impact-of-statewide-stay-at-home-orders-estimating-the-heterogeneous-effects-using-gps-data-from-mobile-devices/ #### The limits of smartphone data are on display as the country seeks to reopen URL: https://www.safegraph.com/publications/the-limits-of-smartphone-data-are-on-display-as-the-country-seeks-to-reopen/ #### The numbers behind the jobs numbers don't look so hot URL: https://www.safegraph.com/publications/the-numbers-behind-the-jobs-numbers-dont-look-so-hot/ #### The Persuasive Effect of Fox News: Non-Compliance with Social Distancing During the Covid-19 Pandemic URL: https://www.safegraph.com/publications/the-persuasive-effect-of-fox-news-non-compliance-with-social-distancing-during-the-covid-19-pandemic/ #### The Relationship between In-Person Voting, Consolidated Polling Locations, and Absentee Voting on COVID-19: Evidence from the Wisconsin Primary URL: https://www.safegraph.com/publications/the-relationship-between-in-person-voting-consolidated-polling-locations-and-absentee-voting-on-covid-19-evidence-from-the-wisconsin-primary/ #### The role of alcohol outlet visits derived from mobile phone location data in enhancing domestic violence prediction at the neighborhood level URL: https://www.safegraph.com/publications/the-role-of-alcohol-outlet-visits-derived-from-mobile-phone-location-data-in-enhancing-domestic-violence-prediction-at-the-neighborhood-level/ #### The spread of social distancing URL: https://www.safegraph.com/publications/the-spread-of-social-distancing-2/ #### The Spread of Social Distancing URL: https://www.safegraph.com/publications/the-spread-of-social-distancing/ #### The U.S. job market is still in very bad shape. Just wait until the fiscal time bomb goes off URL: https://www.safegraph.com/publications/the-u-s-job-market-is-still-in-very-bad-shape-just-wait-until-the-fiscal-time-bomb-goes-off/ #### The Unspoken Messages of COVID-19 Restrictions URL: https://www.safegraph.com/publications/the-unspoken-messages-of-covid-19-restrictions/ #### The US Didn’t See A Major Additional COVID-19 Thanksgiving Surge. Christmas Will Be A Bigger Challenge. URL: https://www.safegraph.com/publications/the-us-didnt-see-a-major-additional-covid-19-thanksgiving-surge-christmas-will-be-a-bigger-challenge/ #### There venues are high-risk areas for spreading coronavirus, model suggests URL: https://www.safegraph.com/publications/there-venues-are-high-risk-areas-for-spreading-coronavirus-model-suggests/ #### These charts show how Boston is coming back to life URL: https://www.safegraph.com/publications/these-charts-show-how-boston-is-coming-back-to-life/ #### ThinkData announces new partnership with SafeGraph URL: https://www.safegraph.com/publications/thinkdata-announces-new-partnership-with-safegraph/ #### This may be the post-pandemic economy’s most closely watched indicator URL: https://www.safegraph.com/publications/this-may-be-the-post-pandemic-economys-most-closely-watched-indicator/ #### Three Ways Data Scientists are Fighting COVID-19 URL: https://www.safegraph.com/publications/three-ways-data-scientists-are-fighting-covid-19/ #### Time-series clustering for home dwell time during COVID-19: what can we learn from it? URL: https://www.safegraph.com/publications/time-series-clustering-for-home-dwell-time-during-covid-19-what-can-we-learn-from-it/ #### Timing of Community Mitigation and Changes in Reported COVID-19 and Community Mobility URL: https://www.safegraph.com/publications/timing-of-community-mitigation-and-changes-in-reported-covid-19-and-community-mobility/ #### Timing of State and Territorial COVID-19 Stay-at-Home Orders and Changes in Population Movement — United States, March 1–May 31, 2020 URL: https://www.safegraph.com/publications/timing-of-state-and-territorial-covid-19-stay-at-home-orders-and-changes-in-population-movement-united-states-march-1-may-31-2020/ #### Top Fast Food Chains by State URL: https://www.safegraph.com/publications/top-fast-food-chains-by-state/ #### Tracing the Country's Stay-at-Home Behaviors URL: https://www.safegraph.com/publications/tracing-the-countrys-stay-at-home-behaviors/ #### Trump’s Tulsa Rally Drew People From Dozens of Virus Hot Spots in U.S. URL: https://www.safegraph.com/publications/trumps-tulsa-rally-drew-people-from-dozens-of-virus-hot-spots-in-u-s/ #### Two New Lockdown Studies URL: https://www.safegraph.com/publications/two-new-lockdown-studies/ #### U.S. and Oklahoma Economic Outlook in the Midst of COVID-19 and Low Oil Prices URL: https://www.safegraph.com/publications/u-s-and-oklahoma-economic-outlook-in-the-midst-of-covid-19-and-low-oil-prices/ #### U.S. county level analysis to determine If social distancing slowed the spread of COVID-19 URL: https://www.safegraph.com/publications/u-s-county-level-analysis-to-determine-if-social-distancing-slowed-the-spread-of-covid-19/ #### U.S. Downtowns Yearn for Vaccine as Merchant Traffic Off 70% URL: https://www.safegraph.com/publications/u-s-downtowns-yearn-for-vaccine-as-merchant-traffic-off-70/ #### U.S. economic rebound may be a slow train for the unemployed URL: https://www.safegraph.com/publications/u-s-economic-rebound-may-be-a-slow-train-for-the-unemployed/ #### U.S. Geographic Responses to Shelter in Place Orders URL: https://www.safegraph.com/publications/u-s-geographic-responses-to-shelter-in-place-orders-geospatial/ #### U.S. hits fiscal cliff with jobs, economic recovery in the balance URL: https://www.safegraph.com/publications/u-s-hits-fiscal-cliff-with-jobs-economic-recovery-in-the-balance/ #### Understanding the Racial and Income Gap in Commuting for Work Following COVID-19 URL: https://www.safegraph.com/publications/understanding-the-racial-and-income-gap-in-commuting-for-work-following-covid-19/ #### Unemployment's second wave? Stodgy reopening, virus surge may undercut U.S. jobs URL: https://www.safegraph.com/publications/unemployments-second-wave-stodgy-reopening-virus-surge-may-undercut-u-s-jobs/ #### Universal Studios’ Crowds Look Severely Thin After Reopening URL: https://www.safegraph.com/publications/universal-studios-crowds-look-severely-thin-after-reopening/ #### Vaccinations Help Dallas County’s COVID-19 Risk Drop 40 Percent in May, According to PCCI’s Vulnerability Index URL: https://www.safegraph.com/publications/vaccinations-help-dallas-countys-covid-19-risk-drop-40-percent-in-may-according-to-pccis-vulnerability-index/ #### Virus surges. Work hours plateau. U.S. may be flattening the wrong curve URL: https://www.safegraph.com/publications/virus-surges-work-hours-plateau-u-s-may-be-flattening-the-wrong-curve/ #### Visits to U.S. stores, restaurants stall as concern over economic recovery grows URL: https://www.safegraph.com/publications/visits-to-u-s-stores-restaurants-stall-as-concern-over-economic-recovery-grows/ #### Walking in the University of Memphis: Which College Campuses Opened in Fall 2020? URL: https://www.safegraph.com/publications/walking-in-the-university-of-memphis-which-college-campuses-opened-in-fall-2020/ #### Walmart Trailed Supermarkets Amid Peak Panic-Buying: Data URL: https://www.safegraph.com/publications/walmart-trailed-supermarkets-amid-peak-panic-buying-data/ #### Walmart trailed supermarkets amid peak panic-buying: data URL: https://www.safegraph.com/publications/walmart-trailed-supermarkets-amid-peak-panic-buying-data-2/ #### Want to Help a Local Restaurant? Do This. URL: https://www.safegraph.com/publications/want-to-help-a-local-restaurant-do-this/ #### We’ve been cooped up with our families for almost a year. This is the result. URL: https://www.safegraph.com/publications/weve-been-cooped-up-with-our-families-for-almost-a-year-this-is-the-result/ #### What U.S. leaders say affects whether Americans stay at home, CDC data suggests URL: https://www.safegraph.com/publications/what-u-s-leaders-say-affects-whether-americans-stay-at-home-cdc-data-suggests/ #### What's New in Esri Demographics (Sep 2020) URL: https://www.safegraph.com/publications/whats-new-in-esri-demographics-sep-2020/ #### Where Americans are still staying at home the most URL: https://www.safegraph.com/publications/where-americans-are-still-staying-at-home-the-most/ #### Which workers bear the burden of social distancing policies? URL: https://www.safegraph.com/publications/which-workers-bear-the-burden-of-social-distancing-policies/ #### White House or State House: Who do we listen to on social distancing? URL: https://www.safegraph.com/publications/white-house-or-state-house-who-do-we-listen-to-on-social-distancing/ #### Who Would Have Predicted This? Americans Excel at Staying Home URL: https://www.safegraph.com/publications/who-would-have-predicted-this-americans-excel-at-staying-home/ #### Why economists, hedge funds, and health officials are using this startup’s data to understand the pandemic URL: https://www.safegraph.com/publications/why-economists-hedge-funds-and-health-officials-are-using-this-startups-data-to-understand-the-pandemic/ #### Why maps matter in our response to COVID-19 URL: https://www.safegraph.com/publications/why-maps-matter-in-our-response-to-covid-19/ #### William Watson: Surviving the ups and downs on the info corona-coaster URL: https://www.safegraph.com/publications/william-watson-surviving-the-ups-and-downs-on-the-info-corona-coaster/ #### Your Income Predicts How Well You Can Socially Distance URL: https://www.safegraph.com/publications/your-income-predicts-how-well-you-can-socially-distance/ ### Use Cases #### ARITY + SAFEGRAPH ARITY + SAFEGRAPH Building Trips + Audiences with SafeGraph Places Partner with a modern data provider so you can spend less time sourcing data, and more time driving business outcomes. Learn More Places Data to Enhance Arity's Trips and Audience Product Combine connected car data with SafeGraph's accurate points of interest to contextualize a full trip with stops. Explore the interactive dashboard INRIX enhances vehicle trips data with precise POIs from SafeGraph Historically, investors analyzed passenger and commercial vehicle trip data separately. However, INRIX's Trips Plus product combined with SafeGraph Places provides a significant opportunity to assess whether a brand's manufacturing, logistics, distribution, or shipping activities align with consumer behaviors or diverge from them. Read the case study Access Extensive Coverage Across Brands and Categories SafeGraph data allows you to further segment your analysis by specific brands or categories. Explore our global place coverage to see the possibilities. View all the stats The Industry’s Most Trusted Places Data
 POIs 0 M+ Brands 0 K+ Categories 0 + Countries &Territories 0 + Learn more about accurate and precise global POI data #### Global OOH Global OOH Improve out of home advertising with SafeGraph Places Use accurate global points of interest to segment your audience and build campaigns based on proximity to real world places. Learn More Fuel Higher Performing OOH Advertising Campaigns See how Clear Channel Europe used SafeGraph’s global point of interest data to pinpoint the correct OOH ad placements based on their customers’ marketing goals. Read the case study Precise Building Footprints for Visit Attribution Use SafeGraph point of interest geofences to attribute visits by a third party to help inform audience creation. Read the whitepaper Access Extensive Coverage Across Brands and Categories SafeGraph data allows you to further segment your analysis by specific brands or categories. Explore our global place coverage to see the possibilities. View all Stats The Industry’s Most Trusted Places Data
 POIs 0 M+ Brands 0 K+ Categories 0 + Countries &Territories 0 + Learn more about accurate and precise global POI data #### Lead Generation Lead Generation Improve Lead Generation With High Quality Places Data Power sales teams with accurate places data to target the right locations, increase conversions, and maintain an up-to-date database of prospects. Learn More Stay ahead of the competition with the freshest data With continuous data collection and monthly updates in bulk, you'll gain access to the most current, reliable, global location dataset so you can be the first to know when there’s a new location to target. Best Practices in Applying Accurate Location Data to CPG Improve conversion rates with granular insights Outbound with more customized messaging by knowing more about a location beyond the industry. Instead of just "Full-Service Restaurant" (NAICS 722511), we'll also provide tags like 'Pizza', 'Lunch', 'Dinner', 'Drive Through', and 'Late Night' so that you can glean more meaningful details about that specific restaurant. See what else SafeGraph Places has to offer Don’t waste time targeting the wrong locations Increase sales efficiency by easily identifying which locations have shut down to avoid continuous follow up or showing up to the wrong spots. Read about our open and close data The Industry’s Most Trusted Places Data
 POIs 0 M+ Brands 0 K+ Categories 0 + Countries &Territories 0 + Learn more about accurate and precise global POI data #### Transaction Enrichment Transaction Enrichment Contextualize Transactions with High Quality Places Data Accurate and up-to-date global points of interest (POI) to power transaction enrichment capabilities. Learn More Understand the “where” for every transaction Fill in data gaps with precise point of interest details including merchant name, category, address, lat/long, and more. For example, turn a “POS PURCHASE POS0828 WALGREENS” on a credit card statement into insights like “this purchase was made at Walgreen's on 1300 Bush Street in San Francisco, CA”. Explore Our POI Attributes Easily match financial data with Places data Aligning data from various transaction sources with SafeGraph Places data doesn’t need to be a heavy lift. Unlike other sources of POI data, our unique columns such as Store ID provide geographic coordinates to help you join data to specific merchant locations. Explore SafeGraph’s Store ID Gain Confidence with the Leading Merchant Database As a business solely focused on building high quality datasets, we utilize rigorous sourcing and verification to ensure that you're receiving the most accurate, current, and reliable data available with our monthly updates. Our fine-tuned set of rich attributes provide all the context you need for each transaction. See an example of our coverage The Industry’s Most Trusted Places Data
 POIs 0 M+ Brands 0 K+ Categories 0 + Countries &Territories 0 + Learn more about accurate and precise global POI data #### WOOLBRIGHT + SAFEGRAPH WOOLBRIGHT + SAFEGRAPH Expand redevelopment efforts with SafeGraph Places Partner with a modern data provider so you can spend less time sourcing data, and more time driving business outcomes. Learn More Places Data to Support Woolbright’s Expansion Efforts Use accurate points of interest and building footprints to purchase and redevelop underperforming shopping centers across the US. Explore the Interactive Dashboard Access Extensive Coverage Across Brands and Categories SafeGraph data allows you to further segment your analysis by specific brands or categories. Explore our global place coverage to see the possibilities. View all Stats Avison Young Uses SafeGraph Data to Offer Local Market Insights for Commercial Real Estate Site Selection In the past, Avison Young’s team of analysts would often spend up to 40% of their time (per project) cleaning data to make it usable. According to Julia Adams, Director of Data Scientist at Avison Young, “SafeGraph makes it possible for us to provide the most updated, real-time insights to our clients with the greatest amount of accuracy and precision.” Read the Case Study The Industry’s Most Trusted Places Data
 POIs 0 M+ Brands 0 K+ Categories 0 + Countries &Territories 0 + Learn more about accurate and precise global POI data ### Custom Landing Pages #### Esri Partner Conference 2023 URL: https://www.safegraph.com/lp/epc-2023/ #### Esri User Conference 2023 URL: https://www.safegraph.com/lp/esri-user-conference-2023/ #### Geocoded addresses URL: https://www.safegraph.com/lp/geocoded-addresses/ #### Get High Quality POI Data The Most Trusted Partner for Accurate Global POI Data Clean, high-quality places data — so your analytics, models, and location products start with truth. POIs 0 M+ Brands 0 K+ Categories 0 + Countries & Territories 0 + Find the Right POI Data for Your Product The Problem with Most POI Datasets Many POI datasets lack consistency, accuracy, and long-term reliability. Outdated Locations Static POI data quickly falls behind as businesses open, close, and relocate constantly. Duplicate & Unstable Records Inconsistent IDs and duplicate listings disrupts joins and historical tracking. Limited Business Attributes Basic name and address records are not enough for serious analytics or modeling. How SafeGraph Delivers Top-Quality POI Data Built for teams that rely on accurate, scalable, and continuously updated point of interest data. Continuously Updated Locations Structured monthly refresh cycles keep your POI dataset aligned with real-world change. Clean, Stable Identifiers Designed for reliable joins, longitudinal analysis, and enterprise workflows. Rich, Standardized Attributes A production-ready POI database with consistent categories, brand mapping, and verified business POI coverage. Get Started with POI Data Trusted by the World’s Leading Innovators and Builders Why teams choose SafeGraph data SafeGraph is the trusted choice for enterprises that rely on accurate, production-ready location data. Here’s why. Production ready from day-1 Data arrives clean, structured, and normalised. No heavy preprocessing required. Less time fixing bad data Reduce engineering and analyst time spent sorting, cleaning, and reconciling inconsistent location data. Built for real- world change Frequent updates capture openings, closures, and category changes across geographies. Schedule a Demo See the Difference in Places Data Quality other data providers SafeGraph data Inside SafeGraph POI Database A comprehensive POI dataset built with stable identifiers, precise geospatial data, and rich business attributes. Placekey Universal placekeys are unique IDs that allow Places data to be easily joined with other datasets. Geographic Coordinates Exact latitude and longitude for each POI location. Industries & Categories Granular tags that provide context for detailed POI data. Opened/Closed Dates Track business lifecycle data for accurate timelines. Brands Brand names and IDs for precise chain-level analysis. Store IDs Store-level identifiers for clean joints and data enrichment. Polygons Precise POI footprints as part of geospatial places attributes. ...and more Open hours, phone numbers, website URLs, and other location dataset attributes. Here’s What Our Customers Say About Us Real stories from teams using our POI data to power their decisions. Clear Channel Olvin Mobsta Avison Young Spade “When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard.” Andy StevensChief Data Officer, Clear Channel Europe “With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it.” Matt TaaffeVP of Product, Olvin “We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph.” James Sexton-BarrowHead of Planning at Mobsta “With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before.” Julian AdamsDirector of Data Science, Avison Young “SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location.” Oban MacTavishCEO at Spade "When the dataset is good, we can trust it to make business decisions. Working with SafeGraph data has made our jobs much easier in this regard." Andy StevensChief Data Officer, Clear Channel Europe "With our previous data provider, the polygons didn’t always align to where stores really were. It took us sometimes a month or two to import and clean the data before we could even use it." Matt TaaffeVP of Product, Olvin "We expect nothing less than the gold standard in data…which is precisely why we decided to partner with SafeGraph." James Sexton-BarrowHead of Planning at Mobsta "With SafeGraph, we’ve not only improved the efficiency and effectiveness of our analysis but also have been able to increase our speed to value—now our analysts can answer our clients’ questions and deliver actionable insights faster than ever before." Julian AdamsDirector of Data Science, Avison Young "SafeGraph takes data quality very seriously—which is why, if a POI is included in the Places dataset, we can always trust it’s a real location." Oban MacTavishCEO at Spade Ready to Build with Reliable POI Data? Connect with our team to explore how SafeGraph’s production-ready POI database can support your business growth. Talk to a Data Expert FAQ’s 1. What is SafeGraph POI data? SafeGraph POI data is a structured point of interest dataset containing verified business POI records, geographic coordinates, categories, and operational attributes. 2. What does the SafeGraph POI database include? The POI database includes business names, precise location data, brand relationships, industry classifications, and lifecycle indicators such as open and closed dates. 3. How is this POI dataset different from others? Our POI dataset is monthly updated, built with stable identifiers, and designed for production-ready analytics and enterprise use. 4. Who uses SafeGraph business POI data? Retail, financial services, consumer goods, and software teams rely on our business POI data for mapping, market analysis, and location intelligence. SafeGraph POI data is a structured point of interest dataset containing verified business POI records, geographic coordinates, categories, and operational attributes.The POI database includes business names, precise location data, brand relationships, industry classifications, and lifecycle indicators such as open and closed dates.Our POI dataset is monthly updated, built with stable identifiers, and designed for production-ready analytics and enterprise use.Retail, financial services, consumer goods, and software teams rely on our business POI data for mapping, market analysis, and location intelligence.All Rights Reserved © 2026 SafeGraph #### Influencer Bios URL: https://www.safegraph.com/lp/influencer-bios/ #### Local Business Appointments API URL: https://www.safegraph.com/lp/local-business-appointments-api/ #### POI Data for AI Trip Planners URL: https://www.safegraph.com/lp/poi-data-for-ai-trip-apps/ #### Reservations Terms of Service URL: https://www.safegraph.com/lp/reservations-terms-of-service/ #### SafeGraph Aliases for Emergency Response URL: https://www.safegraph.com/lp/safegraph-aliases-for-emergency-response/ #### SafeGraph and CoreLogic Partnership URL: https://www.safegraph.com/lp/corelogic-safegraph-partnership/ #### SafeGraph Location-Based Data URL: https://www.safegraph.com/lp/safegraph-location-based-data/ #### SafeGraph Places for Retail URL: https://www.safegraph.com/lp/safegraph-places-for-retail/ #### SDSC London 2023 URL: https://www.safegraph.com/lp/sdsc-london-2023/ #### Spatial Analysis in 2025: Key Trends URL: https://www.safegraph.com/lp/spatial-analysis-in-2025-key-trends/ #### Vacancies URL: https://www.safegraph.com/lp/vacancies/ ### Free Data #### Download Open Census Data & Neighborhood Demographics Free Data Download Open Census Data & Neighborhood Demographics Free bulk download of the complete US Decennial Census and American Community Survey data from 2016-2020, with CBG geometry. Download the free dataset What's included in this download: 2016 5-year ACS2017 5-year ACS2018 5-year ACS2019 5-year ACSNEW 2020 5-year ACS2010-2019 Census Block Group geometries*NEW 2020-2029 Census Block Group geometriesNEW 2020 decennial redistricting data*At this time SafeGraph datasets use 2010-2019 CBG geometries Why we offer this: French Polynesia spans over 100 islands, many of which have no formal address systems. As a result, large parts of the country have no addresses to capture. SafeGraph provides strong coverage in the populated areas where address systems do exist, offering verified, high-precision data where it matters most. ‍Read the full origin story #### Download SafeGraph data for Moscow, Las Vegas & Washington State in Senzing JSON Free Data Download SafeGraph Data In Senzing JSON Format Free download of SafeGraph data from Washington State, Las Vegas, and Moscow in Senzing JSON Download the free dataset What's included in this download Every data attribute found in SafeGraph Places (see schema here)SafeGraph data for Washington State in Senzing JSONSafeGraph data for Moscow in Senzing JSONSafeGraph data for Las Vegas in Senzing JSON*Additional information about using SafeGraph data with Senzing can be found here  #### Free Data: Retail Brands in Miami Dade - POI Dataset Free Data SafeGraph Places: Retail Brands in Miami-Dade Unlock Comprehensive POI Data for Miami's Retail Scene Download the free dataset What's included in this download: Detailed Retail Chains List: A comprehensive list of retail chains in Miami, FL.Key Attributes: Addresses, phone numbers, categories, and more.Data Schema: View the full data schema here. Why SafeGraph? Accurate and Up-to-Date: Regularly updated data to ensure you have the latest information.Comprehensive Coverage: Extensive details on places around the world.Trusted by Industry Leaders: Used by top companies for accurate location-based insights. #### Global Places Sample Data URL: https://www.safegraph.com/free-data/global-places-sample-data/ #### Open Census Data URL: https://www.safegraph.com/free-data/open-census-data-2/ #### Open Dataset: Address in French Polynesia OPEN DATASET SafeGraph Places: Address Data in French Polynesia Unlock Verified, Structured Addresses Download the free dataset What's included in this download: Detailed Addresses: A comprehensive list of addresses in French PolynesiaKey Attributes: Primary Number, Street, City, Sub Region, Region, Postal Code, Latitude, LongitudeData Schema: View the full data schema here.This work is openly licensed via CC BY 4.0.  Why SafeGraph? French Polynesia spans over 100 islands, many of which have no formal address systems. As a result, large parts of the country have no addresses to capture. SafeGraph provides strong coverage in the populated areas where address systems do exist, offering verified, high-precision data where it matters most. #### Open Dataset: French Polynesia URL: https://www.safegraph.com/free-data/open-dataset-french-polynesia-2/ #### Parking Lots Sample Data URL: https://www.safegraph.com/free-data/parking-lots-sample-data-2/ #### Precision Polygon Data with SafeGraph URL: https://www.safegraph.com/free-data/precision-polygon-data-with-safegraph/ #### Precision Polygon Data with SafeGraph Geometry Free data Precision Polygon Data with SafeGraph Geometry SafeGraph Geometry attributes contain POI footprints and spatial hierarchy metadata. Available for 15MM POIs globally. Download the free dataset What's included in this download: 100 sample rows including basic location information and building shapesEvery column or data attribute found in SafeGraph Geometry (see schema here)‍Representing a randomized set of brands in San Francisco, CA  Why SafeGraph Geometry Data? We emphasize accuracy to support better outcomes of using our Geometry data. For the most popular branded POIs important to our customers, we hand draw polygons to ensure they are close to 100% accurate. Our technical documentation outlines which building footprints are hand drawn and which are machine generated, so you know exactly what data you are working with.Spatial hierarchy is a key aspect of our polygon data, representing the relationship between two or more polygons. This allows you to understand if a polygon includes a parking lot, if it shares a building footprint with another POI, if it’s a child or parent to another POI, and more. #### Request data to be appended with Placekey We Want to Hear From You Request Data for Enhanced Usability SafeGraph can help clean and append Placekey to datasets to make it easier for you to use. Submit a Data Request At SafeGraph, we specialize in cleaning and enhancing datasets by appending Placekey. This process significantly simplifies data usage and integration, making your work more efficient and insightful. Are there datasets you frequently use, beyond what SafeGraph offers? Whether it’s comprehensive Census data or a unique POI file discovered on Github, we’re interested in learning about the diverse datasets you rely on in your daily operations. Please ensure the dataset you submit contains an address. #### request-data URL: https://www.safegraph.com/free-data/request-data-2/ #### Retail Brands Miami Dade URL: https://www.safegraph.com/free-data/retail-brands-miami-dade-2/ #### SafeGraph Free Parking Lot Sample Data for Downloads Free Data Parking Lot Polygons A premium set of Geometry rows depicting the shape and size of surface parking lots and their relationship to surrounding POIs. Download the free dataset What's included in this download: 100+ sample rows of Places and Geometry for all Walmarts in PennsylvaniaEvery column or data attribute found in SafeGraph Parking Lots Geometry (see schema here)Includes related parking lot Placekeys for easily joining the two files   #### SafeGraph Global Places POI Sample Download Free Data Global Places POI Data for Any Brand Anywhere in the World If you are a data leader looking for a global POI dataset that is clean, accurate, and reliable, SafeGraph's Global Places may be for you. Download the free dataset What's Included in This Download: Every column or data attribute found in SafeGraph Places (see schema here) 100 sample rows of points of interest (POI) from a randomized set of countries Representing 10 brands (BMW, Burberry, Skechers, Burger King, KFC, Urban Outfitters, Domino's Pizza, CEVA Logistics, Gucci, Holiday Inn) #### Safegraph Spend Data Sample Free Data Aggregated, Permissioned, and Anonymized Consumer Spending Data for Places Access a sample of SafeGraph transaction attributes, which includes Median Spend per Transaction, Spend by Day, Spend by Customer Frequency, Online vs In-person Spend, and more. Download the free dataset What's included in this download: 100 sample rows of SafeGraph Spend data from randomized brandsEvery column or data attribute found in SafeGraph Spend (see schema here) The Most Comprehensive Transaction Data Tied to Places Spend data is built with the largest source of credit and debit transactions available, plus a proprietary methodology to match transaction data to 1.1M+ individual POIs and over 5,000 brands to provide location-based insights. #### Senzing Data Sample URL: https://www.safegraph.com/free-data/senzing-data-sample-2/ #### Spend Data Sample URL: https://www.safegraph.com/free-data/spend-data-sample-2/ #### Starbucks and Dunkin POI URL: https://www.safegraph.com/free-data/starbucks-and-dunkin-poi/ #### Starbucks Vs Dunkin' Location Data Free Data SafeGraph Places: Starbucks and Dunkin’ US Points of Interest Fresh, Accurate Places Data Reflecting Dynamic, Real-World Change Download the free dataset Get access to a valuable dataset featuring: Comprehensive Data Attributes: Detailed information found in SafeGraph Places (see schema here)Extensive POI List: Locations across the United States for two major brands - Starbucks and Dunkin'.‍Actionable data: Accurate and up-to-date data to support your business needs Why Choose SafeGraph? Accuracy You Can Trust: Regularly updated to ensure you have the freshest data.Wide Coverage: Detailed information on locations across the world.Used by Industry Leaders: Trusted by top companies for precise location-based insights. ### Product Info #### Parking Lots URL: https://www.safegraph.com/product-info/parking-lots/ #### Point POIs URL: https://www.safegraph.com/product-info/point-poi/ #### Store ID URL: https://www.safegraph.com/product-info/store-id/ ### Scorecard #### Parking Lots in the Top 24 Largest US Cities Parking Lots in the Top 24 Largest US CitiesUsing SafeGraph Parking Lots data, see what percentage of total land area consists of surface parking lots in the top 24 largest US cities by population. #### SafeGraph Restaurant Scorecard SafeGraph Restaurant ScorecardEvery day, restaurants open, close, and relocate. They sometimes change names, or get acquired by other brands. At the same time, consumers are constantly changing the way they interact with restaurants based on their own personal financial situation, and the larger economy.In such a dynamically changing world, it can be difficult to stay on top of these trends and truly understand what the restaurant market landscape looks like. With accurate and fresh points of interest (POI) data, data scientists can easily understand how many restaurants opened in a given month, at any level of geographic granularity. Enriching each of these restaurant locations with consumer behavior data provides the necessary context for understanding the full picture, enabling data scientists to measure how economic trends impact the restaurant market.This infographic reflects US market changes and provides insight into how consumer behavior is impacted by economic trends. Check out the breakdown and then download a sample of SafeGraph data to get started building your own restaurant industry analytics.Restaurant types with the most store openings per stateThe restaurant industry is extremely dynamic. Full service restaurants, quick service restaurants, drinking places, and snack bars all have unique target customers and ideal market conditions that determine whether they thrive or flounder. For example, during the COVID-19 pandemic, quick service restaurants (QSRs) had to adapt to changing consumer behaviors in order to stay afloat. Success or failure to do this can result in brands expanding to new areas of demand, or closing down underperforming restaurants.In these infographics, we use the SafeGraph Places dataset to see how many restaurant and bar locations opened and any regional trends in store openings for the months of April and May of 2022. We specifically measure NAICS codes beginning with 722 so that we isolate our findings to categories in the food service industry. From those NAICS codes, we identify the top restaurant category in each state that has seen the largest number of location openings using the opened_on column in SafeGraph Places.Restaurant types with the most store closings per stateAccording to CNBC, 60% of restaurants close within one year of opening, and 80% close before their five year anniversary. Restaurants close every day in every type of economy, but the closure rate can fluctuate due to factors like the pandemic, stimulus checks, or recession.Each month, SafeGraph measures how many restaurant and bar locations closed using our Places dataset. To understand the state of the food service industry and how that differs regionally, we specifically look at NAICS codes that begin with 722 for April and May 2022. We identify the top restaurant category in each state that has seen the largest number of store location closings during that time period using the closed_on column in SafeGraph Places.Restaurant brands with the most store openings nationwideRestaurant openings can be the result of many factors, such as an increase in demand in an area, or the success and expansion of a particular brand. For example, in times of recession consumers may choose to eat more at less expensive restaurants, resulting in a boom in demand for those brand locations. Analyzing restaurant store location openings by brand can reveal interesting insights related to consumer demand and economic health.To see which restaurant brands experienced the most growth during April and May 2022, we measure the amount of restaurant location openings by brand. We then identify the five brands with the most POIs opened in the US, using the opened_on column in SafeGraph Places.Restaurant brands with the most store closings nationwideRestaurant closures can occur for a variety reasons, including increased competition in a particular area, or a decrease in consumer demand. As an example, during a recession consumers may choose to spend less money at expensive restaurants, resulting in a decrease in demand and a need for brands to close restaurant locations that are underperforming.To understand which restaurant brands contracted the most in the US during April and May 2022, we measure the amount of restaurant location closings by brand. We then identify the five brands with the most POIs closed in the US, using the closed_on column in SafeGraph Places.Restaurant brands with the biggest decrease in transaction volume nationwideConsumer behavior fluctuates throughout the year based on factors like seasonality and popular trends, but also over time as a result of the larger economy. Looking at transaction data associated to specific points of interest (POIs) helps reveal these patterns in consumer spending and how it impacts different restaurant brands. To see which restaurant brands saw the biggest decrease in transaction volume during this time period in the US, we use SafeGraph Spend data to compare the number of transactions month over month at brands with NAICS starting in 722. With these insights, data scientists can attribute a restaurant brand’s performance to larger economic or regional trends.‍Restaurant brands with the biggest increase in transaction volume nationwide Consumers often change how they interact with restaurants based on what time of year it is, what is popular among their friends and family at the moment, and also how the economy is doing. Transaction data for individual places shows how consumer spending changes over time, and also across regions. To see where else consumers shopped, we used SafeGraph Spend data to compare the number of transactions at POIs with NAICS starting in 722 during the given time period. Identifying the restaurant brands with the biggest increase in transaction volume month over month can help indicate how the economy is impacting consumer spending behavior and overall brand health.Restaurant brand affinities in the USConsumers spend money at multiple restaurant brands, and understanding these relationships is critical to profiling customers and building trade areas. While two restaurants may be competitors of each other, their customers may actually frequent different locations. These types of insights can help restaurants brands understand how best to serve their customers and better compete in the market.Using cross-shopping columns in SafeGraph Spend, we chose two competitive restaurant brands and identified the top three other restaurant brands customers shop at. We did this by counting the number of times each restaurant brand’s POIs shows any spend with a related restaurant brand, reflecting the number of POIs where a customer had spent money at that specific location and the related brand. #### SafeGraph Retail Scorecard SafeGraph Retail ScorecardEvery day, businesses open, close, and relocate. They sometimes change names, or get acquired by larger brands. At the same time, consumers are constantly changing the way they interact with retail businesses based on their own personal financial situation, and the larger economy.In such a dynamically changing world, it can be difficult to stay on top of these trends and truly understand what the retail market landscape looks like. With accurate and fresh points of interest (POI) data, data scientists can easily understand how many businesses opened in a given month, at any level of geographic granularity. Enriching each of these store locations with consumer behavior data provides the necessary context for understanding the full picture, enabling data scientists to measure how economic trends impact the retail market.Check out trends for the spring of 2022 and then download a sample of SafeGraph data to get started building your own retail analytics.Retail categories with the most store openings per stateAccording to the National Retail Foundation, more than 8,100 new retail store locations opened across the United States in 2021. Whether a result of a new brand opening, an existing brand expanding its market footprint, or a store choosing to relocate to a new space across town, stores open each and every day.In these infographics, we use the SafeGraph Places dataset to see how many retail store locations opened and identify any regional trends in store openings. We specifically measure NAICS codes beginning with 44 or 45 so that we isolate our findings to categories in the retail industry. From those NAICS codes, we identify the top retail category in each state that saw the largest number of store location openings in the given time period using the opened_on column in SafeGraph Places.Retail categories with the most store closings per stateAccording to the National Retail Foundation, around 3,950 retail store locations closed in 2021. This was a significant drop in closures compared to 2020, which saw over 10,700 retail store locations close. Businesses close every day in every type of economy, but the closure rate can fluctuate due to factors like the pandemic, stimulus checks, or recession.Each month, SafeGraph measures how many retail store locations closed using our Places dataset. For these infographics, to understand the state of the retail industry and how that differs regionally, we specifically look at NAICS codes that begin with 44 or 45 for April and May 2022. We identify the top retail category in each state that saw the largest number of store location closings using the closed_on column in SafeGraph Places.Retail brands with the most store openings nationwideStore openings can be the result of many factors, such as an increase in demand for certain products or services in an area, or the success and expansion of a particular brand. For example, in times of recession consumers may choose to shop more at bargain outlets, resulting in a boom in demand for those store locations. Analyzing retail store location openings by brand can reveal interesting insights related to consumer demand and economic health.To see which retail brands experienced the most growth in the US during the given time period, we measure the amount of store location openings by brand. We then identify the five brands with the most POIs opened in the US, using the opened_on column in SafeGraph Places.Retail brands with the most store closings nationwideStore closures can occur for a variety reasons, including increased competition in a particular area, or a decrease in consumer demand for specific retail goods. As an example, during a recession consumers may choose to spend less money on luxury goods, resulting in a decrease in demand and a need for brands to close store locations that are underperforming.To understand which retail brands contracted the most in the US during the given time period, we measure the amount of store location closings by brand. We then identify the five brands with the most POIs closed in the US, using the closed_on column in SafeGraph Places.Retail brands with the biggest decrease in transaction volume nationwideConsumer behavior fluctuates throughout the year based on factors like seasonality and popular trends, but also over time as a result of the larger economy. Looking at transaction data associated to specific points of interest (POIs) helps reveal these patterns in consumer spending and how it impacts different retail brands. To see which retail brands saw the biggest decrease in transaction volume, we use SafeGraph Spend data to compare the number of transactions month over month at brands with NAICS starting in 44 or 45 during April and May 2022. With these insights, data scientists can attribute a retail brand’s performance to larger economic or regional trends.‍Retail brands with the biggest increase in transaction volume nationwide Consumers often change how they interact with retail brands based on what time of year it is, what is popular among their friends and family at the moment, and also how the economy is doing. Transaction data for individual places shows how consumer spending changes over time, and also across regions. In this infographic, we use SafeGraph Spend data to compare the number of transactions at POIs with NAICS starting in 44 or 45 for April and May 2022. Identifying the retail brands with the biggest increase in transaction volume month over month can help indicate how the economy is impacting consumer spending behavior and overall brand health.Retail brand affinities in the USConsumers spend money at multiple retail brands, and understanding these relationships is critical to profiling customers and building trade areas. While two stores may be competitors of each other, their customers may actually frequent different stores. These types of insights can help retailers understand how best to serve their customers and better compete in the market.Using cross-shopping columns in the May and June releases of SafeGraph Spend, we choose two competitive brands and identify the top three other brands customers shop at. We do this by counting the number of times each brand’s POIs shows any spend with a related brand, reflecting the number of POIs where a customer had spent money at that specific location and the related brand.