Hello there! I am trying to look at the origin locations of visitors to all pois in my region of interest. I have the following questions below:
When trying to do the ‘Micro’ normalization of visitor_home_cbgs to obtain a true population count for visitors to each poi from each origin cbg, I see in a colab tutorial that a method of eliminating all visitor_home_cbgs with less than 5 visitor counts is used, the idea is to eliminate any of the cbgs with 4 visitors because this value could technically be anything from 2-4 actual sampled visitors due to the added noise for privacy concerns. So my question is, if we filter out all visitor_home_cbgs <5, and then apply the eq–>
(SG CBG visitor raw count / SG CBG sample size) x CBG population to get a true estimate of the population of visitors to that poi from that specific CBG, how can we reliably use this if we are filtering out a lot of the visitors to each poi, from that CBG? What sort of significance would an analysis be with this method of filtering out every cbg with less than 5 visitors?
Similar to Q1, I am looking at the amount of visitor_home_cbgs with counts = 4 in a monthly patterns dataset vs in a weekly patterns dataset. Since the weekly patterns is 1/4 of the sampling period, wouldn’t we expect to see a much larger amount of visitor_home_cbgs counts to be =4 and thus have to be filtered out? Im wondering if doing a study like this would produce vastly different answers if looking at monthly vs weekly patterns due to the relatively large amount of counts equalling 4.
Also has anyone tried to do this for thousands of poi’s in a city to generate a map of sorts of origin locations, the process of exploding the visitor_home_cbgs, then counting up each CBG visitor count and adding it to a master list of all CBGs in the country seems like it would be a nested loop nightmare. But I havent been able to find any resources of someone trying to do this.
Sorry if this was confusing, let me know if I should restate my questions more succinctly!
link to colab notebook explaining the process for obtaining estimates of population counts of visitors from each CBG: Google Colab