The NY Times wrote a story recently based on the new 3 year ACS data that the Census Bureau released a couple weeks ago (see my previous post for details). They created some maps for this story using geography that I would never have thought to use.
Outside of Decennial Census years, it is difficult to map demographic patterns and trends within large cities as you’ll typically get one figure for the entire city and you can’t get a break down for areas within. Data for areas like census tracts and zip codes is not available outside the ten-year census (yet), and large cities exist as single municipal divisions that aren’t subdivided. New York City is an exception, as it is the only city composed of several counties (boroughs) and thus can be subdivided. But the borough data still doesn’t reveal much about patterns within the city.
The NY Times used PUMAS – Public Use Microdata Areas – to subdivide the city into smaller areas and mapped rents and income. PUMAs are aggregations of census tracts and were designed for aggregating and mapping public microdata. Microdata consists of a selection of actual individual responses from the census or survey with the personal identifying information (name, address, etc) stripped away. Researchers can build their own indicators from scratch, aggregate them to PUMAs, and then figure out the degree to which the sample represents the entire population.
Since PUMAs have a large population, the new three-year ACS data is available at the PUMA level. The PUMAs essentially become surrogates for neighborhoods or clusters of neighborhoods, and in fact several NYC agencies have created districts or neighborhoods based on these boundaries for statistical or planning purposes. This wasn’t the original intent for creating or using PUMAs, but it’s certainly a useful application of them.
You can check out the NY Times article and maps here – Census Shows Growing Diversity in New York City (12/9/08). I tested ACS / PUMA mapping out myself by downloading some PUMA shapefiles from the Census Bureau’s Generalized Cartographic Boundaries page, grabbing some of the new annual ACS data from the American Factfinder, and creating a map of Philly. In the map below, you’re looking at 2005-2007 averaged data that shows the percentage of residents who lived in their current home last year. If you know Philly, you can see that the PUMAs do a reasonable job of approximating regions in the city – South Philly, Center City, West Philly, etc.
The problem I ran into here was that data did not exist for all of the PUMAs – in this case, South Philly and half of North Philly had values of zero. According to the footnotes on the ACS site, there were no values for these areas because “no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution”. So even though the PUMA geography is generally available, there still may be cases where data for particular variables for individual geographies is missing.
Just for the heck of it, I tried looking at the annual ACS data which is limited to more populated areas (must have 65k population where 3 year estimates are for areas with at least 20k) and even more data was missing (in this instance, all the areas in the northeast). Even though PUMAs have a minimum population of 100k people, the ACS sampling is county based. So even if the sample size for a county is ideal, they may not have a significant threshold for individual places within a county to compute an estimate. At least, that’s my guess. Regardless, it’s still worth looking at for the city and data you’re interested in.