Archive for December 2008

Mapping ACS Census Data for Urban Areas With PUMAs

The NY Times wrote a story recently based on the new 3 year ACS data that the Census Bureau released a couple weeks ago (see my previous post for details). They created some maps for this story using geography that I would never have thought to use.

Outside of Decennial Census years, it is difficult to map demographic patterns and trends within large cities as you’ll typically get one figure for the entire city and you can’t get a break down for areas within. Data for areas like census tracts and zip codes is not available outside the ten-year census (yet), and large cities exist as single municipal divisions that aren’t subdivided. New York City is an exception, as it is the only city composed of several counties (boroughs) and thus can be subdivided. But the borough data still doesn’t reveal much about patterns within the city.

The NY Times used PUMAS – Public Use Microdata Areas – to subdivide the city into smaller areas and mapped rents and income. PUMAs are aggregations of census tracts and were designed for aggregating and mapping public microdata. Microdata consists of a selection of actual individual responses from the census or survey with the personal identifying information (name, address, etc) stripped away. Researchers can build their own indicators from scratch, aggregate them to PUMAs, and then figure out the degree to which the sample represents the entire population.

Since PUMAs have a large population, the new three-year ACS data is available at the PUMA level. The PUMAs essentially become surrogates for neighborhoods or clusters of neighborhoods, and in fact several NYC agencies have created districts or neighborhoods based on these boundaries for statistical or planning purposes. This wasn’t the original intent for creating or using PUMAs, but it’s certainly a useful application of them.

You can check out the NY Times article and maps here – Census Shows Growing Diversity in New York City (12/9/08). I tested ACS / PUMA mapping out myself by downloading some PUMA shapefiles from the Census Bureau’s Generalized Cartographic Boundaries page, grabbing some of the new annual ACS data from the American Factfinder, and creating a map of Philly. In the map below, you’re looking at 2005-2007 averaged data that shows the percentage of residents who lived in their current home last year. If you know Philly, you can see that the PUMAs do a reasonable job of approximating regions in the city – South Philly, Center City, West Philly, etc.

The problem I ran into here was that data did not exist for all of the PUMAs – in this case, South Philly and half of North Philly had values of zero. According to the footnotes on the ACS site, there were no values for these areas because “no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution”. So even though the PUMA geography is generally available, there still may be cases where data for particular variables for individual geographies is missing.

Just for the heck of it, I tried looking at the annual ACS data which is limited to more populated areas (must have 65k population where 3 year estimates are for areas with at least 20k) and even more data was missing (in this instance, all the areas in the northeast). Even though PUMAs have a minimum population of 100k people, the ACS sampling is county based. So even if the sample size for a county is ideal, they may not have a significant threshold for individual places within a county to compute an estimate. At least, that’s my guess. Regardless, it’s still worth looking at for the city and data you’re interested in.

ACS Data for Philly Pumas

Census Bureau Releases New ACS Data

The Census Bureau released its new American Community Survey data the other day. Three year averages for a variety of socio-economic variables are now available for all geographic areas that have at least 20,000 people. The ACS has been releasing annual data for most of this decade for areas with at least 65,000 people and will continue to do so. They didn’t provide data for smaller areas because the numbers were not as statistically robust. Now that they have three years of data, they can average the numbers for three years and get sound data for areas with a population of at least 20k.

Data for 2005 to 2007 is available now, and like the annual numbers, you’ll get a range of values and a confidence interval. For example, we can say with 90% confidence that the estimated population of Atlantic City, NJ between 2005 and 2007 was 35,770, plus or minus 1,749 people. The Bureau created this estimate based on a sample of 1,379 people in AC.

Next year, the census will release new annual numbers for areas with a population of at least 65k, and will update the three year averages for areas with 20k by adding the newest year of data and dropping the oldest one to calculate a new average.

All of the data is available through the American Factfinder.

If you are looking for population figures for basic indicators (population, race, gender, age, and housing units) for basic geographic areas (states, counties, places, and metro areas), you’ll probably want to consider using estimates from the Bureau’s Population Estimates program instead. Their annual estimates are based on a demographic calculation that factors in births, deaths, and migration, and is not based on a survey (according to that program, Atlantic City had 39,684 residents in 2007 – 4,090 more people than the ACS midrange estimate). If you’re looking for any other kind of data (ethnicity, immigration status, income, poverty, rent, home value, etc) the ACS is your best bet.

By 2010 the Bureau will begin releasing ACS 5 year avearges for all geographic areas. Of course, we’ll also have our next decennial census in 2010. The big change here is that, since we’ll have the ACS churning out data for all areas for every year from that point forward, the Bureau is doing away with the long form (which was sent to one in six households) that was issued in past censuses, and will only collect data using the basic short form, which gets distributed to everyone. For more info on this change, see the Bureau’s Census 2010 info page.