Posts Tagged ‘census geography’

ACS Trend Reports and Census Geography Guide

Sunday, February 12th, 2012

I recently received my first question from someone who wanted to compare 2005-2007 ACS data with 2008-2010. With the release of the latter, we can make historical comparisons with the three year data for the first time since we have estimates that don’t overlap. We should be able to make some interesting comparisons, since the first set covers the real estate boom years (remember those?) and the second covers the Great Recession. One resource that makes such comparisons relatively painless is over at the Missouri Census Data Center. They’ve put together a really clean and simple interface called the ACS Trends Menu, which allows you to select either two one period estimates or two three period estimates and compare them for several different census geographies – states, counties, MCDs, places, metros, Congressional Districts, PUMAs, and a few others – for the entire US (not just Missouri). The end result is a profile that groups data into the Economic, Demographic, Social, and Housing categories that the Census uses for its Demographic Profile tables. The calculations for change and percent change for the estimates and margins of error are done for you.

Downloading the data is not as straightforward – the links to extract it just brought me some error messages, so it’s still a work in progress. Until then, a simple copy and paste into your spreadsheet of choice will work fine.

ACS Trends Menu

If you like the interface, they’ve created separate ones for downloading profiles from any of the ACS periods or from the 2010 Census. The difference here is that you’re looking at one time frame; not across time periods. The interface and the output are the same, but in these menus you can compare four different geographies at once in one profile. Unlike the Trends reports, both the ACS and 2010 Census profiles have easy, clear cut ways to download the profiles as a PDF or a spreadsheet. If you’re happy with data in a profile format and want an interface that’s a little less confusing to navigate than the American Factfinder, these are all great alternatives (and if you’re building web applications these profiles are MUCH easier to work with – you can easily build permanent links or generate them on the fly).

The US Census Bureau also recently put together a great resource called the Guide to State and Local Census Geography. They provide a census geography overview of each state: 2010 population, land area, bordering states, year of entry into the union, population centroids, and a description of how local government is organized in the state – (i.e. do they have municipal civil divisions or only incorporated cities and unincorporated land, etc). You get counts for every type of geography – how many counties, tracts, ZCTAs, and so on, AND best of all you can download all of this data directly in tab delimited files. Need a list of every county subdivision in a state, with codes, land area, and coordinates? No problem – it’s all there.

2010 Census Generalized Cartographic Boundary Files

Thursday, December 22nd, 2011

I’ve had a few interesting projects that have kept me busy at the end of this year. I’ll do a post or two after New Years, once I’m back in the office and can take some screen shots to illustrate.

In the meantime I have one tidbit I can mention – the Census Bureau has released the 2010 version of the Generalized Cartographic Boundary Files. These files are generalized versions of the TIGER files, with smoothed and simplified boundaries and areas of coastal water removed. They haven’t posted them on the same page as the 2000 and 1990 boundaries; they’ve mentioned they’re creating a new interface to host all of them, which is currently a work in process at http://www.census.gov/geo/www/cob/.

However, you can get access to all the 2010 boundaries via the FTP site – you just need to know what you’re looking at. All the files are named with codes to identify the geographic coverage, summary level, and resolution / scale. There’s a README file on the FTP page that tells you how to identify each.

But in brief – The file names look like this: gz_2010_ss_lll_vv_rr.zip, where:

  • ss is the state INCITS / FIPS code which you can look up here – ‘us’ is a national level file.
  • lll is the summary level or unit of geography – the README file has a table with each code. The most common ones: 040 for state, 050 for county, 060 for county subdivisions, 140 for census tracts, 160 for places, 310 for metropolitan and micropolitan statistical areas, 860 for ZCTAs. (No PUMAs- 2010 PUMA boundaries haven’t been drawn yet, and 2000 PUMA boundaries are still being used in the latest ACS).
  • vv is a version number for the file.
  • rr is resolution – most of the files are 500k = 1:500,000, which is the least generalized and best for mapping state-level to regional areas. For national level files you also have the option of 5m = 1:5,000,000 and 20m = 1:20,000,000, which are more generalized and better for national mapping.

The Census Bureau has been doing a lot of tweaking to their website lately. The legacy version of the American Factfinder is set to disappear for good on Jan 20, 2012.

2010 American Community Survey Releases

Friday, September 23rd, 2011

The US Census Bureau released the new annual data for the 2010 American Community Survey; this dataset includes an extensive number of demographic, socio-economic, and housing estimates (with margins of error) for all geographic areas in the US that have a population of at least 65,000 people. This is the first ACS survey that is weighted based on the 2010 Census, and that is tabulated entirely on the new 2010 Census geography; exceptions include PUMAs and urban areas, which typically aren’t redrawn until a couple of years after a decennial census is taken. Data for these areas will be reported based on the 2000 Census geography. This will also be the first ACS that is distributed via the new American Factfinder. Previous ACS datasets should be moved to the new Factfinder by the end of this year.

According to the release schedule data for the three year ACS (2008-2010) for areas with at least 20,000 residents will be published in October and the five year ACS (2006-2010) for geography down to census tracts will be released in December. The three year dataset hits a milestone this year, as for the first time we’ll have datasets with mutually exclusive years that can be feasibly compared for historical change (the 2005-2007 dataset versus 2008-2010). It should prove interesting as the earlier dataset represents the end of the brief boom years while the current one depicts the depth of the great recession. There will be some challenges in making comparisons, as the base for weighting the estimates and the geography used to tabulate them is different for each dataset (2000 Census in the earlier dataset versus 2010 Census in the latest one).

ZIP Code KML Map for NYC Census Data

Saturday, September 10th, 2011

With the release of both the 2010 Census profiles for ZCTAs (ZIP Code Tabulation Areas) and the TIGER line files for 2010 Census geographies, I created another Google Map finding aid for NYC neighborhood data by ZIP code (I previously created one for PUMAs with American Community Survey data). Once again I used the Export to KML plugin that was created for ArcGIS. This allowed me to use the TIGER shapefile in ArcGIS to create the map I wanted and then export it as a KML, while using fields in the attribute table of each feature to insert the ZCTA number into stable links for the census profiles, automatically generating unique urls for each feature. Click on the ZCTA in the map, and then click on a link to open a profile directly from the new American Factfinder.

There were two new obstacles I had to contend with this time. The first was that my department has finally migrated to Windows 7 from Windows XP, and I upgraded from ArcGIS 9.3 to 10. I had to reinstall the Export to KML plugin (version 2.5.5) and ran into trouble; fortunately all the work-arounds were included in the plugin’s documentation. I don’t have administrator rights on my machine, so I had to have someone install the plugin as an administrator; this included running the initial setup file AND running Arc as an administrator as you add and turn the plugin on. That was straightforward, but when I ran it the first time I got an error message – there’s a particular Windows dll or ocx file that the plugin needs and it was missing (presumably something that was included in XP but not in 7). I downloaded the necessary file, and with administrator rights moved it into the system32 folder and registered the file via the command line. After that I was good to go.

The second issue was with the Census Bureau’s new American Factfinder. With the old Factfinder the urls that were generated as you built and accessed tables were static and you could simply save and bookmark them. Not the case in the new Factfinder; you can bookmark some basic tables but most of them are “too complex to bookmark”; you can save and download queries from the online ap but that’s it. After some digging I found a CB document that tells you how you can create deep links to any query you run and table you create. The url consists of a fixed series of codes that identify the dataset, year, table, and geography. So this link:

http://factfinder2.census.gov/bkmk/table/1.0/en/DEC/10_DP/DPDP1/8600000US10010

Tells us that were getting a table from version 1.0 of the American Factfinder in English. It’s from the Decennial Census, 2010 Demographic Profiles, Demographic Profile Table 1, for ZCTA 10010 (860 is the summary level code that indicates we’re looking at ZCTAs). So for the plugin to create the links, I just included this URL but for the last five digits I specified the attribute from the ZCTA shapefile that held the ZCTA code. So when the plugin creates the KML, each KML feature has a link generated that is specific to it:

http://factfinder2.census.gov/bkmk/table/1.0/en/DEC/10_DP/DPDP1/8600000US[ZCTA5CE10]

You can see this previous post for details on how the Export to KML plugin works.

For now, the 2010 and 2000 Census are in the new American Factfinder. The American Community Survey, the Economic Census, population estimates, and a few other datasets are still in the older, legacy Factfinder. According to the CB all of this data will be migrated to the new Factfinder by the end of 2011 and the legacy version will disappear. At that point I’ll have to update my PUMA map so that it points to the profiles in the new Factfinder.

You can take a look at the ZCTA map and profiles below (I’m hosting it on the NYC data resource guide I’ve created for my college). As I’ve written before, ZCTAs are odd Census geographies since they are approximations of residential USPS ZIP Codes created by aggregating census blocks based on addresses; you can see in many instances where boundaries have a blocky teeth-like appearance instead of straight lines. Since they’re created directly by aggregating blocks, ZCTAs don’t correspond or mesh with other census boundaries like tracts or PUMAs, or even legal boundaries like counties. In some cases my assignment of county-based colors doesn’t ring true. For example, ZCTA 11370 includes part of the East Elmhurst neighborhood in Queens and Rikers Island, which is in the Bronx. ZCTA 10463 includes the Bronx neighborhoods of Kingsbridge and Spuyten Duyvil and the Manhattan neighborhood of Marble Hill (a geographic anomaly; it’s not on the Island of Manhattan but it’s part of Manhattan borough).

The most salient issue with ZCTAs is that they are only tabulated for the decennial census and not the American Community Survey; the currency of data and spectrum of census variables will be limited compared to other types of geography.


View Larger Map

2010 Census Redistricting Data

Sunday, April 17th, 2011

The Redistricting Summary Data [P.L. 94-171] from the 2010 Census has all been published for the nation, states, counties, and places, and is available via the new American Factfinder. The redistricting data includes basic demographic data: total population, race, Hispanic or Latino origin, and number of housing units occupied and vacant. Data is available down to census blocks and is available for most (but not all – no ZCTAs or PUMAs) geographies.

If you don’t want all the data for a state, don’t want to slog through the Factfinder, and are comfortable working with large text files, you can FTP the summary data from the Redistricting Data homepage. If you want basic summary data for states, counties, and places and don’t want to fuss with the Factfinder or text files, you can download Excel spreadsheets from the Redistricting Data Press Kit. They also have some pdf / jpg maps showing county level population and population change, plus interactive map widgets like the one below for the country and for each state. 2010 Redistricting TIGER Shapefiles have also been released for geographies included in the redistricting dataset.

The full 2010 Census for all geographies will be released throughout this summer and into the fall in Summary File 1 [SF1]. Stay tuned.

Relating ZIP Codes / ZCTAs to PUMAs

Saturday, March 19th, 2011

Ever since I created the Google Maps finding aid for census data for NYC PUMAs and the associated PUMA – NYC neighborhood names maps, I’ve received several requests for tables or maps that relate PUMAs to ZIP Codes. These are usually from non-profits in NYC who have lists of donors, members, or constituents with addresses, and they want to relate the addresses (using the ZIP) to recent demographic data from American Community Survey (ACS) for the broader neighborhood where the ZIP is located.

The problem is that ZIP Codes are an all around pain. They actually don’t exist as areas with distinct boundaries; ZIP Codes are all address based, with ZIPs tied to addresses along street segments. The USPS doesn’t publish these tables or create maps; they contract this out for private companies to do, who turn around and sell these products for hefty fees.

Fortunately the Census Bureau has used these address tables to create approximations of ZIP Codes that they call ZCTAs or ZIP Code Tabulation Areas. ZCTAs are aggregates of census blocks that attempt to mimic ZIP Codes that exist as areas; codes associated with specific single-point firms or organization are dropped. Since ZIPS were created by the USPS, ZCTAs do not nest or mesh with any census geography; they cross PUMA, county, and in some cases even state boundaries. They are also less stable than census geography, with frequent changes, and as statistical areas they vary widely in area and population. For this reason ZCTA data is only published every ten years in the decennial census; it’s not included in the ACS (so far).

With these caveats in mind, I used the Missouri Census Data Center’s MABLE/GEOCORR engine to correlate ZCTAs with PUMAs. While the interface looks a little retro and daunting, it’s actually pretty simple. You choose the state, the two geographies you want to relate, the weighting method for allocating one to the other, and an output format that includes CSV or HTML. I also used an option that lets you type in FIPS codes for the counties you want, so I didn’t end up with the entire state.

This method was the way to go, as they give you the option to allocate geographies based on population and not simply land area; each ZCTA was allocated to PUMAs based on where the majority of the ZCTA’s population lived using 2000 census block data. The final output contains one row for each ZCTA to PUMA combination. So you had multiple rows for ZCTAs that weren’t contained within a single PUMA, and for each of those ZCTAs you had fields that showed the percentage of the ZCTA’s population that lived in each PUMA (along with the actual population number) as well as the percentage of the PUMA’s population that lived in that ZCTA.

I took that table and cleaned it up in a spreadsheet, so that I was left with one row for each ZCTA, where the ZCTA was allocated to one PUMA based on where the majority of it’s population lives. I used some ZCTA and PUMA boundaries that I had originally downloaded and subsequently cleaned up from the 2009 TIGER shapefiles page, added them to QGIS, joined the ZCTA allocation table to the ZCTA geography, and mapped the result. I color-coded ZCTAs so that clusters of ZCTAs within a particular PUMA had the same color. Then I overlaid the PUMA boundaries on top to see how well they corresponded.

In the end, they didn’t correspond all that well. There was a fairly good relationship in Manhattan, ok relationship in Queens and Staten Island, and a rather lousy relationship in the Bronx and Brooklyn. I overlaid greenspace and facilities (airports, shipyards, etc) boundaries I had, and that made some difference; you could see in some areas where ZCTAs overlapped two PUMAs that the overlap coincided with parks, cemeteries, or other areas with low or no residential population in one of the PUMAs.

I’ve posted both sets of tables, maps, and some instructions on the NYC neighborhoods resource page. You can use the original MABLE / GEOCORR table to judge where allocations were good and were they were not so good based on population. For now, the engine is still based on 2000 Census geography and data. Even though the Census has started releasing 2010 TIGER files based on 2010 Census geography, ZCTAs and PUMAs are often some of the last geographies to be updated; current releases of the ACS are still based on the 2000 geographies. Stay tuned to the Census Bureau and MCDC websites for news on updates, and keep the MABLE / GEOCORR in mind if you want to create lists to relate census geographies by population or land area.

NYCRDC 2nd Annual Workshop

Tuesday, March 25th, 2008

I attended the first of the three workshops held at Baruch College as part of the New York Census Research Data Center’s 2nd Annual Workshop series. The NYCRDC provides confidential census microdata to researchers at secure facilities at Baruch and Cornell.

This year’s theme is census geography and mapping, and there were a number of great presentations that covered census geography from the global down to the block level. My personal favorite was a presentation that illustrated the composition and evolution of census tracts – using Legos! Not the real ones mind you, but digital photos of Legos that were enhanced and tied together with Flash in a Powerpoint presentation.

I have provided a link to the 2nd Annual Workshop page before – but there it is again. Powerpoints, and perhaps video footage, of the presentations should be posted there relatively soon.

I also gave a promo to the hands-on GIS workshop that I’ll be doing as part of the second workshop of the series. Two weeks to go, and I still have a lot to do…


Copyright © 2012 Gothos. All Rights Reserved.
No computers were harmed in the 0.343 seconds it took to produce this page.

Designed/Developed by Lloyd Armbrust & hot, fresh, coffee.