2010 Census Generalized Cartographic Boundary Files

Thursday, December 22nd, 2011

I’ve had a few interesting projects that have kept me busy at the end of this year. I’ll do a post or two after New Years, once I’m back in the office and can take some screen shots to illustrate.

In the meantime I have one tidbit I can mention – the Census Bureau has released the 2010 version of the Generalized Cartographic Boundary Files. These files are generalized versions of the TIGER files, with smoothed and simplified boundaries and areas of coastal water removed. They haven’t posted them on the same page as the 2000 and 1990 boundaries; they’ve mentioned they’re creating a new interface to host all of them, which is currently a work in process at

However, you can get access to all the 2010 boundaries via the FTP site – you just need to know what you’re looking at. All the files are named with codes to identify the geographic coverage, summary level, and resolution / scale. There’s a README file on the FTP page that tells you how to identify each.

But in brief – The file names look like this:, where:

  • ss is the state INCITS / FIPS code which you can look up here – ‘us’ is a national level file.
  • lll is the summary level or unit of geography – the README file has a table with each code. The most common ones: 040 for state, 050 for county, 060 for county subdivisions, 140 for census tracts, 160 for places, 310 for metropolitan and micropolitan statistical areas, 860 for ZCTAs. (No PUMAs- 2010 PUMA boundaries haven’t been drawn yet, and 2000 PUMA boundaries are still being used in the latest ACS).
  • vv is a version number for the file.
  • rr is resolution – most of the files are 500k = 1:500,000, which is the least generalized and best for mapping state-level to regional areas. For national level files you also have the option of 5m = 1:5,000,000 and 20m = 1:20,000,000, which are more generalized and better for national mapping.

The Census Bureau has been doing a lot of tweaking to their website lately. The legacy version of the American Factfinder is set to disappear for good on Jan 20, 2012.

Mapping Domestic Migration with IRS Data

Friday, November 18th, 2011

Forbes magazine just published a neat interactive map on American migration using data NOT from the Census, but from – the IRS. Whether you fill it out virtually or the old fashioned way, everyone fills in their address at the top of the 1040, and the IRS stores this in a database. If you file from a different address from one year to the next you must have moved, and the IRS publishes a summary file of where people went (with all personal information and practically all filing data stripped away) .

The Forbes map taps into five years of this data and lets you see all domestic in-migration and out-migration from a particular county. The map is a flow or line map with lines going from the county you choose to each target – net in-migration to your county is colored in blue and net out-migration is red. You can also hover over the sending and receiving counties to see how many people moved. Click on the map to choose your county or search by name; you also have the option of searching for cities or towns, as the largest place within each county is helpfully identified and tied to the data.

It’s relatively straightforward and fun to explore. Some of the trends are pretty striking – the difference between declining cities (Wayne County – Detroit MI) and growing ones (Travis County – Austin TX) is pretty vivid, as is the change in migration during the height of the housing boom period in 2005 compared to the depth of the bust in 2009 (see Maricopa County – Phoenix AZ). More subtle is the difference in the scope of migration between urban and rural counties, with the former having more numerous and broader connections and the latter having smaller, more localized exchanges. Case in point is my home state of Delaware – urban New Castle County (Wilmington) compared to rural Sussex County (Seaford). There are many other stories to see here – the exodus from New Orleans after Katrina and the subsequent return of residents, the escape from Los Angeles to the surrounding mountain states, and the pervasiveness of Florida as a destination for everybody (click on the thumbnails below for full images of each map).

Detroit 2009

Wayne Co MI (Detroit) 2009

Austin 2009

Travis Co TX (Austin) 2009

Phoenix 2005

Mariciopa Co AZ (Phoenix) 2005

Phoenix 2009

Mariciopa Co AZ (Phoenix) 2009

Wilmington 2009

New Castle Co DE (Wilmington) 2009

Seaford 2009

Sussex Co DE (Seaford) 2009

While the map is great, the even better news is that the data is free and can be downloaded by anyone from the IRS Statistics page. They provide a lot of summary data – information for individuals is never reported. The individual tax data page with data gleaned from the 1040 has the most data that is geographic in nature. If you wanted to see how much and what kind of tax is collected by state, county, and ZIP code you could get it there. The US Population Migration data used to create the Forbes map is also there and the years from 2005 to 2009 are free (migration data from 1991 to 2004 is available for purchase).

You can download separate files for county inflow and county outflow on a state by state basis in Excel (.xls) format, or you can download the entire enormous dataset in .dat or .csv format. The data that’s reported is the number of filings and exemptions that represent a change in address by county from one year to the next, and includes the aggregated adjusted gross income of the total filers. There are some limitations – in order to protect confidentiality, if the flow from one county to another has less than 10 moves that data is lumped into an “other” category. International migration is also lumped into one interntaional category (on the Forbes map, both the other category where two counties have a flow less than 10 and the foreign migration category are not depicted).

The IRS migration data is often used when creating population estimates; when combined with vital stats on births and deaths it can serve as the migration piece of the demographic equation.

2010 Census Data Being Released

Thursday, June 16th, 2011

The US Census Bureau has begin releasing data for Summary File 1, which is the primary summary data set that the Bureau tabulates. They will release data for groups of states on a weekly basis from June through September. Alabama and Hawaii were the first states released today. California, Delaware, Kansas, Pennsylvania and Wyoming are out next week.

This data is based on the 100% count of the population and is being released for geographies that nest within states: states, counties, county subdivisions, places, census tracts, ZCTAs, and congressional districts, and in some cases block groups and blocks. You can download the data table by table by building queries via the new American Factfinder, or power users can download entire datasets via the FTP site.

You’ll see how small the 2010 Census is compared to the past: we’re only going to get basic demographic variables. The extensive number of socio-economic indicators – education, income, language, employment status, etc – are no longer collected as part of the decennial census; you have to turn to the American Community Survey for this data, which is released on an annual basis.

Here’s what’s in the 2010 Census:

  • Total Population
  • Urban and Rural Population
  • Gender and Age
  • Race
  • Hispanic or Latino Origin
  • Households (Including Type and Size)
  • Group Quarters
  • Families
  • Family Relationships
  • Housing Units
  • Occupancy Status (Occupied or Vacant)
  • Tenure (Owner or Renter Occupied)

Many of these variables are cross-tabulated by age, gender, race, Hispanic or Latino Origin, Household Type, and Household Size. Once we get to the fall of 2011 we’ll start to see national level data for divsions, regions, and metropolitan areas.

2010 Census Redistricting Data

Sunday, April 17th, 2011

The Redistricting Summary Data [P.L. 94-171] from the 2010 Census has all been published for the nation, states, counties, and places, and is available via the new American Factfinder. The redistricting data includes basic demographic data: total population, race, Hispanic or Latino origin, and number of housing units occupied and vacant. Data is available down to census blocks and is available for most (but not all – no ZCTAs or PUMAs) geographies.

If you don’t want all the data for a state, don’t want to slog through the Factfinder, and are comfortable working with large text files, you can FTP the summary data from the Redistricting Data homepage. If you want basic summary data for states, counties, and places and don’t want to fuss with the Factfinder or text files, you can download Excel spreadsheets from the Redistricting Data Press Kit. They also have some pdf / jpg maps showing county level population and population change, plus interactive map widgets like the one below for the country and for each state. 2010 Redistricting TIGER Shapefiles have also been released for geographies included in the redistricting dataset.

The full 2010 Census for all geographies will be released throughout this summer and into the fall in Summary File 1 [SF1]. Stay tuned.

Some 2010 Census Updates

Monday, February 7th, 2011

Some geography updates to pass along regarding new US Census data:

  • The Census has released a few 2010 map widgets that you can embed in web pages. One shows population change, density, and apportionment for the whole country at the state level, while the other shows population, race, and Hispanic change for states at the county level. As of this post only four states are ready (LA, MS, NJ, and VA) but they’ll be adding the rest once they’re available.
  • The 2010 TIGER Line Files are starting to be released; they’ve changed the download interface a little bit based on user feedback. Most summary levels / geographic areas are available; some (like ZIP Codes and PUMAs) will be released later this year.
  • They’re also rolling out the new interface for the American Factfinder; currently you can get 2000 Census data, some population estimates, and the 2010 Census data as it becomes available. Other datasets like the American Community Survey and Economic Census will be added over time. Some maps and gov docs librarians have expressed concerned about the change – apparently when you download the data from the new interface the FIPS codes are not “ready to go” for joining to shapefiles; there’s one long geo id that has to be parsed. The other concern is that the 1990 Census won’t be carried over into the new interface at all. The original American Factfinder is slated to come down towards the end of this year.

Track 2010 Census Participation Rates

Tuesday, March 30th, 2010

The 2010 Census is in full swing – the target date of April 1st is coming up soon. I mailed my form back last week. If you’re curious as to how many others have mailed theirs back, check out the bureau’s interactive Take 10 Map. Built on top of a Google Map interface, it allows you to track participation rates by state, county, place, reservation, and census tract. You can zoom in to change the scale and select different geography, or enter a zip code, city, or state to zoom to an area of choice. Clicking on an area will display it’s participation rate to date, compared to the state and national rates.

Data is updated daily, Monday through Friday. Once you click on a particular area, if you click the Track Participation Rate link it will create a widget that you can embed in a website to provide the updated rate. Unlike a lot of the other interactive web maps floating around these days, the bureau does give you the ability to download the actual data behind the map, if you want to do some analysis of your own.

NY Times Interactive Maps

Sunday, March 29th, 2009

The New York Times has been putting together some great, web-based, thematic maps lately. I thought I’d provide a summary of some of the latest and greatest here.

US Maps

  • Immigration Explorer – Explore foreign born groups for the United States by county, based on the decennial census from 1880 to 2000. Choropleth maps of the largest immigrant group per county and graduated circle maps depicting the size of each group. 3/10/2009.
  • The Geography of a Recession – Choropleth map of the US that shows the annual change in unemployment by county. Lets you filter by county type (urban, rural, manufacturing areas, housing bubbles). 3/7/2009.
  • A Growing Detention Network – a graduated circle map of the US that shows detention centers where people are held on immigration violations, by number of detainees and type of facility. 12/26/2008.
  • Where Homes Are Worth Less Than the Mortgage – State-based US choropleth and graduated circle maps of the housing and debt crisis. 11/10/2008.

NYC Maps

  • New Yorkers Assess Their City – How New Yorkers rate their neighborhood based on quality of life, city services, education, transportation and crime. Based on a large survey of city residents. Choropleth maps of community districts. 3/7/2009.
  • Census Shows Growing Diversity in New York City – Choropleth maps show median rent and median income by PUMAs in 2000 and 2007. An example of mapping 3-year ACS data by PUMAs to show patterns below the county level. 12/9/2008.

World Maps

  • A Map of Olympic Medals – A cartogram of countries based on the number of olympic medals they’ve won for every olympics from 1896 to 2008. Mouse over to get medal counts. 8/4/2008.

Census Cartographic Boundary Files

Tuesday, May 13th, 2008

I’ve worked with these files a number of times and just used them again recently, and thought I would share the process you need to go through to prepare them for use in ArcGIS, as they are not “ready to go”. If you are not using ArcGIS, you can still follow these general steps using the specific tools that your software provides.

I would opt for the Cartographic Boundary Files (CBF) over the TIGER shapefiles (that the census just released) when making a national-level thematic map, as the generalization of the CBF makes the boundaries look cleaner at this scale. Also, the generalized files show land boundaries along coasts, while the TIGER files show the legal boundaries that extend into the water. The latter are not great for thematic maps, particularly as the Great Lakes states look distorted (as their boundaries extend into the lakes).

I’ll use the state and equivalent areas as an example, as those are the files I’ve just worked with. After downloading and unzipping the national-level shapefiles, you’ll need to take the following steps in the ArcCatalog:

  • Define the projection, as the files are undefined. According to metadata on the website, the files are in simple NAD83. In the ArcToolbox, the tool is under Data Management Tools, Projections and Transformations, Define Projection. Once you launch the tool, you will need to select the North American Datum 1983 as the coordinate system, which is stored under Geographic Coordinate Systems for North America.

  • After you define the projection, the next step is to reproject the layer to another projection that is more suitable for displaying the US. If you are making a map for basic presentation, a projected coordinate system like Albers Equal Area Conic would be a good choice (most atlases and maps of the continental US use this projection). Alaska, Hawaii, and Puerto Rico will be distorted, but we will be able to give them a separate data frame in ArcMap with their own projection later on. The tool is in the ArcToolbox under Data Management Tools, Projections and Transformations, Features, Project. Note that this is a DIFFERENT tool than the one we used in the last step. Define Projection is used to tell ArcGIS what projection a file is in if it is undefined, while Feature, Project is used to reproject a vector file from one projection to another. A file MUST have a defined projection BEFORE you can reproject it.

  • The CBF’s are stored as single part features, which means that each distinct polygon will have its own record in the attribute table. For example, each of the Hawaiian Islands will have its own record in the table. This is a problem if you plan to join state-level data to your shapefile, as the data from the join will be repeated for each record. So if you have a table with population data for each of the states and you join it to the shapefile, each individual Hawaiian island will be assigned the total population of Hawaii. If you run statistics on your data, you’ll get inflated counts. To avoid this, we need to convert the CBF to a multi-part feature, where each state will have only one record in the attribute table. To do this, we use the Dissolve tool under Data Management Tools, Generalization, Dissolve. The Dissolve fields will be the basis for dissolving the individual parts of the states into one state feature. In this case, we would choose the STATE field (FIPS code) and NAME field as the dissolve field, which will give us one feature for each state (if we chose DIVISION or REGION as the field, we would aggregate the polygons to create those larger geographic areas).

  • The next step is to decide whether you want to keep your shapefile as an independent file, or bring it into a geodatabase. The geodatabase is handy if you have lots of other tables and shapefiles that you are using in your project. Right-click in the catalog tree to create a new personal or file geodatabase. Then select your shapefile and right click to export it to your new geodatabase.

  • Whether you stick with a shapefile or go with a geodb, the next step is to open ArcMap and add your file to it. Now, you’ll have to make a decision about Puerto Rico. If you have a dataset where you want to map data for it, then you need not do anything. Since I am making presidential election maps and Puerto Rico doesn’t vote in the electoral college, I needed to delete it. To do so, go into an Edit mode under the Editor toolbar, select PR in the attribute table or map, delete it, then save. You’ll be left with a file for the 50 states and DC.

  • At this point, if you are going to join table data to your features, do so. Your features have a FIPS code, so you can use that to do the join (NEVER use names for joining – stick with codes). I often will add a new column to my features and plug in the two letter postal abbreviations, since they are commonly used for identifying states.

  • National Map With Multiple Data LayersOnce you’ve joined your data and are ready to make a finished map, the last step will be adding two new data frames for Alaska and Hawaii. Since AK and HI are distant from the continental US, it is better to create separate frames for all three rather than trying to display them in one. Copy your current data layer (not the features – the layer which is indicated by the yellow rectangles layered on top of each other) in the table of contents, and paste it below. Activate that layer, and name the layer Alaska. Then right click on the properties for the data layer and go to the coordinates tab. Modify the coordinate system of the data layer by choosing Alaska Albers Equal Area Conic. This will reproject the data on the fly and will display Alaska in a more appropriate projection (as the continental projection distorts it). Then, in the Layout View, you can resize the Alaska data frame and zoom in to focus just on AK. Repeat these steps for Hawaii (and Puerto Rico if you’re mapping it), and you’ll have a good looking US map!

