American Factfinder Tutorial & Census Geography Updates

Monday, July 23rd, 2012

I’ve been en-meshed in the census lately as I’ve been writing a paper about the American Community Survey. Here are a few a things to share:

  • Since I frequently receive questions about how to use the American Factfinder, I’ve created a brief tutorial with screenshots demonstrating a few ways to navigate it. I illustrate how to download a profile for a single census tract from the American Community Survey, and how to download a table for all ZIP Code Tabulation Areas (ZCTAs) in a county using the 2010 Census.
  • New boundaries for PUMAs based on 2010 census geography have been released; they’re not available from the TIGER web-based interface yet but you get can state-based files from the FTP site. I’ve downloaded the boundaries for New York and there are small changes here and there from the 2000 Census boundaries; not surprising as PUMAs are built from tracts and tract boundaries have changed. One big bonus is that PUMAs now have names associated with them, based on local government suggestions. In NY State they either take the name of counties with some directional element (east, central, south, etc), or the name of MCDs that are contained within them. In NYC they’ve been given the names of community districts.
  • I’ve done some digging through the FAQs at and discovered that the census is going to stick with the old 2000 PUMA boundaries for the next release of the American Community Survey – the 2011 ACS will be released at the end of this year. 2010 PUMAs won’t be used until the 2012 ACS, to be released at the end of 2013.
  • Urban Areas are the other holdovers in the ACS that use 2000 vintage boundaries. The ACS will also transition to the 2010 boundaries for urban areas in the 2012 ACS.
  • In the course of my digging I discovered that the census will begin including ZCTA-level data as part of the 5-year ACS estimates, beginning with the 2011 release this year. 2010 ZCTA boundaries are already available, and 2010 Census data has already been released for ZCTAs. The ACS will use the 2010 vintage ZCTAs for each release until they’re redrawn for 2020.

GIS Workshops This Apr & May

Sunday, March 25th, 2012

This semester I’ll be teaching three workshops with Prof. Deborah Balk in spatial tools and analysis. Sponsored by the CUNY Institute of Demographic Research (CIDR), the workshops will be held on Baruch College’s campus in midtown NYC on Friday afternoons. The course is primarily intended for data and policy analysts who want to gain familiarity with the basics of map making and spatial analysis; registration is open to anyone. The workshops progress from basic to intermediate skills that cover making a map (Apr 27th), geospatial calculations (May 4th), and geospatial analysis (May 11th). We’ll be using QGIS and participants will work off of their own laptops; we’ll also be demonstrating some of the processes in ArcGIS and participants will receive an evaluation copy of that software. Each workshop is $300 or you can register for all three for $750.

For full details check out this flier. You can register via the College’s CAPS website; do a search for DEM and register for each session (DEM0003, DEM0004, and DEM0005).

Country Centroids File Updated

Monday, February 13th, 2012

A brief note – I’ve updated and replaced the country centroids file that I was previously hosting. I extracted data with geographic centroids in latitude and longitude for each country and dependency in the world using extracts from the NGA’s GNS and the USGS GNIS. Data is current as of Feb 2012, with long and short names for countries and two letter alpha FIPS and ISO codes for identification and attribute linking. Available for download on the Resources page.

2010 Census Generalized Cartographic Boundary Files

Thursday, December 22nd, 2011

I’ve had a few interesting projects that have kept me busy at the end of this year. I’ll do a post or two after New Years, once I’m back in the office and can take some screen shots to illustrate.

In the meantime I have one tidbit I can mention – the Census Bureau has released the 2010 version of the Generalized Cartographic Boundary Files. These files are generalized versions of the TIGER files, with smoothed and simplified boundaries and areas of coastal water removed. They haven’t posted them on the same page as the 2000 and 1990 boundaries; they’ve mentioned they’re creating a new interface to host all of them, which is currently a work in process at

However, you can get access to all the 2010 boundaries via the FTP site – you just need to know what you’re looking at. All the files are named with codes to identify the geographic coverage, summary level, and resolution / scale. There’s a README file on the FTP page that tells you how to identify each.

But in brief – The file names look like this:, where:

  • ss is the state INCITS / FIPS code which you can look up here – ‘us’ is a national level file.
  • lll is the summary level or unit of geography – the README file has a table with each code. The most common ones: 040 for state, 050 for county, 060 for county subdivisions, 140 for census tracts, 160 for places, 310 for metropolitan and micropolitan statistical areas, 860 for ZCTAs. (No PUMAs- 2010 PUMA boundaries haven’t been drawn yet, and 2000 PUMA boundaries are still being used in the latest ACS).
  • vv is a version number for the file.
  • rr is resolution – most of the files are 500k = 1:500,000, which is the least generalized and best for mapping state-level to regional areas. For national level files you also have the option of 5m = 1:5,000,000 and 20m = 1:20,000,000, which are more generalized and better for national mapping.

The Census Bureau has been doing a lot of tweaking to their website lately. The legacy version of the American Factfinder is set to disappear for good on Jan 20, 2012.

Gothos Back In Business

Thursday, October 6th, 2011

Gothos had some downtime this past weekend, as I was cleaning up after a hacking attack. I managed to get things fixed and was removed from Google’s blacklist, so all is now good and the site is safe. If you visited during the last two weeks of Sept into the first weekend of Oct, I’d recommend running a virus and an anti-spyware scanner on your machine just in case.

I managed to restore all the posts and images, and most of the resources – a few are missing. If you linked to this site or to any individual post your links should still work. I did lose all of the comments and the user list. At this point I’m leery about opening membership back up; it looked like a lot of garbage was inserted via comments. Since this site has functioned more or less as a traditional, one-way resource and not as a collaborative blog, I’m going to keep comments turned off and not allow members for now. If you’d like to continue following I’d suggest you subscribe to the RSS feed or bookmark the site and check back now and again. Of course, you’re always welcome to send me an email if you have questions or comments.

Since I’m essentially starting fresh I went with a different theme for a new look and feel. I’d like to thank my host, Webfaction for their help in getting me through this, and I’d recommend the Sucuri website scanner to anyone owns a site or has suspicions about one you’re visiting.

2010 American Community Survey Releases

Friday, September 23rd, 2011

The US Census Bureau released the new annual data for the 2010 American Community Survey; this dataset includes an extensive number of demographic, socio-economic, and housing estimates (with margins of error) for all geographic areas in the US that have a population of at least 65,000 people. This is the first ACS survey that is weighted based on the 2010 Census, and that is tabulated entirely on the new 2010 Census geography; exceptions include PUMAs and urban areas, which typically aren’t redrawn until a couple of years after a decennial census is taken. Data for these areas will be reported based on the 2000 Census geography. This will also be the first ACS that is distributed via the new American Factfinder. Previous ACS datasets should be moved to the new Factfinder by the end of this year.

According to the release schedule data for the three year ACS (2008-2010) for areas with at least 20,000 residents will be published in October and the five year ACS (2006-2010) for geography down to census tracts will be released in December. The three year dataset hits a milestone this year, as for the first time we’ll have datasets with mutually exclusive years that can be feasibly compared for historical change (the 2005-2007 dataset versus 2008-2010). It should prove interesting as the earlier dataset represents the end of the brief boom years while the current one depicts the depth of the great recession. There will be some challenges in making comparisons, as the base for weighting the estimates and the geography used to tabulate them is different for each dataset (2000 Census in the earlier dataset versus 2010 Census in the latest one).

Updates for QGIS 1.7 Wroclaw

Tuesday, July 5th, 2011

The latest version of QGIS, 1.7 “Wroclaw” was released a few weeks ago. Some of the recent updates render parts of my GIS Practiucm out of date (unless you’re sticking with versions 1.5 or 1.6), so I’ll be making updates later this summer for the upcoming workshops this academic year. In the meantime, I wanted to summarize the most salient changes here for the participants in this past spring’s workshops, and for anyone else who may be interested. Here are the big two changes that affect the tutorial / manual:

Transforming Projections – In previous versions you would go under Vector – > Data Management Tools > Export to New Projection. In 1.7 this has been dropped from the ftools vector menu. To transform the projection of a file, you select it in the Map Legend (ML), right-click, hit Save As, give it a new name and specify the new CRS there. The QGIS developers have provided some info on how QGIS handles projections that’s worth checking out. You can go in the settings and have QGIS transform projections on the fly, which is fine depending on what you’re going to do. My preference is to play it safe – do the transformations and make sure all your files and the window share the same CRS. It can save you headaches later on.

Table Joins – In previous versions you would also accomplish this under Vector – > Data Management Tools > Join Attributes, where you’d join a DBF or CSV to a shapefile to create a new file with both the geometry and the data. Now that’s out, and QGIS can support dynamic joins, similar to ArcGIS where you couple the attribute table to the shapefile without permanently fusing the two. To do this you must add your DBF or CSV directly to your project; do this the same way you’d add a vector layer. Hit the Add Vector button, and change the drop down at the bottom so you can see all files (not just shapefiles) in your directory. Add your table. Then select your layer in the ML and double click to open the properties menu. You’ll see a new tab for Joins. Hit the tab and hit the plus button to add a new join. You’ll select your table, the table’s join field, and the layer’s join field. Hit OK to join em, and it’s done. Open the attribute table for the layer and you’ll see all columns for the layer and the joined field.
New Join Menu QGIS 1.7

This sounds great, but I had some trouble when I went to symbolize my data. Using the old symbology tab, I couldn’t classify any of my columns from my attribute table using Equal Intervals; it populated each class with zeros. Quantiles worked fine. If I switched to the new symbology, I still couldn’t use Equal Intervals, and Quantiles and Natural Breaks only worked partially – my dataset contained negative values which were simply dropped instead of classified. Grrrrr. I got around this by selecting my layer in the ML (after doing the join), right clicked, and saved it as a new layer. This permanently fused the shapefile with the attributes, and I had no problem classifying and mapping data in the new file. I went to the forum and asked for help to see if this is a bug – will report back with the results.

Here are some other updates worth noting:

  • Feature Count – if you right click on a layer in the ML, you can select the Feature Count option and the number of features will be listed under your layer. If you’ve classified your features, it will do a separate count for each classification.
  • Feature Count QGIS 1.7

  • Measuring Tool – it looks like it’s now able to convert units on the fly, so even if you’re using a CRS with units in degrees, it can convert it to meters and kilometers (you can go into the options menu if you want to switch to feet and miles).
  • Labels – it looks like the experimental labelling features have been made default in this verson, and they do work better, especially with polygons.
  • Map Composer Undo – undo and redo buttons have been added to the map composer, which makes it much easier to work with.
  • Undo Redo Button Map Composer

  • Map Composer Symbols – if you go to insert an image in the map composer, you’ll have a number of default SVG symbols you can load, incuding north arrows
  • Export to PDF – from the map composer, this was pretty buggy in the past but I was able to export a map today without any problems at all.

FOSS4G In Denver This Sept

Monday, June 20th, 2011

I’m all set to go to FOSS4G 2011, the global conference on Free and Open Source Software for Geospatial, organized by OSGeo. The conference takes place in Denver, CO from Mon Sept 12 to Fri the 16th. The first two days (12th-13th) consist of morning and and afternoon workshops while the main conference takes place from the 14th to the 16th and features talks, presentations, tutorials, exhibits, and some fun social events.

The full program is available here, and it looks like it’s chock full of interesting presentations and lots of great learning opportunities via the workshops and tutorials. I’ll be presenting on Weds afternoon, for those interested in my adventures in introducing QGIS on a college campus.

If you’re on the fence about attending, consider this: this is the sixth year for the conference and it’s only the second time that it’s been held in North America (Canada hosted the 2nd conference in 2007) and the first time it’s being hosted in the US. So if you’re in North America and getting funding from your organization for travel is an issue, now’s your best chance to go. This is truly an international conference (was also hosted in Switzerland, South Africa, Australia, and Spain) so it probably won’t be back on these shores for awhile.

Here’s some more motivation – early registration at the discounted rate ends on June 30th!

2010 Census Data Being Released

Thursday, June 16th, 2011

The US Census Bureau has begin releasing data for Summary File 1, which is the primary summary data set that the Bureau tabulates. They will release data for groups of states on a weekly basis from June through September. Alabama and Hawaii were the first states released today. California, Delaware, Kansas, Pennsylvania and Wyoming are out next week.

This data is based on the 100% count of the population and is being released for geographies that nest within states: states, counties, county subdivisions, places, census tracts, ZCTAs, and congressional districts, and in some cases block groups and blocks. You can download the data table by table by building queries via the new American Factfinder, or power users can download entire datasets via the FTP site.

You’ll see how small the 2010 Census is compared to the past: we’re only going to get basic demographic variables. The extensive number of socio-economic indicators – education, income, language, employment status, etc – are no longer collected as part of the decennial census; you have to turn to the American Community Survey for this data, which is released on an annual basis.

Here’s what’s in the 2010 Census:

  • Total Population
  • Urban and Rural Population
  • Gender and Age
  • Race
  • Hispanic or Latino Origin
  • Households (Including Type and Size)
  • Group Quarters
  • Families
  • Family Relationships
  • Housing Units
  • Occupancy Status (Occupied or Vacant)
  • Tenure (Owner or Renter Occupied)

Many of these variables are cross-tabulated by age, gender, race, Hispanic or Latino Origin, Household Type, and Household Size. Once we get to the fall of 2011 we’ll start to see national level data for divsions, regions, and metropolitan areas.

2010 Census Redistricting Data

Sunday, April 17th, 2011

The Redistricting Summary Data [P.L. 94-171] from the 2010 Census has all been published for the nation, states, counties, and places, and is available via the new American Factfinder. The redistricting data includes basic demographic data: total population, race, Hispanic or Latino origin, and number of housing units occupied and vacant. Data is available down to census blocks and is available for most (but not all – no ZCTAs or PUMAs) geographies.

If you don’t want all the data for a state, don’t want to slog through the Factfinder, and are comfortable working with large text files, you can FTP the summary data from the Redistricting Data homepage. If you want basic summary data for states, counties, and places and don’t want to fuss with the Factfinder or text files, you can download Excel spreadsheets from the Redistricting Data Press Kit. They also have some pdf / jpg maps showing county level population and population change, plus interactive map widgets like the one below for the country and for each state. 2010 Redistricting TIGER Shapefiles have also been released for geographies included in the redistricting dataset.

The full 2010 Census for all geographies will be released throughout this summer and into the fall in Summary File 1 [SF1]. Stay tuned.

