Posts Tagged ‘american community survey’

Government Shutdown: Alternate Resources for US Census Data

Sunday, October 13th, 2013

As the US government shutdown continues (thanks to a handful of ideological nutcases in congress) those of us who work with and rely on government data are re-learning the lesson of why it’s important to keep copies of things. This includes having alternate sources of information floating around on the web and in the cloud, as well as the tried and true approach of downloading and saving datasets locally. There have been a number of good posts (like this succinct one) to point users to alternatives to the federal sources that many of us rely on. I’ll go into more detail here with my suggestions on where to access US Census data, based on user-level and need.

  • The Social Explorer: this web-mapping resource for depicting and accessing US Census data from 1790 to present (including the 2010 Census and the latest American Community Survey data) is intuitive and user-friendly. Many academic and public libraries subscribe to the premium edition that provides full access to all features and datasets (so check with yours to see if you have access), while a basic free version is available online. Given the current circumstances the Social Explorer team has announced that it will open the hatch and provide free access to users who request it.
  • The NHGIS (National Historic GIS): this project is managed by the Minnesota Population Center and also provides access to all US Census data from 1790 to present. While it’s a little more complex than the Social Exlorer, the NHGIS is the better option for downloading lots of data en-masse, and is the go-to place if you need access to all datasets in their entirety, including all the detail from the American Community Survey (as the Social Explorer does not include margins of error for any of the ACS estimates) or if you need access to other datasets like the County Business Patterns. Lastly – it is the alternative to the TIGER site for GIS users who need shapefiles of census geography. You have to register to use NHGIS, but it’s free. For users who need microdata (decennial census, ACS, Current Population Survey), you can visit a related MPC project to the NHGIS: IPUMS.
  • The Missouri Census Data Center (MCDC): I’ve mentioned a number of their tools in the past; they provide easy-to-access profiles from the 2010 Census and American Community Survey, as well as historical trend reports for the ACS. For intermediate users they provide extract applications for the 2010 Census and ACS for creating spreadsheets and SAS files for download, and for advanced users the Dexter tool for downloading data en-masse from 1980 to present. Unlike the other resources no registration or sign-up is required. I also recommend the MCDC’s ACS and 2010 Census profiles to web designers and web mappers; if you’ve created online resources that tapped directly into the American Factfinder via deep links (like I did), you can use the MCDC’s profiles as an alternative. The links to their profiles are persistent and use a logical syntax (as it looks like there’s no end in site to this shutdown I may make the change-over this week). Lastly, the MCDC is a great resource for technical documentation about geography and datasets.
  • State and local government: thankfully many state and local governments have taken subsets of census data of interest to people in their areas and have recompiled and republished it on the web. These past few weeks I’ve been constantly sending students to the NYC Department of City Planning’s population resources. Take a look at your state data center’s resources, as well as local county or city planning departments, transportation agencies, or economic development offices to see what they provide.

Article on Working With the American Community Survey

Monday, June 17th, 2013

I’ve got another article that’s just hit the presses. In this one I discuss the American Community Survey: how it differs from the Decennial Census, when you should use it versus other summary data sets, how to work with the different period estimates, and how to create derived estimates and calculate their margins of error. For that last piece I’ve essentially done an extended version of this old post on Excel formulas, with several different and updated examples.

The article is available via Emerald’s journal database. If you don’t have access to it from your library feel free to contact me and I’ll send you a copy (can’t share this one freely online).

Title: The American Community Survey: practical considerations for researchers
Author(s): Francis P. Donnelly
Citation: Francis P. Donnelly, (2013) “The American Community Survey: practical considerations for researchers”, Reference Services Review, Vol. 41 Iss: 2, pp.280 – 297
Keywords: American Community Survey, Census, Census geography, Data handling, Decennial census, Demographic data, Government data processing, Government information, Margins of error, Sample-based data, United States of America, US Census Bureau
Article type: Technical paper
DOI: 10.1108/00907321311326228 (Permanent URL)
Publisher: Emerald Group Publishing Limited

Downloading Data for Small Census Geographies in Bulk

Tuesday, May 7th, 2013

I needed to download block group level census data for a project I’m working on; there was one particular 2010 Census table that I needed for every block group in the US. I knew that the American Factfinder was out – you can only download block group data county by county (which would mean over 3,000 downloads if you want them all). I thought I’d share the alternatives I looked at; as I searched around the web I found many others who were looking for the same thing (i.e. data for the smallest census geographies covering a large area).

The Census FTP site at http://www2.census.gov/

This would be the first logical step, but in the end it wasn’t optimal based on my need. When you drill down through Census 2010, Summary File 1, you see a file for every state and a national file. Initially I thought – great! I’ll just grab the national file. But the national file does NOT contain the small census statistical areas – no tracts, block groups, or blocks. If you want those small areas you have to download the files for each of the states – 51 downloads. When you download the data you can also download an MS Access database, which is an empty shell with the geography and field headers, and you can import each of the text file data tables (there a lot of them for 2010 SF1) into the db and match them to the headers during import (the instructions that were included for doing this were pretty good). This is great if you need every variable in every table for every geography, but I was only interested in one table for one geography. I could just import the one text file with my table, but then I’d have to do this import process 51 times. The alternative is to use some Python to get that one text file for every state into one big file and then do the import once, but I opted for a different route.

The NHGIS at https://www.nhgis.org/

I always recommend this resource to anyone who’s looking for historical census data or boundary files, but it’s also good if you want current data for these small areas. I was able to use their query window to widdle down the selection by dataset (2010 SF1), geography (block groups), and topic (Hispanic origin and race in my case), then I was able to choose the table I needed. On the last screen before download I was able to check a box to include all 50 states plus DC and PR in one file. I had to wait a couple minutes for the request to process, then downloaded the file as a CSV and loaded it into my database. This was the best solution for my circumstances by far – one table for all block groups in the country. If you had to download a lot (or all) of the tables or variables for every block group or block it may take quite awhile, and plugging through all of those menus to select everything would be tedious – if that’s your situation it may be easier to grab everything using the Census FTP.

nhgis

UExplore / Dexter at http://mcdc.missouri.edu/applications/uexplore.shtml

The Missouri Census Data Center’s UExplore / Dexter tool lets you choose a dataset and takes you to a window that resembles a file system, with a ton of files in it. The MCDC takes their extracts directly from the Census, so they’re structured in a similar way to the FTP site as state-based files. They begin with the state prefix and have a name that indicates geography – there are files for block groups, blocks, and one for everything else. There are national files (which don’t contain small census areas) that begin with ‘us’. The difference here is – when you click on a file, it launches a query window that let’s you customize the extract. The interface may look daunting at first, but it’s worth exploring (and there’s a tutorial to help guide you). You can choose from several output formats, specific variables or tables (if you don’t want them all), and there are a bunch of handy options that you can specify like aggregation or percent totals. In addition to the complete datasets, they’ve also created ‘Standard Extracts’ that have the most common variables, if you want just a core subset. While the NHGIS was the best choice for my specific need, the customization abilities in Dexter may fit your needs – and the state-level block group and block data is conveniently broken out from the other files.

Lastly…

There are a few others tools – I’ll give an honorable mention to the Summary File Retrieval tool, which is an Excel plugin that lets you tap directly into the American Community Survey from a spreadsheet. So if you wanted tracts or block groups for a wide area for but a small number of variables (I think 20 is the limit) that could be a winner, provided you’re using Excel 2007 or later and are just looking at the ACS. No dice in my case, as I needed Decennial Census data and use OpenOffice at home.

NYC Geodatabase in Spatialite

Wednesday, February 6th, 2013

I spent much of the fall semester and winter interim compiling and creating the NYC geodatabase (nyc_gdb), a desktop geodatabase resource for doing basic mapping and analysis at a neighborhood level – PUMAs, ZIP Codes / ZCTAs, and census tracts. There were several motivations for doing this. First and foremost, as someone who is constantly introducing new people to GIS it’s a pain sending people to a half dozen different websites to download shapefiles and process basic features and data before actually doing a project. By creating this resource I hoped to lower the hurdles a bit for newcomers; eventually they still need to learn about the original sources and data processing, but this gives them a chance to experiment and see the possibilities of GIS before getting into nitty gritty details.

Second, for people who are already familiar with GIS and who have various projects to work on (like me) this saves a lot of duplicated effort, as the db provides a foundation to build on and saves the trouble of starting from scratch each time.

Third, it gave me something new to learn and will allow me to build a second part to my open source GIS workshops. I finally sat down and hammered away with Spatialite (went through the Spatialite Cookbook from start to finish) and learned spatial SQL, so I could offer a resource that’s open source and will compliment my QGIS workshop. I was familiar with the Access personal geodatabases in ArcGIS, but for the most part these serve as simple containers. With the ability to run all the spatial SQL operations, Spatialite expands QGIS functionality, which was something I was really looking for.

My original hope was to create a server-based PostGIS database, but at this point I’m not set up to do that on my campus. I figured Spatialite was a good alternative – the basic operations and spatial SQL commands are relatively the same, and I figured I could eventually scale up to PostGIS when the time comes.

I also created an identical, MS Access version of the database for ArcGIS users. Once I got my features in Spatialite I exported them all out as shapefiles and imported them all via ArcCatalog – not too arduous as I don’t have a ton of features. I used the SQLite ODBC driver to import all of my data tables from SQLite into Access – that went flawlessly and was a real time saver; it just took a little bit of time to figure out how to set up (but this blog post helped).

The databases are focused on NYC features and resources, since that’s what my user base is primarily interested in. I purposefully used the Census TIGER files as the base, so that if people wanted to expand the features to the broader region they easily could. I spent a good deal of time creating generalized layers, so that users would have the primary water / coastline and large parks and wildlife areas as reference features for thematic maps, without having every single pond and patch of grass to clutter things up. I took several features (schools, subway stations, etc) from the City and the MTA that were stored in tables and converted them to point features so they’re readily useable.

Given that focus, it’s primarily of interest to NYC folks, but I figured it may be useful for others who wish to experiment with Spatialite. I assumed that most people who would be interested in the database would not be familiar with this format, so I wrote a tutorial that covers the database and it’s features, how to add and map data in QGIS, how to work with the data and do SQL / spatial SQL in the Spatialite GUI, and how to map data in ArcGIS using the Access Geodb. It’s Creative Commons, Attribution, Non-Commercial, Share-alike, so feel free to give it a try.

I spent a good amount of time building a process rather than just a product, so I’ll be able to update the db twice a year, as city features (schools, libraries, hospitals, transit) change and new census data (American Community Survey, ZIP Business Patterns) is released. Many of the Census features, as well as the 2010 Census data, will be static until 2020.

American Factfinder Tutorial & Census Geography Updates

Monday, July 23rd, 2012

I’ve been en-meshed in the census lately as I’ve been writing a paper about the American Community Survey. Here are a few a things to share:

  • Since I frequently receive questions about how to use the American Factfinder, I’ve created a brief tutorial with screenshots demonstrating a few ways to navigate it. I illustrate how to download a profile for a single census tract from the American Community Survey, and how to download a table for all ZIP Code Tabulation Areas (ZCTAs) in a county using the 2010 Census.
  • New boundaries for PUMAs based on 2010 census geography have been released; they’re not available from the TIGER web-based interface yet but you get can state-based files from the FTP site. I’ve downloaded the boundaries for New York and there are small changes here and there from the 2000 Census boundaries; not surprising as PUMAs are built from tracts and tract boundaries have changed. One big bonus is that PUMAs now have names associated with them, based on local government suggestions. In NY State they either take the name of counties with some directional element (east, central, south, etc), or the name of MCDs that are contained within them. In NYC they’ve been given the names of community districts.
  • I’ve done some digging through the FAQs at https://askacs.census.gov/ and discovered that the census is going to stick with the old 2000 PUMA boundaries for the next release of the American Community Survey – the 2011 ACS will be released at the end of this year. 2010 PUMAs won’t be used until the 2012 ACS, to be released at the end of 2013.
  • Urban Areas are the other holdovers in the ACS that use 2000 vintage boundaries. The ACS will also transition to the 2010 boundaries for urban areas in the 2012 ACS.
  • In the course of my digging I discovered that the census will begin including ZCTA-level data as part of the 5-year ACS estimates, beginning with the 2011 release this year. 2010 ZCTA boundaries are already available, and 2010 Census data has already been released for ZCTAs. The ACS will use the 2010 vintage ZCTAs for each release until they’re redrawn for 2020.

ACS Trend Reports and Census Geography Guide

Sunday, February 12th, 2012

I recently received my first question from someone who wanted to compare 2005-2007 ACS data with 2008-2010. With the release of the latter, we can make historical comparisons with the three year data for the first time since we have estimates that don’t overlap. We should be able to make some interesting comparisons, since the first set covers the real estate boom years (remember those?) and the second covers the Great Recession. One resource that makes such comparisons relatively painless is over at the Missouri Census Data Center. They’ve put together a really clean and simple interface called the ACS Trends Menu, which allows you to select either two one period estimates or two three period estimates and compare them for several different census geographies – states, counties, MCDs, places, metros, Congressional Districts, PUMAs, and a few others – for the entire US (not just Missouri). The end result is a profile that groups data into the Economic, Demographic, Social, and Housing categories that the Census uses for its Demographic Profile tables. The calculations for change and percent change for the estimates and margins of error are done for you.

Downloading the data is not as straightforward – the links to extract it just brought me some error messages, so it’s still a work in progress. Until then, a simple copy and paste into your spreadsheet of choice will work fine.

ACS Trends Menu

If you like the interface, they’ve created separate ones for downloading profiles from any of the ACS periods or from the 2010 Census. The difference here is that you’re looking at one time frame; not across time periods. The interface and the output are the same, but in these menus you can compare four different geographies at once in one profile. Unlike the Trends reports, both the ACS and 2010 Census profiles have easy, clear cut ways to download the profiles as a PDF or a spreadsheet. If you’re happy with data in a profile format and want an interface that’s a little less confusing to navigate than the American Factfinder, these are all great alternatives (and if you’re building web applications these profiles are MUCH easier to work with – you can easily build permanent links or generate them on the fly).

The US Census Bureau also recently put together a great resource called the Guide to State and Local Census Geography. They provide a census geography overview of each state: 2010 population, land area, bordering states, year of entry into the union, population centroids, and a description of how local government is organized in the state – (i.e. do they have municipal civil divisions or only incorporated cities and unincorporated land, etc). You get counts for every type of geography – how many counties, tracts, ZCTAs, and so on, AND best of all you can download all of this data directly in tab delimited files. Need a list of every county subdivision in a state, with codes, land area, and coordinates? No problem – it’s all there.

2010 American Community Survey Releases

Friday, September 23rd, 2011

The US Census Bureau released the new annual data for the 2010 American Community Survey; this dataset includes an extensive number of demographic, socio-economic, and housing estimates (with margins of error) for all geographic areas in the US that have a population of at least 65,000 people. This is the first ACS survey that is weighted based on the 2010 Census, and that is tabulated entirely on the new 2010 Census geography; exceptions include PUMAs and urban areas, which typically aren’t redrawn until a couple of years after a decennial census is taken. Data for these areas will be reported based on the 2000 Census geography. This will also be the first ACS that is distributed via the new American Factfinder. Previous ACS datasets should be moved to the new Factfinder by the end of this year.

According to the release schedule data for the three year ACS (2008-2010) for areas with at least 20,000 residents will be published in October and the five year ACS (2006-2010) for geography down to census tracts will be released in December. The three year dataset hits a milestone this year, as for the first time we’ll have datasets with mutually exclusive years that can be feasibly compared for historical change (the 2005-2007 dataset versus 2008-2010). It should prove interesting as the earlier dataset represents the end of the brief boom years while the current one depicts the depth of the great recession. There will be some challenges in making comparisons, as the base for weighting the estimates and the geography used to tabulate them is different for each dataset (2000 Census in the earlier dataset versus 2010 Census in the latest one).

Google Maps to Create a Census Finding Aid

Thursday, May 13th, 2010

Yikes! It’s been quite awhile since my last post (the past couple months have been a little tough for me), but I just finished an interesting project that I can share.

I constantly get questions from students who are interested in getting recent demographic and socio-economic profiles for neighborhoods in New York City. The problem is that neighborhoods are not officially defined, so we have to look for a surrogate. The City has created neighborhood-like areas out of census tracts called community districts and they publish profiles for them, but this data is from the decennial census  and not current enough for their needs.  ZIP code data is also only available from the decennial census.

We can use PUMAs (Public Use Microdata Areas) to approximate neighborhoods in large cities, and they are published as part of the 3 year estimates of the American Community Survey. The problem is, in order to look up the data from the census you need to search by PUMA number – there are no qualitative place names. The city and the census have worked together to assign names to neighborhoods as part of the NYC Housing and Vacancy Survey, but this is the only place (I’ve found) that uses these names. You need to look in several places to figure out what the PUMA number and boundaries for an area are and then navigate through the census site to find it. Too much for the average student who visits me at the reference desk or emails me looking for data.

My solution was to create a finding aid in Google maps that tied everything together:

View Larger Map

I downloaded PUMA boundaries from the Census TIGER file site in a shapefile format. I opened them up in ArcGIS and used an excellent script that I downloaded called Export to KML. ArcGIS 9.3 does support KML exports via the toolbox, and there are a number of other scripts and stand-alone programs that can do this (I tried several) but Export to KML was best (assuming you have access to ArcGIS) in terms of the level of customization and the thoroughness of the user documentation. I symbolized the PUMAs in ArcGIS using the colors and line thickness that I wanted and fired up the tool. It allows you to automatically group and color features based on the layer’s symbology. I was able to add a “snippet” to each feature to help identify it (I used the PUMA number as the attribute name and the neighborhood name as my snippet, so both appear in the legend) and added a description that would appear in the pop up window when that feature is clicked. In that description, I added the URL from the ACS census profile page for a particular PUMA – the cool part here is that the URL is consistent and contains the PUMA number. So, I replaced the specific number and inserted the [field] name from the PUMAs attribute table that contained the number. When I did the export, the URLs for each individual feature were created with their PUMA number inserted into the link.

There were a few quirks – I discovered that you can’t automatically display labels on a Google Map without subterfuge, like creating the labels as images and not text. Google Earth (but not Maps) supports labels if you create multi-geometry where you have a point for a label and a polygon for the feature. If you select a labeling attribute on the initial options screen of the Export to KML tool, you create an icon in the middle of each polygon that has a different description pop-up (which I didn’t want so I left it to none and lived without labels). I made my features 75% transparent (a handy feature of Export to KML) so that you could see the underlying Google Map features through the PUMA, but this made the fill AND the lines transparent, making the features too difficult to see. After the export I opened the KML in a text editor and changed the color values for the lines / boundaries by hand, which was easy since the styles are saved by feature group (boroughs) and not by individual feature (pumas). I also manually changed the value of the folder open element (from 0 to 1) so that the feature and feature groups (pumas and boroughs) are expanded by default when someone opens the map.

After making the manual edits, I uploaded the KML to my webserver and pasted the url for it into the Google Maps search box, which overlayed my KML on the map. Then I was able to get a persistent link to the map and code for embedding it into websites via the Google Map Interface. No need to add it to Google My Maps, as I have my own space. One big quirk – it’s difficult to make changes to an existing KML once you’ve uploaded and displayed it. After I uploaded what I thought would be my final version I noticed a typo. So I fixed it locally, uploaded the KML and overwrote the old one. But – the changes I made didn’t appear. I tried reloading and clearing the cache in my browser, but no good – once the KML is uploaded and Google caches it, you won’t see any of your changes until Google re-caches. The conventional wisdom is to change the name of the file every single time – which is pretty dumb as you’ll never be able to have a persistent link to anything. There are ways to circumvent the problem, or you can just wait it out. I waited one day and by the next the file was updated; good enough for me, as I’ll only need to update it once a year.

I’m hosting the map, along with some static PDF maps and a spreadsheet of PUMA names and neighborhood numbers, from the NYC Data LibGuide I created (part of my college’s collection of research guides). If you’re looking for neighborhood names to associate with PUMA numbers for your city, you’ll have to hunt around and see if a local planning agency or non-profit has created them for a project or research study (as the Census Bureau does not create them). For example, the County of Los Angeles Department of Mental Health uses pumas in a large study they did where they associated local place names with each puma.

If you’re interested in dabbling in some KML, there’s Google’s KML tutorial. I’d also recommend The KML Handbook by Josie Wernecke. The catch for any guide to KML is that while all KML elements are supported by Google Earth, there’s only partial support for Google Maps.

Update on Some Data Sources

Saturday, October 31st, 2009

Here’s my last chance to squeeze in a post before the month is over. There have been a lot of changes and updates with some key data sites lately. Here’s a summary:

  • The homepage for gdata, which provides global GIS data that was created as part of UC Berkeley’s Biogeomancer project, has moved to the DIVA-GIS website. DIVA-GIS is a free GIS software project designed specifically for biology and ecology applications, with support from UC Berkeley as well as several other research institutions and independent contributors. It looks like the old download interface has been incorporated into the DIVA-GIS page.
  • The US Census Bureau has recently released its latest iteration of the TIGER shapefiles, the 2009 TIGER/Line Shapefiles. Since they seem to be making annual updates, which has involved changing the URLs around, it may be better to link to their main TIGER shapefile page where you can get to the latest and previous versions of the files.
  • The bureau has released its latest American Community Survey (ACS) data: 2008 annual estimates for geographic areas with 65,000 plus people, and three year 2006-2008 estimates for geographic areas with 20,000 plus people. Available through the American Factfinder.
  • Over the summer, UM Information Studies student Clint Newsom and I created a 2005-2007 PUMA-level New York Metropolitan ACS Geodatabase (NYMAG). It’s available for download on the new Baruch Geoportal, which was re-launched as a public website this past September. It’s a personal geodatabase in Microsoft Access format, so it can only be directly used with ArcGIS. I plan on creating the 2006-2008 version sometime between January and March 2010, and hope to release an Access and SQLite version, as the latest development versions of QGIS now offer direct support for SQlite geodatabases in the Spatialite format (which is awesome!).
  • While it’s not a source for GIS data or attribute tables, it’s still worth mentioning that the CIA World Factbook completely revised their website this past summer. The previous web versions of the factbook took their design cues from the old paper copies of the report. The CIA revamped the entire site and apparently will be using a model of continuous rather than annual updates. It’s a great site for getting country profiles – another good option is the UN World Statistics Pocketbook, which is part of the UNdata page.

Print Composer in QGIS – ACS Puma Maps

Sunday, July 12th, 2009

ny_youth_pumasI wrapped up a project recently where I created some thematic maps of 2005-2007 ACS PUMA level census data for New York State. I decided to do all the mapping in open source QGIS, and was quite happy with the result, which leads me to retract a statement from a post I made last year, where I suggested that QGIS may not be the best for map layout. The end product looked just as good as maps I’ve created in ArcGIS. There were a few tricks and quirks in using the QGIS Print Composer and I wanted to share those here. I’m using QGIS Kore 1.02, and since I was at work I was using Windows XP with SP3 (I run Ubuntu at home but haven’t experimented with all of these steps yet using Linux). Please note that the data in this map isn’t very strong – the subgroup I was mapping was so small that there were large margins of errors for many of the PUMAs, and in many cases the data was suppressed. But the map itself is a good example of what an ACS PUMA map can look like, and is a good example of what QGIS can do.

  • Inset Map – The map was of New York State, but I needed to add an inset map of New York City so the details there were not obscured. This was just a simple matter of using the Add New Map button for the first map, and doing it a second time for the inset. In the item tab for the map, I changed the preview from rectangle to cache and I had maps of NY state in each map. Changing the focus and zoom of the inset map was easy, once I realized that I could use the scroll on my mouse to zoom in and out and the Move Item Content button (hand over the globe) to re-position the extent (you can also manually type in the scale in the map item tab). Unlike other GIS software I’ve experimented with, the extent of the map layout window is not dynamically tied to the data view – which is a good thing! It means I can have these two maps with different extents based on data in one data window. Then it was just a matter of using the buttons to raise or lower one element over another.
  • Legend – Adding the legend was a snap, and editing each aspect of the legend, the data class labels, and the categories was a piece of cake. You can give your data global labels in the symbology tab for the layer, or you can simply alter them in the legend. One quirk for the legend and the inset map – if you give assign a frame outline that’s less than 1.0, and you save and exit your map, QGIS doesn’t remember this setting if when you open your map again – it sets the outline to zero.
  • Text Boxes / Labels – Adding them was straightforward, but you have to make sure that the label box is large enough to grab and move. One annoyance here is, if you accidentally select the wrong item and move your map frame instead of the label, there is no undo button or hotkey. If you have to insert a lot of labels or free text, it can be tiresome because you can’t simply copy and paste the label – you have to create a new one each time, which means you have to adjust your font size and type, change the opacity, turn the outline to zero, etc each time. Also, if the label looks “off” compared to any automatic labeling you’ve done in the data window, don’t sweat it. After you print or export the map it will look fine.
  • North Arrow – QGIS does have a plugin for north arrows, but the arrow appears in the data view and not in the print layout. To get a north arrow, I inserted a text label, went into the font menu, and chose a font called ESRI symbols, which contains tons of north arrows. I just had to make the font really large, and experiment with hitting keys to get the arrow I wanted.
  • Scale Bar – This was the biggest weakness of the print composer. The scale bar automatically takes the unit of measurement from your map, and there doesn’t seem to be an option to convert your measurement units. Which means you’re showing units in feet, meters, or decimal degrees instead of miles or kilometers, which doesn’t make a lot of sense. Since I was making a thematic map, I left the scale bar off. If anyone has some suggestions for getting around this or if I’m totally missing something, please chime in.
  • Exporting to Image – I exported my map to an image file, which was pretty simple. One quirk here – regardless of what you set as your paper size, QGIS will ignore this and export your map out as the optimal size based on the print quality (dpi) that you’ve set (this isn’t unique to QGIS – ArcGIS behaves the same way when you export a map). If you create an image that you need to insert into a report or web page, you’ll have to mess around with the dpi to get the correct size. The map I’ve linked to in this post uses the default 300 dpi in a PNG format.
  • Printing to PDF – QGIS doesn’t have a built in export function for PDF, so you have to use a PDF print driver via your print screen (if you don’t have the Adobe PDF printer or a reasonable facsimile pre-installed, there are a number  of free ones available on sourceforge – PDFcreator is a good one). I tried Adobe and PDFcreator and ran into trouble both times. For some reason when I printed to PDF it was unable to print the polygon layer I had in either the inset map or the primary map (I had a polygon layer of pumas and a point layer of puma centroids showing MOEs). It appeared that it started to draw the polygon layer but then stopped near the top of the map. I fiddled with the internal settings of both pdf drivers endlessly to no avail, and after endless tinkering found the answer. Right before I go to print to pdf, if I selected the inset map, chose the move item content button (hand with globe), used the arrow key to move the extent up one, and then back one to get it to it’s original position, then printed the map, it worked! I have no idea why, but it did the trick. After printing the map once, to print it again you have to re-do this trick. I also noticed that after hitting print, if the map blinked and I could see all the elements, I knew it would work. But, if the map blinked and I momentarily didn’t see the polygon layer, I knew it wouldn’t export correctly.

Despite a few quirks (what software doesn’t have them), I was really happy with the end result and find myself using QGIS more and more for making basic to intermediate maps at work. Not only was the print composer good, but I was also able to complete all of the pre-processing steps using QGIS or another open source tool. I’ll wrap up by giving you the details of the entire process, and links to previous posts where I discuss those particular issues.

I used 2005-2007 American Community Survey (ACS) date from the US Census Bureau, and mapped the data at the PUMA level. I had to aggregate and calculate percentages for the data I downloaded, which required using a number of spreadsheet formulas to calculate new margins of error; (MOEs). I downloaded a PUMA shapefile layer from the US Census Generalized Cartographic Boundary files page, since generalized features were appropriate at the scale I was using. The shapefile had an undefined coordinate system, so I used the Ftools add-on in QGIS I converted the shapefile from single-part to multi-part features. Then I used Ftools to join my shapefile to the ACS data table I had downloaded and cleaned-up (I had to save the data table as a DBF in order to do the join). Once they were joined, I classified the data using natural breaks (I sorted and eyeballed the data and manually created breaks based on where I thought there were gaps). I used the Color Brewer tool to choose a good color scheme, and entered the RGB values in the color / symbology screen. Once I had those colors, I saved them as custom colors so I could use them again and again. Then I used Ftools to create a polygon centroid layer out of my puma/data layer. I used this new point layer to map my margin of error values. Finally, I went into the print composer and set everything up. I exported my maps out as PNGs, since this is a good image format for preserving the quality of the maps, and as PDFs.


Copyright © 2014 Gothos. All Rights Reserved.
No computers were harmed in the 0.482 seconds it took to produce this page.

Designed/Developed by Lloyd Armbrust & hot, fresh, coffee.