Posts Tagged ‘thematic mapping’

Print Composer in QGIS – ACS Puma Maps

Sunday, July 12th, 2009

ny_youth_pumasI wrapped up a project recently where I created some thematic maps of 2005-2007 ACS PUMA level census data for New York State. I decided to do all the mapping in open source QGIS, and was quite happy with the result, which leads me to retract a statement from a post I made last year, where I suggested that QGIS may not be the best for map layout. The end product looked just as good as maps I’ve created in ArcGIS. There were a few tricks and quirks in using the QGIS Print Composer and I wanted to share those here. I’m using QGIS Kore 1.02, and since I was at work I was using Windows XP with SP3 (I run Ubuntu at home but haven’t experimented with all of these steps yet using Linux). Please note that the data in this map isn’t very strong – the subgroup I was mapping was so small that there were large margins of errors for many of the PUMAs, and in many cases the data was suppressed. But the map itself is a good example of what an ACS PUMA map can look like, and is a good example of what QGIS can do.

  • Inset Map – The map was of New York State, but I needed to add an inset map of New York City so the details there were not obscured. This was just a simple matter of using the Add New Map button for the first map, and doing it a second time for the inset. In the item tab for the map, I changed the preview from rectangle to cache and I had maps of NY state in each map. Changing the focus and zoom of the inset map was easy, once I realized that I could use the scroll on my mouse to zoom in and out and the Move Item Content button (hand over the globe) to re-position the extent (you can also manually type in the scale in the map item tab). Unlike other GIS software I’ve experimented with, the extent of the map layout window is not dynamically tied to the data view – which is a good thing! It means I can have these two maps with different extents based on data in one data window. Then it was just a matter of using the buttons to raise or lower one element over another.
  • Legend – Adding the legend was a snap, and editing each aspect of the legend, the data class labels, and the categories was a piece of cake. You can give your data global labels in the symbology tab for the layer, or you can simply alter them in the legend. One quirk for the legend and the inset map – if you give assign a frame outline that’s less than 1.0, and you save and exit your map, QGIS doesn’t remember this setting if when you open your map again – it sets the outline to zero.
  • Text Boxes / Labels – Adding them was straightforward, but you have to make sure that the label box is large enough to grab and move. One annoyance here is, if you accidentally select the wrong item and move your map frame instead of the label, there is no undo button or hotkey. If you have to insert a lot of labels or free text, it can be tiresome because you can’t simply copy and paste the label – you have to create a new one each time, which means you have to adjust your font size and type, change the opacity, turn the outline to zero, etc each time. Also, if the label looks “off” compared to any automatic labeling you’ve done in the data window, don’t sweat it. After you print or export the map it will look fine.
  • North Arrow – QGIS does have a plugin for north arrows, but the arrow appears in the data view and not in the print layout. To get a north arrow, I inserted a text label, went into the font menu, and chose a font called ESRI symbols, which contains tons of north arrows. I just had to make the font really large, and experiment with hitting keys to get the arrow I wanted.
  • Scale Bar – This was the biggest weakness of the print composer. The scale bar automatically takes the unit of measurement from your map, and there doesn’t seem to be an option to convert your measurement units. Which means you’re showing units in feet, meters, or decimal degrees instead of miles or kilometers, which doesn’t make a lot of sense. Since I was making a thematic map, I left the scale bar off. If anyone has some suggestions for getting around this or if I’m totally missing something, please chime in.
  • Exporting to Image – I exported my map to an image file, which was pretty simple. One quirk here – regardless of what you set as your paper size, QGIS will ignore this and export your map out as the optimal size based on the print quality (dpi) that you’ve set (this isn’t unique to QGIS – ArcGIS behaves the same way when you export a map). If you create an image that you need to insert into a report or web page, you’ll have to mess around with the dpi to get the correct size. The map I’ve linked to in this post uses the default 300 dpi in a PNG format.
  • Printing to PDF – QGIS doesn’t have a built in export function for PDF, so you have to use a PDF print driver via your print screen (if you don’t have the Adobe PDF printer or a reasonable facsimile pre-installed, there are a number  of free ones available on sourceforge – PDFcreator is a good one). I tried Adobe and PDFcreator and ran into trouble both times. For some reason when I printed to PDF it was unable to print the polygon layer I had in either the inset map or the primary map (I had a polygon layer of pumas and a point layer of puma centroids showing MOEs). It appeared that it started to draw the polygon layer but then stopped near the top of the map. I fiddled with the internal settings of both pdf drivers endlessly to no avail, and after endless tinkering found the answer. Right before I go to print to pdf, if I selected the inset map, chose the move item content button (hand with globe), used the arrow key to move the extent up one, and then back one to get it to it’s original position, then printed the map, it worked! I have no idea why, but it did the trick. After printing the map once, to print it again you have to re-do this trick. I also noticed that after hitting print, if the map blinked and I could see all the elements, I knew it would work. But, if the map blinked and I momentarily didn’t see the polygon layer, I knew it wouldn’t export correctly.

Despite a few quirks (what software doesn’t have them), I was really happy with the end result and find myself using QGIS more and more for making basic to intermediate maps at work. Not only was the print composer good, but I was also able to complete all of the pre-processing steps using QGIS or another open source tool. I’ll wrap up by giving you the details of the entire process, and links to previous posts where I discuss those particular issues.

I used 2005-2007 American Community Survey (ACS) date from the US Census Bureau, and mapped the data at the PUMA level. I had to aggregate and calculate percentages for the data I downloaded, which required using a number of spreadsheet formulas to calculate new margins of error; (MOEs). I downloaded a PUMA shapefile layer from the US Census Generalized Cartographic Boundary files page, since generalized features were appropriate at the scale I was using. The shapefile had an undefined coordinate system, so I used the Ftools add-on in QGIS I converted the shapefile from single-part to multi-part features. Then I used Ftools to join my shapefile to the ACS data table I had downloaded and cleaned-up (I had to save the data table as a DBF in order to do the join). Once they were joined, I classified the data using natural breaks (I sorted and eyeballed the data and manually created breaks based on where I thought there were gaps). I used the Color Brewer tool to choose a good color scheme, and entered the RGB values in the color / symbology screen. Once I had those colors, I saved them as custom colors so I could use them again and again. Then I used Ftools to create a polygon centroid layer out of my puma/data layer. I used this new point layer to map my margin of error values. Finally, I went into the print composer and set everything up. I exported my maps out as PNGs, since this is a good image format for preserving the quality of the maps, and as PDFs.

Mapping ACS Census Data for Urban Areas With PUMAs

Tuesday, December 16th, 2008

The NY Times wrote a story recently based on the new 3 year ACS data that the Census Bureau released a couple weeks ago (see my previous post for details). They created some maps for this story using geography that I would never have thought to use.

Outside of Decennial Census years, it is difficult to map demographic patterns and trends within large cities as you’ll typically get one figure for the entire city and you can’t get a break down for areas within. Data for areas like census tracts and zip codes is not available outside the ten-year census (yet), and large cities exist as single municipal divisions that aren’t subdivided. New York City is an exception, as it is the only city composed of several counties (boroughs) and thus can be subdivided. But the borough data still doesn’t reveal much about patterns within the city.

The NY Times used PUMAS – Public Use Microdata Areas – to subdivide the city into smaller areas and mapped rents and income. PUMAs are aggregations of census tracts and were designed for aggregating and mapping public microdata. Microdata consists of a selection of actual individual responses from the census or survey with the personal identifying information (name, address, etc) stripped away. Researchers can build their own indicators from scratch, aggregate them to PUMAs, and then figure out the degree to which the sample represents the entire population.

Since PUMAs have a large population, the new three-year ACS data is available at the PUMA level. The PUMAs essentially become surrogates for neighborhoods or clusters of neighborhoods, and in fact several NYC agencies have created districts or neighborhoods based on these boundaries for statistical or planning purposes. This wasn’t the original intent for creating or using PUMAs, but it’s certainly a useful application of them.

You can check out the NY Times article and maps here – Census Shows Growing Diversity in New York City (12/9/08). I tested ACS / PUMA mapping out myself by downloading some PUMA shapefiles from the Census Bureau’s Generalized Cartographic Boundaries page, grabbing some of the new annual ACS data from the American Factfinder, and creating a map of Philly. In the map below, you’re looking at 2005-2007 averaged data that shows the percentage of residents who lived in their current home last year. If you know Philly, you can see that the PUMAs do a reasonable job of approximating regions in the city – South Philly, Center City, West Philly, etc.

The problem I ran into here was that data did not exist for all of the PUMAs – in this case, South Philly and half of North Philly had values of zero. According to the footnotes on the ACS site, there were no values for these areas because “no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution”. So even though the PUMA geography is generally available, there still may be cases where data for particular variables for individual geographies is missing.

Just for the heck of it, I tried looking at the annual ACS data which is limited to more populated areas (must have 65k population where 3 year estimates are for areas with at least 20k) and even more data was missing (in this instance, all the areas in the northeast). Even though PUMAs have a minimum population of 100k people, the ACS sampling is county based. So even if the sample size for a county is ideal, they may not have a significant threshold for individual places within a county to compute an estimate. At least, that’s my guess. Regardless, it’s still worth looking at for the city and data you’re interested in.

ACS Data for Philly Pumas

Open Source GIS for Thematic Mapping

Wednesday, September 3rd, 2008

I’ve been exploring the open source GIS alternatives, and have been pretty overwhelmed by the number of choices. For an overview of what’s out there, you can check out The State of Open Source GIS (a large pdf) from Refractions Research, and a series of comparison tables assembled by a geography prof at the Univ of Calgary. You can also search the web for “Open Source GIS”, and you’ll find a number of blogs, forums, lists, and sites that cover it in some detail.

There are a lot of alternatives, and many of them are geared to a particular purpose: raster vs vector, viewer vs map making vs analysis, etc. I’m looking for something that’s cross-platform that I can use for vector-based thematic mapping, and something that I can easily introduce and teach to novices. I need software that allows me to: work with common formats like shapefiles, transform projections, add data tables and join them to shapefiles, symbolize data with a good selection of color schemes, classify data with several methods including natural breaks, add labels, and produce maps as pdfs, images, or in print. Preferably, I want something that has strong map layout capabilities. I don’t want to use a graphic design package for final map creation.

I’ve looked at five options that are all great, but there isn’t one that covers everything I’m looking for :

  • GRASSGRASS – an established and powerful GIS. Setting up the environment for working with files takes some getting used to (you can’t simply open a window and start adding files), and the native graphic interface is complex. GRASS only works with it’s own native file formats, so you have to import everything into that format first. GRASS was really designed for analysis and modeling (tasks for which it excels), but is not the best choice for basic thematic mapping, or for novices.
  • QGISQGIS – one of the strongest attributes of QGIS is that it harnesses the power of GRASS in a more user friendly environment. If you use the GRASS plug-in and work with GRASS datasets, you can do table joins and have access to several classification methods. If you use QGIS on it’s own (working with shapefiles or raster images), you won’t have these capabilities. It seems that projection transformations are limited to a certain subset of projections, and common global thematic map projections like Robisnon or Winkel Tripel are missing (you would have to create them manually). QGIS does have a print layout screen bu color schemes are limited, and labeling isn’t good – it automatically labels every single polygon in a multi-part layer, and the only way to turn it off is via a hack. QGIS is a great viewer (particularly for rasters) and is a good alternative GRASS front end, but probably won’t be your choice for making high-quality thematic maps.
  • gvSIGgvSIG – billed as an alternative to the old ArcView 3.x, gvSIG lives up to this reputation. A map project has separate, defined areas for data views, maps, and data tables. You can do projection transformations and it does support EPSG and ESRI, but the process is a little confusing. The map window has a default projection, but when you add a layer it uses the layer’s projection but keeps the windows projection, and it’s difficult to figure out what projection the layer is in (if this makes any sense!) You can do table joins, and there is a good selection of color schemes for classification and several classification methods. It is the only one that I’ve seen that has natural breaks as an option. It also has the best map layout compared to the other software I’ve looked at and you can export maps out to a number of formats. The biggest weakness is labeling, which is very rudimentary. It doesn’t place labels in the center of polygons, but offsets them slightly (appropriate for points but not polygons), and has no conflict detection. It does allow you to create annotation layers, and improvements are in the works for the next version. Since the software was created in Spain, you’ll occasionally find a menu here or a button there that was missed in translation to English.
  • udiguDIG – the windows and toolbars are not as GIS-like as the other software options, which makes findings things a little tougher. Udig has good projection transformation support, great selection of color schemes for symbolization, and excellent label placement (the best, by far), with conflict detection and different placement options. Like QGIS, the map layout screen is located under the print option. The templates are rudimentary but are easy to use and get the job done. The detractors here are data classification (natural breaks is not a choice) and table joins – there is no option for adding and joining attribute tables to spatial files whatsoever.
  • openJUMPOpenJUMP – has a great interface, particularly for working with attribute tables, good selection of color schemes for symbolization and good label placement features. Table joins are supported for text files (no DBFs). Equal Intervals is the only data classification method available (no natural breaks), and projection transformation is only available via a plug-in. The bigger issue is that there is no print option or map layout. These are available through a plugin as well, but I haven’t tried installing it yet. Plugin installation requires altering or over-writing some of the program files, and I was dubious to try.

There are always work-arounds to fill in the features that are missing. For file and projection transformations, you can always use the GDAL / OGR tools, which I would recommend (although annoyingly I can’t seem to get the projection transformations to Robinson or Winkel to work). The NCEAS at UC Santa Barbara has a nice wiki with examples of commands. Table joins can be accomplished outside the GIS using a database package. You could use a spreadsheet or stats package to figure out break points for data, and change the breaks manually in the GIS. For labels, you can export labels out as annotation, or you can convert polygons to points and use the points layer as a label layer (just make the points invisible but turn the labeling on). Then you can edit that file and move the labels around to get them in the right position. If you make lots of thematic maps for the same area, you can use the same label file over and over again.

This isn’t an exhaustive overview and I haven’t created a consistent procedure for testing all the options. There are a few other elements that I would also want to explore (How good is the support for legends? Can you normalize data or calculate new attribute fields? Can you add a graticule? Change the background color for a view? Convert a table of XY coordinates to a point layer? Is there a geoprocessing tool for generalizing layers?)

If pressed, which option would I choose? I’m inclined to go with gvSIG, which really reminds me of the old ArcView, and would hope that label placement improves with the next version – each of these software packages are constantly improving works in progress. Perhaps it’s best to regard these software options as different tools in one toolbox. If I need to make a thematic map of the world, showing population by country with only a few labels, then I’ll go with gvSIG, where I can easily do table joins, use natural breaks, and have lots of colors at my disposal. If I need to make basic reference maps, say ZIP codes of the NYC metro area, then I’ll go with uDig, as I’ll be able to quickly label them all and still have good color scheme choices. If I need to do geoprocessing or analysis, I’ll also have to evaluate the options – maybe look at plugins, or hunker down and learn GRASS.

In the end, I’m certainly grateful that there are solid, open, and free choices out there and that people are freely giving their time and talent to everyone’s benefit. Why consider these alternatives? I’ll cover that in my next post.


Copyright © 2017 Gothos. All Rights Reserved.
No computers were harmed in the 0.409 seconds it took to produce this page.

Designed/Developed by Lloyd Armbrust & hot, fresh, coffee.