Posts Tagged ‘grass’

Thiessen Polygons and Listing Neighboring Features

Monday, January 2nd, 2012

I was helping someone with a project recently that I thought would be straightforward but turned out to be rather complex. We had a list of about 10,000 addresses that had to be plotted as coordinates, and then we needed to create Thiessen or Voroni polygons for each point to create market areas. Lastly we needed to generate an adjacency table or list of neighbors; for every polygon list all the neighboring polygons.

For step one I turned to the USC Geocoding service to geocode the addresses; I became a partner a ways back so I could batch geocode datasets for students and faculty on my campus. Once I had coordinates I plotted them in ArcGIS 10 (and learned that the Add XY data feature had been moved to File > Add Data > Add XY Data). Step 2 seemed easy enough; in Arc you go to ArcToolbox > Analysis Tools > Proximity > Create Thiessen Polygons. This creates a polygon for each point and assigns the attributes of each point to the polygon.

I hit a snag with Step 3 – Arc didn’t have a tool for generating the adjacency table. After a thorough search of the ESRI and Stack Exchange forums, I stumbled on the Find Adjacent Features Script by Ken Buja which did exactly what I wanted in ArcGIS 9.2 and 9.3, but not in 10. I had used this script before on a previous project, but I’ve since upgraded and can’t go back. So I searched some more until I found the Find Adjacent & Neighboring Polygons Tool by cmaene. I was able to add this custom toolbox directly to ArcToolbox, and it did exactly what I wanted in ArcGIS 10. I get to select the unique identifying field, and for every ID I get a list of the IDs of the neighboring polygons in a text file (just like Ken’s tool). This tool also had the option of saving the list of neighbors for each feature directly in the attribute table of a shapefile (which is only OK for small files with few neighbors; fields longer than 254 characters get truncated), and it gave you the option of listing neighbors to the next degree (a list of all the neighbor’s neighbors).

Everything seemed to run fine, so I re-ran the tool on a second set of Thiessen polygons that I had clipped with an outline of the US to create something more geographically realistic (so polygons that share a boundary only in the ocean or across the Great Lakes are not considered neighbors).

THEN – TROUBLE. I took some samples of the output table and checked the neighbors of a few features visually in Arc. I discovered two problems. First, I was missing about a thousand records or so in the output. When I geocoded them I couldn’t get a street-level address match for every record; the worse case scenario was a plot to the ZCTA / ZIP code centroid for the address, which was an acceptable level of accuracy for this project. The problem is that if there are many point features plotted to the same coordinate (because they share the same ZIP), a polygon was created for one feature and the overlapping ones fell away (you can’t have overlapping Thiessen polygons). Fortunately this also wasn’t an issue for the person I was helping; we just needed to join the output table back to the master one to track which ones fell out and live with the result.

The bigger problem was the output was wrong. I discovered that the neighbor list for most of the features I checked, especially polygons that had borders on the outer edge of the space, had incomplete lists; each feature had several (and in some cases, all) neighbors missing. Instead of using a shapefile of Thiessen’s I tried running the tool on polygons that I generated as feature classes within an Arc geodatabase, and got the same output. For the heck of it I tried dissolving all the Thiessen’s into one big polygon, and when I did that I noticed that I had orphaned lines and small gaps in what should have been one big, solid rectangle. I tried checking the geometry of the polygons and there were tons of problems. This led me to conclude that Arc did a lousy job when constructing the topology of the polygons, and the neighbor tool was giving me bad output as a result.

Since I’ve been working more with GRASS, I remembered that GRASS vectors have strict topology rules, where features have shared boundaries (instead of redundant overlapping ones). So I imported my points layer from a shapefile into GRASS and then used the v.voroni tool to create the polygons. The geometry looked sound, the attributes of each point were assigned to a polygon, and for overlapping points one polygon was created and attributes of the shared points were dumped. I exported the polygons out as a shapefile and brought them back into Arc, ran the Find Adjacent & Neighboring Polygons tool, spot checked the neighbors of some features, and voila! The output was good. I clipped these polygons with my US outline, ran the tool again, and everything checked out.

Morals of this story? When geocoding addresses consider how the accuracy of the results will impact your project. If a tool or feature doesn’t exist assume that someone else has encountered the same problem and search for solutions. Never blindly accept output; take a sample and do manual checks. If one tool or piece of software doesn’t work, try exporting your data out to something else that will. Open source software and Creative Commons tools can save the day!

Footnote – apparently it’s possible to create lists of adjacent polygons in GRASS using the sides option in, although it isn’t clear to me how this is accomplished; the documentation talks about categories of areas on the right and left of a boundary, but not on all sides of an area. Since I already had a working solution I didn’t investigate further.

Giving GRASS GIS a Try

Saturday, July 30th, 2011

I’ve been taking the plunge in learning GRASS GIS this summer, as I’ve been working at home (and thus don’t have access to ArcGIS) on a larger and more long-term project (and thus don’t want to mess around with shapefiles and csv tables). I liked the idea of working with GRASS vectors, as everything is contained in one folder and all my attributes are stored rather neatly in a SQLite database.

I started out using QGIS to create my mapset and to connect it to my SQLite db which I had created and loaded with some census data. Then I thought, why not give the GRASS interface a try? I started using the newer Python-wx GUI and as I’m trying different things, I bounce back and forth between using the GUI for launching commands and the command line for typing them in – all the while I have Open Source GIS A GRASS GIS Approach at my side and the online manual a click away . So far, so good.

I loaded and cleaned a shapefile with the GRASS GUI (the GUI launches,, abd v.clean) and it’s attributes were loaded into the SQLite database I had set (using db.connect – need to do this otherwise a DBF is created by default for storing attributes). Then I had an age-old task to perform – the US Census FIPS / ANSI codes where stored in separate fields, and in order to join them to my attribute tables I had to concatenate them. I also needed to append some zeros to census tract IDs that lacked them – FIPS codes for states are two digits long, counties are three digits, and tracts are between four and six digits, but to standardize them four digit tracts should have two zeros appended.

Added the new JOIN_ID column using v.db.addcol, then did the following using db.execute:

UPDATE tracts_us99_nad83


UPDATE tracts_us99_nad83
WHERE length(JOIN_ID)=9

So this:

01 077 0113
01 077 0114
01 077 011502
01 077 011602

Becomes this:


db.execute GRASS GUI

I could have done this a few different ways from within GRASS: instead of the separate v.db.addcol command I could have written a SQL statement in db.execute to alter the table and add a column. Or, instead of db.execute I could have used the v.db.update command.

My plan is to use GRASS for geoprocessing and analysis (will be doing some buffering, geographic selection, and basic spatial stats), and QGIS for displaying and creating final maps. I just used to transform an attribute table with coordinates in my db to a point vector. But now I’m realizing that in order to draw buffers, I’ll need a projected coordinate system that uses meters or feet, as inputting degrees for a buffer distance (for points throughout the US) isn’t going to work too well. I’m also having trouble figuring out how to link my attribute data to my vectors – I can easily use v.db.join to fuse the two together, but there is a way to link them more loosely using the CAT ID number for each vector, but I’m getting stuck. We’ll see how it goes.

Some final notes – since I’m working with large datasets (every census tract in the US) and GRASS uses a topological data model where common node and boundaries between polygons are shared, geoprocessing can take awhile. I’ve gotten in the habit of testing things out on a small subset of my vectors, and once I get it right I run the process on the total set.

Lastly, there are times where I read about commands in the manual and for the life of me I can’t find them in the GUI – for example, finding a menu to delete (i.e. permanently remove) layers. But if you type the command without any of its parameters in the command line (in this case, g.remove) it will launch the right window in the GUI.

GRASS GIS Interface

Open Source GIS for Thematic Mapping

Wednesday, September 3rd, 2008

I’ve been exploring the open source GIS alternatives, and have been pretty overwhelmed by the number of choices. For an overview of what’s out there, you can check out The State of Open Source GIS (a large pdf) from Refractions Research, and a series of comparison tables assembled by a geography prof at the Univ of Calgary. You can also search the web for “Open Source GIS”, and you’ll find a number of blogs, forums, lists, and sites that cover it in some detail.

There are a lot of alternatives, and many of them are geared to a particular purpose: raster vs vector, viewer vs map making vs analysis, etc. I’m looking for something that’s cross-platform that I can use for vector-based thematic mapping, and something that I can easily introduce and teach to novices. I need software that allows me to: work with common formats like shapefiles, transform projections, add data tables and join them to shapefiles, symbolize data with a good selection of color schemes, classify data with several methods including natural breaks, add labels, and produce maps as pdfs, images, or in print. Preferably, I want something that has strong map layout capabilities. I don’t want to use a graphic design package for final map creation.

I’ve looked at five options that are all great, but there isn’t one that covers everything I’m looking for :

  • GRASSGRASS – an established and powerful GIS. Setting up the environment for working with files takes some getting used to (you can’t simply open a window and start adding files), and the native graphic interface is complex. GRASS only works with it’s own native file formats, so you have to import everything into that format first. GRASS was really designed for analysis and modeling (tasks for which it excels), but is not the best choice for basic thematic mapping, or for novices.
  • QGISQGIS – one of the strongest attributes of QGIS is that it harnesses the power of GRASS in a more user friendly environment. If you use the GRASS plug-in and work with GRASS datasets, you can do table joins and have access to several classification methods. If you use QGIS on it’s own (working with shapefiles or raster images), you won’t have these capabilities. It seems that projection transformations are limited to a certain subset of projections, and common global thematic map projections like Robisnon or Winkel Tripel are missing (you would have to create them manually). QGIS does have a print layout screen bu color schemes are limited, and labeling isn’t good – it automatically labels every single polygon in a multi-part layer, and the only way to turn it off is via a hack. QGIS is a great viewer (particularly for rasters) and is a good alternative GRASS front end, but probably won’t be your choice for making high-quality thematic maps.
  • gvSIGgvSIG – billed as an alternative to the old ArcView 3.x, gvSIG lives up to this reputation. A map project has separate, defined areas for data views, maps, and data tables. You can do projection transformations and it does support EPSG and ESRI, but the process is a little confusing. The map window has a default projection, but when you add a layer it uses the layer’s projection but keeps the windows projection, and it’s difficult to figure out what projection the layer is in (if this makes any sense!) You can do table joins, and there is a good selection of color schemes for classification and several classification methods. It is the only one that I’ve seen that has natural breaks as an option. It also has the best map layout compared to the other software I’ve looked at and you can export maps out to a number of formats. The biggest weakness is labeling, which is very rudimentary. It doesn’t place labels in the center of polygons, but offsets them slightly (appropriate for points but not polygons), and has no conflict detection. It does allow you to create annotation layers, and improvements are in the works for the next version. Since the software was created in Spain, you’ll occasionally find a menu here or a button there that was missed in translation to English.
  • udiguDIG – the windows and toolbars are not as GIS-like as the other software options, which makes findings things a little tougher. Udig has good projection transformation support, great selection of color schemes for symbolization, and excellent label placement (the best, by far), with conflict detection and different placement options. Like QGIS, the map layout screen is located under the print option. The templates are rudimentary but are easy to use and get the job done. The detractors here are data classification (natural breaks is not a choice) and table joins – there is no option for adding and joining attribute tables to spatial files whatsoever.
  • openJUMPOpenJUMP – has a great interface, particularly for working with attribute tables, good selection of color schemes for symbolization and good label placement features. Table joins are supported for text files (no DBFs). Equal Intervals is the only data classification method available (no natural breaks), and projection transformation is only available via a plug-in. The bigger issue is that there is no print option or map layout. These are available through a plugin as well, but I haven’t tried installing it yet. Plugin installation requires altering or over-writing some of the program files, and I was dubious to try.

There are always work-arounds to fill in the features that are missing. For file and projection transformations, you can always use the GDAL / OGR tools, which I would recommend (although annoyingly I can’t seem to get the projection transformations to Robinson or Winkel to work). The NCEAS at UC Santa Barbara has a nice wiki with examples of commands. Table joins can be accomplished outside the GIS using a database package. You could use a spreadsheet or stats package to figure out break points for data, and change the breaks manually in the GIS. For labels, you can export labels out as annotation, or you can convert polygons to points and use the points layer as a label layer (just make the points invisible but turn the labeling on). Then you can edit that file and move the labels around to get them in the right position. If you make lots of thematic maps for the same area, you can use the same label file over and over again.

This isn’t an exhaustive overview and I haven’t created a consistent procedure for testing all the options. There are a few other elements that I would also want to explore (How good is the support for legends? Can you normalize data or calculate new attribute fields? Can you add a graticule? Change the background color for a view? Convert a table of XY coordinates to a point layer? Is there a geoprocessing tool for generalizing layers?)

If pressed, which option would I choose? I’m inclined to go with gvSIG, which really reminds me of the old ArcView, and would hope that label placement improves with the next version – each of these software packages are constantly improving works in progress. Perhaps it’s best to regard these software options as different tools in one toolbox. If I need to make a thematic map of the world, showing population by country with only a few labels, then I’ll go with gvSIG, where I can easily do table joins, use natural breaks, and have lots of colors at my disposal. If I need to make basic reference maps, say ZIP codes of the NYC metro area, then I’ll go with uDig, as I’ll be able to quickly label them all and still have good color scheme choices. If I need to do geoprocessing or analysis, I’ll also have to evaluate the options – maybe look at plugins, or hunker down and learn GRASS.

In the end, I’m certainly grateful that there are solid, open, and free choices out there and that people are freely giving their time and talent to everyone’s benefit. Why consider these alternatives? I’ll cover that in my next post.

Copyright © 2017 Gothos. All Rights Reserved.
No computers were harmed in the 0.355 seconds it took to produce this page.

Designed/Developed by Lloyd Armbrust & hot, fresh, coffee.