Posts Tagged ‘GIS’

Natural Earth Vector and Raster Data

Tuesday, December 15th, 2009

I haven’t been posting regularly as I’ve been swamped this semester – but now that it’s coming to an end I should be able to crank out a post or two each month.

I recently saw a message on Maps-L about a new GIS data source, Natural Earth, and just got around to taking a look at it. It’s run by a volunteer organization dedicated to providing free, integrated, public domain map layers for producing high-quality maps at small scales. They have a pretty comprehensive website that includes a blog, feature list, contributor information, and details on how to volunteer.

Natural Earth provides smooth, generalized vector and raster layers at three scales: 1:10m, 1:50m, and 1:110m. See my screen shot of the Delmarva peninsula to see the distinctions (beige area is 110m, red line is 50m, and blue line is 10m).

nat_earth

Having a choice of scales with vector and raster data layers from the same source is a huge plus (many other country-level boundary files available on the web are detailed and suitable for large scale maps, but look messy when you zoom out to a smaller scale). Natural Earth also provide outlines for land and water (including legal water boundaries for all the Pacific islands), hydrographic features generalized to the different scales, ice shelves, urban areas, and several lat/long grid line layers.

For country boundaries they’ve gotten around the tangled issue of country definitions by providing different layers for different definitions, so you can choose the one that’s most appropriate – sovereign states (so, Greenland would be part of the Denmark polygon, Alaska and Puerto Rico part of the US, and French Guiana part of France), countries (Greenland separate from the Denmark polygon, Puerto Rico separate from the US, Alaska part of the US, and French Guiana part of France), and subunits (each place its own polygon). As you move down this hierarchy, places are linked back to their whole (so there are fields in the subunit file that list which country and sovereign state it’s part of).

At this point subdivisions (states / provinces) are only provided for the US and Canada. They do provide some descriptive metadata for each layer on the website, but the metadata doesn’t follow any standardized format for geographic data. The biggest missing link is unique identifiers – none of the countries have ISO or FIPS codes, so there aren’t any fields to join attribute data to for thematic mapping (except country name, which never works smoothly given the amount of variation with names).

Overall this looks like a great resource. Vector data is in shapefile format, raster data is in tiff, and everything is defined as simple WGS 84, so these files should work with almost any GIS package, ready to go.

Update on Some Data Sources

Saturday, October 31st, 2009

Here’s my last chance to squeeze in a post before the month is over. There have been a lot of changes and updates with some key data sites lately. Here’s a summary:

  • The homepage for gdata, which provides global GIS data that was created as part of UC Berkeley’s Biogeomancer project, has moved to the DIVA-GIS website. DIVA-GIS is a free GIS software project designed specifically for biology and ecology applications, with support from UC Berkeley as well as several other research institutions and independent contributors. It looks like the old download interface has been incorporated into the DIVA-GIS page.
  • The US Census Bureau has recently released its latest iteration of the TIGER shapefiles, the 2009 TIGER/Line Shapefiles. Since they seem to be making annual updates, which has involved changing the URLs around, it may be better to link to their main TIGER shapefile page where you can get to the latest and previous versions of the files.
  • The bureau has released its latest American Community Survey (ACS) data: 2008 annual estimates for geographic areas with 65,000 plus people, and three year 2006-2008 estimates for geographic areas with 20,000 plus people. Available through the American Factfinder.
  • Over the summer, UM Information Studies student Clint Newsom and I created a 2005-2007 PUMA-level New York Metropolitan ACS Geodatabase (NYMAG). It’s available for download on the new Baruch Geoportal, which was re-launched as a public website this past September. It’s a personal geodatabase in Microsoft Access format, so it can only be directly used with ArcGIS. I plan on creating the 2006-2008 version sometime between January and March 2010, and hope to release an Access and SQLite version, as the latest development versions of QGIS now offer direct support for SQlite geodatabases in the Spatialite format (which is awesome!).
  • While it’s not a source for GIS data or attribute tables, it’s still worth mentioning that the CIA World Factbook completely revised their website this past summer. The previous web versions of the factbook took their design cues from the old paper copies of the report. The CIA revamped the entire site and apparently will be using a model of continuous rather than annual updates. It’s a great site for getting country profiles – another good option is the UN World Statistics Pocketbook, which is part of the UNdata page.

Geographic Information: Literacy and Systems

Wednesday, August 5th, 2009

I’ve been spending a good portion of my summer working on the course that I’m going to teach this fall. The library at my college offers credit courses in Information Studies which students can take as a minor – they can choose two 3000 level courses and then a 4000 level capstone course. My course is a 3000 level special topics course which I’ve called Geographic Information: Literacy and Systems.

My situation is rather peculiar. I can’t teach this course as a pure GIS course, since it’s an information studies class and not geography or earth sciences. Beyond that, my college does not have a geography department, and earth sciences are not an individual department but are combined with other natural and physical sciences. With the exception of a regional geography class offered by the anthropology department, my college doesn’t offer geography instruction. So even if I could teach a pure GIS class, it’s unlikely that any of the students would have any foundational geographic knowledge.

I also can’t teach the course as a “library” class where I’m training people to be map or GIS librarians, because that isn’t the point of the info studies minor. The minor is meant to introduce students to the foundational principles of information – what is information, how do we search for it, organize it, what is its context in society, etc. I also could not teach the course as a basic software class, as that isn’t really appropriate for a college course. In short, I couldn’t find a model that I could follow, as what I’m doing falls outside these traditional realms.

So I decided to build the course around the concept of geographic information where I’ll cover some foundational geography,cartography, and GIS from an information science perspective that encompasses:  organization, search and retrieval, data processing, and assessment and analysis of GI. I’ve divided the class into four units that cover geographic information and fundamental geography, maps as information objects, and two units of GIS. In the first GIS unit we’ll cover the theoretical aspects and the basics of using the software with datasets that I’ll provide. In the second unit we’ll deal with the nitty gritty of actually searching for and processing freely available GIS data. In the last couple of weeks I’ll spend some time on web mapping and on geographic analysis and research.

Many of the concepts that I’ll be teaching are things that I never formally learned in a college course, such as a discussion of the kinds of administrative and statistical divisions that exist in the world, why they exist, and how data is collected for them. The second GIS unit on data processing is something that I feel is never adequately covered in GIS classes, but is essential for doing just about anything in GIS. I think this is also poignant in information studies, as it involves a discussion of the difference between data and information and how you can turn one into the other.

I’ve decided to use all open source software. Since these are undergraduate students who probably won’t be entering a geography related field, and we are a commuter campus where students have to make special trips to get to computer labs, I didn’t see any logic in using ArcGIS. With the open source software they can use it anywhere and there will be a better chance that they’ll use it after the course is over (and after they graduate). I’ve opted to go with QGIS as it covers all the bases I need. I liked gvSIG but had too many problems with the map layout – I might be able to cut my way through them, but can sophomore business and english majors? QGIS is also more thoroughly documented (in english), which is important since this is an introductory class.

I’m using Krygier and Woods Making Maps as my textbook, along with a few chapters here and there from other texts. I have looked to the pages Krygier’s created for his courses for guidance, and like the stream of consciousness style he used for writing his notes. I’ll post an annotated reading list later.

Since I’m breaking molds, I’ve also decided not to use Blackboard to organize the whole course and am using a blog and various other bits and pieces of software for creating assignments, organizing the roster, etc. If you’re interested you can follow along on my course blog – (only students can register). Classes start on August 31st…

Print Composer in QGIS – ACS Puma Maps

Sunday, July 12th, 2009

ny_youth_pumasI wrapped up a project recently where I created some thematic maps of 2005-2007 ACS PUMA level census data for New York State. I decided to do all the mapping in open source QGIS, and was quite happy with the result, which leads me to retract a statement from a post I made last year, where I suggested that QGIS may not be the best for map layout. The end product looked just as good as maps I’ve created in ArcGIS. There were a few tricks and quirks in using the QGIS Print Composer and I wanted to share those here. I’m using QGIS Kore 1.02, and since I was at work I was using Windows XP with SP3 (I run Ubuntu at home but haven’t experimented with all of these steps yet using Linux). Please note that the data in this map isn’t very strong – the subgroup I was mapping was so small that there were large margins of errors for many of the PUMAs, and in many cases the data was suppressed. But the map itself is a good example of what an ACS PUMA map can look like, and is a good example of what QGIS can do.

  • Inset Map – The map was of New York State, but I needed to add an inset map of New York City so the details there were not obscured. This was just a simple matter of using the Add New Map button for the first map, and doing it a second time for the inset. In the item tab for the map, I changed the preview from rectangle to cache and I had maps of NY state in each map. Changing the focus and zoom of the inset map was easy, once I realized that I could use the scroll on my mouse to zoom in and out and the Move Item Content button (hand over the globe) to re-position the extent (you can also manually type in the scale in the map item tab). Unlike other GIS software I’ve experimented with, the extent of the map layout window is not dynamically tied to the data view – which is a good thing! It means I can have these two maps with different extents based on data in one data window. Then it was just a matter of using the buttons to raise or lower one element over another.
  • Legend – Adding the legend was a snap, and editing each aspect of the legend, the data class labels, and the categories was a piece of cake. You can give your data global labels in the symbology tab for the layer, or you can simply alter them in the legend. One quirk for the legend and the inset map – if you give assign a frame outline that’s less than 1.0, and you save and exit your map, QGIS doesn’t remember this setting if when you open your map again – it sets the outline to zero.
  • Text Boxes / Labels – Adding them was straightforward, but you have to make sure that the label box is large enough to grab and move. One annoyance here is, if you accidentally select the wrong item and move your map frame instead of the label, there is no undo button or hotkey. If you have to insert a lot of labels or free text, it can be tiresome because you can’t simply copy and paste the label – you have to create a new one each time, which means you have to adjust your font size and type, change the opacity, turn the outline to zero, etc each time. Also, if the label looks “off” compared to any automatic labeling you’ve done in the data window, don’t sweat it. After you print or export the map it will look fine.
  • North Arrow – QGIS does have a plugin for north arrows, but the arrow appears in the data view and not in the print layout. To get a north arrow, I inserted a text label, went into the font menu, and chose a font called ESRI symbols, which contains tons of north arrows. I just had to make the font really large, and experiment with hitting keys to get the arrow I wanted.
  • Scale Bar – This was the biggest weakness of the print composer. The scale bar automatically takes the unit of measurement from your map, and there doesn’t seem to be an option to convert your measurement units. Which means you’re showing units in feet, meters, or decimal degrees instead of miles or kilometers, which doesn’t make a lot of sense. Since I was making a thematic map, I left the scale bar off. If anyone has some suggestions for getting around this or if I’m totally missing something, please chime in.
  • Exporting to Image – I exported my map to an image file, which was pretty simple. One quirk here – regardless of what you set as your paper size, QGIS will ignore this and export your map out as the optimal size based on the print quality (dpi) that you’ve set (this isn’t unique to QGIS – ArcGIS behaves the same way when you export a map). If you create an image that you need to insert into a report or web page, you’ll have to mess around with the dpi to get the correct size. The map I’ve linked to in this post uses the default 300 dpi in a PNG format.
  • Printing to PDF – QGIS doesn’t have a built in export function for PDF, so you have to use a PDF print driver via your print screen (if you don’t have the Adobe PDF printer or a reasonable facsimile pre-installed, there are a number  of free ones available on sourceforge – PDFcreator is a good one). I tried Adobe and PDFcreator and ran into trouble both times. For some reason when I printed to PDF it was unable to print the polygon layer I had in either the inset map or the primary map (I had a polygon layer of pumas and a point layer of puma centroids showing MOEs). It appeared that it started to draw the polygon layer but then stopped near the top of the map. I fiddled with the internal settings of both pdf drivers endlessly to no avail, and after endless tinkering found the answer. Right before I go to print to pdf, if I selected the inset map, chose the move item content button (hand with globe), used the arrow key to move the extent up one, and then back one to get it to it’s original position, then printed the map, it worked! I have no idea why, but it did the trick. After printing the map once, to print it again you have to re-do this trick. I also noticed that after hitting print, if the map blinked and I could see all the elements, I knew it would work. But, if the map blinked and I momentarily didn’t see the polygon layer, I knew it wouldn’t export correctly.

Despite a few quirks (what software doesn’t have them), I was really happy with the end result and find myself using QGIS more and more for making basic to intermediate maps at work. Not only was the print composer good, but I was also able to complete all of the pre-processing steps using QGIS or another open source tool. I’ll wrap up by giving you the details of the entire process, and links to previous posts where I discuss those particular issues.

I used 2005-2007 American Community Survey (ACS) date from the US Census Bureau, and mapped the data at the PUMA level. I had to aggregate and calculate percentages for the data I downloaded, which required using a number of spreadsheet formulas to calculate new margins of error; (MOEs). I downloaded a PUMA shapefile layer from the US Census Generalized Cartographic Boundary files page, since generalized features were appropriate at the scale I was using. The shapefile had an undefined coordinate system, so I used the Ftools add-on in QGIS I converted the shapefile from single-part to multi-part features. Then I used Ftools to join my shapefile to the ACS data table I had downloaded and cleaned-up (I had to save the data table as a DBF in order to do the join). Once they were joined, I classified the data using natural breaks (I sorted and eyeballed the data and manually created breaks based on where I thought there were gaps). I used the Color Brewer tool to choose a good color scheme, and entered the RGB values in the color / symbology screen. Once I had those colors, I saved them as custom colors so I could use them again and again. Then I used Ftools to create a polygon centroid layer out of my puma/data layer. I used this new point layer to map my margin of error values. Finally, I went into the print composer and set everything up. I exported my maps out as PNGs, since this is a good image format for preserving the quality of the maps, and as PDFs.

Transform Projections with GDAL / OGR

Tuesday, April 14th, 2009

The GDAL / OGR tools are an open source, cross platform, command-line toolkit that can be used for viewing GIS metadata, performing attribute queries, and converting file formats, among other things. It can also be used for transforming coordinate systems and projections for GIS files. I’ll demonstrate in this brief tutorial how to accomplish this using the OGR tools, which are for vector based GIS. The raster based GDAL tools work in a similar fashion.

Viewing basic coordinate system / projection info:

ogrinfo -al -so world_wgs.shp

Where ogrinfo is the name of the tool, -al is a switch to get detailed info about the layer, -so is a switch to display summary info, and world_wgs.ship is the name of our file. Run that command and we’ll get something that looks like this, with info about the features, coordinate system, and attribute fields of our shapefile:

INFO: Open of `world_wgs.shp’
using driver `ESRI Shapefile’ successful.

Layer name: world_wgs
Geometry: Polygon
Feature Count: 243
Extent: (-179.808664, -89.677397) – (179.808664, 83.435942)
Layer SRS WKT:
GEOGCS["GCS_WGS_1984",
DATUM["WGS_1984",
SPHEROID["WGS_1984",6378137,298.257223563]],
PRIMEM["Greenwich",0],
UNIT["Degree",0.017453292519943295]]
CNTRY_NAME: String (254.0)
FIPS_CNT_1: String (254.0)
ISO_2DIGIT: String (254.0)
ISO_3DIGIT: String (254.0)
STATUS: String (254.0)
COLORMAP: Real (18.6)
CONTINENT: String (254.0)
UN_CONTINE: String (254.0)
REGION: String (254.0)
UN_REGION: String (254.0)

Convert coordinate systems supported by EPSG

GDAL / OGR and most of the open source GIS software supports projections and coordinate systems that are part of the EPSG library. If you want to do a conversion between two coordinate systems and they are both supported by EPSG, you just have to reference the EPSG code that’s used to identity the system that you want to project to. You can look up codes using spatialreference.org.

Let’s say we want to convert our shapefile that’s in WGS 84 (common lat and long) to NAD 83 (used frequently in North America):

ogr2ogr -t_srs EPSG:4269 world_new.shp world_wgs.shp

Where ogr2ogr is the name of the tool, -t_srs is the command for transforming from one coordinate system to the other, EPSG:4269 is the code that identifies the coordinate system we want the new file to have – NAD83, world_new.shp is the name of the output file that will have the new projection that we want, and world_wgs.shp is our input file. If you run the command and get no error message, you’re in good shape. Just run the ogrinfo command on the new file to verify that it’s been re-projected.

Convert coordinate system not supported by EPSG

The EPSG library is extensive, but doesn’t contain everything, particularly some global and continental map projections. GDAL / OGR can still do the job, but you’ll have to provide the tool with the proper frame of reference since the EPSG library doesn’t have the info. Let’s say we want to project our WGS file to the Robinson Projection, which is not part of EPSG.

First, go back to spatialreference.org and search for Robinson. Its ID code is ESRI 54030 – not part of the EPSG library. Click on the link for the projection to open its window. You’ll be able to look at the projection data in a number of standard file formats. Select OGC_WKT from the list, and it will open the text in a new window, showing you the parameters of that projection. In your browser, go up to file, save as, and save the file as robinson_ogcwkt.txt in the same directory as the shapefile you want to reproject.

Now that you have the projection info stored in the text file, run the following command to make the conversion:

ogr2ogr -t_srs robisnon_ogcwkt.txt world_rob.shp world_wgs.shp

It’s the same command as our previous one, except that you’re referencing the text file with your data instead of an EPSG code.

Define an undefined coordinate system

If you run the ogrinfo command and your coordinate system is undefined, you should define it before doing anything else, and you must define an undefined projection before converting to another projection. Look at the metadata that came with you file or go back to the source to figure out what it is. For example the US Census Bureau Generalized Cartographic Boundary Files for 2000 are in NAD83 according to their metadata, but the files lack a projection definition.

To define one, use the following command:

ogr2ogr -a_srs EPSG:4269 states_nad83.shp states_unknown.shp

The only difference here is the -a_srs command is used to assign a coordinate system to a file – the rest of the parameters are the same. If you’re defining a non-EPSG projection, use the same method from the previous example – download a definition file from spatialreference.org and use the file name in place of the EPSG code.

More help and where to download:

UC Santa Barbara NCEAS and the UC Davis Soil Lab both have short tutorials and sample commands of GDAL / OGR.

If you want to thumb through the world’s map projections, the folks at radicalcartography have a nice projection reference page with visuals and brief descriptions.

Visit the GDAL / OGR page for downloading, or if you’re a Windows or Mac user, you can download QGIS and GDAL / OGR together from the QGIS download page. Linux users can get GDAL / OGR via your package handler – depending on your distro, you may have it already.

ReferenceUSA for business data

Sunday, November 30th, 2008

Sorry that November has been another crummy month for posts. Here’s one that I’ve been meaning to write for quite awhile.

 

While there is a lot of free GIS data out there, one of the black holes is business data. Specifically, if you want to plot all of the businesses in one industry or all of the branches or locations of one company, where do you get the data? I’ve found that, if you need a comprehensive resource, this is one of those datasets that you have to pay for.

 

At our library we subscribe to a great business directory called ReferenceUSA, which is produced by company called InfoUSA. Their directories of American and Canadian businesses are extremely comprehensive and cover every business large an small. They also have an international directory that has mid-size to large businesses. You can generate lists of businesses using several criteria and filters.

 

Search using NAICS and ZIP CodeFor places, you can specify the entire country, states, counties, places, or ZIP codes. You can get generate lists based on company names, keywords, or NAICS codes to grab all of the businesses in one industry. Once you have your list, you can click on each individual business to get a detailed profile. For GIS purposes, you’ll want to use the download option. Depending on your subscription, you’ll be able to download only a certain number of records at a time (we can get 25 records per download). Just download as a csv file, save, open in a spreadsheet, then start downloading subsequent batches and start copying and pasting records in a master file.

 

Coffee shops in midtown - download to get all the dataWhen you go to download, you’ll be prompted to choose basic, detailed, or custom. Basic isn’t going to cut it, as it’s missing the key fields – latitude and longitude coordinates. Choose the detailed option to get all of the fields. The custom option has some bugs – you’ll get lat and long without decimal places and some of the data for fields will be missing. Once you have all of the detailed records, you can delete a lot of the unecessary fields. You’ll want to, as many of the field headings are not database friendly – many are long and contain spaces, which will cause problems when you go to import the table into GIS. So be sure to delete any that you don’t need and fix the ones you do need.

 

Once you have your table ready, add it to your favorite GIS program. In ArcGIS you can use the Add XY Table feature to plot the points and turn them into a shapefile. Remember to specify the X coordinate as your longitude field and the Y coordinate as latitude, and define your geographic coordinate system as WGS 84. Once you plot them, right click on the feature in the Table of Contents and export them out as a shapefile so you have a permanent layer (see my previous XY post for more details). You can map the businesses as regular old points, or make some graduated symbols based on some of the attributes, like sales or total employees (ReferenceUSA doesn’t provide the exact data, but identifies a range, i.e. 1 to 10 employees, 11 to 25, etc).

 

Most of the open source alternatives also have a tool or plugin that allow you to plot XY data. Of course, the data does include address fields if you wanted to geocode your points rather than plot XY (but plotting XY is a million times easier and doesn’t require downloading huge street network files).

 

The good news here is that if you’re not affiliated with a university, you can probably get access to this db from a large public library, as many will have a subscription to a business directory as a matter of course. If they don’t have RefUSA they may have an alternative like the D and B Million Dollar Database. It’s another business directory that allows you to download XY data for businesses, but it is not nearly as comprehensive.

Open Source GIS Wrap-up

Tuesday, September 30th, 2008

I’ve been on an open source GIS tear this month, so in this post I’ll wrap up some odds and ends:

  • There is a project called Sextante, which is essentially an open source ArcToolbox for gvSIG. It adds a lot of geoprocessing and analysis functions and is pretty easy to install. There are 200 + tools in the box, but for some reason not all of them are active. I’m not sure why this is the case, but haven’t poked around much to find out.
  • There are also a number of extra plugins for QGIS that are available through the QGIS wiki under PluginRepository; they include plugins that add more symbolization and that make table joins possible. Haven’t had a chance to try this yet either, but it sounds like these extras could make QGIS a lot more viable as a thematic mapping option.
  • I found out about the QGIS plugins from this article, which offers a good overview of QGIS. The article also discusses one of the other shortcomings of open source GIS – the lack of a support for a simple, desktop geodatabase similar to the Microsoft Access personal geodatabases. PostGIS is certainly powerful and there has been a lot written about it, but a server based geodatabase is not always the best solution, particularly for small, stand-alone projects. There is a cool project called Spatiallite, where someone has created geographically enabled SQLite databases (which are small, stand alone dbs). You can export shapefiles to them, or simply view and edit the attributes in a shapefile via a virtual connection. Based on what I’ve looked at thus far, you can access SQlite databases directly in GRASS and when using GRASS datasets via QGIS, but I haven’t been able to connect to a SQlite db with the other software I’ve looked at – it’s just not supported yet.
  • In researching open source GIS, I’ve looked at a book specifically on GRASS, Open Source GIS: A Grass Approach, as well as two books on web mapping (GIS for Web Developers: Adding ‘Where’ to Your Web Applications and Web Mapping Illustrated: Using Open Source GIS Toolkits)which cover GDAL and OGR, QGIS, GIS servers, PostGIS and PostgreSQL, and a few other tools. There is a book that’s recently been published that focusses specifically on Open Source Desktop GIS – Desktop GIS: Mapping the Planet with Open Source Tools. I pre-ordered a copy on Amazon that was supposed to ship in Mid September, but is now being delayed until late October. Based on the table of contents it looks pretty thorough and covers many of the choices I listed in my previous post, and I’m looking forward to its arrival.

Why Consider ArcGIS Alternatives?

Wednesday, September 17th, 2008

Last week I shared my adventures evaluating open source software. Why bother looking at alternatives to ArcGIS? There are significant barriers of entry to ArcGIS. Whenever I give an introductory GIS presentation to anyone, I inevitably have to answer the question of “How can I get access to this software?” Inevitably, the answer is you have to spend a lot of money, or if your institution already has a subscription, you need to go through a lengthy process to get access.

  • Price. A single, stand-alone copy of ArcView costs $1500. Not only is that prohibitively expensive for me, it’s impossible for students. Which means that students who are taking a GIS class have to use the software in a computer lab on campus to complete assignments. This is not always convenient for many students, and is particularly problematic where I work since we are primarily a commuter campus.
  • License limitations. If you’re running Arc through a central license server, PCs have to be connected to the server through a hardwired connection – no wireless. Our library has a laptop checkout program for students which would give students an alternative to using a computer lab. But not being able to install the software on a laptop eliminates this possibility. It also makes it a pain for me to give presentations, as I always have to make sure that the room I’ll be presenting in has the software. My short term solution is to use an eval copy on a laptop. You can purchases USB keys that have the license info on them, but if you work in a large, complex academic or government setting, getting one can be a challenge. And every year we have to go through the process of getting the license renewed.
  • Installation and Bugs. As Arc users know, installation can be time consuming, particularly since you can’t have two versions of Arc installed concurrently – you have to uninstall one before installing the new one. And how many service packs have been issued for version 9.2? Six. IT people love it when they have to install fixes in a dozen labs / classrooms in the middle of a semester, particularly when they have to do it 5 or 6 times a year. In reality, we skip several service packs and live with the bugs.
  • Forced Obsolescence. This is particularly aggravating. Every year or two, we all have to go through the ritual of making an upgrade, which involves time consuming un-installation and installation. And you need to make sure that different branches of your organization that use GIS are on the same page, otherwise you’ll run into incompatibility issues (like when mxd files created in version  9.2 don’t work in 9.1).
  • Cross platform. I run a linux box at home and occasionally would like to take my work with me. There are a number of students and faculty members at my school who are ardent Mac users. But ArcGIS runs only on Windows.

The open source alternatives are free, easy to install (usually), can be installed anywhere without restrictions, the software doesn’t expire, and upgrades are a rather simple affair. The obvious downside is that none of them have the power, scope, or usability that ArcGIS has. At least, not yet.

Open Source GIS for Thematic Mapping

Wednesday, September 3rd, 2008

I’ve been exploring the open source GIS alternatives, and have been pretty overwhelmed by the number of choices. For an overview of what’s out there, you can check out The State of Open Source GIS (a large pdf) from Refractions Research, and a series of comparison tables assembled by a geography prof at the Univ of Calgary. You can also search the web for “Open Source GIS”, and you’ll find a number of blogs, forums, lists, and sites that cover it in some detail.

There are a lot of alternatives, and many of them are geared to a particular purpose: raster vs vector, viewer vs map making vs analysis, etc. I’m looking for something that’s cross-platform that I can use for vector-based thematic mapping, and something that I can easily introduce and teach to novices. I need software that allows me to: work with common formats like shapefiles, transform projections, add data tables and join them to shapefiles, symbolize data with a good selection of color schemes, classify data with several methods including natural breaks, add labels, and produce maps as pdfs, images, or in print. Preferably, I want something that has strong map layout capabilities. I don’t want to use a graphic design package for final map creation.

I’ve looked at five options that are all great, but there isn’t one that covers everything I’m looking for :

  • GRASSGRASS – an established and powerful GIS. Setting up the environment for working with files takes some getting used to (you can’t simply open a window and start adding files), and the native graphic interface is complex. GRASS only works with it’s own native file formats, so you have to import everything into that format first. GRASS was really designed for analysis and modeling (tasks for which it excels), but is not the best choice for basic thematic mapping, or for novices.
  • QGISQGIS – one of the strongest attributes of QGIS is that it harnesses the power of GRASS in a more user friendly environment. If you use the GRASS plug-in and work with GRASS datasets, you can do table joins and have access to several classification methods. If you use QGIS on it’s own (working with shapefiles or raster images), you won’t have these capabilities. It seems that projection transformations are limited to a certain subset of projections, and common global thematic map projections like Robisnon or Winkel Tripel are missing (you would have to create them manually). QGIS does have a print layout screen bu color schemes are limited, and labeling isn’t good – it automatically labels every single polygon in a multi-part layer, and the only way to turn it off is via a hack. QGIS is a great viewer (particularly for rasters) and is a good alternative GRASS front end, but probably won’t be your choice for making high-quality thematic maps.
  • gvSIGgvSIG – billed as an alternative to the old ArcView 3.x, gvSIG lives up to this reputation. A map project has separate, defined areas for data views, maps, and data tables. You can do projection transformations and it does support EPSG and ESRI, but the process is a little confusing. The map window has a default projection, but when you add a layer it uses the layer’s projection but keeps the windows projection, and it’s difficult to figure out what projection the layer is in (if this makes any sense!) You can do table joins, and there is a good selection of color schemes for classification and several classification methods. It is the only one that I’ve seen that has natural breaks as an option. It also has the best map layout compared to the other software I’ve looked at and you can export maps out to a number of formats. The biggest weakness is labeling, which is very rudimentary. It doesn’t place labels in the center of polygons, but offsets them slightly (appropriate for points but not polygons), and has no conflict detection. It does allow you to create annotation layers, and improvements are in the works for the next version. Since the software was created in Spain, you’ll occasionally find a menu here or a button there that was missed in translation to English.
  • udiguDIG – the windows and toolbars are not as GIS-like as the other software options, which makes findings things a little tougher. Udig has good projection transformation support, great selection of color schemes for symbolization, and excellent label placement (the best, by far), with conflict detection and different placement options. Like QGIS, the map layout screen is located under the print option. The templates are rudimentary but are easy to use and get the job done. The detractors here are data classification (natural breaks is not a choice) and table joins – there is no option for adding and joining attribute tables to spatial files whatsoever.
  • openJUMPOpenJUMP – has a great interface, particularly for working with attribute tables, good selection of color schemes for symbolization and good label placement features. Table joins are supported for text files (no DBFs). Equal Intervals is the only data classification method available (no natural breaks), and projection transformation is only available via a plug-in. The bigger issue is that there is no print option or map layout. These are available through a plugin as well, but I haven’t tried installing it yet. Plugin installation requires altering or over-writing some of the program files, and I was dubious to try.

There are always work-arounds to fill in the features that are missing. For file and projection transformations, you can always use the GDAL / OGR tools, which I would recommend (although annoyingly I can’t seem to get the projection transformations to Robinson or Winkel to work). The NCEAS at UC Santa Barbara has a nice wiki with examples of commands. Table joins can be accomplished outside the GIS using a database package. You could use a spreadsheet or stats package to figure out break points for data, and change the breaks manually in the GIS. For labels, you can export labels out as annotation, or you can convert polygons to points and use the points layer as a label layer (just make the points invisible but turn the labeling on). Then you can edit that file and move the labels around to get them in the right position. If you make lots of thematic maps for the same area, you can use the same label file over and over again.

This isn’t an exhaustive overview and I haven’t created a consistent procedure for testing all the options. There are a few other elements that I would also want to explore (How good is the support for legends? Can you normalize data or calculate new attribute fields? Can you add a graticule? Change the background color for a view? Convert a table of XY coordinates to a point layer? Is there a geoprocessing tool for generalizing layers?)

If pressed, which option would I choose? I’m inclined to go with gvSIG, which really reminds me of the old ArcView, and would hope that label placement improves with the next version – each of these software packages are constantly improving works in progress. Perhaps it’s best to regard these software options as different tools in one toolbox. If I need to make a thematic map of the world, showing population by country with only a few labels, then I’ll go with gvSIG, where I can easily do table joins, use natural breaks, and have lots of colors at my disposal. If I need to make basic reference maps, say ZIP codes of the NYC metro area, then I’ll go with uDig, as I’ll be able to quickly label them all and still have good color scheme choices. If I need to do geoprocessing or analysis, I’ll also have to evaluate the options – maybe look at plugins, or hunker down and learn GRASS.

In the end, I’m certainly grateful that there are solid, open, and free choices out there and that people are freely giving their time and talent to everyone’s benefit. Why consider these alternatives? I’ll cover that in my next post.

My Goings On

Monday, May 26th, 2008

It’s been awhile since my last post – I’ve been locked away in my apartment on research leave for the past two weeks. I’m working on a database-backed web directory for finding GIS data, as I’m tired of dealing with bookmarks, html lists, and protracted web searches for keeping track of datasets. The goal is to keep it simple and standards-based. The basic architecture is in place; I’m just struggling to learn and apply PHP. Fingers crossed, I may have a prototype ready by the end of my next round of leave this summer.

I’m also still reading Georeferencing by Linda Hill, which extensively covers gazetteers and and metadata standards.  It’s definetly worth checking out, from a library near you.


Copyright © 2012 Gothos. All Rights Reserved.
No computers were harmed in the 0.415 seconds it took to produce this page.

Designed/Developed by Lloyd Armbrust & hot, fresh, coffee.