Archive for the ‘Resources’ Category

Centroids for Countries

Tuesday, February 10th, 2009

I just added a new resource and updated another one on the resources page. I put together a file that contains the centroids (geographic centers) of all of the countries in the world, plus a few territories and dependencies. The centroids are in latitude and longitude coordinates based on WGS 84 in two formats: decimal degrees and degrees / minutes / seconds. It’s a tab delimited text file that you can open or import into any spreadsheet or database program. Each record is uniquely identified by a FIPS 10 code.

I downloaded most of the data from the NGA’s GeoNames Server (GNS). I blogged about the GNS awhile back, pointing out that you could query this gazetteer for individual places or you could download files that have all the features for each country in the world. While it took some time to figure out, you can actually take a middle road and query the database for specific categories of features that you can download. I used the text-based search and the links on the left side of the screen actually open different input boxes that you can use to query or exclude data. I managed to query top-level administrative units (countries) and to exclude most variant country names. After I downloaded the file, I still had to go in and do some clean-up, and I had to go back and get countries I missed by hand – these were mostly dependencies and territories that were excluded based on the search I did (Greenland, French Guiana, Netherlands Antilles, and a number of others).

Then I realized that the GNS excludes the United States and all of its territories. So, I went over to the USGS Geographic Names Information Service (GNIS) and grabbed the data for the US territories. The GNIS is simpler to navigate and you can download records pretty easily. They didn’t have a record for the United States as a whole, so I had to go over to the Census Bureau to get coordinates for the US centroid.

I brought all of these records into one file and placed it on the resources page for download, along with some metadata to describe it. Why would you want to use this stuff? You can use if for basic distance calculations, or as a annotated label field for label placement in GIS. More about that in my next post.

I also updated the country code cross-reference file that I took from the CIA World Factbook. You can use this as a bridge table to relate tables that use different identifiers. So if you wanted to join the fips-based centroid file to an iso-based shapefile of countries, you can join the centroids to the bridge first based on fips, and then that new table to the shapefile based on iso.

Social Explorer and New ACS Census Data

Thursday, January 22nd, 2009

This is kind of a follow-up to my last post – the Social Explorer, a great interactive mapping site that allows you to map US Census data, has added the 2005-2007 American Community Survey data to their site at the PUMA level. This is the smallest geographic area that is available for recent data, until we get to the 2010 Census and 2010 ACS. At this point you can look at total population, race, and Hispanic ethnicity. It looks like you can make maps, but you can’t export the data unless you subscribe to the full version.

The Social Explorer allows you to map a wide selection of decennial census data all the way back to the 1790 census (they have a partnership with NHGIS, which provides historical data and boundary files for free download with registration). Tract-level data is available back to 1940. While you can map the data, and you can generate slideshows and download static maps as image files, you can only generate reports for the 2000 census. In order to get full access for report generation and other features, you’ll have to subscribe (or find access to a library that does).

Social Explorer also works with ARDA (Association of Religious Data Archives) to create maps of county-level religious affiliation (since the US Census does not collect this data by law). Of all the interactive mapping sites I’ve seen, the Social Explorer is one of the slickest and easiest to use.

Open Source GIS Wrap-up

Tuesday, September 30th, 2008

I’ve been on an open source GIS tear this month, so in this post I’ll wrap up some odds and ends:

  • There is a project called Sextante, which is essentially an open source ArcToolbox for gvSIG. It adds a lot of geoprocessing and analysis functions and is pretty easy to install. There are 200 + tools in the box, but for some reason not all of them are active. I’m not sure why this is the case, but haven’t poked around much to find out.
  • There are also a number of extra plugins for QGIS that are available through the QGIS wiki under PluginRepository; they include plugins that add more symbolization and that make table joins possible. Haven’t had a chance to try this yet either, but it sounds like these extras could make QGIS a lot more viable as a thematic mapping option.
  • I found out about the QGIS plugins from this article, which offers a good overview of QGIS. The article also discusses one of the other shortcomings of open source GIS – the lack of a support for a simple, desktop geodatabase similar to the Microsoft Access personal geodatabases. PostGIS is certainly powerful and there has been a lot written about it, but a server based geodatabase is not always the best solution, particularly for small, stand-alone projects. There is a cool project called Spatiallite, where someone has created geographically enabled SQLite databases (which are small, stand alone dbs). You can export shapefiles to them, or simply view and edit the attributes in a shapefile via a virtual connection. Based on what I’ve looked at thus far, you can access SQlite databases directly in GRASS and when using GRASS datasets via QGIS, but I haven’t been able to connect to a SQlite db with the other software I’ve looked at – it’s just not supported yet.
  • In researching open source GIS, I’ve looked at a book specifically on GRASS, Open Source GIS: A Grass Approach, as well as two books on web mapping (GIS for Web Developers: Adding ‘Where’ to Your Web Applications and Web Mapping Illustrated: Using Open Source GIS Toolkits)which cover GDAL and OGR, QGIS, GIS servers, PostGIS and PostgreSQL, and a few other tools. There is a book that’s recently been published that focusses specifically on Open Source Desktop GIS – Desktop GIS: Mapping the Planet with Open Source Tools. I pre-ordered a copy on Amazon that was supposed to ship in Mid September, but is now being delayed until late October. Based on the table of contents it looks pretty thorough and covers many of the choices I listed in my previous post, and I’m looking forward to its arrival.

Why Consider ArcGIS Alternatives?

Wednesday, September 17th, 2008

Last week I shared my adventures evaluating open source software. Why bother looking at alternatives to ArcGIS? There are significant barriers of entry to ArcGIS. Whenever I give an introductory GIS presentation to anyone, I inevitably have to answer the question of “How can I get access to this software?” Inevitably, the answer is you have to spend a lot of money, or if your institution already has a subscription, you need to go through a lengthy process to get access.

  • Price. A single, stand-alone copy of ArcView costs $1500. Not only is that prohibitively expensive for me, it’s impossible for students. Which means that students who are taking a GIS class have to use the software in a computer lab on campus to complete assignments. This is not always convenient for many students, and is particularly problematic where I work since we are primarily a commuter campus.
  • License limitations. If you’re running Arc through a central license server, PCs have to be connected to the server through a hardwired connection – no wireless. Our library has a laptop checkout program for students which would give students an alternative to using a computer lab. But not being able to install the software on a laptop eliminates this possibility. It also makes it a pain for me to give presentations, as I always have to make sure that the room I’ll be presenting in has the software. My short term solution is to use an eval copy on a laptop. You can purchases USB keys that have the license info on them, but if you work in a large, complex academic or government setting, getting one can be a challenge. And every year we have to go through the process of getting the license renewed.
  • Installation and Bugs. As Arc users know, installation can be time consuming, particularly since you can’t have two versions of Arc installed concurrently – you have to uninstall one before installing the new one. And how many service packs have been issued for version 9.2? Six. IT people love it when they have to install fixes in a dozen labs / classrooms in the middle of a semester, particularly when they have to do it 5 or 6 times a year. In reality, we skip several service packs and live with the bugs.
  • Forced Obsolescence. This is particularly aggravating. Every year or two, we all have to go through the ritual of making an upgrade, which involves time consuming un-installation and installation. And you need to make sure that different branches of your organization that use GIS are on the same page, otherwise you’ll run into incompatibility issues (like when mxd files created in version  9.2 don’t work in 9.1).
  • Cross platform. I run a linux box at home and occasionally would like to take my work with me. There are a number of students and faculty members at my school who are ardent Mac users. But ArcGIS runs only on Windows.

The open source alternatives are free, easy to install (usually), can be installed anywhere without restrictions, the software doesn’t expire, and upgrades are a rather simple affair. The obvious downside is that none of them have the power, scope, or usability that ArcGIS has. At least, not yet.

Open Source GIS for Thematic Mapping

Wednesday, September 3rd, 2008

I’ve been exploring the open source GIS alternatives, and have been pretty overwhelmed by the number of choices. For an overview of what’s out there, you can check out The State of Open Source GIS (a large pdf) from Refractions Research, and a series of comparison tables assembled by a geography prof at the Univ of Calgary. You can also search the web for “Open Source GIS”, and you’ll find a number of blogs, forums, lists, and sites that cover it in some detail.

There are a lot of alternatives, and many of them are geared to a particular purpose: raster vs vector, viewer vs map making vs analysis, etc. I’m looking for something that’s cross-platform that I can use for vector-based thematic mapping, and something that I can easily introduce and teach to novices. I need software that allows me to: work with common formats like shapefiles, transform projections, add data tables and join them to shapefiles, symbolize data with a good selection of color schemes, classify data with several methods including natural breaks, add labels, and produce maps as pdfs, images, or in print. Preferably, I want something that has strong map layout capabilities. I don’t want to use a graphic design package for final map creation.

I’ve looked at five options that are all great, but there isn’t one that covers everything I’m looking for :

  • GRASSGRASS – an established and powerful GIS. Setting up the environment for working with files takes some getting used to (you can’t simply open a window and start adding files), and the native graphic interface is complex. GRASS only works with it’s own native file formats, so you have to import everything into that format first. GRASS was really designed for analysis and modeling (tasks for which it excels), but is not the best choice for basic thematic mapping, or for novices.
  • QGISQGIS – one of the strongest attributes of QGIS is that it harnesses the power of GRASS in a more user friendly environment. If you use the GRASS plug-in and work with GRASS datasets, you can do table joins and have access to several classification methods. If you use QGIS on it’s own (working with shapefiles or raster images), you won’t have these capabilities. It seems that projection transformations are limited to a certain subset of projections, and common global thematic map projections like Robisnon or Winkel Tripel are missing (you would have to create them manually). QGIS does have a print layout screen bu color schemes are limited, and labeling isn’t good – it automatically labels every single polygon in a multi-part layer, and the only way to turn it off is via a hack. QGIS is a great viewer (particularly for rasters) and is a good alternative GRASS front end, but probably won’t be your choice for making high-quality thematic maps.
  • gvSIGgvSIG – billed as an alternative to the old ArcView 3.x, gvSIG lives up to this reputation. A map project has separate, defined areas for data views, maps, and data tables. You can do projection transformations and it does support EPSG and ESRI, but the process is a little confusing. The map window has a default projection, but when you add a layer it uses the layer’s projection but keeps the windows projection, and it’s difficult to figure out what projection the layer is in (if this makes any sense!) You can do table joins, and there is a good selection of color schemes for classification and several classification methods. It is the only one that I’ve seen that has natural breaks as an option. It also has the best map layout compared to the other software I’ve looked at and you can export maps out to a number of formats. The biggest weakness is labeling, which is very rudimentary. It doesn’t place labels in the center of polygons, but offsets them slightly (appropriate for points but not polygons), and has no conflict detection. It does allow you to create annotation layers, and improvements are in the works for the next version. Since the software was created in Spain, you’ll occasionally find a menu here or a button there that was missed in translation to English.
  • udiguDIG – the windows and toolbars are not as GIS-like as the other software options, which makes findings things a little tougher. Udig has good projection transformation support, great selection of color schemes for symbolization, and excellent label placement (the best, by far), with conflict detection and different placement options. Like QGIS, the map layout screen is located under the print option. The templates are rudimentary but are easy to use and get the job done. The detractors here are data classification (natural breaks is not a choice) and table joins – there is no option for adding and joining attribute tables to spatial files whatsoever.
  • openJUMPOpenJUMP – has a great interface, particularly for working with attribute tables, good selection of color schemes for symbolization and good label placement features. Table joins are supported for text files (no DBFs). Equal Intervals is the only data classification method available (no natural breaks), and projection transformation is only available via a plug-in. The bigger issue is that there is no print option or map layout. These are available through a plugin as well, but I haven’t tried installing it yet. Plugin installation requires altering or over-writing some of the program files, and I was dubious to try.

There are always work-arounds to fill in the features that are missing. For file and projection transformations, you can always use the GDAL / OGR tools, which I would recommend (although annoyingly I can’t seem to get the projection transformations to Robinson or Winkel to work). The NCEAS at UC Santa Barbara has a nice wiki with examples of commands. Table joins can be accomplished outside the GIS using a database package. You could use a spreadsheet or stats package to figure out break points for data, and change the breaks manually in the GIS. For labels, you can export labels out as annotation, or you can convert polygons to points and use the points layer as a label layer (just make the points invisible but turn the labeling on). Then you can edit that file and move the labels around to get them in the right position. If you make lots of thematic maps for the same area, you can use the same label file over and over again.

This isn’t an exhaustive overview and I haven’t created a consistent procedure for testing all the options. There are a few other elements that I would also want to explore (How good is the support for legends? Can you normalize data or calculate new attribute fields? Can you add a graticule? Change the background color for a view? Convert a table of XY coordinates to a point layer? Is there a geoprocessing tool for generalizing layers?)

If pressed, which option would I choose? I’m inclined to go with gvSIG, which really reminds me of the old ArcView, and would hope that label placement improves with the next version – each of these software packages are constantly improving works in progress. Perhaps it’s best to regard these software options as different tools in one toolbox. If I need to make a thematic map of the world, showing population by country with only a few labels, then I’ll go with gvSIG, where I can easily do table joins, use natural breaks, and have lots of colors at my disposal. If I need to make basic reference maps, say ZIP codes of the NYC metro area, then I’ll go with uDig, as I’ll be able to quickly label them all and still have good color scheme choices. If I need to do geoprocessing or analysis, I’ll also have to evaluate the options – maybe look at plugins, or hunker down and learn GRASS.

In the end, I’m certainly grateful that there are solid, open, and free choices out there and that people are freely giving their time and talent to everyone’s benefit. Why consider these alternatives? I’ll cover that in my next post.

FIPS and ISO Country Code Table

Tuesday, July 22nd, 2008

I’ve created a bridge table to relate various country codes: FIPS 10, the three versions of ISO 3166 (two alpha, three alpha, and three numeric), NATO, and the internet country codes. Ok, ok, I didn’t create it, the CIA World Factbook did and has it listed in their appendix, but in HTML. I just copied it and made it database friendly: fixed column headings, removed dashes for nulls, removed trailing and leading spaces, converted numeric codes to text while preserving zeros, and saved it in a tab-delimited text file. You can download it from the Resources page.

If you open it in Excel, Excel annoyingly converts the three digit numeric ISO code back to a number and drops the leading zeros. You can fix this using a trick I illustrated in a previous post, or import it into Calc or Access instead. You’ll be able to designate that field as text during the import process. If you stash the table in a geodatabase, you will be able to relate features and tables that use different codes through this bridge table.

I was also searching for the ISO 3166-2 codes for 1st level subdivisions within countries (like states in the US or provinces in Canada), but had trouble finding anything official as I think they may be copyrighted. I eventually found one source that claims that they received permission to post the codes, so you can take a look there. I also stumbled across a good reference source called Statoids, which gives you background information, lists, and codes for subdivisions on a country by country basis. I’ve added both of these to the Links – Resources page.

Searching for Foreign Census Data

Saturday, July 5th, 2008

I’ve been looking for census data for various countries, and have visited the usual suspects that aggregate this data – the CIA World Factbook and the United Nations Population Information Network. Other supra-national orgs like the IMF and World Bank also create and compile this info. These are fine sources, particularly if your goal is to look at basic data for several (or all) countries. But if you are studying or writing about one country in particular, it may seem odd to cite the UN, and even odder to cite the CIA. It would be better to go right to the source – the chief statistical agency in that particular country. In all likelihood, this agency would also have more in-depth stats than the aggregators.

But – where is the source? Rather than be left to the mercy of google, where you’ll uncover the obvious suspects and lots of commercial sites and joe-schmoes who republished some data from last decade, visit the US Census Bureau’s list of foreign statistical agencies, which will lead you right to the source.

Assuming you can find some pages with some data (census data isn’t public domain in every country and isn’t necessarily online for free, or at all, in which case you may need to go with some of the aggregate sources), the next obstacle will be overcoming the language barrier. Many countries will publish pages in several languages, including English. Some may publish only limited info in English, or no info in English at all. If you don’t read the lingua franca, you can try a translating tool like Babblefish or the Google Language Tool to translate the page for you. The translation may not be perfect, but it should be good enough where you can figure out what you need (although if the language you are translating doesn’t use the Roman alphabet and Arabic numerals – i.e. 1,2,3 etc, you may have some trouble).

The toughest obstacle to overcome may be the organizational barrier. If you are familiar with the US Census Bureau, you’ll know that it’s a large and complex organization with many subdivisions and datasets (decennial census, acs, population estimates, etc). And despite it’s enormity, it doesn’t collect all socio-economic data (religious affiliation) and may not be the best source for all data (current labor force stats). Well – other countries are just as complicated, so be wary!

Another strategy would be to visit Wikipedia – not to cite as a source, but to find what sources they use. You’ll find many country specific articles that cite the CIA Factbook or the UN, but some of the more detailed and well written ones do cite reports written by the statistical agencies for the country in question, often with a link to the page or report. If you have access to some library databases, like Gale Virtual Reference, they will (usually) cite sound references as well. Happy hunting!

Hypercities – Hypermedia Berlin

Sunday, May 4th, 2008

Last week, I met Prof John Maciuika , a Baruch CUNY professor who is one of the Academic Directors of the Hypercities project. Created by the UCLA Center for Digital Humanities, this site is essentially like a geographic content management system for cities. Built on the Google Map interface, users can explore city streets and features for a variety of time periods. Historical maps have been scanned in, and current features from Google can be layered on top. For each historical period, you can see photos, video, commentary, and Google KML files for several landmarks – buildings, parks, and districts.

If you register with the site, you can create your own account where you can upload your own content. For profs who want to use this as part of their teaching, you can grant access to portions of your account to groups (a class for instance), where they will be able to view and interact with your content. For example, students may need to explore various architectural projects by visiting them on the map, and then they could upload their assignments (and even photos of their own) to the map where the rest of the group can see them.

Hypermedia Berlin Screen ShotCurrently, Berlin is the featured city and the one that has the most complete content. There is some content for Los Angeles, and there are plans to add New York, Chicago, Paris, and Beijing. Check it out at http://www.hypercities.com/. It’s still a work in progress, so some of the features may not be active yet.

GIS and Mapping Titles

Friday, April 25th, 2008

I’ve been building the library’s collection of geography, GIS, and demography books, and many of them have started arriving. I thought I’d share some of best picks with everyone. Here’s a subset of some of the GIS titles (and more are on the way!)

A to Z GIS : an illustrated dictionary of geographic information systems 2nd ed
Wade, Tasha and Sommer, Shelly
ESRI Press, 2006
Convenient print copy of ESRI’s online glossary, with nice illustrations.

Beyond mapping : meeting national needs through enhanced geographic information science
National Research Council (U.S.). Committee on Beyond Mapping
National Academies Press, 2006
A concise overview of GIS: where it’s been, where it is, where it needs to go, and why.

Elevation data for floodplain mapping
National Research Council (U.S.). Committee on Floodplain Mapping Technologies
National Academies Press, 2007.
The specific topic is quite salient, but this book also serves as great intro to rasters and remote sensing in general.

Geographic information systems demystified
Galati, Stephen R
Artech House, 2006.
A non-ESRI overview of GIS. It’s been checked out since it arrived, so it must be good!

Georeferencing : the geographic associations of information
Hill, Linda L.
MIT Press, 2006.
This is what I’m reading now. It covers spatial cognition, digitization, coordinates, ontologies, gazetteers, metadata, info retrieval, and more.

Getting to know ArcGIS desktop : basics of ArcView, ArcEditor, and ArcInfo 2nd ed., updated for ArcGIS 9.
Ormsby, Tim, et. al.
ESRI Press, 2004
At this point, the old stand-by introduction to ArcGIS

Making maps : a visual guide to map design for GIS
Krygier, John and Wood, Denis
Guilford Press, 2005
Easy to read and cleverly organized overview of cartographic conventions and design for GIS

Statistical methods for geography : a student guide 2nd ed.
Rogerson, Peter A.
Sage Publications, 2006.
Nice overview of this topic with many examples.

Census Update: Shapefiles, ACS, Estimates

Wednesday, April 16th, 2008

I’m in Boston at the Association of American Geographers (AAG) annual conference this week, and attended a great series that explored what the Census Bureau is currently up to. Here are some hi-lites:

The Bureau is now providing the TIGER line files in shapefile format! Before, it was only possible to get generalized cartographic boundary files directly from the bureau in shapefile format. Now, you can get the boundaries in their original detail from a public domain source. Includes 2000 census geography plus some updates for 2007 for states, counties, metros, places, zips, districts, pumas, and more. Currently, it does not include tracts, block groups, or blocks.

Census TIGER Shapefile Download

The 2008 release of the American Community Survey (ACS) will include two datasets. There will be the annual numbers for geographies that have over 65,000 people, and for the first time there will be three year averages for geographies that have over 20,000 people. In each succeeding year, this average will be recalculated by adding in the most recent year and dropping the oldest one. Data for geographies with less than 20,000 people will become available in 2011 and will be based on five year averages. The good news is that, from that point forward, data will be available for all areas every single year. The bad news is that the long form (the one in six sample of households taken in the decennial census) is being discontinued and will not be conducted in 2010. Census 2010 will consist solely of the short form questions (the 100% count that covers the basic demographic variables). The ACS will serve as the replacement to the long form, but in most cases the data will not be suitable for making historical comparisons (i.e. comparing 2010 to 2000).

Bureau reps gave an overview of their Population Estimates program. Unlike the ACS which is survey based, estimates are calculated using a cohort component analysis that accounts for births, deaths, and migration each year. Estimates are calculated nationally and at the county level. The county numbers are used to create estimates for each state, which are then adjusted to fit national numbers. Data is available for total population, race, age (broken down by gender for each year at the national level and for five year groups below that) and housing units. Some data is also available for metropolitan areas (which are county based) and county subdivisions (for total population only).

The Bureau gave an overview of Dataferret, which is a tool for data power users. It is available in two versions, as download-able software or as a browser-based JAVA applet, and allows users to gather and process data from several different government sources (unlike the Amerivan Factfinder, which focuses solely on downloading census data).

Finally, things are ramping up for the 2010 Decennial Census. The bureau is updating its master address files and has almost finished recalibrating the TIGER files for each county, so that boundaries are precise within a maximum limit of 70 meters.


Copyright © 2012 Gothos. All Rights Reserved.
No computers were harmed in the 0.332 seconds it took to produce this page.

Designed/Developed by Lloyd Armbrust & hot, fresh, coffee.