Archive for the ‘Resources’ Category

NYC Subway and Transit GIS Layers

Saturday, July 24th, 2010

I’ve started outlining a one-day, introductory GIS practicum / workshop that I hope to offer in the coming academic year. One of the primary examples I want to use in the workshop is site selection for a retail store, and I thought it would be great to use a subway layer as part of the exercise. But alas, I searched high and low for a layer late last year (for a site selection project) and couldn’t find a publicly available one. I had purchased some proprietary layers, but really don’t want to use them for this workshop because I want to be able to freely distribute all of the materials to anyone; the layer I purchased is also outdated now because the MTA cut many services (including two subway lines) last month.

But thanks to Steve Romalewski at the CUNY Mapping Service, there’s now an alternative! Steve’s work is a HUGE contribution to the GIS community in New York and fills a glaring hole in the city’s collection of freely available GIS data. The MTA does host a data feed service (based on the General Transit Feed Specification created by Google) where it provides the geography of all its transit services, among other things. Steve downloaded and processed this raw data and turned it into shapefiles. He quickly discovered that it required a fair amount of scrubbing to be usable, and he’s cleaned it up and documented the entire process in great detail in several posts on his blog (Spatiality). Links to download individual shapefiles are available at the bottom of each post, following his discussion of issues and methodology for each set of layers. The CUNY Center for Urban Research has created an index page with each post, which you can access here.

In addition, he’s created a lyr file for the subway lines in order to symbolize them correctly by color and a separate mxd file for labels. While the shapefiles represent where the lines are, there are some problems representing them as they appear cartographically on the MTA’s subway maps. Many lines, including some with different colors, share the same trunk line. For example the A and C trains (blue lines) share the same trunk with the B and D trains (orange lines) along 8th Ave from 59th St to 145th St. Depending on how you sort your symbol categories, you’ll only see one color (and line) depending on which one you have on top. Steve points out two ways for solving this issue – you can edit the geography and offset one of the lines, which is tedious and creates problems as you change scale (he has some great screen shots that depict this). If you’re using ArcGIS, he shows off some cartographic tools that you can use to offest lines by prioritizing values in the attribute table. This is more ideal, as it gives the illusion that the lines are side by side cartographically while keeping the geometry of the shapefile intact.

So if you’re using ArcGIS you’ll be good to go. I’ve downloaded the files to play around with, but as I’m at home and using QGIS I had some more work to do, since lyr and mxd files are proprietary ESRI formats that the open source packages can’t handle. I’ve assigned the appropriate colors to each subway line and saved them a QGIS style file (.qml), which you can import in the symbology window to quickly and easily get the right colors (which I plucked from the MTA’s website). I’ve also saved the RGB and hex values for each line in a text file, if you’re using some other GIS software and need to input them manually. As far as I know there isn’t an easy way to circumvent the shared-line subway problem if you’re using QGIS (see screenshot below), so you’d have your work cut out for you if you want to faithfully represent the lines the way they appear on the MTA maps. But if you’re using the layers for analysis (which is what I’ll be doing) or you don’t need to emulate “the” subway map in exact detail, it shouldn’t matter.

NYC subway layers from CUNY Mapping Service in QGIS

NYC subway layers from CUNY Mapping Service in QGIS

Footnote – for anyone who is interested, the proprietary data that I purchase for the college is from a company called Halcrow. The entire NYC transportation package costs $465. It includes NYC subways and buses (lines and stations for each, along with ridership statistics from 2008 and a historical bus stops layer from 1998), LIRR and Metro North (lines and stations), but also includes the PATH train, freight lines, and truck routes.

Learning Python at PyCamp

Thursday, June 10th, 2010

I got back from leave a couple week ago, and spent part of it at a Python boot camp. I’ve gotten tired of hacking away at data in spreadsheets and read in several places that Python is a good language to learn for beginning programmers – it’s also open source, flexible, and is used by many in the GIS community for processing data and building plugins and software (the instructor for the camp, Chris Calloway, pointed me to this presentation on Python scripting techniques for ArcGIS).

The workshop was a three-day event hosted at Penn State by the Triangle Zope and Python Users Group (TriZPUG). It was geared towards beginners and non-programmers (although many of my fellow classmates were IT and systems people) and provided a pretty thorough review of all of the elements of the language – now it’s up to me to tie it all together! The price was extremely reasonable (only $300 for a 3 day class!) and I’d certainly recommend it if there’s a camp in your area; although I would also recommend reading a book or taking a tutorial to familiarize yourself with the basics BEFORE attending the class; I did, and as a result I think I got more out of it than I would have had going in cold.

The next PyCamp is being held in LA in a few days, and the following one will be in Toronto from Aug 30th to Sept 3rd (although this isn’t posted on the website yet); the normal workshop is a five day affair, the one I attended was a mini 3 day version which suited my needs pretty well.

There are tons of Python tutorials on the web and Python’s site is pretty definitive. If you’re looking for a book, I’d recommend Practical Programming: An Introduction to Computer Science Using Python. Unlike the “Learn Language X” books, this one introduces you to general theory and practice in programming, and the authors illustrate the applications with practical examples using Python – it’s been immensely helpful to me. Now that I’m around the initial learning curve, I’ve been relying more on Beginning Python: From Novice to Professional, which is better as a reference book and good for illustrating many of the uses for individual objects, methods, etc (which I had a hard time grasping before I covered the basics of programming).

Google Maps to Create a Census Finding Aid

Thursday, May 13th, 2010

Yikes! It’s been quite awhile since my last post (the past couple months have been a little tough for me), but I just finished an interesting project that I can share.

I constantly get questions from students who are interested in getting recent demographic and socio-economic profiles for neighborhoods in New York City. The problem is that neighborhoods are not officially defined, so we have to look for a surrogate. The City has created neighborhood-like areas out of census tracts called community districts and they publish profiles for them, but this data is from the decennial census  and not current enough for their needs.  ZIP code data is also only available from the decennial census.

We can use PUMAs (Public Use Microdata Areas) to approximate neighborhoods in large cities, and they are published as part of the 3 year estimates of the American Community Survey. The problem is, in order to look up the data from the census you need to search by PUMA number – there are no qualitative place names. The city and the census have worked together to assign names to neighborhoods as part of the NYC Housing and Vacancy Survey, but this is the only place (I’ve found) that uses these names. You need to look in several places to figure out what the PUMA number and boundaries for an area are and then navigate through the census site to find it. Too much for the average student who visits me at the reference desk or emails me looking for data.

My solution was to create a finding aid in Google maps that tied everything together:

View Larger Map

I downloaded PUMA boundaries from the Census TIGER file site in a shapefile format. I opened them up in ArcGIS and used an excellent script that I downloaded called Export to KML. ArcGIS 9.3 does support KML exports via the toolbox, and there are a number of other scripts and stand-alone programs that can do this (I tried several) but Export to KML was best (assuming you have access to ArcGIS) in terms of the level of customization and the thoroughness of the user documentation. I symbolized the PUMAs in ArcGIS using the colors and line thickness that I wanted and fired up the tool. It allows you to automatically group and color features based on the layer’s symbology. I was able to add a “snippet” to each feature to help identify it (I used the PUMA number as the attribute name and the neighborhood name as my snippet, so both appear in the legend) and added a description that would appear in the pop up window when that feature is clicked. In that description, I added the URL from the ACS census profile page for a particular PUMA – the cool part here is that the URL is consistent and contains the PUMA number. So, I replaced the specific number and inserted the [field] name from the PUMAs attribute table that contained the number. When I did the export, the URLs for each individual feature were created with their PUMA number inserted into the link.

There were a few quirks – I discovered that you can’t automatically display labels on a Google Map without subterfuge, like creating the labels as images and not text. Google Earth (but not Maps) supports labels if you create multi-geometry where you have a point for a label and a polygon for the feature. If you select a labeling attribute on the initial options screen of the Export to KML tool, you create an icon in the middle of each polygon that has a different description pop-up (which I didn’t want so I left it to none and lived without labels). I made my features 75% transparent (a handy feature of Export to KML) so that you could see the underlying Google Map features through the PUMA, but this made the fill AND the lines transparent, making the features too difficult to see. After the export I opened the KML in a text editor and changed the color values for the lines / boundaries by hand, which was easy since the styles are saved by feature group (boroughs) and not by individual feature (pumas). I also manually changed the value of the folder open element (from 0 to 1) so that the feature and feature groups (pumas and boroughs) are expanded by default when someone opens the map.

After making the manual edits, I uploaded the KML to my webserver and pasted the url for it into the Google Maps search box, which overlayed my KML on the map. Then I was able to get a persistent link to the map and code for embedding it into websites via the Google Map Interface. No need to add it to Google My Maps, as I have my own space. One big quirk – it’s difficult to make changes to an existing KML once you’ve uploaded and displayed it. After I uploaded what I thought would be my final version I noticed a typo. So I fixed it locally, uploaded the KML and overwrote the old one. But – the changes I made didn’t appear. I tried reloading and clearing the cache in my browser, but no good – once the KML is uploaded and Google caches it, you won’t see any of your changes until Google re-caches. The conventional wisdom is to change the name of the file every single time – which is pretty dumb as you’ll never be able to have a persistent link to anything. There are ways to circumvent the problem, or you can just wait it out. I waited one day and by the next the file was updated; good enough for me, as I’ll only need to update it once a year.

I’m hosting the map, along with some static PDF maps and a spreadsheet of PUMA names and neighborhood numbers, from the NYC Data LibGuide I created (part of my college’s collection of research guides). If you’re looking for neighborhood names to associate with PUMA numbers for your city, you’ll have to hunt around and see if a local planning agency or non-profit has created them for a project or research study (as the Census Bureau does not create them). For example, the County of Los Angeles Department of Mental Health uses pumas in a large study they did where they associated local place names with each puma.

If you’re interested in dabbling in some KML, there’s Google’s KML tutorial. I’d also recommend The KML Handbook by Josie Wernecke. The catch for any guide to KML is that while all KML elements are supported by Google Earth, there’s only partial support for Google Maps.

Evaluating Open Source GIS for Libraries

Wednesday, March 17th, 2010

I’ve hit a couple of milestones this month.

I had my first peer-reviewed journal article published, Evaluating open source GIS for libraries. After my initial exploration of open source GIS that I documented on this blog over a year and a half ago, I took a systematic approach to evaluating a number of software packages for thematic mapping. This article documents the tests and results and provides the requisite background on open source software, GIS, and how both are manifest in academic libraries. Given the lengthy process of academic publishing (the whole process began in Dec 2008 with my first test and ended in March 2010 with publication), some of my observations of individual software packages have changed with the release of bug fixes, new features, and new versions. Generally, individual software packages and open source GIS as a whole have improved during this short span of time, but my primary observations and the big picture still hold.

Title: Evaluating open source GIS for libraries
Author(s): Francis P. Donnelly
Journal: Library Hi Tech
Year: 2010 Volume: 28 Issue: 1 Page: 131 – 151
ISSN: 0737-8831
DOI: 10.1108/07378831011026742
Publisher: Emerald Group Publishing Limited

I’ve previously mentioned Steiniger and Bocher’s excellent article, An overview on current free open source desktop GIS developments in the International Journal of Geographic Information Science, which Steiniger has posted on his website. I recently discovered he’s written a second article with Hay entitled Free and Open Source Geographic Information Tools for Landscape Ecology in Ecological Informatics, which is also available there. The second article provides an in-depth look and great summary tables of landscape analysis applications for eight different open source GIS apps, focusing on advanced tools for researchers. In contrast, my article focuses on basic mapping capabilities for novice to intermediate users.

The other milestone is this blog – I just noticed that we’ve passed the two year mark. While there have only been a few public comments here and there, I have received a number of emails over the years with questions and comments and the number of visitors to the site has grown consistently from month to month. I’m glad that it’s been useful to so many people; it’s certainly been useful to me (as an extension to my feeble brain) and I’ll endeavor to keep it going. Thanks to everyone for your comments and feedback. Best – frank

Mapping Hard to Count Areas for Census 2010

Tuesday, February 23rd, 2010

There was an interesting article in the New York Times today about neighborhoods in New York that typically get under-counted in the Census. These include areas with high immigrant populations as well as places that have had new construction since the last census, as the buildings haven’t been added to the Census Bureau’s master address file.

What the article didn’t mention is that CUNY’s Center for Urban Research has created a great online ap called the Census 2010 Hard to Count mapping site. The site is built on the Census Bureau’s Tract Level Planning Database, which identified twelve population and housing variables, such as language isolation, recent movers, poverty, and crowded housing, that were associated with low mail response in the 2000 Census. This tool was designed to help Census reps, local government officials, and community activists identify traditionally under-counted areas to insure a more complete count this time around.

The database is national in scope, and you can easily map tracts for a particular state, county, city, metro area, or tribal area, and you can search for an area using an individual address. The map is built on a Google Maps interface, and zooming in will change the units mapped from larger units (states, counties, etc) to tracts. You can easily select one of the twelve variables color-coded in the menu to the left of the map, or a Hard to Count index of all the variables.

Reading List for Geographic Information Course

Saturday, August 29th, 2009

The fall semester is here, and I’m about to start teaching the class I mentioned in my last post (an information studies course on geographic information). I thought I’d share my reading list and try out the Open Book plugin. I chose my readings based on: my particular audience (undergraduate students from many disciplines with little or no background in geography), relevance (materials appropriate in a hybrid information studies / geography course), cost (wanting to assign the students a single textbook that’s reasonably priced and covers all the bases, and will supplement with other readings), and copyright (staying within the bounds of fair use by not assigning too much from a single work). Here goes:

[openbook]1593852002[/openbook] I decided to go with Krygier and Woods Making Maps as my assigned text book. Since cartography is a visual and technical art, I thought it made sense to use a book that relies on visuals for explanations rather than text. It’s approachable, particularly for my students who won’t be coming from a geography background, affordable, wonderfully quirky, and covers all of the essentials of the geographic framework and map interpretation and design independent of specific GIS software.

[openbook]1405106727[/openbook] I’m using the first chapter of Cresswell’s book as a succinct introduction to how individuals define places, but would recommend the rest of the text for classes that cover geographic concepts and methods.

[openbook]026208354X[/openbook] I’m assigning the second and third chapters of Hill’s book. The second chapter, which discusses how people process, store, and use geographic information is the best summary of this topic that I’ve ever seen, and the third chapter is a good overview of the different types of geographic objects. As a librarian-geo nerd, I love the chapters that deal with coordinate metadata and gazetteers, but won’t be using them in this class.

[openbook]0262620014[/openbook]This is an urban planning / design classic, and I’ll have my students read the summary of Lynch’s city elements (based on his research, Lynch proposed that people mentally break the urban environment down into five types of elements in order to organize and navigate the city: paths, barriers, districts, nodes, and landmarks).

[openbook]0470129050[/openbook]This is the only traditional textbook that I’ll be borrowing from (I actually used it when I was a Freshmen, way back when). While I’m using the previous three books to discuss egocentric places, or how we as individuals conceive of place, I’m using the first chapter of this book to give the students an overview of geocentric places – the formal, defined hierarchy of places that exist in the world – and to introduce them to the concept of regions.

[openbook]0226534146[/openbook]This has become a modern classic and I almost assigned it as a second textbook. I am assigning the chapter on maps for propaganda as a background to our discussion on map interpretation and communication, and will later use the chapter on census maps to talk about the effects of data classification and choice of enumeration units.

[openbook]1934356069[/openbook]This is the only software book that I’ll be using chapters from, so the students have some formal guide for using QGIS (in addition to the QGIS documentation). I’m using the chapters on vector and raster data.

[openbook]1412910161[/openbook]This concise, excellent book deals strictly with the concepts and principles behind GIS. I’m using the chapters on spatial search and geoprocessing, but would recommend the entire book for any GIS course, novice to advanced.

In addition to chapters from these books, I’ll also be using:

  • “Revolutions in Mapping” by John Noble Wilford, National Geographic Feb 1998 – a great overview of the history of cartography
  • USGS GIS poster – if there is such a thing, this is a “web classic” and an accessible intro to GIS
  • One article from a scholarly journal and one article from a mass market magazine to illustrate how geographic research is covered and used
  • And for shameless self-promotion, I summary I wrote about US Census data – In Three Parts

Finally, an honorable mention:

[openbook] 1593855664[/openbook] If I was teaching an introductory GIS course in a geography or earth sciences department, this is certainly the book I would use, and for those of you in that boat I’d recommend checking it out. It does an excellent job of covering GIS principles without being software specific, contains exercises at the end of each chapter, and is well written and affordable. Since the scope of my course is broader than GIS and my audience more general and diverse, I opted to leave it out (but may still assign a chapter).

Print Composer in QGIS – ACS Puma Maps

Sunday, July 12th, 2009

ny_youth_pumasI wrapped up a project recently where I created some thematic maps of 2005-2007 ACS PUMA level census data for New York State. I decided to do all the mapping in open source QGIS, and was quite happy with the result, which leads me to retract a statement from a post I made last year, where I suggested that QGIS may not be the best for map layout. The end product looked just as good as maps I’ve created in ArcGIS. There were a few tricks and quirks in using the QGIS Print Composer and I wanted to share those here. I’m using QGIS Kore 1.02, and since I was at work I was using Windows XP with SP3 (I run Ubuntu at home but haven’t experimented with all of these steps yet using Linux). Please note that the data in this map isn’t very strong – the subgroup I was mapping was so small that there were large margins of errors for many of the PUMAs, and in many cases the data was suppressed. But the map itself is a good example of what an ACS PUMA map can look like, and is a good example of what QGIS can do.

  • Inset Map – The map was of New York State, but I needed to add an inset map of New York City so the details there were not obscured. This was just a simple matter of using the Add New Map button for the first map, and doing it a second time for the inset. In the item tab for the map, I changed the preview from rectangle to cache and I had maps of NY state in each map. Changing the focus and zoom of the inset map was easy, once I realized that I could use the scroll on my mouse to zoom in and out and the Move Item Content button (hand over the globe) to re-position the extent (you can also manually type in the scale in the map item tab). Unlike other GIS software I’ve experimented with, the extent of the map layout window is not dynamically tied to the data view – which is a good thing! It means I can have these two maps with different extents based on data in one data window. Then it was just a matter of using the buttons to raise or lower one element over another.
  • Legend – Adding the legend was a snap, and editing each aspect of the legend, the data class labels, and the categories was a piece of cake. You can give your data global labels in the symbology tab for the layer, or you can simply alter them in the legend. One quirk for the legend and the inset map – if you give assign a frame outline that’s less than 1.0, and you save and exit your map, QGIS doesn’t remember this setting if when you open your map again – it sets the outline to zero.
  • Text Boxes / Labels – Adding them was straightforward, but you have to make sure that the label box is large enough to grab and move. One annoyance here is, if you accidentally select the wrong item and move your map frame instead of the label, there is no undo button or hotkey. If you have to insert a lot of labels or free text, it can be tiresome because you can’t simply copy and paste the label – you have to create a new one each time, which means you have to adjust your font size and type, change the opacity, turn the outline to zero, etc each time. Also, if the label looks “off” compared to any automatic labeling you’ve done in the data window, don’t sweat it. After you print or export the map it will look fine.
  • North Arrow – QGIS does have a plugin for north arrows, but the arrow appears in the data view and not in the print layout. To get a north arrow, I inserted a text label, went into the font menu, and chose a font called ESRI symbols, which contains tons of north arrows. I just had to make the font really large, and experiment with hitting keys to get the arrow I wanted.
  • Scale Bar – This was the biggest weakness of the print composer. The scale bar automatically takes the unit of measurement from your map, and there doesn’t seem to be an option to convert your measurement units. Which means you’re showing units in feet, meters, or decimal degrees instead of miles or kilometers, which doesn’t make a lot of sense. Since I was making a thematic map, I left the scale bar off. If anyone has some suggestions for getting around this or if I’m totally missing something, please chime in.
  • Exporting to Image – I exported my map to an image file, which was pretty simple. One quirk here – regardless of what you set as your paper size, QGIS will ignore this and export your map out as the optimal size based on the print quality (dpi) that you’ve set (this isn’t unique to QGIS – ArcGIS behaves the same way when you export a map). If you create an image that you need to insert into a report or web page, you’ll have to mess around with the dpi to get the correct size. The map I’ve linked to in this post uses the default 300 dpi in a PNG format.
  • Printing to PDF – QGIS doesn’t have a built in export function for PDF, so you have to use a PDF print driver via your print screen (if you don’t have the Adobe PDF printer or a reasonable facsimile pre-installed, there are a number  of free ones available on sourceforge – PDFcreator is a good one). I tried Adobe and PDFcreator and ran into trouble both times. For some reason when I printed to PDF it was unable to print the polygon layer I had in either the inset map or the primary map (I had a polygon layer of pumas and a point layer of puma centroids showing MOEs). It appeared that it started to draw the polygon layer but then stopped near the top of the map. I fiddled with the internal settings of both pdf drivers endlessly to no avail, and after endless tinkering found the answer. Right before I go to print to pdf, if I selected the inset map, chose the move item content button (hand with globe), used the arrow key to move the extent up one, and then back one to get it to it’s original position, then printed the map, it worked! I have no idea why, but it did the trick. After printing the map once, to print it again you have to re-do this trick. I also noticed that after hitting print, if the map blinked and I could see all the elements, I knew it would work. But, if the map blinked and I momentarily didn’t see the polygon layer, I knew it wouldn’t export correctly.

Despite a few quirks (what software doesn’t have them), I was really happy with the end result and find myself using QGIS more and more for making basic to intermediate maps at work. Not only was the print composer good, but I was also able to complete all of the pre-processing steps using QGIS or another open source tool. I’ll wrap up by giving you the details of the entire process, and links to previous posts where I discuss those particular issues.

I used 2005-2007 American Community Survey (ACS) date from the US Census Bureau, and mapped the data at the PUMA level. I had to aggregate and calculate percentages for the data I downloaded, which required using a number of spreadsheet formulas to calculate new margins of error; (MOEs). I downloaded a PUMA shapefile layer from the US Census Generalized Cartographic Boundary files page, since generalized features were appropriate at the scale I was using. The shapefile had an undefined coordinate system, so I used the Ftools add-on in QGIS I converted the shapefile from single-part to multi-part features. Then I used Ftools to join my shapefile to the ACS data table I had downloaded and cleaned-up (I had to save the data table as a DBF in order to do the join). Once they were joined, I classified the data using natural breaks (I sorted and eyeballed the data and manually created breaks based on where I thought there were gaps). I used the Color Brewer tool to choose a good color scheme, and entered the RGB values in the color / symbology screen. Once I had those colors, I saved them as custom colors so I could use them again and again. Then I used Ftools to create a polygon centroid layer out of my puma/data layer. I used this new point layer to map my margin of error values. Finally, I went into the print composer and set everything up. I exported my maps out as PNGs, since this is a good image format for preserving the quality of the maps, and as PDFs.

Updated Links for Data and Resources

Saturday, April 25th, 2009

I recently went through my pages of suggested links for data and resources to update and clean them up. I’ve included many of the cool resources I’ve discovered since I started writing this blog, which ended up in individual posts but not in these pages. I went over the resources page in particular, to try and classify the reference materials, tools, and software into useful categories rather than just having one large blob of stuff.

Transform Projections with GDAL / OGR

Tuesday, April 14th, 2009

The GDAL / OGR tools are an open source, cross platform, command-line toolkit that can be used for viewing GIS metadata, performing attribute queries, and converting file formats, among other things. It can also be used for transforming coordinate systems and projections for GIS files. I’ll demonstrate in this brief tutorial how to accomplish this using the OGR tools, which are for vector based GIS. The raster based GDAL tools work in a similar fashion.

Viewing basic coordinate system / projection info:

ogrinfo -al -so world_wgs.shp

Where ogrinfo is the name of the tool, -al is a switch to get detailed info about the layer, -so is a switch to display summary info, and world_wgs.ship is the name of our file. Run that command and we’ll get something that looks like this, with info about the features, coordinate system, and attribute fields of our shapefile:

INFO: Open of `world_wgs.shp’
using driver `ESRI Shapefile’ successful.

Layer name: world_wgs
Geometry: Polygon
Feature Count: 243
Extent: (-179.808664, -89.677397) – (179.808664, 83.435942)
Layer SRS WKT:
GEOGCS["GCS_WGS_1984",
DATUM["WGS_1984",
SPHEROID["WGS_1984",6378137,298.257223563]],
PRIMEM["Greenwich",0],
UNIT["Degree",0.017453292519943295]]
CNTRY_NAME: String (254.0)
FIPS_CNT_1: String (254.0)
ISO_2DIGIT: String (254.0)
ISO_3DIGIT: String (254.0)
STATUS: String (254.0)
COLORMAP: Real (18.6)
CONTINENT: String (254.0)
UN_CONTINE: String (254.0)
REGION: String (254.0)
UN_REGION: String (254.0)

Convert coordinate systems supported by EPSG

GDAL / OGR and most of the open source GIS software supports projections and coordinate systems that are part of the EPSG library. If you want to do a conversion between two coordinate systems and they are both supported by EPSG, you just have to reference the EPSG code that’s used to identity the system that you want to project to. You can look up codes using spatialreference.org.

Let’s say we want to convert our shapefile that’s in WGS 84 (common lat and long) to NAD 83 (used frequently in North America):

ogr2ogr -t_srs EPSG:4269 world_new.shp world_wgs.shp

Where ogr2ogr is the name of the tool, -t_srs is the command for transforming from one coordinate system to the other, EPSG:4269 is the code that identifies the coordinate system we want the new file to have – NAD83, world_new.shp is the name of the output file that will have the new projection that we want, and world_wgs.shp is our input file. If you run the command and get no error message, you’re in good shape. Just run the ogrinfo command on the new file to verify that it’s been re-projected.

Convert coordinate system not supported by EPSG

The EPSG library is extensive, but doesn’t contain everything, particularly some global and continental map projections. GDAL / OGR can still do the job, but you’ll have to provide the tool with the proper frame of reference since the EPSG library doesn’t have the info. Let’s say we want to project our WGS file to the Robinson Projection, which is not part of EPSG.

First, go back to spatialreference.org and search for Robinson. Its ID code is ESRI 54030 – not part of the EPSG library. Click on the link for the projection to open its window. You’ll be able to look at the projection data in a number of standard file formats. Select OGC_WKT from the list, and it will open the text in a new window, showing you the parameters of that projection. In your browser, go up to file, save as, and save the file as robinson_ogcwkt.txt in the same directory as the shapefile you want to reproject.

Now that you have the projection info stored in the text file, run the following command to make the conversion:

ogr2ogr -t_srs robisnon_ogcwkt.txt world_rob.shp world_wgs.shp

It’s the same command as our previous one, except that you’re referencing the text file with your data instead of an EPSG code.

Define an undefined coordinate system

If you run the ogrinfo command and your coordinate system is undefined, you should define it before doing anything else, and you must define an undefined projection before converting to another projection. Look at the metadata that came with you file or go back to the source to figure out what it is. For example the US Census Bureau Generalized Cartographic Boundary Files for 2000 are in NAD83 according to their metadata, but the files lack a projection definition.

To define one, use the following command:

ogr2ogr -a_srs EPSG:4269 states_nad83.shp states_unknown.shp

The only difference here is the -a_srs command is used to assign a coordinate system to a file – the rest of the parameters are the same. If you’re defining a non-EPSG projection, use the same method from the previous example – download a definition file from spatialreference.org and use the file name in place of the EPSG code.

More help and where to download:

UC Santa Barbara NCEAS and the UC Davis Soil Lab both have short tutorials and sample commands of GDAL / OGR.

If you want to thumb through the world’s map projections, the folks at radicalcartography have a nice projection reference page with visuals and brief descriptions.

Visit the GDAL / OGR page for downloading, or if you’re a Windows or Mac user, you can download QGIS and GDAL / OGR together from the QGIS download page. Linux users can get GDAL / OGR via your package handler – depending on your distro, you may have it already.

QGIS: Data Defined Labeling and Table Joins

Saturday, March 7th, 2009

A little while ago I posted a text file with geographic centroids (centers) for each of the world’s countries. The reason why I put this together was that I wanted to test the data defined labeling features in QGIS. While automatic labeling in QGIS isn’t so hot (overlapping labels, multiple lables for each polygon), there are some powerful features for storing and referencing columns for annotation within the attribute table of shapefiles. One of the neat features is the ability to place labels based on coordinates stored in the attribute table.

The first step was to take the centroids file and join in to a shapefile of the worlds countries based on a common ID field, in this case FIPS country codes. QGIS doesn’t support table joins directly, but you can accomplish this with a good plugin called fTools, which includes a lot of additional and useful features. The instructions for getting fTools up and running are available on the fTools website; the installation doesn’t require you to download any files, you just handle everything through the QGIS plugin manager (if you have trouble seeing the plugin manager or getting fTools to appear, check to make sure that you have python installed on your machine). Once fTools is up and running, you’ll see a Tools dropdown menu next to your other menus – drop it down, select data management tools and join attribute tables. You’ll get a dialog box asking which shapefile and field you want to join and which shapefile or table you want to join to it. The plugin only supports joins from other shapefiles and dbf tables, so you have to save the save the country centroids text file as a dbf before you do the join (you can do this in Calc or a pre-2007 version of Excel). These aren’t dynamic joins; fTools will create a new shapefile with the table fields attached.

Once the join is complete, you can add the new shapefile with the new fields, click on the layer, and navigate to the labels tab. Hit the checkbox to turn the labels on, select the field that contains the label in the dropdown box at the top, then select data defined position from the menu below. You’ll see a new series of dropdowns on the right, and you can select your longitude column for the X coordinate and latitude column for the Y coordinate. Hit OK, and voila! You’ll have labels that are centered in the middle of each country.

Of course, the label placement will not be perfect in every case. There will be label overlap in areas with small countries, areas with many countries clustered together, and with countries that have long names. The scale and size of the font will also be a factor, and placing the country name in the center is not always ideal for small island nations. However, you can easily change the label placement by going into an edit mode and changing the coordinates in the attribute table to get optimal placement. You can mouse over the map and use the coordinate information that’s displayed beside the scale in the lower right-hand corner of the window to determine which coordinates are most optimal for a given situation. If you produce several maps at the same area and scale, you can use the same settings over and over again. You can also globally change the placement of all the labels using some of the other label options, such as placing all labels above or to the top-right of the centroid.

Now in order for all of this to work, the coordinates in the country centroid file must be in the same coordinate system as the shapefile. Since the country centroid file uses basic latitude and longitude, I was able to do this with a shapefile that was in the basic WGS 84 geographic coordinate system. If you’re using a different geographic coordinate system or a projected coordinate system, you’ll have to convert the coordinates in the centroid file to match that system. I haven’t delved into this too deeply yet, but there are a number of free tools that you can download that should do this – one of them is called GEOTRANS, and it’s available for free download from the NGA. It can handle batch transformations of coordinate data stored in text files, and supports conversions to several different geographic and projected systems.

QGIS Label Placement With XY Coordinates

QGIS Label Placement With XY Coordinates


Copyright © 2012 Gothos. All Rights Reserved.
No computers were harmed in the 0.335 seconds it took to produce this page.

Designed/Developed by Lloyd Armbrust & hot, fresh, coffee.