SpatiaLite and QGIS: Loading, Joining, Mapping Shapefiles and Tables
I stuck with with the Long Term Support Version of QGIS (1.02) last semester while I was teaching, but now I finally have had a chance to experiment with the latest version (1.4) which has a lot of great new features including: improved symbolization, labeling, print layouts, and support for SpatiaLite – a personal (single file) geodaatbase based on SQLite. For a summary of the new QGIS features check out the QGIS blog and this developer’s blog, and for an overview of SpatialLite you can go to the official docs page and this tutorial. The latter will show you the obvious strengths of SpatialLite – the ability to store features and attributes in one container, with the ability to run standard SQL and spatial queries on both. Since that’s covered pretty well, I thought I’d run through a basic operation – how do you load a shapefile and an attribute table in SpatialLite, join them, connect to the database in QGIS and map the data. I’m using the SpatialLite GUI, but for those more inclined you could use the command line tool instead.
Fire up the GUI, and create a new, empty geodatabase under the File menu.Once we have a container, we can hit the load a shapefile button. I have a census PUMA layer for NYC that I’ve formatted by erasing water features. Click load, go to the path, give the file a nice brief name, and specify the SRID – the EPSG code that specifies what coordinate system my shapefile is in. In this case, it’s 4269 as the layer is in NAD83 (you can check your files by opening the prj file in a text editor or by using the OGR tools).
Once it’s loaded, you can expand the listing in the table of contents to see all the field names of the feature, and you can right click on it and choose the edit option to see all of the data in the attribute table.
Next we can load a data table. I have a 2006-2008 ACS census table in tab-delimited text format that I’ve pre-formatted. The table has the number of workers (labor force age 16+) and number of workers who commuted to work via the subway for every PUMA in the State of New York (it’s faster to download the whole state and filter out the city PUMAs later). Hit the load txt/csv button, specify a path, a new table name (subway_commuters), the delimiter used, and load the table. It’s given a different icon in the table of contents (toc), to distinguish a regular data table from a feature class.
The next step is to join them together; I already insured that they both share a common, unique identifier; a FIPS code that has a state and PUMA code. If I run a standard SELECT query I can join the tables in a temporary view – but that’s not what I want. I can save the query as a view, but I won’t be able to access the view within QGIS (at least not with this current stable version of SpatialLite, 2.31). What we have to do here is create a brand new table that combines both the puma feature class and the subway commuter table (referred to in Microsoft Access land as a Make Table Query). Here’s the SQL that we type in the command window:
CREATE TABLE pumas_nyc_subcom AS
FROM nyc_pumas, sub_commuters
Execute the query, and we get a message that an empty results set was generated. Uh, ok. But then if we select the database path at the top of the TOC , right-click, and refresh, we’ll see our new combined table, pumas_nyc_subcom, and we can expand it and take a look at the data. The join worked, but we’re not done yet. Right now this is just a regular old data table (notice the icon?) We have to turn this into a feature class next.
Expand the fields for the new table in the TOC, select the Geometry field, right click, and check the geometry. We’ll see that it’s MULTIPOLYGON geometry, the projection is still NAD83, and there are 55 features (the non-NYC PUMAS were filtered out during the join, leaving us just with NYC data). Right click on Geometry again, choose the option to Recover Geometry. Specify the geometry type and the SRID, run, refresh the database, and success. A little globe appears next to pumas_nyc_subcom, indicating that it’s now a feature class.
At this point we can fire up QGIS. In the toolbar for versions post 1.02, there should be a connect to SpatialLite button. Hit connect, add a New database, and browse to get to it. Once it’s loaded, then we can hit connect to connect to it, and we’ll be able to see all feature classes (but NOT data tables, which is why we had to go through the join). Select pumas_nyc_subcom, which has features and data, and click add.
As with any GIS, now we have to symbolize the features to map the subway commuters. Right click on the layer in the table of contents, select properties, and you’ll get to the recently redesigned properties menu. Go to Symbology, map the subway commuters field by graduated values, change some colors, and voila, a map!
The developers are still experimenting with improvements – there’s a button in the upper right-hand corner of the symbology tab that asks you if you want to try the New Symbology – this is a new layout, with the introduction of graduated color palettes. It’s pretty slick, but still a work in progress (color ranges are assigned from dark to light, with the lowest values getting the darkest color; the opposite of cartographic convention). The same label properties are there too, but you can experiment with the improved labeling engine under the Plugins menu. The automatic placement of labels is vastly improved.
Mapping totals for subway commuters isn’t as interesting as mapping the percentage of commuters in each PUMA who ride the subway. So I’ll share my experiments working with calculated fields (in SpatialLite and QGIS) in my next post.