Joining CSV Files in QGIS

DBF files are the other thorny issue that comes up when I’m teaching the Intro to GIS workshops with QGIS – specifically how do you create and edit them? Doing that has been pretty inconvenient in the Windows world since they were deprecated in Excel 2007. You can download plugins or basic stand-alone software to do the job. Since I’m a Linux user I have the Open Office suite by default, and I can easily work with DBFs in Calc. That’s an option for Windows and Mac users too, but it’s kind of a drag to download an entire office suite just for working with DBFs (assuming most folks are use MS Office).

Another possibility is to dump DBF all together; even though the join table option in the fTools menu in QGIS only presents you the option of joining shapfiles or DBF, it turns out if you choose the DBF radio button option you can actually point to DBFs OR CSV files when you’re browsing to point to your data table. The join proceeds the same way, and you get a new shapefile with the data from the CSV joined to it. CSVs can be easily created from any spreadsheet, text editor, or database program in any operating system, so you can prep your attribute data in your program of choice before exporting to CSV and importing to GIS.

The problem with this approach is that all of the fields from the CSV file are automatically saved as text or strings when they’re appended to the shapefile. This means that any numeric data you have can’t be treated numerically; you can’t perform calculations or classify data as value ranges for mapping. You can go into the attribute table, enter the edit mode, and use the field calculator to create new fields where you convert the values to integers, but this adds a bunch of duplicate fields and is rather messy. You can’t delete columns from shapefiles within QGIS; you have to edit the attributes outside the software to remove extra columns.

Here’s an easy work around: you can create a CSVT file to go along with your CSV file. Open a text editor like Notepad or gedit and create a one line file where you specify the data type for each of the fields in your CSV file. Save the CSVT with the SAME NAME as your CSV file. Now when you go to join your CSV to your shapefile in QGIS, it will read the CSVT and save all of your fields properly in the new shapefile.

So for a CSV file like this with six fields:

CODE, ST, ID_NAICS51, EMP_51, ID_TOTAL, TOTAL_EMP
01, AL, ENU0100010551, 27013, ENU0100010510, 1570188
02, AK, ENU0200010551, 6988, ENU0200010510, 237708
04, AZ, ENU 0400010551, 41833, ENU0400010510, 2174919 …

You’d create a CSVT file like this:

“String”, “String”, “String”, “Integer”, “String”, “Integer”

So the 4th and 6th fields that have numeric values are saved as integers. There are a few other field types you can use, like Real for decimal numbers and some Date / Time options, and you can specify length and precision – see this post for details.

Using shapefiles, CSVs, and CSVTs are fine for small to medium projects and for the introductory workshops I’m teaching; geodatabases are another option and are certainly better for medium to large projects, but introducing them in my intro workshop is a little too much.

(NOTE – in QGIS 1.7 the join tool has been dropped from the ftools menu. To join a csv or dbf in 1.7+, you have to add your data table as a vector to your map, and then use the join tab within the properties menu of the feature you want to join it to. See this post for details.)

Tags: , , , , , , ,

Comments are closed.


Copyright © 2014 Gothos. All Rights Reserved.
No computers were harmed in the 0.346 seconds it took to produce this page.

Designed/Developed by Lloyd Armbrust & hot, fresh, coffee.