A brief note – I’ve updated and replaced the country centroids file that I was previously hosting. I extracted data with geographic centroids in latitude and longitude for each country and dependency in the world using extracts from the NGA’s GNS and the USGS GNIS. Data is current as of Feb 2012, with long and short names for countries and two letter alpha FIPS and ISO codes for identification and attribute linking. Available for download on the Resources page.
Posts Tagged ‘longitude’
A little while ago I posted a text file with geographic centroids (centers) for each of the world’s countries. The reason why I put this together was that I wanted to test the data defined labeling features in QGIS. While automatic labeling in QGIS isn’t so hot (overlapping labels, multiple lables for each polygon), there are some powerful features for storing and referencing columns for annotation within the attribute table of shapefiles. One of the neat features is the ability to place labels based on coordinates stored in the attribute table.
The first step was to take the centroids file and join in to a shapefile of the worlds countries based on a common ID field, in this case FIPS country codes. QGIS doesn’t support table joins directly, but you can accomplish this with a good plugin called fTools, which includes a lot of additional and useful features. The instructions for getting fTools up and running are available on the fTools website; the installation doesn’t require you to download any files, you just handle everything through the QGIS plugin manager (if you have trouble seeing the plugin manager or getting fTools to appear, check to make sure that you have python installed on your machine). Once fTools is up and running, you’ll see a Tools dropdown menu next to your other menus – drop it down, select data management tools and join attribute tables. You’ll get a dialog box asking which shapefile and field you want to join and which shapefile or table you want to join to it. The plugin only supports joins from other shapefiles and dbf tables, so you have to save the save the country centroids text file as a dbf before you do the join (you can do this in Calc or a pre-2007 version of Excel). These aren’t dynamic joins; fTools will create a new shapefile with the table fields attached.
Once the join is complete, you can add the new shapefile with the new fields, click on the layer, and navigate to the labels tab. Hit the checkbox to turn the labels on, select the field that contains the label in the dropdown box at the top, then select data defined position from the menu below. You’ll see a new series of dropdowns on the right, and you can select your longitude column for the X coordinate and latitude column for the Y coordinate. Hit OK, and voila! You’ll have labels that are centered in the middle of each country.
Of course, the label placement will not be perfect in every case. There will be label overlap in areas with small countries, areas with many countries clustered together, and with countries that have long names. The scale and size of the font will also be a factor, and placing the country name in the center is not always ideal for small island nations. However, you can easily change the label placement by going into an edit mode and changing the coordinates in the attribute table to get optimal placement. You can mouse over the map and use the coordinate information that’s displayed beside the scale in the lower right-hand corner of the window to determine which coordinates are most optimal for a given situation. If you produce several maps at the same area and scale, you can use the same settings over and over again. You can also globally change the placement of all the labels using some of the other label options, such as placing all labels above or to the top-right of the centroid.
Now in order for all of this to work, the coordinates in the country centroid file must be in the same coordinate system as the shapefile. Since the country centroid file uses basic latitude and longitude, I was able to do this with a shapefile that was in the basic WGS 84 geographic coordinate system. If you’re using a different geographic coordinate system or a projected coordinate system, you’ll have to convert the coordinates in the centroid file to match that system. I haven’t delved into this too deeply yet, but there are a number of free tools that you can download that should do this – one of them is called GEOTRANS, and it’s available for free download from the NGA. It can handle batch transformations of coordinate data stored in text files, and supports conversions to several different geographic and projected systems.
I just added a new resource and updated another one on the resources page. I put together a file that contains the centroids (geographic centers) of all of the countries in the world, plus a few territories and dependencies. The centroids are in latitude and longitude coordinates based on WGS 84 in two formats: decimal degrees and degrees / minutes / seconds. It’s a tab delimited text file that you can open or import into any spreadsheet or database program. Each record is uniquely identified by a FIPS 10 code.
I downloaded most of the data from the NGA’s GeoNames Server (GNS). I blogged about the GNS awhile back, pointing out that you could query this gazetteer for individual places or you could download files that have all the features for each country in the world. While it took some time to figure out, you can actually take a middle road and query the database for specific categories of features that you can download. I used the text-based search and the links on the left side of the screen actually open different input boxes that you can use to query or exclude data. I managed to query top-level administrative units (countries) and to exclude most variant country names. After I downloaded the file, I still had to go in and do some clean-up, and I had to go back and get countries I missed by hand – these were mostly dependencies and territories that were excluded based on the search I did (Greenland, French Guiana, Netherlands Antilles, and a number of others).
Then I realized that the GNS excludes the United States and all of its territories. So, I went over to the USGS Geographic Names Information Service (GNIS) and grabbed the data for the US territories. The GNIS is simpler to navigate and you can download records pretty easily. They didn’t have a record for the United States as a whole, so I had to go over to the Census Bureau to get coordinates for the US centroid.
I brought all of these records into one file and placed it on the resources page for download, along with some metadata to describe it. Why would you want to use this stuff? You can use if for basic distance calculations, or as a annotated label field for label placement in GIS. More about that in my next post.
I also updated the country code cross-reference file that I took from the CIA World Factbook. You can use this as a bridge table to relate tables that use different identifiers. So if you wanted to join the fips-based centroid file to an iso-based shapefile of countries, you can join the centroids to the bridge first based on fips, and then that new table to the shapefile based on iso.
Sorry that November has been another crummy month for posts. Here’s one that I’ve been meaning to write for quite awhile.
While there is a lot of free GIS data out there, one of the black holes is business data. Specifically, if you want to plot all of the businesses in one industry or all of the branches or locations of one company, where do you get the data? I’ve found that, if you need a comprehensive resource, this is one of those datasets that you have to pay for.
At our library we subscribe to a great business directory called ReferenceUSA, which is produced by company called InfoUSA. Their directories of American and Canadian businesses are extremely comprehensive and cover every business large an small. They also have an international directory that has mid-size to large businesses. You can generate lists of businesses using several criteria and filters.
For places, you can specify the entire country, states, counties, places, or ZIP codes. You can get generate lists based on company names, keywords, or NAICS codes to grab all of the businesses in one industry. Once you have your list, you can click on each individual business to get a detailed profile. For GIS purposes, you’ll want to use the download option. Depending on your subscription, you’ll be able to download only a certain number of records at a time (we can get 25 records per download). Just download as a csv file, save, open in a spreadsheet, then start downloading subsequent batches and start copying and pasting records in a master file.
When you go to download, you’ll be prompted to choose basic, detailed, or custom. Basic isn’t going to cut it, as it’s missing the key fields – latitude and longitude coordinates. Choose the detailed option to get all of the fields. The custom option has some bugs – you’ll get lat and long without decimal places and some of the data for fields will be missing. Once you have all of the detailed records, you can delete a lot of the unecessary fields. You’ll want to, as many of the field headings are not database friendly – many are long and contain spaces, which will cause problems when you go to import the table into GIS. So be sure to delete any that you don’t need and fix the ones you do need.
Once you have your table ready, add it to your favorite GIS program. In ArcGIS you can use the Add XY Table feature to plot the points and turn them into a shapefile. Remember to specify the X coordinate as your longitude field and the Y coordinate as latitude, and define your geographic coordinate system as WGS 84. Once you plot them, right click on the feature in the Table of Contents and export them out as a shapefile so you have a permanent layer (see my previous XY post for more details). You can map the businesses as regular old points, or make some graduated symbols based on some of the attributes, like sales or total employees (ReferenceUSA doesn’t provide the exact data, but identifies a range, i.e. 1 to 10 employees, 11 to 25, etc).
Most of the open source alternatives also have a tool or plugin that allow you to plot XY data. Of course, the data does include address fields if you wanted to geocode your points rather than plot XY (but plotting XY is a million times easier and doesn’t require downloading huge street network files).
The good news here is that if you’re not affiliated with a university, you can probably get access to this db from a large public library, as many will have a subscription to a business directory as a matter of course. If they don’t have RefUSA they may have an alternative like the D and B Million Dollar Database. It’s another business directory that allows you to download XY data for businesses, but it is not nearly as comprehensive.
Sorry that October has obviously been a pretty weak month for posts. I’ve been driven to distraction lately and haven’t done much GIS related work.
I was working on a project this week that involved manipulating data tables, so I thought I’d share a couple tips here. A number of months ago I wrote a post about manipulating FIPS codes and text-based ID fields. But what if you have to manipulate numeric fields? Adding decimal places, zeros, etc? The answer is – math!
In one field, I had a population figure from the 1970 Census that had been rounded to the hundreds place, so it was listed like this:
BronxÂ Â Â 14718
I wanted to make this a little more explicit by adding the appropriate zeros, so in Excel (or Calc if you prefer) I created a formula to multiply this by 100 =(c2*100) to get the full number with zeros:
BronxÂ Â Â 1471800
I also had fields with latitude and longitude coordinates in decimal degrees, but they lacked decimal points. The longitude field also lacked the minus sign, which means if we plotted the points they would end up in Asia instead of North America (longitude east of the dateline and west of the prime meridian is notated as negative in decimal degrees, as is latitude south of the equator). I knew from the metadata that each coordinate pair was precise to four decimal places, and I knew all of my points were in North America. So I created a formula where I took the latitude and divided by ten thousand =(c3/10000) and took the longitude, divided by ten thousand and multiplied by -1 =((c4/10000)*-1). Here’s the before and after:
BronxÂ Â Â 408492Â Â Â 738800
BronxÂ Â Â 40.8492Â Â Â -73.8800
Some of this may seem pretty obvious, but if you’re used to working with text-based ID fields all of the time (like I am), it’s easy to forget that all you need is simple math to fix number fields.
The last step I took was to check for null values. A few of my data points had 0,0 listed for lat and long, because coordinate data was missing for those particular places. The problem is that 0 IS a value! If we plotted this data, these points would show up where the equator and prime meridian meet below western Africa. You have to represent “no data” as a blank value or null, and not as a zero. I fixed those, plotted, and was good to go.
Here’s a tutorial I’ve been meaning to write: adding a table of longitude and latitude coordinates to ArcMap and turning them into features. For this example, I’ll be using place names from the GEOnet Names Server country files. The US National Geospatial Intelligence Agency has a pretty extensive list of geographic features for each country, with coordinates in many formats, including longitude and latitude in decimal degrees. I’ll use Botswana in southern Africa as an example, as it has a small record set and because I have some admin boundaries handy that I’ve downloaded from SAHIMS.
- Download the file from the GNS and unzip it. It is a tab-delimited text file. If you like, you can open it in Excel or another spreadsheet to see what it looks like. This works fine for this example, but won’t work for larger or more populated countries because the files will exceed the maximum number of records that a spreadsheet can handle (65k). You’ll need to import the file into a database (Access for example) if you want to take a look in those cases. In either event, you’ll be able to add the text file directly to ArcMap, so no worries.
- Open ArcMap and under the Tools menu, select Add XY Data. In the dialog box, you’ll select the file that contains your XY coordinates. Choose the text file you’ve downloaded. ArcMap will then search through the fields and look for appropriate ones to add as X and Y fields. In this case, it should correctly choose LONG for X and LAT for Y. If Arc couldn’t figure it out, you would have to specify which columns have the coordinates. Longitude is ALWAYS the X coordinate, and Latitude is ALWAYS the Y. Finally, you’ll select a projection. Choose the standard geographic coordinate system WGS 1984, which is usually a safe bet when adding long/lat data from most sources.
- Hit OK, and Arc will plot the coordinates (after you click through the warning message). In this example, it looks like there is one wayward point, way to the north. When you see something like this, it often means that one of the coordinates is missing a minus sign: latitudes below the equator are negative, as are longitudes east of the international date line and west of the prime meridian. If you use the identity tool, you’ll see that the minus sign for latitude for this wayward point is missing. The easiest thing to do would be to go back into the text file, edit it, and add it to ArcMap again.
- Even though Arc has plotted the points, they still don’t exist as features (remember the warning message? That’s essentially what it was saying). Select the plotted points in the Table of Contents, right-click, select Data, and select Export. Export the points out as a new shapefile or a feature class in a geodatabase. Then add the new features to the map.
- At this point, it may be helpful to have a frame of reference for all of these points. Get your hands on some administrative layers, like country boundaries. I downloaded the outline of Botswana from SAHIMS. This step usually requires projecting and reprojecting, as you’ll need to get your points layer to match the projection of the other files you’re working with. I always use the ArcToolbox within ArcCatalog to fiddle with projections and then add the finished files to a new, blank map in ArcMap. In my case, the Botswana boundary was undefined – I had to consult the metadata from their website to figure out what the projection is (NAD 1927) and then define it using the ArcToolbox (Data Management Tools, Projections and Transformations, Define Projection). Then, I had to convert the Botswana points layer from WGS 1984 to match the boundary’s NAD 1927 projection (using Data Management Tools, Projections and Transformations, Feature, Project).
- Add the projected boundary and reprojected points to your map. Many of these points are point features (villages, towns, farms, mountain peaks), while others represent the geographic centers of lines (roads, rivers) or areas (administrative areas, parks, reserves). You’ll probably want to extract certain kinds of features. At this point, you’ll want to take a look at the attribute table for the points file and consult the NGS description for the names files. The description will tell you what each of the data columns represents and what all of the codes mean. The FC field will come in quite handy here, as it designates categories for each feature. So if we wanted to extract populated places, under the Selection Menu in ArcMap we could do a Select by Attribute where the field FC is equal to P, which is the code for populated place features. Once they are selected, you can do a Data, Export to create a new shapefile with just those features.
- Alternatives do abound here. If you prefer, you could do a lot of the work of editing and creating feature subsets within a geodatabase. You can also follow these same, general procedures using open source tools (I believe that QGIS has a tool for adding XY data). And while we’re discussing a specific example here, the same basic steps would apply for any XY dataset.