Formulas for Working With Census ACS Data in Excel / Calc

After downloading US census data, you often need to reformat it before using it. It’s quite common that you download files where the population is broken down by gender and age, and you need to aggregate the data to get a total or divide a particular characteristic to get a percent total. This is pretty straightforward if you’re working with decennial census data, but data from the American Community Survey (ACS) is a little trickier to deal with since you’re working with estimates that have a margin of error. When creating new data, you also have to calculate what the margin of error is for your derived numbers. I’ll walk through some examples of how you would do this in a spreadsheet (the formulas below will work in either Excel or Calc).

Creating an Aggregate

We’ll use the following data in our example:

screenshot1

We have the total population of people three years and older who are enrolled in school, and a breakdown of this population enrolled in grades 1 through 4 and grades 5 through 8 in a few counties in New York, with margins of error for each data point. Our data is from the 3 year averaged 2005-2007 American Community Survey.

Let’s say we want to create a total for students who are enrolled in grades 1 through 8 for each county. We create a new column and sum the estimates for each county with the formula e3+g3, or sum(e3:g3).

To calculate a margin or error (MOE) for our grade 1 to 8 data, first we have to use the find and replace command to get rid of the “+/-” signs in the MOE column, so our spreadsheet will treat our values as numbers and not text (this is an issue if you downloaded the data as an Excel file - if you download a txt file the +/- is not included). Depending on the dataset you’re working with, you may also need to replace dashes, which represent data that was null or not estimated.

Once the data is cleaned up, we can insert a new column with this formula:

=SQRT((F3^2)+(H3^2))

This calculates our new margin of error by squaring the moes for each of our data points, summing the results together, and taking the square root of that sum. In other words,

=SQRT((MOE1^2)+(MOE2^2))

Once that’s done, you may want to round the new MOE to a whole number.

Creating a Percent Total

Let’s calculate the percentage of the population 3 years and older enrolled in school that are in grades 1 through 8. Based on what we have thus far (I hid the columns E,F,G, and H for grades 1-4 and 5-8 in this screenshot, as we don’t need them):

screenshot-2

We insert a new column where we divide our subgroup by the total, as you would expect - I3/C3. In the next column we insert the following formula to create a MOE for our new percent total:

=(SQRT((J3^2)-((K3^2)*(D3^2))))/C3

This one’s a little weightier than our last formula. We’re taking the square of our percent total (K3) and the square of the MOE of the total population (D3), multiplying them together, then subtracting that number from the square of the MOE of our subgroup (J3). Then we take the square root of the whole thing, then divide it by our total population (C3). If you’re saying - HUH? Maybe this is clearer:

=(SQRT((MOEsubset^2)-((PercentTotal^2)*(MOEtotalpop^2))))/TotalPop

Finally, we have something like this:

screenshot-3

Based on our data, we can say things like “There were approximately 30,556 students enrolled in 1st through 8th grade per year in Dutchess County, NY between 2005 and 2007, plus or minus 1,184 students. An estimated 37% of the population enrolled in school in the county was in the 1st through 8th grade, plus or minus 1%.” The ACS estimates have a 90% confidence interval.

Wrap Up

In this example we worked with aggregating and calculating percentages based on characteristics. We could also use these same formulas to aggregate data by geography, if we wanted to add the characteristics for all the counties together.

For the full documentation on working with ACS data, take a look at the appendix in the Census’ ACS Compass Guide, What General Data Users Need to Know. It provides you with the formulas in their proper statistical notation (for those of you more mathematically inclined than I) and includes formulas for calculating other kinds of numbers, such as ratios and percent change. It does provide you with worked-through examples, but not with spreadsheet formulas. I used their examples when I created formulas the first time around, so I could compare my formula results to their examples to insure that I was getting it right. I’d strongly recommend doing that before you start plugging away with your own data - one misplaced parentheses and you could end up with a different (and incorrect) result.

2007 Economic Census

The US Census Bureau has begun releasing data for the 2007 Economic Census. The bureau conducts the survey of businesses every five years - all medium to large size businesses and multi-part businesses are counted, samples are taken for smaller businesses, and various administrative records are used to calculate businesses with no employees (i.e. freelancers). All businesses are categorized hierarchically by North American Industrial Classification System (NAICS) codes and the data is reported by industry and geography. The number of establishments, employees, payroll, and sales are counted for the nation, states, counties, metro areas, places, and zip codes.

At this point national industry totals for the broadest categories of NAICS are available, as are preliminary numbers for the most specific NAICS categories (six digit) at the national level. Data for smaller geographic areas will be released between October 2009 and August 2010.

The biggest change from the 2002 Economic Census is the delivery method for the data. There will be no more 90 page pdf files or HTML tables that drill down six levels. All of the data will be released via the American Factfinder only. Other changes include the addition of some new geography (CDPs with at least 5000 people), new metro area definitions, and the revised 2007 definitions for NAICS which include small changes to the Finance, Insurance, Real Estate, Professional Services, and Administrative Services categories.

Additional changes for 2007, the data release schedule, NAICS codes, and methodology docs are all available at the 2007 Economic Census homepage within the Census Bureau’s website.

All of the data is aggregated by industry and geography - you cannot get lists of businesses with names and addresses as this information is kept confidential. Furthermore, to maintain confidentiality, if one company controls a large share of the market for a specific sector within a specific geographic area, or if there few businesses within a sector in a specific geographic area, much of the data (with the exception of the number of businesses) remains classified (marked with a D for disclosure). Oftentimes this means that data for industries within small areas (big box retail in a small town) and data for industries with few establishments in an area (mining establishments in New York City) are hidden. The smaller the geography, the more likely it is that the data will not be disclosed. This becomes a technical issue if you want / need to move this data into a database, as these pesky disclosure notes are stored in the same columns as the data and prevent you for designating the fields as numeric.

Given the delay between the time the data is collected and the time it is released, it isn’t particularly helpful for analyzing our current economic climate, but it does provide a snapshot of the way the US economy looked at that moment, and is useful in understanding how the economy is evolving. Be aware that when making comparisons to past data, you have to correct for changes in geography and NAICS definitions. The differences between 2002 and 2007 are not too great, but more adjustments are necessary as you go further back in time. The Bureau provides data back to 1997 through the American Factfinder and some data from 1992 on an older page. If you need to go back further, you’ll be entering the realm of (gasp!) CD-ROMs or the paper reports.

Creating a New Shapefile in ArcGIS: Part II

In my previous post I gave an overview of how to create a shapefile from scratch, where we created a point layer to identify places and neighborhoods in NYC. In this post, I’ll pick up where we left off.

Whenever you create new features in a shapefile, ArcGIS automatically adds a couple of fields, including an auto-number ID field that uniquely identifies each feature. This was sufficient for our example as the 291 place names we were working with do not have a standard ID number that represents them. If we were creating features that did have a recognized ID number or code, we certainly would want to add an additional field to hold that number. This would allow us to share and relate our data to other datasets that use that conventional ID. For example, if we had a layer with the 50 states, we would want to have a FIPS number or the two digit postal code for each state in the attribute table, so we could relate our states feature to the zillions of other state-based data tables out there that also use these codes.

It’s also helpful to add other identifiers to relate our place names to some larger geographic area. Why? Let’s say we want to filter our neighborhoods by borough - perhaps we just want to label neighborhoods in Manhattan or calculate distances only between places that appear in the Bronx. It would be useful to have a borough code or some other code associated with each of our place names for running queries.

scrnshot6As it turns out, the City of New York does use a standardized system of three digit codes to identify all boroughs and community districts in the city. In our example, the code for Manhattan Community District 12, which contains Inwood and Washington Heights, is 112. The first digit identifies the borugh and the second two digits identify the district. It would be a good idea to assign each of our neighborhoods this district code, so we could filter our features by either borough or district.

When we create each feature, we could manually type in the code in it’s own field just like we added the neighborhood names, but that would be rather tedious - and unnecessary. A better choice would be to do a spatial join. Whereas a “regular” join allows us to join attribute tables based on a common ID field, a spatial join allows us to assign attributes to one layer based on their geographical relationship to another layer.

scrnshot7In the Table of Contents, right click on the neighborhoods layer and choose Joins and Relates - Joins. We’ll get the familiar Join dialog box. However, if you hit the first drop down box that says Join Attributes From a Table and choose Join Data Based on Spatial Location, we’ll get the options for doing a spatial join. Choose the community districts as the layer to join to the neighborhoods, and since we’re joining points to polygons we’ll choose parameters that are relevant for relating these two features. In this case, give each point (neighborhood / place) the attributes of the polygons (districts) that it falls inside. ArcGIS will create a new point layer with the joined fields when you hit OK. Open the attribute table of the new point layer, and you’ll see the additional fields, including the community district numbers. You’ll also get some rather useless fields from the district layer, like the length and area of each district, which you can safely delete.

So instead of tediously entering these numbers by hand for each neighborhood, we simply run the spatial join process once (after we’ve finished adding the points for all 291 neighborhoods) and the IDs are automatically added.

Creating a New Shapefile in ArcGIS: Part I

I’m working with a grad student who needs to create a new shapefile from scratch, and thought I’d turn the instructions for doing this in ArcGIS into a tutorial / post for creating new point layers. The idea in this example is to create a point layer that shows the relative center of 291 neighborhoods in New York City. Since many of these neighborhoods are place names without finite boundaries, we’ll have to use various sources (NYC Planning map and Rand McNally street maps) to pinpoint the relative center of each neighborhood.

These points will be used for labeling each neighborhood. In this case, creating a new, georeferenced layer is preferable to creating 291 text labels on a map that are not tied to geography in any way.

  • The first step is to download some layers from the NYC Department of Planning to use for reference, such as a layer for boroughs and community districts. Community districts are used by the city to approximate neighborhoods. Many of the neighborhoods that we are trying to plot are, in many cases, smaller areas or places within these boundaries.
  • scrnshot1Next, open ArcCatalog and create a folder to store the data. Then, right click on the folder in the table of contents and select New - Shapefile. In the Create New Shapefile window, we give the shapefile a name, select Point as the feature type, and hit Edit to change the
    coordinate system. In the Spatial Reference Properties menu, we’ll import a coordinate system from one of the files we downloaded from NYC Planning, which uses New York State Plane for Long Island. Click OK and OK again, and we’ll have a new shapefile.
  • scrnshot2Right now, our new shapefile isn’t very exciting because it’s empty - you can preview it in the catalog to see for yourself. If you preview the table, you’ll see that Arc created three fields - FID, Shape, and ID, which it will automatically fill in when we start creating features. Before we do that, we’ll have to add an additional column to store the name of the neighborhood. To do that, open ArcMap and add the neighborhood layer to the map. Then, right click on the layer in the Table of Contents and open the attribute table. Hit the Options button and choose Add Field. In the Add Field menu, name the new field, choose Text as the type, and change the length to 80 (in case we have some neighborhoods with long names). Hit OK, and you’ll have a new field.
  • scrnshot3Let’s add our reference layers next. Hit the Add Data button (or File - Add Data), and add the borough boundaries and community districts (if you don’t see anything after you add them, right click on one of these layers and choose Zoom to Layer). Go into the symbology tab for each layer and change their display to make the areas appear more distinctive. Make sure your neighborhood layer is on top of your other layers.
  • Now it’s time to start plotting neighborhoods. Go to the Selection menu - Set selectable Layers, and turn off all the layers except the neighborhood layer. Then, use the dropdown on the Editor Toolbar and Select Start Editing (if you don’t see the Editor Toolbar, make sure it’s activated by going to View - Toolbars and select it). scrnshot4On the Editor Toolbar, make sure the Create New Feature task is activated and that the target layer is the neighborhood layer, and not any of the reference layers. Zoom in to the top of Manhattan. With the Pencil tool selected in the toolbar, and using your sources (NYC planning map, Rand McNally street map, whatever), click on the map to approximate where the center of the Inwood neighborhood would be. A blue dot should appear on the map. Then right-click on the neighborhoods layer in the Table of Contents and open the attributes table. You’ll see a brand new record for your new dot. Click in the empty field for Name, type in the name of the neighborhood, and press enter.
  • That’s the process! Next, locate the area for Washington Heights and click on the map to create the point for that neighborhood. The new dot will appear hi-lighted, while the previous dot for Inwood will now appear as a regular point symbol. Now it’s just a matter of plugging away. Make sure to occasionally save your edits by clicking Editor and choosing Save Edits. If you make a mistake, you can delete a feature by selecting the Select Feature tool in the regular tool bar (white arrow with a blue and white feature box next to it), select the particular point, and hit the delete key. If you’re having trouble pinpointing the right location for the neighborhood, try downloading additional reference layers to guide you. The NYC DOITT also has a page with GIS layers for the city with features like parks and streets that may be helpful. When you’re finished editing, choose Stop Editing under the Editor Toolbar.

    scrnshot5

  • The ultimate goal of this exercise was to get neighborhood labels to appear without the actual point. To accomplish this, change the point symbol for the neighborhood to nothing by going into the Symbology tab for the layer and reducing the fill to no color, the outline to nothing, and the size to zero. Then open the Labels tab under the Properties menu, turn labels on using the name field as the label field, select Placement Properties and choose the setting to place the labels on top of the point, hit ok, and voila! Perfectly centered neighborhood names that are part of a georeferenced layer.

This covers the basics. In the next post, I’ll go a little further and discuss adding additional fields to the new file, without having to type them in manually.

Updated Links for Data and Resources

I recently went through my pages of suggested links for data and resources to update and clean them up. I’ve included many of the cool resources I’ve discovered since I started writing this blog, which ended up in individual posts but not in these pages. I went over the resources page in particular, to try and classify the reference materials, tools, and software into useful categories rather than just having one large blob of stuff.

Transform Projections with GDAL / OGR

The GDAL / OGR tools are an open source, cross platform, command-line toolkit that can be used for viewing GIS metadata, performing attribute queries, and converting file formats, among other things. It can also be used for transforming coordinate systems and projections for GIS files. I’ll demonstrate in this brief tutorial how to accomplish this using the OGR tools, which are for vector based GIS. The raster based GDAL tools work in a similar fashion.

Viewing basic coordinate system / projection info:

ogrinfo -al -so world_wgs.shp

Where ogrinfo is the name of the tool, -al is a switch to get detailed info about the layer, -so is a switch to display summary info, and world_wgs.ship is the name of our file. Run that command and we’ll get something that looks like this, with info about the features, coordinate system, and attribute fields of our shapefile:

INFO: Open of `world_wgs.shp’
using driver `ESRI Shapefile’ successful.

Layer name: world_wgs
Geometry: Polygon
Feature Count: 243
Extent: (-179.808664, -89.677397) - (179.808664, 83.435942)
Layer SRS WKT:
GEOGCS["GCS_WGS_1984",
DATUM["WGS_1984",
SPHEROID["WGS_1984",6378137,298.257223563]],
PRIMEM["Greenwich",0],
UNIT["Degree",0.017453292519943295]]
CNTRY_NAME: String (254.0)
FIPS_CNT_1: String (254.0)
ISO_2DIGIT: String (254.0)
ISO_3DIGIT: String (254.0)
STATUS: String (254.0)
COLORMAP: Real (18.6)
CONTINENT: String (254.0)
UN_CONTINE: String (254.0)
REGION: String (254.0)
UN_REGION: String (254.0)

Convert coordinate systems supported by EPSG

GDAL / OGR and most of the open source GIS software supports projections and coordinate systems that are part of the EPSG library. If you want to do a conversion between two coordinate systems and they are both supported by EPSG, you just have to reference the EPSG code that’s used to identity the system that you want to project to. You can look up codes using spatialreference.org.

Let’s say we want to convert our shapefile that’s in WGS 84 (common lat and long) to NAD 83 (used frequently in North America):

ogr2ogr -t_srs EPSG:4269 world_new.shp world_wgs.shp

Where ogr2ogr is the name of the tool, -t_srs is the command for transforming from one coordinate system to the other, EPSG:4269 is the code that identifies the coordinate system we want the new file to have - NAD83, world_new.shp is the name of the output file that will have the new projection that we want, and world_wgs.shp is our input file. If you run the command and get no error message, you’re in good shape. Just run the ogrinfo command on the new file to verify that it’s been re-projected.

Convert coordinate system not supported by EPSG

The EPSG library is extensive, but doesn’t contain everything, particularly some global and continental map projections. GDAL / OGR can still do the job, but you’ll have to provide the tool with the proper frame of reference since the EPSG library doesn’t have the info. Let’s say we want to project our WGS file to the Robinson Projection, which is not part of EPSG.

First, go back to spatialreference.org and search for Robinson. Its ID code is ESRI 54030 - not part of the EPSG library. Click on the link for the projection to open its window. You’ll be able to look at the projection data in a number of standard file formats. Select OGC_WKT from the list, and it will open the text in a new window, showing you the parameters of that projection. In your browser, go up to file, save as, and save the file as robinson_ogcwkt.txt in the same directory as the shapefile you want to reproject.

Now that you have the projection info stored in the text file, run the following command to make the conversion:

ogr2ogr -t_srs robisnon_ogcwkt.txt world_rob.shp world_wgs.shp

It’s the same command as our previous one, except that you’re referencing the text file with your data instead of an EPSG code.

Define an undefined coordinate system

If you run the ogrinfo command and your coordinate system is undefined, you should define it before doing anything else, and you must define an undefined projection before converting to another projection. Look at the metadata that came with you file or go back to the source to figure out what it is. For example the US Census Bureau Generalized Cartographic Boundary Files for 2000 are in NAD83 according to their metadata, but the files lack a projection definition.

To define one, use the following command:

ogr2ogr -a_srs EPSG:4269 states_nad83.shp states_unknown.shp

The only difference here is the -a_srs command is used to assign a coordinate system to a file - the rest of the parameters are the same. If you’re defining a non-EPSG projection, use the same method from the previous example - download a definition file from spatialreference.org and use the file name in place of the EPSG code.

More help and where to download:

UC Santa Barbara NCEAS and the UC Davis Soil Lab both have short tutorials and sample commands of GDAL / OGR.

If you want to thumb through the world’s map projections, the folks at radicalcartography have a nice projection reference page with visuals and brief descriptions.

Visit the GDAL / OGR page for downloading, or if you’re a Windows or Mac user, you can download QGIS and GDAL / OGR together from the QGIS download page. Linux users can get GDAL / OGR via your package handler - depending on your distro, you may have it already.

NY Times Interactive Maps

The New York Times has been putting together some great, web-based, thematic maps lately. I thought I’d provide a summary of some of the latest and greatest here.

US Maps

  • Immigration Explorer - Explore foreign born groups for the United States by county, based on the decennial census from 1880 to 2000. Choropleth maps of the largest immigrant group per county and graduated circle maps depicting the size of each group. 3/10/2009.
  • The Geography of a Recession - Choropleth map of the US that shows the annual change in unemployment by county. Lets you filter by county type (urban, rural, manufacturing areas, housing bubbles). 3/7/2009.
  • A Growing Detention Network - a graduated circle map of the US that shows detention centers where people are held on immigration violations, by number of detainees and type of facility. 12/26/2008.
  • Where Homes Are Worth Less Than the Mortgage - State-based US choropleth and graduated circle maps of the housing and debt crisis. 11/10/2008.

NYC Maps

  • New Yorkers Assess Their City - How New Yorkers rate their neighborhood based on quality of life, city services, education, transportation and crime. Based on a large survey of city residents. Choropleth maps of community districts. 3/7/2009.
  • Census Shows Growing Diversity in New York City - Choropleth maps show median rent and median income by PUMAs in 2000 and 2007. An example of mapping 3-year ACS data by PUMAs to show patterns below the county level. 12/9/2008.

World Maps

  • A Map of Olympic Medals - A cartogram of countries based on the number of olympic medals they’ve won for every olympics from 1896 to 2008. Mouse over to get medal counts. 8/4/2008.

QGIS: Data Defined Labeling and Table Joins

A little while ago I posted a text file with geographic centroids (centers) for each of the world’s countries. The reason why I put this together was that I wanted to test the data defined labeling features in QGIS. While automatic labeling in QGIS isn’t so hot (overlapping labels, multiple lables for each polygon), there are some powerful features for storing and referencing columns for annotation within the attribute table of shapefiles. One of the neat features is the ability to place labels based on coordinates stored in the attribute table.

The first step was to take the centroids file and join in to a shapefile of the worlds countries based on a common ID field, in this case FIPS country codes. QGIS doesn’t support table joins directly, but you can accomplish this with a good plugin called fTools, which includes a lot of additional and useful features. The instructions for getting fTools up and running are available on the fTools website; the installation doesn’t require you to download any files, you just handle everything through the QGIS plugin manager (if you have trouble seeing the plugin manager or getting fTools to appear, check to make sure that you have python installed on your machine). Once fTools is up and running, you’ll see a Tools dropdown menu next to your other menus - drop it down, select data management tools and join attribute tables. You’ll get a dialog box asking which shapefile and field you want to join and which shapefile or table you want to join to it. The plugin only supports joins from other shapefiles and dbf tables, so you have to save the save the country centroids text file as a dbf before you do the join (you can do this in Calc or a pre-2007 version of Excel). These aren’t dynamic joins; fTools will create a new shapefile with the table fields attached.

Once the join is complete, you can add the new shapefile with the new fields, click on the layer, and navigate to the labels tab. Hit the checkbox to turn the labels on, select the field that contains the label in the dropdown box at the top, then select data defined position from the menu below. You’ll see a new series of dropdowns on the right, and you can select your longitude column for the X coordinate and latitude column for the Y coordinate. Hit OK, and voila! You’ll have labels that are centered in the middle of each country.

Of course, the label placement will not be perfect in every case. There will be label overlap in areas with small countries, areas with many countries clustered together, and with countries that have long names. The scale and size of the font will also be a factor, and placing the country name in the center is not always ideal for small island nations. However, you can easily change the label placement by going into an edit mode and changing the coordinates in the attribute table to get optimal placement. You can mouse over the map and use the coordinate information that’s displayed beside the scale in the lower right-hand corner of the window to determine which coordinates are most optimal for a given situation. If you produce several maps at the same area and scale, you can use the same settings over and over again. You can also globally change the placement of all the labels using some of the other label options, such as placing all labels above or to the top-right of the centroid.

Now in order for all of this to work, the coordinates in the country centroid file must be in the same coordinate system as the shapefile. Since the country centroid file uses basic latitude and longitude, I was able to do this with a shapefile that was in the basic WGS 84 geographic coordinate system. If you’re using a different geographic coordinate system or a projected coordinate system, you’ll have to convert the coordinates in the centroid file to match that system. I haven’t delved into this too deeply yet, but there are a number of free tools that you can download that should do this - one of them is called GEOTRANS, and it’s available for free download from the NGA. It can handle batch transformations of coordinate data stored in text files, and supports conversions to several different geographic and projected systems.

QGIS Label Placement With XY Coordinates

QGIS Label Placement With XY Coordinates

Comments Disabled

Because I’m being spammed to death, I’ve disabled the ability for anyone to leave comments on posts and have restricted it to registered users only.

But for some reason, when someone registers for this blog they aren’t sent an email confirmation with a temporary password as there is something wrong with the configuration of my email server and how it communicates with Wordpress. I know of one possible solution that I’m going to try this weekend to fix this. Until then, no one will be able to leave comments or officially register. Hopefully I can get this straightened out soon…

Centroids for Countries

I just added a new resource and updated another one on the resources page. I put together a file that contains the centroids (geographic centers) of all of the countries in the world, plus a few territories and dependencies. The centroids are in latitude and longitude coordinates based on WGS 84 in two formats: decimal degrees and degrees / minutes / seconds. It’s a tab delimited text file that you can open or import into any spreadsheet or database program. Each record is uniquely identified by a FIPS 10 code.

I downloaded most of the data from the NGA’s GeoNames Server (GNS). I blogged about the GNS awhile back, pointing out that you could query this gazetteer for individual places or you could download files that have all the features for each country in the world. While it took some time to figure out, you can actually take a middle road and query the database for specific categories of features that you can download. I used the text-based search and the links on the left side of the screen actually open different input boxes that you can use to query or exclude data. I managed to query top-level administrative units (countries) and to exclude most variant country names. After I downloaded the file, I still had to go in and do some clean-up, and I had to go back and get countries I missed by hand - these were mostly dependencies and territories that were excluded based on the search I did (Greenland, French Guiana, Netherlands Antilles, and a number of others).

Then I realized that the GNS excludes the United States and all of its territories. So, I went over to the USGS Geographic Names Information Service (GNIS) and grabbed the data for the US territories. The GNIS is simpler to navigate and you can download records pretty easily. They didn’t have a record for the United States as a whole, so I had to go over to the Census Bureau to get coordinates for the US centroid.

I brought all of these records into one file and placed it on the resources page for download, along with some metadata to describe it. Why would you want to use this stuff? You can use if for basic distance calculations, or as a annotated label field for label placement in GIS. More about that in my next post.

I also updated the country code cross-reference file that I took from the CIA World Factbook. You can use this as a bridge table to relate tables that use different identifiers. So if you wanted to join the fips-based centroid file to an iso-based shapefile of countries, you can join the centroids to the bridge first based on fips, and then that new table to the shapefile based on iso.