<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Gothos</title>
	<atom:link href="http://gothos.info/feed/" rel="self" type="application/rss+xml" />
	<link>http://gothos.info</link>
	<description>A Geospatial Librarian&#039;s World</description>
	<lastBuildDate>Sun, 25 Mar 2012 15:56:39 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>GIS Workshops This Apr &amp; May</title>
		<link>http://gothos.info/2012/03/gis-workshops-this-apr-may/</link>
		<comments>http://gothos.info/2012/03/gis-workshops-this-apr-may/#comments</comments>
		<pubDate>Sun, 25 Mar 2012 15:56:39 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[ArcGIS]]></category>
		<category><![CDATA[GIS]]></category>
		<category><![CDATA[new york city]]></category>
		<category><![CDATA[qgis]]></category>
		<category><![CDATA[Tutorial]]></category>

		<guid isPermaLink="false">http://gothos.info/?p=819</guid>
		<description><![CDATA[This semester I&#8217;ll be teaching three workshops with Prof. Deborah Balk in spatial tools and analysis. Sponsored by the CUNY Institute of Demographic Research (CIDR), the workshops will be held on Baruch College&#8217;s campus in midtown NYC on Friday afternoons. The course is primarily intended for data and policy analysts who want to gain familiarity [...]]]></description>
			<content:encoded><![CDATA[<p>This semester I&#8217;ll be teaching three workshops with Prof. Deborah Balk in spatial tools and analysis. Sponsored by the <a href="http://www.cuny.edu/about/centers-and-institutes/cidr.html" target="_blank">CUNY Institute of Demographic Research</a> (CIDR), the workshops will be held on Baruch College&#8217;s campus in midtown NYC on Friday afternoons. The course is primarily intended for data and policy analysts who want to gain familiarity with the basics of map making and spatial analysis; registration is open to anyone. The workshops progress from basic to intermediate skills that cover making a map (Apr 27th), geospatial calculations (May 4th), and geospatial analysis (May 11th). We&#8217;ll be using QGIS and participants will work off of their own laptops; we&#8217;ll also be demonstrating some of the processes in ArcGIS and participants will receive an evaluation copy of that software. Each workshop is $300 or you can register for all three for $750.</p>
<p>For full details <a href="http://gothos.info/resource_files/CIDR_SpatialWorkshop_2012.pdf" target="_blank">check out this flier</a>. You can register via the College&#8217;s <a href="http://www.baruched.com/" target="_blank">CAPS website</a>; do a search for DEM and register for each session (DEM0003, DEM0004, and DEM0005).</p>
]]></content:encoded>
			<wfw:commentRss>http://gothos.info/2012/03/gis-workshops-this-apr-may/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Screen Scraping Data with Python</title>
		<link>http://gothos.info/2012/03/screen-scraping-data-with-python/</link>
		<comments>http://gothos.info/2012/03/screen-scraping-data-with-python/#comments</comments>
		<pubDate>Fri, 09 Mar 2012 12:52:58 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Data Processing]]></category>
		<category><![CDATA[census data]]></category>
		<category><![CDATA[population centroids]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[Tutorial]]></category>

		<guid isPermaLink="false">http://gothos.info/?p=797</guid>
		<description><![CDATA[I had a request recently for population centers (aka population centroids) for all the counties in the US. The Census provides the 2010 centroids in state level files and in one national file for download, but the 2000 centroids were provided in HTML tables on individual web pages for each state. Rather than doing the [...]]]></description>
			<content:encoded><![CDATA[<p>I had a request recently for population centers (aka population centroids) for all the counties in the US. The Census provides the <a href="http://www.census.gov/geo/www/2010census/centerpop2010/county/countycenters.html" target="_blank">2010 centroids</a> in state level files and in one national file for download, but the <a href="http://www.census.gov/geo/www/cenpop/county/ctyctrpg.html" target="_blank">2000 centroids</a> were provided in HTML tables on individual web pages for each state. Rather than doing the tedious work of copying and pasting 51 web pages into a spreadsheet, I figured this was my chance to learn how to do some screen scraping with Python. I&#8217;m certainly no programmer, but based on what I&#8217;ve learned (I took a three day workshop a couple years ago) and by consulting books and crawling the web for answers when I get stuck, I&#8217;ve been able to write some decent scripts for processing data. </p>
<p>For screen scraping there&#8217;s a must-have module called <a href="http://www.crummy.com/software/BeautifulSoup/" target="_blank">Beautiful Soup</a> which easily let&#8217;s you parse web pages, well or ill-formed. After reading the <a href="http://www.crummy.com/software/BeautifulSoup/bs3/documentation.html#Quick%20Start" target="_blank">Beautiful Soup Quickstart</a> and some nice advice I found on a <a href="http://stackoverflow.com/questions/2081586/web-scraping-with-python" target="_blank">post on Stack Overflow</a>, I was able to build a script that looped through each of the state web pages, scraped the data from the tables, and dumped it into a delimited text file. Here&#8217;s the code:</p>
<p><code>
<pre>
## Frank Donnelly Feb 29, 2012
## Scrapes 2000 centers of population for counties from individual state web pages
## and saves in one national-level text file.

from urllib.request import urlopen
from bs4 import BeautifulSoup

output_file=open('CenPop2000_Mean_CO.txt','a')
header=['STATEFP','COUNTYFP','COUNAME','STNAME','POPULATION','LATITUDE','LONGITUDE']
output_file.writelines(",".join(header)+"\n")

url='http://www.census.gov/geo/www/cenpop/county/coucntr%s.html'

fips=['01','02','04','05','06','08','09','10',
'11','12','13','15','16','17','18','19','20',
'21','22','23','24','25','26','27','28','29','30',
'31','32','33','34','35','36','37','38','39','40',
'41','42','44','45','46','47','48','49','50',
'51','53','54','55','56']

for i in fips:
  soup = BeautifulSoup(urlopen(url %i).read())
  titleTag = soup.html.head.title
  list=titleTag.string.split()
  name=(list[4:])
  state=' '.join(name)  

  for row in soup('table')[1].tbody('tr'):
    tds = row('td')
    line=tds[0].string, tds[1].string, tds[2].string, state,
    tds[3].string.replace(',',''), tds[4].string, tds[5].string

    output_file.writelines(",".join(line)+"\n")     

output_file.close()
</pre>
<p></code></p>
<p>After installing the modules step 1 is to import them into the script. I initially got a little stuck here, because there are also some standard modules for working with urls (urllib and urlib2) that I&#8217;ve seen in books and other examples that weren&#8217;t working for me. I discovered that since I&#8217;m using Python 3.x and not the 2.x series, something <a href="http://stackoverflow.com/questions/2792650/python3-error-import-error-no-module-name-urllib" target="_blank">had changed recently</a> and I had to change how I was referencing urllib.</p>
<p>With that out of the way I created a a text file, a list with the column headings I want, and then wrote those column headings to my file.</p>
<p>Next I read in the url. Since the Census uses a static URL that varies for each state by FIPS code, I was able to assign the URL to a variable and inserted the % symbol to substitute where the FIPS code goes. I created a list of all the FIPS codes, and then I run through a loop &#8211; for every FIPS code in the list I pass that code into the url where the % place holder is, and process that page.</p>
<p>The first bit of info I need to grab is the name of the state, which doesn&#8217;t appear in the table. I grab the title tag from the page and save it as a list, and then grab everything from the fourth element (fifth word) to the end of the list to capture the state name, and then collapse those list elements back into one string (have to do this for states that have multiple words &#8211; New, North, South, etc.).</p>
<p>So we go from the HTML Title tag:</p>
<p>County Population Centroids for New York</p>
<p>To a list with elements 0 to 5:</p>
<p>list=["County", "Population", "Centroids", "for", "New", "York"]</p>
<p>To a shorter list with elements 4 to end:</p>
<p>name=["New","York"]</p>
<p>To a string:</p>
<p>state=&#8221;New York&#8221;</p>
<p>But the primary goal here is to grab everything in the table. So we identify the table in the HTML that we want &#8211; the first table in those pages [0] is just an empty frame and the second one [1] is the one with the data. For every row (tr) in the table we can reference and grab each cell (td), and string those cells together as a line by referencing them in the list. As I string these together I also insert the state name so that it appears on every line, and for the third list element (total population in 2000) I strip out any commas (numbers in the HTML table included commas, a major no-no that leads to headaches in a csv file). After we grab that line we dump it into the output file, with each value separated by a comma and each record on it&#8217;s own line (using the new line character). Once we&#8217;ve looped through each table on each page for each state, we close the file.</p>
<p>There are a few variations I could have tried; I could have read the FIPS codes in from a table rather than inserting them into the script, but I preferred to keep everything together. I could have read the state names in as a list, or coupled them with the codes in a dictionary. This would have been less risky then relying on the state name in the title tag, but since the pages were well-formed and I wanted to experiment a little I went the title tag route. Instead of typing the codes in by hand I used Excel trickery to concatenate commas to the end of each code, and then concatenated all the values together in one cell so I could copy and paste the list into the script.</p>
<p>You can <a href="http://www.census.gov/geo/www/cenpop/county/coucntr01.html" target="_blank">go here to see an individual state page</a> and source, and <a href="http://gothos.info/resource_files/CenPop2000_Mean_CO.txt" target="_blank">here to see what the final output</a> looks like. Or if you&#8217;re just looking for a national level file of 2000 population centroids for counties that you can download, look no further!</p>
]]></content:encoded>
			<wfw:commentRss>http://gothos.info/2012/03/screen-scraping-data-with-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Country Centroids File Updated</title>
		<link>http://gothos.info/2012/02/country-centroids-file-updated/</link>
		<comments>http://gothos.info/2012/02/country-centroids-file-updated/#comments</comments>
		<pubDate>Mon, 13 Feb 2012 05:02:10 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Resources]]></category>
		<category><![CDATA[countries]]></category>
		<category><![CDATA[country codes]]></category>
		<category><![CDATA[latitude]]></category>
		<category><![CDATA[longitude]]></category>
		<category><![CDATA[xy coordinates]]></category>

		<guid isPermaLink="false">http://gothos.info/?p=792</guid>
		<description><![CDATA[A brief note &#8211; I&#8217;ve updated and replaced the country centroids file that I was previously hosting. I extracted data with geographic centroids in latitude and longitude for each country and dependency in the world using extracts from the NGA&#8217;s GNS and the USGS GNIS. Data is current as of Feb 2012, with long and [...]]]></description>
			<content:encoded><![CDATA[<p>A brief note &#8211; I&#8217;ve updated and replaced the country centroids file that I was previously hosting. I extracted data with geographic centroids in latitude and longitude for each country and dependency in the world using extracts from the NGA&#8217;s GNS and the USGS GNIS. Data is current as of Feb 2012, with long and short names for countries and two letter alpha FIPS and ISO codes for identification and attribute linking. Available for download on the <a href="http://gothos.info/resources/" target="_blank">Resources page</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://gothos.info/2012/02/country-centroids-file-updated/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ACS Trend Reports and Census Geography Guide</title>
		<link>http://gothos.info/2012/02/acs-trend-reports-and-census-geography-guide/</link>
		<comments>http://gothos.info/2012/02/acs-trend-reports-and-census-geography-guide/#comments</comments>
		<pubDate>Mon, 13 Feb 2012 04:37:01 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Data Sources]]></category>
		<category><![CDATA[Resources]]></category>
		<category><![CDATA[2010 Census]]></category>
		<category><![CDATA[acs]]></category>
		<category><![CDATA[american community survey]]></category>
		<category><![CDATA[census]]></category>
		<category><![CDATA[census data]]></category>
		<category><![CDATA[census geography]]></category>
		<category><![CDATA[FIPS codes]]></category>
		<category><![CDATA[geography]]></category>
		<category><![CDATA[united states]]></category>
		<category><![CDATA[xy coordinates]]></category>

		<guid isPermaLink="false">http://gothos.info/?p=788</guid>
		<description><![CDATA[I recently received my first question from someone who wanted to compare 2005-2007 ACS data with 2008-2010. With the release of the latter, we can make historical comparisons with the three year data for the first time since we have estimates that don&#8217;t overlap. We should be able to make some interesting comparisons, since the [...]]]></description>
			<content:encoded><![CDATA[<p>I recently received my first question from someone who wanted to compare 2005-2007 ACS data with 2008-2010. With the release of the latter, we can make historical comparisons with the three year data for the first time since we have estimates that don&#8217;t overlap. We should be able to make some interesting comparisons, since the first set covers the real estate boom years (remember those?) and the second covers the Great Recession. One resource that makes such comparisons relatively painless is over at the <a href="http://mcdc.missouri.edu/" target="_blank">Missouri Census Data Center</a>. They&#8217;ve put together a really clean and simple interface called the <a href="http://mcdc1.missouri.edu/acsprofiles/acstrendmenu.html" target="_blank">ACS Trends Menu</a>, which allows you to select either two one period estimates or two three period estimates and compare them for several different census geographies &#8211; states, counties, MCDs, places, metros, Congressional Districts, PUMAs, and a few others &#8211; for the entire US (not just Missouri). The end result is a profile that groups data into the Economic, Demographic, Social, and Housing categories that the Census uses for its Demographic Profile tables. The calculations for change and percent change for the estimates and margins of error are done for you.</p>
<p>Downloading the data is not as straightforward &#8211; the links to extract it just brought me some error messages, so it&#8217;s still a work in progress. Until then, a simple copy and paste into your spreadsheet of choice will work fine. </p>
<p><a href="http://gothos.info/wp-content/uploads/2012/02/acs_trends.png"><img src="http://gothos.info/wp-content/uploads/2012/02/acs_trends-300x291.png" alt="ACS Trends Menu" title="acs_trends" width="300" height="291" class="aligncenter size-medium wp-image-789" /></a></p>
<p>If you like the interface, they&#8217;ve created separate ones for downloading profiles from any of the <a href="http://mcdc1.missouri.edu/acsprofiles/acsprofilemenu.html" target="_blank">ACS periods</a> or from the <a href="http://mcdc1.missouri.edu/sf1_2010/sf1_2010_menu.html target="_blank">2010 Census</a>. The difference here is that you&#8217;re looking at one time frame; not across time periods. The interface and the output are the same, but in these menus you can compare four different geographies at once in one profile. Unlike the Trends reports, both the ACS and 2010 Census profiles have easy, clear cut ways to download the profiles as a PDF or a spreadsheet. If you&#8217;re happy with data in a profile format and want an interface that&#8217;s a little less confusing to navigate than the American Factfinder, these are all great alternatives (and if you&#8217;re building web applications these profiles are MUCH easier to work with &#8211; you can easily build permanent links or generate them on the fly).</p>
<p>The US Census Bureau also recently put together a great resource called the <a href="http://www.census.gov/geo/www/guidestloc/guide_main.html" target=_blank">Guide to State and Local Census Geography</a>. They provide a census geography overview of each state: 2010 population, land area, bordering states, year of entry into the union, population centroids, and a description of how local government is organized in the state &#8211; (i.e. do they have municipal civil divisions or only incorporated cities and unincorporated land, etc). You get counts for every type of geography &#8211; how many counties, tracts, ZCTAs, and so on, AND best of all you can download all of this data directly in tab delimited files. Need a list of every county subdivision in a state, with codes, land area, and coordinates? No problem &#8211; it&#8217;s all there. </p>
]]></content:encoded>
			<wfw:commentRss>http://gothos.info/2012/02/acs-trend-reports-and-census-geography-guide/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thiessen Polygons and Listing Neighboring Features</title>
		<link>http://gothos.info/2012/01/thiessen-polygons-and-listing-neighboring-features/</link>
		<comments>http://gothos.info/2012/01/thiessen-polygons-and-listing-neighboring-features/#comments</comments>
		<pubDate>Mon, 02 Jan 2012 21:03:33 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Data Processing]]></category>
		<category><![CDATA[Resources]]></category>
		<category><![CDATA[ArcGIS]]></category>
		<category><![CDATA[geoprocessing]]></category>
		<category><![CDATA[GIS]]></category>
		<category><![CDATA[grass]]></category>
		<category><![CDATA[xy coordinates]]></category>

		<guid isPermaLink="false">http://gothos.info/?p=766</guid>
		<description><![CDATA[I was helping someone with a project recently that I thought would be straightforward but turned out to be rather complex. We had a list of about 10,000 addresses that had to be plotted as coordinates, and then we needed to create Thiessen or Voroni polygons for each point to create market areas. Lastly we [...]]]></description>
			<content:encoded><![CDATA[<p>I was helping someone with a project recently that I thought would be straightforward but turned out to be rather complex. We had a list of about 10,000 addresses that had to be plotted as coordinates, and then we needed to create <a href="http://support.esri.com/en/knowledgebase/GISDictionary/term/Thiessen%20polygons" target="_blank">Thiessen</a> or <a href="http://support.esri.com/en/knowledgebase/GISDictionary/term/Voronoi%20diagram" target="_blank">Voroni</a> polygons for each point to create market areas. Lastly we needed to generate an adjacency table or list of neighbors; for every polygon list all the neighboring polygons.</p>
<p>For step one I turned to the <a href="https://webgis.usc.edu/Services/Geocode/Default.aspx" target="_blank">USC Geocoding service</a> to geocode the addresses; I became a partner a ways back so I could batch geocode datasets for students and faculty on my campus. Once I had coordinates I plotted them in ArcGIS 10 (and learned that the Add XY data feature had been moved to File &gt; Add Data &gt; Add XY Data). Step 2 seemed easy enough; in Arc you go to ArcToolbox &gt; Analysis Tools &gt; Proximity &gt; Create Thiessen Polygons. This creates a polygon for each point and assigns the attributes of each point to the polygon.</p>
<p><a href="http://gothos.info/wp-content/uploads/2012/01/thiessen.png"><img src="http://gothos.info/wp-content/uploads/2012/01/thiessen-300x151.png" alt="" title="thiessen polygons" width="300" height="151" class="aligncenter size-medium wp-image-768" /></a></p>
<p>I hit a snag with Step 3 &#8211; Arc didn&#8217;t have a tool for generating the adjacency table. After a thorough search of the ESRI and Stack Exchange forums, I stumbled on the <a href="http://arcscripts.esri.com/details.asp?dbid=15805" target="_blank">Find Adjacent Features Script</a> by Ken Buja which did exactly what I wanted in ArcGIS 9.2 and 9.3, but not in 10. I had used this script before on a previous project, but I&#8217;ve since upgraded and can&#8217;t go back. So I searched some more until I found the <a href="http://resources.arcgis.com/gallery/file/geoprocessing/details?entryID=50F58FCF-1422-2418-884B-A053393CEF92" target="_blank">Find Adjacent &#038; Neighboring Polygons Tool</a> by cmaene. I was able to add this custom toolbox directly to ArcToolbox, and it did exactly what I wanted in ArcGIS 10. I get to select the unique identifying field, and for every ID I get a list of the IDs of the neighboring polygons in a text file (just like Ken&#8217;s tool). This tool also had the option of saving the list of neighbors for each feature directly in the attribute table of a shapefile (which is only OK for small files with few neighbors; fields longer than 254 characters get truncated), and it gave you the option of listing neighbors to the next degree (a list of all the neighbor&#8217;s neighbors). </p>
<p><a href="http://gothos.info/wp-content/uploads/2012/01/FANP.png"><img src="http://gothos.info/wp-content/uploads/2012/01/FANP-300x199.png" alt="" title="Find Adjacent Polygons Tool" width="300" height="199" class="aligncenter size-medium wp-image-769" /></a></p>
<p>Everything seemed to run fine, so I re-ran the tool on a second set of Thiessen polygons that I had clipped with an outline of the US to create something more geographically realistic (so polygons that share a boundary only in the ocean or across the Great Lakes are not considered neighbors).</p>
<p><a href="http://gothos.info/wp-content/uploads/2012/01/thiessen_clip.png"><img src="http://gothos.info/wp-content/uploads/2012/01/thiessen_clip-300x154.png" alt="" title="thiessen polygons clipped" width="300" height="154" class="aligncenter size-medium wp-image-770" /></a></p>
<p>THEN &#8211; TROUBLE. I took some samples of the output table and checked the neighbors of a few features visually in Arc. I discovered two problems. First, I was missing about a thousand records or so in the output. When I geocoded them I couldn&#8217;t get a street-level address match for every record; the worse case scenario was a plot to the ZCTA / ZIP code centroid for the address, which was an acceptable level of accuracy for this project. The problem is that if there are many point features plotted to the same coordinate (because they share the same ZIP), a polygon was created for one feature and the overlapping ones fell away (you can&#8217;t have overlapping Thiessen polygons). Fortunately this also wasn&#8217;t an issue for the person I was helping; we just needed to join the output table back to the master one to track which ones fell out and live with the result.</p>
<p>The <em>bigger</em> problem was the output was wrong. I discovered that the neighbor list for most of the features I checked, especially polygons that had borders on the outer edge of the space, had incomplete lists; each feature had several (and in some cases, all) neighbors missing. Instead of using a shapefile of Thiessen&#8217;s I tried running the tool on polygons that I generated as feature classes within an Arc geodatabase, and got the same output. For the heck of it I tried dissolving all the Thiessen&#8217;s into one big polygon, and when I did that I noticed that I had orphaned lines and small gaps in what should have been one big, solid rectangle. I tried checking the geometry of the polygons and there were tons of problems. This led me to conclude that Arc did a lousy job when constructing the topology of the polygons, and the neighbor tool was giving me bad output as a result.</p>
<p><a href="http://gothos.info/wp-content/uploads/2012/01/thiessen_dissolve.png"><img src="http://gothos.info/wp-content/uploads/2012/01/thiessen_dissolve-300x150.png" alt="" title="thiessen polygons dissolved" width="300" height="150" class="aligncenter size-medium wp-image-771" /></a></p>
<p>Since I&#8217;ve been working more with GRASS, I remembered that GRASS vectors have strict topology rules, where features have shared boundaries (instead of redundant overlapping ones). So I imported my points layer from a shapefile into GRASS and then used the <a href="http://grass.osgeo.org/gdp/html_grass64/v.voronoi.html" target="_blank">v.voroni</a> tool to create the polygons. The geometry looked sound, the attributes of each point were assigned to a polygon, and for overlapping points one polygon was created and attributes of the shared points were dumped. I exported the polygons out as a shapefile and brought them back into Arc, ran the Find Adjacent &#038; Neighboring Polygons tool, spot checked the neighbors of some features, and voila! The output was good. I clipped these polygons with my US outline, ran the tool again, and everything checked out.</p>
<p><a href="http://gothos.info/wp-content/uploads/2012/01/output.png"><img src="http://gothos.info/wp-content/uploads/2012/01/output-300x129.png" alt="" title="sample output neighboring polygons" width="300" height="129" class="aligncenter size-medium wp-image-772" /></a></p>
<p>Morals of this story? When geocoding addresses consider how the accuracy of the results will impact your project. If a tool or feature doesn&#8217;t exist assume that someone else has encountered the same problem and search for solutions. Never blindly accept output; take a sample and do manual checks. If one tool or piece of software doesn&#8217;t work, try exporting your data out to something else that will. Open source software and Creative Commons tools can save the day!</p>
<p>Footnote &#8211; apparently it&#8217;s possible to create lists of adjacent polygons in GRASS using the sides option in <a href="http://grass.osgeo.org/gdp/html_grass64/v.to.db.html" target="_blank">v.to.db</a>, although it isn&#8217;t clear to me how this is accomplished; the documentation talks about categories of areas on the right and left of a boundary, but not on all sides of an area. Since I already had a working solution I didn&#8217;t investigate further.</p>
]]></content:encoded>
			<wfw:commentRss>http://gothos.info/2012/01/thiessen-polygons-and-listing-neighboring-features/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2010 Census Generalized Cartographic Boundary Files</title>
		<link>http://gothos.info/2011/12/2010-census-generalized-cartographic-boundary-files/</link>
		<comments>http://gothos.info/2011/12/2010-census-generalized-cartographic-boundary-files/#comments</comments>
		<pubDate>Fri, 23 Dec 2011 00:54:21 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Data Sources]]></category>
		<category><![CDATA[2010 Census]]></category>
		<category><![CDATA[census]]></category>
		<category><![CDATA[census geography]]></category>
		<category><![CDATA[geography]]></category>
		<category><![CDATA[Mapping]]></category>
		<category><![CDATA[united states]]></category>

		<guid isPermaLink="false">http://gothos.info/?p=756</guid>
		<description><![CDATA[I&#8217;ve had a few interesting projects that have kept me busy at the end of this year. I&#8217;ll do a post or two after New Years, once I&#8217;m back in the office and can take some screen shots to illustrate. In the meantime I have one tidbit I can mention &#8211; the Census Bureau has [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve had a few interesting projects that have kept me busy at the end of this year. I&#8217;ll do a post or two after New Years, once I&#8217;m back in the office and can take some screen shots to illustrate.</p>
<p>In the meantime I have one tidbit I can mention &#8211; the Census Bureau has released the 2010 version of the Generalized Cartographic Boundary Files. These files are generalized versions of the TIGER files, with smoothed and simplified boundaries and areas of coastal water removed. They haven&#8217;t posted them on the same page as the <a href="http://www.census.gov/geo/www/cob/bdy_files.html" target="_blank">2000 and 1990 boundaries</a>; they&#8217;ve mentioned they&#8217;re creating a new interface to host all of them, which is currently a work in process at <a href="http://www.census.gov/geo/www/cob/" target="_blank">http://www.census.gov/geo/www/cob/</a>.</p>
<p>However, you can get access to all the 2010 boundaries via <a href="http://www2.census.gov/geo/tiger/GENZ2010/" target="_blank">the FTP site</a> &#8211; you just need to know what you&#8217;re looking at. All the files are named with codes to identify the geographic coverage, summary level, and resolution / scale. There&#8217;s a README file on the FTP page that tells you how to identify each. </p>
<p>But in brief &#8211; The file names look like this: gz_2010_ss_lll_vv_rr.zip, where:</p>
<ul>
<li>
ss is the state INCITS / FIPS code which you can <a href="http://www.census.gov/geo/www/ansi/statetables.html" target="_blank">look up here</a> &#8211; &#8216;us&#8217; is a national level file.
</li>
<li>
lll is the summary level or unit of geography &#8211; the README file has a table with each code. The most common ones: 040 for state, 050 for county, 060 for county subdivisions, 140 for census tracts, 160 for places, 310 for metropolitan and micropolitan statistical areas, 860 for ZCTAs. (No PUMAs- 2010 PUMA boundaries haven&#8217;t been drawn yet, and 2000 PUMA boundaries are still being used in the latest ACS).
</li>
<li>
vv is a version number for the file.
</li>
<li>
rr is resolution &#8211; most of the files are 500k = 1:500,000, which is the least generalized and best for mapping state-level to regional areas. For national level files you also have the option of 5m = 1:5,000,000 and 20m = 1:20,000,000, which are more generalized and better for national mapping.
</li>
</ul>
<p>The Census Bureau has been doing a lot of tweaking to their website lately. The legacy version of the American Factfinder is set to disappear for good on Jan 20, 2012.</p>
]]></content:encoded>
			<wfw:commentRss>http://gothos.info/2011/12/2010-census-generalized-cartographic-boundary-files/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mapping Domestic Migration with IRS Data</title>
		<link>http://gothos.info/2011/11/mapping-domestitc-migration-with-irs-data/</link>
		<comments>http://gothos.info/2011/11/mapping-domestitc-migration-with-irs-data/#comments</comments>
		<pubDate>Fri, 18 Nov 2011 15:23:15 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Data Sources]]></category>
		<category><![CDATA[Maps]]></category>
		<category><![CDATA[counties]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[IRS]]></category>
		<category><![CDATA[migration]]></category>
		<category><![CDATA[united states]]></category>

		<guid isPermaLink="false">http://gothos.info/?p=731</guid>
		<description><![CDATA[Forbes magazine just published a neat interactive map on American migration using data NOT from the Census, but from &#8211; the IRS. Whether you fill it out virtually or the old fashioned way, everyone fills in their address at the top of the 1040, and the IRS stores this in a database. If you file [...]]]></description>
			<content:encoded><![CDATA[<p>Forbes magazine just published a neat <a href="http://www.forbes.com/special-report/2011/migration.html" target="_blank">interactive map on American migration</a> using data NOT from the Census, but from &#8211; the IRS. Whether you fill it out virtually or the old fashioned way, everyone fills in their address at the top of the 1040, and the IRS stores this in a database. If you file from a different address from one year to the next you must have moved, and the IRS publishes a summary file of where people went (with all personal information and practically all filing data stripped away) .</p>
<p>The Forbes map taps into five years of this data and lets you see all domestic in-migration and out-migration from a particular county. The map is a flow or line map with lines going from the county you choose to each target &#8211; net in-migration to your county is colored in blue and net out-migration is red. You can also hover over the sending and receiving counties to see how many people moved. Click on the map to choose your county or search by name; you also have the option of searching for cities or towns, as the largest place within each county is helpfully identified and tied to the data.</p>
<p>It&#8217;s relatively straightforward and fun to explore. Some of the trends are pretty striking &#8211; the difference between declining cities (Wayne County &#8211; Detroit MI) and growing ones (Travis County &#8211; Austin TX) is pretty vivid, as is the change in migration during the height of the housing boom period in 2005 compared to the depth of the bust in 2009 (see Maricopa County &#8211; Phoenix AZ). More subtle is the difference in the scope of migration between urban and rural counties, with the former having more numerous and broader connections and the latter having smaller, more localized exchanges. Case in point is my home state of Delaware &#8211; urban New Castle County (Wilmington) compared to rural Sussex County (Seaford). There are many other stories to see here &#8211; the exodus from New Orleans after Katrina and the subsequent return of residents, the escape from Los Angeles to the surrounding mountain states, and the pervasiveness of Florida as a destination for everybody (click on the thumbnails below for full images of each map).</p>
<table border="0">
<tbody>
<tr>
<td>
<p><div id="attachment_733" class="wp-caption aligncenter" style="width: 160px"><a href="http://gothos.info/wp-content/uploads/2011/11/wayneco_mi.png"><img class="size-thumbnail wp-image-733" title="Detroit 2009" src="http://gothos.info/wp-content/uploads/2011/11/wayneco_mi-150x150.png" alt="Detroit 2009" width="150" height="150" /></a><p class="wp-caption-text">Wayne Co MI (Detroit) 2009</p></div></td>
<td>
<p><div id="attachment_734" class="wp-caption aligncenter" style="width: 160px"><a href="http://gothos.info/wp-content/uploads/2011/11/travisco_austin.png"><img class="size-thumbnail wp-image-734" title="Austin 2009" src="http://gothos.info/wp-content/uploads/2011/11/travisco_austin-150x150.png" alt="Austin 2009" width="150" height="150" /></a><p class="wp-caption-text">Travis Co TX (Austin) 2009</p></div></td>
<td>
<p><div id="attachment_735" class="wp-caption aligncenter" style="width: 160px"><a href="http://gothos.info/wp-content/uploads/2011/11/maricopa_az_05.png"><img class="size-thumbnail wp-image-735" title="Phoenix 2005" src="http://gothos.info/wp-content/uploads/2011/11/maricopa_az_05-150x150.png" alt="Phoenix 2005" width="150" height="150" /></a><p class="wp-caption-text">Mariciopa Co AZ (Phoenix) 2005</p></div></td>
</tr>
<tr>
<td>
<div id="attachment_736" class="wp-caption aligncenter" style="width: 160px"><a href="http://gothos.info/wp-content/uploads/2011/11/maricopa_az_09.png"><img class="size-thumbnail wp-image-736" title="Phoenix 2009" src="http://gothos.info/wp-content/uploads/2011/11/maricopa_az_09-150x150.png" alt="Phoenix 2009" width="150" height="150" /></a><p class="wp-caption-text">Mariciopa Co AZ (Phoenix) 2009</p></div></td>
<td>
<p><div id="attachment_737" class="wp-caption aligncenter" style="width: 160px"><a href="http://gothos.info/wp-content/uploads/2011/11/newcastleco_de.png"><img class="size-thumbnail wp-image-737" title="Wilmington 2009" src="http://gothos.info/wp-content/uploads/2011/11/newcastleco_de-150x150.png" alt="Wilmington 2009" width="150" height="150" /></a><p class="wp-caption-text">New Castle Co DE (Wilmington) 2009</p></div></td>
<td>
<p><div id="attachment_738" class="wp-caption aligncenter" style="width: 160px"><a href="http://gothos.info/wp-content/uploads/2011/11/sussexco_de.png"><img class="size-thumbnail wp-image-738" title="Seaford 2009" src="http://gothos.info/wp-content/uploads/2011/11/sussexco_de-150x150.png" alt="Seaford 2009" width="150" height="150" /></a><p class="wp-caption-text">Sussex Co DE (Seaford) 2009</p></div></td>
</tr>
</tbody>
</table>
<p>While the map is great, the even better news is that the data is free and can be downloaded by anyone from the <a href="http://www.irs.gov/taxstats/" target="_blank">IRS Statistics page</a>. They provide a lot of summary data &#8211; information for individuals is never reported. The <a href="http://www.irs.gov/taxstats/indtaxstats/article/0,,id=98123,00.html" target="_blank">individual tax data page</a> with data gleaned from the 1040 has the most data that is geographic in nature. If you wanted to see how much and what kind of tax is collected by state, county, and ZIP code you could get it there. The <a href="http://www.irs.gov/taxstats/article/0,,id=212683,00.html" target="_blank">US Population Migration data</a> used to create the Forbes map is also there and the years from 2005 to 2009 are free (migration data from 1991 to 2004 is available for purchase).</p>
<p>You can download separate files for county inflow and county outflow on a state by state basis in Excel (.xls) format, or you can download the entire enormous dataset in .dat or .csv format. The data that&#8217;s reported is the number of filings and exemptions that represent a change in address by county from one year to the next, and includes the aggregated adjusted gross income of the total filers. There are some limitations &#8211; in order to protect confidentiality, if the flow from one county to another has less than 10 moves that data is lumped into an &#8220;other&#8221; category. International migration is also lumped into one interntaional category (on the Forbes map, both the other category where two counties have a flow less than 10 and the foreign migration category are not depicted).</p>
<p>The IRS migration data is often used when creating population estimates; when combined with vital stats on births and deaths it can serve as the migration piece of the demographic equation.</p>
]]></content:encoded>
			<wfw:commentRss>http://gothos.info/2011/11/mapping-domestitc-migration-with-irs-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Gothos Back In Business</title>
		<link>http://gothos.info/2011/10/gothos-back-in-business/</link>
		<comments>http://gothos.info/2011/10/gothos-back-in-business/#comments</comments>
		<pubDate>Thu, 06 Oct 2011 15:52:00 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[gothos]]></category>

		<guid isPermaLink="false">http://gothos.info/?p=728</guid>
		<description><![CDATA[Gothos had some downtime this past weekend, as I was cleaning up after a hacking attack. I managed to get things fixed and was removed from Google&#8217;s blacklist, so all is now good and the site is safe. If you visited during the last two weeks of Sept into the first weekend of Oct, I&#8217;d [...]]]></description>
			<content:encoded><![CDATA[<p>Gothos had some downtime this past weekend, as I was cleaning up after a hacking attack. I managed to get things fixed and was removed from Google&#8217;s blacklist, so all is now good and the site is safe. If you visited during the last two weeks of Sept into the first weekend of Oct, I&#8217;d recommend running a virus and an anti-spyware scanner on your machine just in case.</p>
<p>I managed to restore all the posts and images, and most of the resources &#8211; a few are missing. If you linked to this site or to any individual post your links should still work. I did lose all of the comments and the user list. At this point I&#8217;m leery about opening membership back up; it looked like a lot of garbage was inserted via comments. Since this site has functioned more or less as a traditional, one-way resource and not as a collaborative blog, I&#8217;m going to keep comments turned off and not allow members for now. If you&#8217;d like to continue following I&#8217;d suggest you subscribe to the RSS feed or bookmark the site and check back now and again. Of course, you&#8217;re always welcome to send me an email if you have questions or comments.</p>
<p>Since I&#8217;m essentially starting fresh I went with a different theme for a new look and feel. I&#8217;d like to thank my host, <a href="http://www.webfaction.com/">Webfaction</a> for their help in getting me through this, and I&#8217;d recommend the <a href="http://sitecheck.sucuri.net/scanner/">Sucuri website scanner</a> to anyone owns a site or has suspicions about one you&#8217;re visiting.</p>
]]></content:encoded>
			<wfw:commentRss>http://gothos.info/2011/10/gothos-back-in-business/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2010 American Community Survey Releases</title>
		<link>http://gothos.info/2011/09/2010-american-community-survey-releases/</link>
		<comments>http://gothos.info/2011/09/2010-american-community-survey-releases/#comments</comments>
		<pubDate>Fri, 23 Sep 2011 15:18:50 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Data Sources]]></category>
		<category><![CDATA[american community survey]]></category>
		<category><![CDATA[census]]></category>
		<category><![CDATA[census data]]></category>
		<category><![CDATA[census geography]]></category>

		<guid isPermaLink="false">http://gothos.info/?p=720</guid>
		<description><![CDATA[The US Census Bureau released the new annual data for the 2010 American Community Survey; this dataset includes an extensive number of demographic, socio-economic, and housing estimates (with margins of error) for all geographic areas in the US that have a population of at least 65,000 people. This is the first ACS survey that is [...]]]></description>
			<content:encoded><![CDATA[<p>The US Census Bureau released the <a href="http://www.census.gov/acs/www/data_documentation/2010_release/" target="_blank">new annual data for the 2010 American Community Survey</a>; this dataset includes an extensive number of demographic, socio-economic, and housing estimates (with margins of error) for all geographic areas in the US that have a population of at least 65,000 people. This is the first ACS survey that is weighted based on the 2010 Census, and that is tabulated entirely on the <a href="http://www.census.gov/acs/www/data_documentation/geography/" target="_blank">new 2010 Census geography</a>; exceptions include PUMAs and urban areas, which typically aren&#8217;t redrawn until a couple of years after a decennial census is taken. Data for these areas will be reported based on the 2000 Census geography. This will also be the first ACS that is distributed via the <a href="http://factfinder2.census.gov/faces/nav/jsf/pages/index.xhtml" target="_blank">new American Factfinder</a>. Previous ACS datasets should be moved to the new Factfinder by the end of this year.</p>
<p>According to the <a href="http://www.census.gov/acs/www/data_documentation/2010_release_schedule/" target="_blank">release schedule</a> data for the three year ACS (2008-2010) for areas with at least 20,000 residents will be published in October and the five year ACS (2006-2010) for geography down to census tracts will be released in December. The three year dataset hits a milestone this year, as for the first time we&#8217;ll have datasets with mutually exclusive years that can be feasibly compared for historical change (the 2005-2007 dataset versus 2008-2010). It should prove interesting as the earlier dataset represents the end of the brief boom years while the current one depicts the depth of the great recession. There will be some challenges in making comparisons, as the base for weighting the estimates and the geography used to tabulate them is different for each dataset (2000 Census in the earlier dataset versus 2010 Census in the latest one).</p>
]]></content:encoded>
			<wfw:commentRss>http://gothos.info/2011/09/2010-american-community-survey-releases/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Goings on at FOSS4G 2011</title>
		<link>http://gothos.info/2011/09/goings-on-at-foss4g-2011/</link>
		<comments>http://gothos.info/2011/09/goings-on-at-foss4g-2011/#comments</comments>
		<pubDate>Thu, 15 Sep 2011 18:53:55 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[Resources]]></category>
		<category><![CDATA[FOSS4G]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://gothos.info/?p=716</guid>
		<description><![CDATA[I&#8217;m at FOSS4G in Denver this week (Free and Open Source for Geospatial conference) and have learned a few things (eventually all presentations, audio and visuals of slides, will be available online): There will be a QGIS update, version 1.71, sometime this month; it&#8217;s a minor release that will fix a few bugs. Some future [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m at <a href="http://2011.foss4g.org/" target-"_blank">FOSS4G in Denver</a> this week (Free and Open Source for Geospatial conference) and have learned a few things (eventually all presentations, audio and visuals of slides, will be available online):</p>
<ul>
<li>There will be a <a href="http://2011.foss4g.org/sessions/qgis-whats-new" target="_blank">QGIS update</a>, version 1.71, sometime this month; it&#8217;s a minor release that will fix a few bugs. Some future version of QGIS will included a Data Browser (think Arc Catalog).</li>
<li>For folks who have asked me how they can get more cartographic production power out of QGIS, <a href="http://2011.foss4g.org/sessions/quantum-gis-inkscape-cartographic-tools-attractive-maps" target="_blank">Inkscape looks</a> like a good option &#8211; folks at UC Davis have been experimenting with it with some success.</li>
<li>Learned about a documentation system for open source (or any) project <a href="http://2011.foss4g.org/sessions/documenting-your-open-source-project-sphinx" target=_blank">called Sphinx</a>; documents are stored as restructured text files with some Python scripts that link them together and provide formatting for output and display.</li>
<li>Got a great, clear, concise overview of what&#8217;s involved with an open source <a href="http://2011.foss4g.org/sessions/introduction-opensource-webmapping-beyond-google-maps" target="_blank">web mapping stack</a>.</li>
<li>There&#8217;s a <a href="http://2011.foss4g.org/sessions/defining-gis-essential-gis-functions-users" target="_blank">study at Idaho State</a> (affiliated with the group of folks there that created Map Window)that&#8217;s attempted to define the core functions of GIS based on a survey of GIS users. You can view their data by contacting the project lead.</li>
<li>Educators at a community college in Arizona are experimenting with an open source raster program called Opticks; a viable solution to more expensive packages like ERDAS and IDRISI.</li>
<li>There are some new Python libraries you can use to create and mine KML data</li>
<li><a href="http://2011.foss4g.org/sessions/new-way-open-data" target="_blank">The FCC</a> used a clever method for collapsing / aggregating US Census geography from the block level to create their Broadband Map.</li>
<li>While I&#8217;ve heard of and poked around the <a href="http://2011.foss4g.org/sessions/open-data-0" target="_blank">Open Street Map Project</a>, I never realized that many of the users were contributing to the project by walking, cycling, and driving around with GPS units, which they upload to create and update road networks around the world. They also use some free datasets (like the Census TIGER files and equivalents from other countries) to augment and provide a frame of reference for their systems.</li>
<li>Data in the UK is finally opening up some more, and demand for products from the <a href="http://2011.foss4g.org/sessions/open-season-open-standards-open-source-and-open-data-ordnance-survey" target="_blank">Ordnance Survey</a> have been off the charts.</li>
<li>My presentation on using <a href="http://2011.foss4g.org/sessions/qgis-academic-library-case-study" target="_blank">QGIS in an Academic library</a> went pretty well, and I was pleased to discover I&#8217;m not the only GIS librarian at the conference! I&#8217;ve met folks from Ontario, Alberta, and Kansas.</li>
]]></content:encoded>
			<wfw:commentRss>http://gothos.info/2011/09/goings-on-at-foss4g-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

