Posts Tagged ‘census data’

Downloading Data for Small Census Geographies in Bulk

Tuesday, May 7th, 2013

I needed to download block group level census data for a project I’m working on; there was one particular 2010 Census table that I needed for every block group in the US. I knew that the American Factfinder was out – you can only download block group data county by county (which would mean over 3,000 downloads if you want them all). I thought I’d share the alternatives I looked at; as I searched around the web I found many others who were looking for the same thing (i.e. data for the smallest census geographies covering a large area).

The Census FTP site at http://www2.census.gov/

This would be the first logical step, but in the end it wasn’t optimal based on my need. When you drill down through Census 2010, Summary File 1, you see a file for every state and a national file. Initially I thought – great! I’ll just grab the national file. But the national file does NOT contain the small census statistical areas – no tracts, block groups, or blocks. If you want those small areas you have to download the files for each of the states – 51 downloads. When you download the data you can also download an MS Access database, which is an empty shell with the geography and field headers, and you can import each of the text file data tables (there a lot of them for 2010 SF1) into the db and match them to the headers during import (the instructions that were included for doing this were pretty good). This is great if you need every variable in every table for every geography, but I was only interested in one table for one geography. I could just import the one text file with my table, but then I’d have to do this import process 51 times. The alternative is to use some Python to get that one text file for every state into one big file and then do the import once, but I opted for a different route.

The NHGIS at https://www.nhgis.org/

I always recommend this resource to anyone who’s looking for historical census data or boundary files, but it’s also good if you want current data for these small areas. I was able to use their query window to widdle down the selection by dataset (2010 SF1), geography (block groups), and topic (Hispanic origin and race in my case), then I was able to choose the table I needed. On the last screen before download I was able to check a box to include all 50 states plus DC and PR in one file. I had to wait a couple minutes for the request to process, then downloaded the file as a CSV and loaded it into my database. This was the best solution for my circumstances by far – one table for all block groups in the country. If you had to download a lot (or all) of the tables or variables for every block group or block it may take quite awhile, and plugging through all of those menus to select everything would be tedious – if that’s your situation it may be easier to grab everything using the Census FTP.

nhgis

UExplore / Dexter at http://mcdc.missouri.edu/applications/uexplore.shtml

The Missouri Census Data Center’s UExplore / Dexter tool lets you choose a dataset and takes you to a window that resembles a file system, with a ton of files in it. The MCDC takes their extracts directly from the Census, so they’re structured in a similar way to the FTP site as state-based files. They begin with the state prefix and have a name that indicates geography – there are files for block groups, blocks, and one for everything else. There are national files (which don’t contain small census areas) that begin with ‘us’. The difference here is – when you click on a file, it launches a query window that let’s you customize the extract. The interface may look daunting at first, but it’s worth exploring (and there’s a tutorial to help guide you). You can choose from several output formats, specific variables or tables (if you don’t want them all), and there are a bunch of handy options that you can specify like aggregation or percent totals. In addition to the complete datasets, they’ve also created ‘Standard Extracts’ that have the most common variables, if you want just a core subset. While the NHGIS was the best choice for my specific need, the customization abilities in Dexter may fit your needs – and the state-level block group and block data is conveniently broken out from the other files.

Lastly…

There are a few others tools – I’ll give an honorable mention to the Summary File Retrieval tool, which is an Excel plugin that lets you tap directly into the American Community Survey from a spreadsheet. So if you wanted tracts or block groups for a wide area for but a small number of variables (I think 20 is the limit) that could be a winner, provided you’re using Excel 2007 or later and are just looking at the ACS. No dice in my case, as I needed Decennial Census data and use OpenOffice at home.

NYC Geodatabase in Spatialite

Wednesday, February 6th, 2013

I spent much of the fall semester and winter interim compiling and creating the NYC geodatabase (nyc_gdb), a desktop geodatabase resource for doing basic mapping and analysis at a neighborhood level – PUMAs, ZIP Codes / ZCTAs, and census tracts. There were several motivations for doing this. First and foremost, as someone who is constantly introducing new people to GIS it’s a pain sending people to a half dozen different websites to download shapefiles and process basic features and data before actually doing a project. By creating this resource I hoped to lower the hurdles a bit for newcomers; eventually they still need to learn about the original sources and data processing, but this gives them a chance to experiment and see the possibilities of GIS before getting into nitty gritty details.

Second, for people who are already familiar with GIS and who have various projects to work on (like me) this saves a lot of duplicated effort, as the db provides a foundation to build on and saves the trouble of starting from scratch each time.

Third, it gave me something new to learn and will allow me to build a second part to my open source GIS workshops. I finally sat down and hammered away with Spatialite (went through the Spatialite Cookbook from start to finish) and learned spatial SQL, so I could offer a resource that’s open source and will compliment my QGIS workshop. I was familiar with the Access personal geodatabases in ArcGIS, but for the most part these serve as simple containers. With the ability to run all the spatial SQL operations, Spatialite expands QGIS functionality, which was something I was really looking for.

My original hope was to create a server-based PostGIS database, but at this point I’m not set up to do that on my campus. I figured Spatialite was a good alternative – the basic operations and spatial SQL commands are relatively the same, and I figured I could eventually scale up to PostGIS when the time comes.

I also created an identical, MS Access version of the database for ArcGIS users. Once I got my features in Spatialite I exported them all out as shapefiles and imported them all via ArcCatalog – not too arduous as I don’t have a ton of features. I used the SQLite ODBC driver to import all of my data tables from SQLite into Access – that went flawlessly and was a real time saver; it just took a little bit of time to figure out how to set up (but this blog post helped).

The databases are focused on NYC features and resources, since that’s what my user base is primarily interested in. I purposefully used the Census TIGER files as the base, so that if people wanted to expand the features to the broader region they easily could. I spent a good deal of time creating generalized layers, so that users would have the primary water / coastline and large parks and wildlife areas as reference features for thematic maps, without having every single pond and patch of grass to clutter things up. I took several features (schools, subway stations, etc) from the City and the MTA that were stored in tables and converted them to point features so they’re readily useable.

Given that focus, it’s primarily of interest to NYC folks, but I figured it may be useful for others who wish to experiment with Spatialite. I assumed that most people who would be interested in the database would not be familiar with this format, so I wrote a tutorial that covers the database and it’s features, how to add and map data in QGIS, how to work with the data and do SQL / spatial SQL in the Spatialite GUI, and how to map data in ArcGIS using the Access Geodb. It’s Creative Commons, Attribution, Non-Commercial, Share-alike, so feel free to give it a try.

I spent a good amount of time building a process rather than just a product, so I’ll be able to update the db twice a year, as city features (schools, libraries, hospitals, transit) change and new census data (American Community Survey, ZIP Business Patterns) is released. Many of the Census features, as well as the 2010 Census data, will be static until 2020.

Screen Scraping Data with Python

Friday, March 9th, 2012

I had a request recently for population centers (aka population centroids) for all the counties in the US. The Census provides the 2010 centroids in state level files and in one national file for download, but the 2000 centroids were provided in HTML tables on individual web pages for each state. Rather than doing the tedious work of copying and pasting 51 web pages into a spreadsheet, I figured this was my chance to learn how to do some screen scraping with Python. I’m certainly no programmer, but based on what I’ve learned (I took a three day workshop a couple years ago) and by consulting books and crawling the web for answers when I get stuck, I’ve been able to write some decent scripts for processing data.

For screen scraping there’s a must-have module called Beautiful Soup which easily let’s you parse web pages, well or ill-formed. After reading the Beautiful Soup Quickstart and some nice advice I found on a post on Stack Overflow, I was able to build a script that looped through each of the state web pages, scraped the data from the tables, and dumped it into a delimited text file. Here’s the code:

## Frank Donnelly Feb 29, 2012
## Scrapes 2000 centers of population for counties from individual state web pages
## and saves in one national-level text file.

from urllib.request import urlopen
from bs4 import BeautifulSoup

output_file=open('CenPop2000_Mean_CO.txt','a')
header=['STATEFP','COUNTYFP','COUNAME','STNAME','POPULATION','LATITUDE','LONGITUDE']
output_file.writelines(",".join(header)+"\n")

url='http://www.census.gov/geo/www/cenpop/county/coucntr%s.html'

fips=['01','02','04','05','06','08','09','10',
'11','12','13','15','16','17','18','19','20',
'21','22','23','24','25','26','27','28','29','30',
'31','32','33','34','35','36','37','38','39','40',
'41','42','44','45','46','47','48','49','50',
'51','53','54','55','56']

for i in fips:
  soup = BeautifulSoup(urlopen(url %i).read())
  titleTag = soup.html.head.title
  list=titleTag.string.split()
  name=(list[4:])
  state=' '.join(name)  
 
  for row in soup('table')[1].tbody('tr'):
    tds = row('td')
    line=tds[0].string, tds[1].string, tds[2].string, state, 
    tds[3].string.replace(',',''), tds[4].string, tds[5].string

    output_file.writelines(",".join(line)+"\n")     

output_file.close()

After installing the modules step 1 is to import them into the script. I initially got a little stuck here, because there are also some standard modules for working with urls (urllib and urlib2) that I’ve seen in books and other examples that weren’t working for me. I discovered that since I’m using Python 3.x and not the 2.x series, something had changed recently and I had to change how I was referencing urllib.

With that out of the way I created a a text file, a list with the column headings I want, and then wrote those column headings to my file.

Next I read in the url. Since the Census uses a static URL that varies for each state by FIPS code, I was able to assign the URL to a variable and inserted the % symbol to substitute where the FIPS code goes. I created a list of all the FIPS codes, and then I run through a loop – for every FIPS code in the list I pass that code into the url where the % place holder is, and process that page.

The first bit of info I need to grab is the name of the state, which doesn’t appear in the table. I grab the title tag from the page and save it as a list, and then grab everything from the fourth element (fifth word) to the end of the list to capture the state name, and then collapse those list elements back into one string (have to do this for states that have multiple words – New, North, South, etc.).

So we go from the HTML Title tag:

County Population Centroids for New York

To a list with elements 0 to 5:

list=["County", "Population", "Centroids", "for", "New", "York"]

To a shorter list with elements 4 to end:

name=["New","York"]

To a string:

state=”New York”

But the primary goal here is to grab everything in the table. So we identify the table in the HTML that we want – the first table in those pages [0] is just an empty frame and the second one [1] is the one with the data. For every row (tr) in the table we can reference and grab each cell (td), and string those cells together as a line by referencing them in the list. As I string these together I also insert the state name so that it appears on every line, and for the third list element (total population in 2000) I strip out any commas (numbers in the HTML table included commas, a major no-no that leads to headaches in a csv file). After we grab that line we dump it into the output file, with each value separated by a comma and each record on it’s own line (using the new line character). Once we’ve looped through each table on each page for each state, we close the file.

There are a few variations I could have tried; I could have read the FIPS codes in from a table rather than inserting them into the script, but I preferred to keep everything together. I could have read the state names in as a list, or coupled them with the codes in a dictionary. This would have been less risky then relying on the state name in the title tag, but since the pages were well-formed and I wanted to experiment a little I went the title tag route. Instead of typing the codes in by hand I used Excel trickery to concatenate commas to the end of each code, and then concatenated all the values together in one cell so I could copy and paste the list into the script.

You can go here to see an individual state page and source, and here to see what the final output looks like. Or if you’re just looking for a national level file of 2000 population centroids for counties that you can download, look no further!

ACS Trend Reports and Census Geography Guide

Sunday, February 12th, 2012

I recently received my first question from someone who wanted to compare 2005-2007 ACS data with 2008-2010. With the release of the latter, we can make historical comparisons with the three year data for the first time since we have estimates that don’t overlap. We should be able to make some interesting comparisons, since the first set covers the real estate boom years (remember those?) and the second covers the Great Recession. One resource that makes such comparisons relatively painless is over at the Missouri Census Data Center. They’ve put together a really clean and simple interface called the ACS Trends Menu, which allows you to select either two one period estimates or two three period estimates and compare them for several different census geographies – states, counties, MCDs, places, metros, Congressional Districts, PUMAs, and a few others – for the entire US (not just Missouri). The end result is a profile that groups data into the Economic, Demographic, Social, and Housing categories that the Census uses for its Demographic Profile tables. The calculations for change and percent change for the estimates and margins of error are done for you.

Downloading the data is not as straightforward – the links to extract it just brought me some error messages, so it’s still a work in progress. Until then, a simple copy and paste into your spreadsheet of choice will work fine.

ACS Trends Menu

If you like the interface, they’ve created separate ones for downloading profiles from any of the ACS periods or from the 2010 Census. The difference here is that you’re looking at one time frame; not across time periods. The interface and the output are the same, but in these menus you can compare four different geographies at once in one profile. Unlike the Trends reports, both the ACS and 2010 Census profiles have easy, clear cut ways to download the profiles as a PDF or a spreadsheet. If you’re happy with data in a profile format and want an interface that’s a little less confusing to navigate than the American Factfinder, these are all great alternatives (and if you’re building web applications these profiles are MUCH easier to work with – you can easily build permanent links or generate them on the fly).

The US Census Bureau also recently put together a great resource called the Guide to State and Local Census Geography. They provide a census geography overview of each state: 2010 population, land area, bordering states, year of entry into the union, population centroids, and a description of how local government is organized in the state – (i.e. do they have municipal civil divisions or only incorporated cities and unincorporated land, etc). You get counts for every type of geography – how many counties, tracts, ZCTAs, and so on, AND best of all you can download all of this data directly in tab delimited files. Need a list of every county subdivision in a state, with codes, land area, and coordinates? No problem – it’s all there.

2010 American Community Survey Releases

Friday, September 23rd, 2011

The US Census Bureau released the new annual data for the 2010 American Community Survey; this dataset includes an extensive number of demographic, socio-economic, and housing estimates (with margins of error) for all geographic areas in the US that have a population of at least 65,000 people. This is the first ACS survey that is weighted based on the 2010 Census, and that is tabulated entirely on the new 2010 Census geography; exceptions include PUMAs and urban areas, which typically aren’t redrawn until a couple of years after a decennial census is taken. Data for these areas will be reported based on the 2000 Census geography. This will also be the first ACS that is distributed via the new American Factfinder. Previous ACS datasets should be moved to the new Factfinder by the end of this year.

According to the release schedule data for the three year ACS (2008-2010) for areas with at least 20,000 residents will be published in October and the five year ACS (2006-2010) for geography down to census tracts will be released in December. The three year dataset hits a milestone this year, as for the first time we’ll have datasets with mutually exclusive years that can be feasibly compared for historical change (the 2005-2007 dataset versus 2008-2010). It should prove interesting as the earlier dataset represents the end of the brief boom years while the current one depicts the depth of the great recession. There will be some challenges in making comparisons, as the base for weighting the estimates and the geography used to tabulate them is different for each dataset (2000 Census in the earlier dataset versus 2010 Census in the latest one).

ZIP Code KML Map for NYC Census Data

Saturday, September 10th, 2011

With the release of both the 2010 Census profiles for ZCTAs (ZIP Code Tabulation Areas) and the TIGER line files for 2010 Census geographies, I created another Google Map finding aid for NYC neighborhood data by ZIP code (I previously created one for PUMAs with American Community Survey data). Once again I used the Export to KML plugin that was created for ArcGIS. This allowed me to use the TIGER shapefile in ArcGIS to create the map I wanted and then export it as a KML, while using fields in the attribute table of each feature to insert the ZCTA number into stable links for the census profiles, automatically generating unique urls for each feature. Click on the ZCTA in the map, and then click on a link to open a profile directly from the new American Factfinder.

There were two new obstacles I had to contend with this time. The first was that my department has finally migrated to Windows 7 from Windows XP, and I upgraded from ArcGIS 9.3 to 10. I had to reinstall the Export to KML plugin (version 2.5.5) and ran into trouble; fortunately all the work-arounds were included in the plugin’s documentation. I don’t have administrator rights on my machine, so I had to have someone install the plugin as an administrator; this included running the initial setup file AND running Arc as an administrator as you add and turn the plugin on. That was straightforward, but when I ran it the first time I got an error message – there’s a particular Windows dll or ocx file that the plugin needs and it was missing (presumably something that was included in XP but not in 7). I downloaded the necessary file, and with administrator rights moved it into the system32 folder and registered the file via the command line. After that I was good to go.

The second issue was with the Census Bureau’s new American Factfinder. With the old Factfinder the urls that were generated as you built and accessed tables were static and you could simply save and bookmark them. Not the case in the new Factfinder; you can bookmark some basic tables but most of them are “too complex to bookmark”; you can save and download queries from the online ap but that’s it. After some digging I found a CB document that tells you how you can create deep links to any query you run and table you create. The url consists of a fixed series of codes that identify the dataset, year, table, and geography. So this link:

http://factfinder2.census.gov/bkmk/table/1.0/en/DEC/10_DP/DPDP1/8600000US10010

Tells us that were getting a table from version 1.0 of the American Factfinder in English. It’s from the Decennial Census, 2010 Demographic Profiles, Demographic Profile Table 1, for ZCTA 10010 (860 is the summary level code that indicates we’re looking at ZCTAs). So for the plugin to create the links, I just included this URL but for the last five digits I specified the attribute from the ZCTA shapefile that held the ZCTA code. So when the plugin creates the KML, each KML feature has a link generated that is specific to it:

http://factfinder2.census.gov/bkmk/table/1.0/en/DEC/10_DP/DPDP1/8600000US[ZCTA5CE10]

You can see this previous post for details on how the Export to KML plugin works.

For now, the 2010 and 2000 Census are in the new American Factfinder. The American Community Survey, the Economic Census, population estimates, and a few other datasets are still in the older, legacy Factfinder. According to the CB all of this data will be migrated to the new Factfinder by the end of 2011 and the legacy version will disappear. At that point I’ll have to update my PUMA map so that it points to the profiles in the new Factfinder.

You can take a look at the ZCTA map and profiles below (I’m hosting it on the NYC data resource guide I’ve created for my college). As I’ve written before, ZCTAs are odd Census geographies since they are approximations of residential USPS ZIP Codes created by aggregating census blocks based on addresses; you can see in many instances where boundaries have a blocky teeth-like appearance instead of straight lines. Since they’re created directly by aggregating blocks, ZCTAs don’t correspond or mesh with other census boundaries like tracts or PUMAs, or even legal boundaries like counties. In some cases my assignment of county-based colors doesn’t ring true. For example, ZCTA 11370 includes part of the East Elmhurst neighborhood in Queens and Rikers Island, which is in the Bronx. ZCTA 10463 includes the Bronx neighborhoods of Kingsbridge and Spuyten Duyvil and the Manhattan neighborhood of Marble Hill (a geographic anomaly; it’s not on the Island of Manhattan but it’s part of Manhattan borough).

The most salient issue with ZCTAs is that they are only tabulated for the decennial census and not the American Community Survey; the currency of data and spectrum of census variables will be limited compared to other types of geography.


View Larger Map

2010 Census Data Being Released

Thursday, June 16th, 2011

The US Census Bureau has begin releasing data for Summary File 1, which is the primary summary data set that the Bureau tabulates. They will release data for groups of states on a weekly basis from June through September. Alabama and Hawaii were the first states released today. California, Delaware, Kansas, Pennsylvania and Wyoming are out next week.

This data is based on the 100% count of the population and is being released for geographies that nest within states: states, counties, county subdivisions, places, census tracts, ZCTAs, and congressional districts, and in some cases block groups and blocks. You can download the data table by table by building queries via the new American Factfinder, or power users can download entire datasets via the FTP site.

You’ll see how small the 2010 Census is compared to the past: we’re only going to get basic demographic variables. The extensive number of socio-economic indicators – education, income, language, employment status, etc – are no longer collected as part of the decennial census; you have to turn to the American Community Survey for this data, which is released on an annual basis.

Here’s what’s in the 2010 Census:

  • Total Population
  • Urban and Rural Population
  • Gender and Age
  • Race
  • Hispanic or Latino Origin
  • Households (Including Type and Size)
  • Group Quarters
  • Families
  • Family Relationships
  • Housing Units
  • Occupancy Status (Occupied or Vacant)
  • Tenure (Owner or Renter Occupied)

Many of these variables are cross-tabulated by age, gender, race, Hispanic or Latino Origin, Household Type, and Household Size. Once we get to the fall of 2011 we’ll start to see national level data for divsions, regions, and metropolitan areas.

2010 Census Redistricting Data

Sunday, April 17th, 2011

The Redistricting Summary Data [P.L. 94-171] from the 2010 Census has all been published for the nation, states, counties, and places, and is available via the new American Factfinder. The redistricting data includes basic demographic data: total population, race, Hispanic or Latino origin, and number of housing units occupied and vacant. Data is available down to census blocks and is available for most (but not all – no ZCTAs or PUMAs) geographies.

If you don’t want all the data for a state, don’t want to slog through the Factfinder, and are comfortable working with large text files, you can FTP the summary data from the Redistricting Data homepage. If you want basic summary data for states, counties, and places and don’t want to fuss with the Factfinder or text files, you can download Excel spreadsheets from the Redistricting Data Press Kit. They also have some pdf / jpg maps showing county level population and population change, plus interactive map widgets like the one below for the country and for each state. 2010 Redistricting TIGER Shapefiles have also been released for geographies included in the redistricting dataset.

The full 2010 Census for all geographies will be released throughout this summer and into the fall in Summary File 1 [SF1]. Stay tuned.

Some 2010 Census Updates

Monday, February 7th, 2011

Some geography updates to pass along regarding new US Census data:

  • The Census has released a few 2010 map widgets that you can embed in web pages. One shows population change, density, and apportionment for the whole country at the state level, while the other shows population, race, and Hispanic change for states at the county level. As of this post only four states are ready (LA, MS, NJ, and VA) but they’ll be adding the rest once they’re available.
  • The 2010 TIGER Line Files are starting to be released; they’ve changed the download interface a little bit based on user feedback. Most summary levels / geographic areas are available; some (like ZIP Codes and PUMAs) will be released later this year.
  • They’re also rolling out the new interface for the American Factfinder; currently you can get 2000 Census data, some population estimates, and the 2010 Census data as it becomes available. Other datasets like the American Community Survey and Economic Census will be added over time. Some maps and gov docs librarians have expressed concerned about the change – apparently when you download the data from the new interface the FIPS codes are not “ready to go” for joining to shapefiles; there’s one long geo id that has to be parsed. The other concern is that the 1990 Census won’t be carried over into the new interface at all. The original American Factfinder is slated to come down towards the end of this year.


Google Maps to Create a Census Finding Aid

Thursday, May 13th, 2010

Yikes! It’s been quite awhile since my last post (the past couple months have been a little tough for me), but I just finished an interesting project that I can share.

I constantly get questions from students who are interested in getting recent demographic and socio-economic profiles for neighborhoods in New York City. The problem is that neighborhoods are not officially defined, so we have to look for a surrogate. The City has created neighborhood-like areas out of census tracts called community districts and they publish profiles for them, but this data is from the decennial census  and not current enough for their needs.  ZIP code data is also only available from the decennial census.

We can use PUMAs (Public Use Microdata Areas) to approximate neighborhoods in large cities, and they are published as part of the 3 year estimates of the American Community Survey. The problem is, in order to look up the data from the census you need to search by PUMA number – there are no qualitative place names. The city and the census have worked together to assign names to neighborhoods as part of the NYC Housing and Vacancy Survey, but this is the only place (I’ve found) that uses these names. You need to look in several places to figure out what the PUMA number and boundaries for an area are and then navigate through the census site to find it. Too much for the average student who visits me at the reference desk or emails me looking for data.

My solution was to create a finding aid in Google maps that tied everything together:

View Larger Map

I downloaded PUMA boundaries from the Census TIGER file site in a shapefile format. I opened them up in ArcGIS and used an excellent script that I downloaded called Export to KML. ArcGIS 9.3 does support KML exports via the toolbox, and there are a number of other scripts and stand-alone programs that can do this (I tried several) but Export to KML was best (assuming you have access to ArcGIS) in terms of the level of customization and the thoroughness of the user documentation. I symbolized the PUMAs in ArcGIS using the colors and line thickness that I wanted and fired up the tool. It allows you to automatically group and color features based on the layer’s symbology. I was able to add a “snippet” to each feature to help identify it (I used the PUMA number as the attribute name and the neighborhood name as my snippet, so both appear in the legend) and added a description that would appear in the pop up window when that feature is clicked. In that description, I added the URL from the ACS census profile page for a particular PUMA – the cool part here is that the URL is consistent and contains the PUMA number. So, I replaced the specific number and inserted the [field] name from the PUMAs attribute table that contained the number. When I did the export, the URLs for each individual feature were created with their PUMA number inserted into the link.

There were a few quirks – I discovered that you can’t automatically display labels on a Google Map without subterfuge, like creating the labels as images and not text. Google Earth (but not Maps) supports labels if you create multi-geometry where you have a point for a label and a polygon for the feature. If you select a labeling attribute on the initial options screen of the Export to KML tool, you create an icon in the middle of each polygon that has a different description pop-up (which I didn’t want so I left it to none and lived without labels). I made my features 75% transparent (a handy feature of Export to KML) so that you could see the underlying Google Map features through the PUMA, but this made the fill AND the lines transparent, making the features too difficult to see. After the export I opened the KML in a text editor and changed the color values for the lines / boundaries by hand, which was easy since the styles are saved by feature group (boroughs) and not by individual feature (pumas). I also manually changed the value of the folder open element (from 0 to 1) so that the feature and feature groups (pumas and boroughs) are expanded by default when someone opens the map.

After making the manual edits, I uploaded the KML to my webserver and pasted the url for it into the Google Maps search box, which overlayed my KML on the map. Then I was able to get a persistent link to the map and code for embedding it into websites via the Google Map Interface. No need to add it to Google My Maps, as I have my own space. One big quirk – it’s difficult to make changes to an existing KML once you’ve uploaded and displayed it. After I uploaded what I thought would be my final version I noticed a typo. So I fixed it locally, uploaded the KML and overwrote the old one. But – the changes I made didn’t appear. I tried reloading and clearing the cache in my browser, but no good – once the KML is uploaded and Google caches it, you won’t see any of your changes until Google re-caches. The conventional wisdom is to change the name of the file every single time – which is pretty dumb as you’ll never be able to have a persistent link to anything. There are ways to circumvent the problem, or you can just wait it out. I waited one day and by the next the file was updated; good enough for me, as I’ll only need to update it once a year.

I’m hosting the map, along with some static PDF maps and a spreadsheet of PUMA names and neighborhood numbers, from the NYC Data LibGuide I created (part of my college’s collection of research guides). If you’re looking for neighborhood names to associate with PUMA numbers for your city, you’ll have to hunt around and see if a local planning agency or non-profit has created them for a project or research study (as the Census Bureau does not create them). For example, the County of Los Angeles Department of Mental Health uses pumas in a large study they did where they associated local place names with each puma.

If you’re interested in dabbling in some KML, there’s Google’s KML tutorial. I’d also recommend The KML Handbook by Josie Wernecke. The catch for any guide to KML is that while all KML elements are supported by Google Earth, there’s only partial support for Google Maps.


Copyright © 2013 Gothos. All Rights Reserved.
No computers were harmed in the 0.626 seconds it took to produce this page.

Designed/Developed by Lloyd Armbrust & hot, fresh, coffee.