Archive for the ‘Maps’ Category.

Google Maps to Create a Census Finding Aid

Yikes! It’s been quite awhile since my last post (the past couple months have been a little tough for me), but I just finished an interesting project that I can share.

I constantly get questions from students who are interested in getting recent demographic and socio-economic profiles for neighborhoods in New York City. The problem is that neighborhoods are not officially defined, so we have to look for a surrogate. The City has created neighborhood-like areas out of census tracts called community districts and they publish profiles for them, but this data is from the decennial census  and not current enough for their needs.  ZIP code data is also only available from the decennial census.

We can use PUMAs (Public Use Microdata Areas) to approximate neighborhoods in large cities, and they are published as part of the 3 year estimates of the American Community Survey. The problem is, in order to look up the data from the census you need to search by PUMA number – there are no qualitative place names. The city and the census have worked together to assign names to neighborhoods as part of the NYC Housing and Vacancy Survey, but this is the only place (I’ve found) that uses these names. You need to look in several places to figure out what the PUMA number and boundaries for an area are and then navigate through the census site to find it. Too much for the average student who visits me at the reference desk or emails me looking for data.

My solution was to create a finding aid in Google maps that tied everything together:

View Larger Map

I downloaded PUMA boundaries from the Census TIGER file site in a shapefile format. I opened them up in ArcGIS and used an excellent script that I downloaded called Export to KML. ArcGIS 9.3 does support KML exports via the toolbox, and there are a number of other scripts and stand-alone programs that can do this (I tried several) but Export to KML was best (assuming you have access to ArcGIS) in terms of the level of customization and the thoroughness of the user documentation. I symbolized the PUMAs in ArcGIS using the colors and line thickness that I wanted and fired up the tool. It allows you to automatically group and color features based on the layer’s symbology. I was able to add a “snippet” to each feature to help identify it (I used the PUMA number as the attribute name and the neighborhood name as my snippet, so both appear in the legend) and added a description that would appear in the pop up window when that feature is clicked. In that description, I added the URL from the ACS census profile page for a particular PUMA – the cool part here is that the URL is consistent and contains the PUMA number. So, I replaced the specific number and inserted the [field] name from the PUMAs attribute table that contained the number. When I did the export, the URLs for each individual feature were created with their PUMA number inserted into the link.

There were a few quirks – I discovered that you can’t automatically display labels on a Google Map without subterfuge, like creating the labels as images and not text. Google Earth (but not Maps) supports labels if you create multi-geometry where you have a point for a label and a polygon for the feature. If you select a labeling attribute on the initial options screen of the Export to KML tool, you create an icon in the middle of each polygon that has a different description pop-up (which I didn’t want so I left it to none and lived without labels). I made my features 75% transparent (a handy feature of Export to KML) so that you could see the underlying Google Map features through the PUMA, but this made the fill AND the lines transparent, making the features too difficult to see. After the export I opened the KML in a text editor and changed the color values for the lines / boundaries by hand, which was easy since the styles are saved by feature group (boroughs) and not by individual feature (pumas). I also manually changed the value of the folder open element (from 0 to 1) so that the feature and feature groups (pumas and boroughs) are expanded by default when someone opens the map.

After making the manual edits, I uploaded the KML to my webserver and pasted the url for it into the Google Maps search box, which overlayed my KML on the map. Then I was able to get a persistent link to the map and code for embedding it into websites via the Google Map Interface. No need to add it to Google My Maps, as I have my own space. One big quirk – it’s difficult to make changes to an existing KML once you’ve uploaded and displayed it. After I uploaded what I thought would be my final version I noticed a typo. So I fixed it locally, uploaded the KML and overwrote the old one. But – the changes I made didn’t appear. I tried reloading and clearing the cache in my browser, but no good – once the KML is uploaded and Google caches it, you won’t see any of your changes until Google re-caches. The conventional wisdom is to change the name of the file every single time – which is pretty dumb as you’ll never be able to have a persistent link to anything. There are ways to circumvent the problem, or you can just wait it out. I waited one day and by the next the file was updated; good enough for me, as I’ll only need to update it once a year.

I’m hosting the map, along with some static PDF maps and a spreadsheet of PUMA names and neighborhood numbers, from the NYC Data LibGuide I created (part of my college’s collection of research guides). If you’re looking for neighborhood names to associate with PUMA numbers for your city, you’ll have to hunt around and see if a local planning agency or non-profit has created them for a project or research study (as the Census Bureau does not create them). For example, the County of Los Angeles Department of Mental Health uses pumas in a large study they did where they associated local place names with each puma.

If you’re interested in dabbling in some KML, there’s Google’s KML tutorial. I’d also recommend The KML Handbook by Josie Wernecke. The catch for any guide to KML is that while all KML elements are supported by Google Earth, there’s only partial support for Google Maps.

Track 2010 Census Participation Rates

The 2010 Census is in full swing – the target date of April 1st is coming up soon. I mailed my form back last week. If you’re curious as to how many others have mailed theirs back, check out the bureau’s interactive Take 10 Map. Built on top of a Google Map interface, it allows you to track participation rates by state, county, place, reservation, and census tract. You can zoom in to change the scale and select different geography, or enter a zip code, city, or state to zoom to an area of choice. Clicking on an area will display it’s participation rate to date, compared to the state and national rates.

Data is updated daily, Monday through Friday. Once you click on a particular area, if you click the Track Participation Rate link it will create a widget that you can embed in a website to provide the updated rate. Unlike a lot of the other interactive web maps floating around these days, the bureau does give you the ability to download the actual data behind the map, if you want to do some analysis of your own.

Reading List for Geographic Information Course

The fall semester is here, and I’m about to start teaching the class I mentioned in my last post (an information studies course on geographic information). I thought I’d share my reading list and try out the Open Book plugin. I chose my readings based on: my particular audience (undergraduate students from many disciplines with little or no background in geography), relevance (materials appropriate in a hybrid information studies / geography course), cost (wanting to assign the students a single textbook that’s reasonably priced and covers all the bases, and will supplement with other readings), and copyright (staying within the bounds of fair use by not assigning too much from a single work). Here goes:

Making Maps
John Krygier, Denis Wood; The Guilford Press 2005

I decided to go with Krygier and Woods Making Maps as my assigned text book. Since cartography is a visual and technical art, I thought it made sense to use a book that relies on visuals for explanations rather than text. It’s approachable, particularly for my students who won’t be coming from a geography background, affordable, wonderfully quirky, and covers all of the essentials of the geographic framework and map interpretation and design independent of specific GIS software.

Place
Tim Cresswell; Blackwell Publishing Limited 2004

I’m using the first chapter of Cresswell’s book as a succinct introduction to how individuals define places, but would recommend the rest of the text for classes that cover geographic concepts and methods.

Georeferencing
Linda L. Hill; The MIT Press 2006

I’m assigning the second and third chapters of Hill’s book. The second chapter, which discusses how people process, store, and use geographic information is the best summary of this topic that I’ve ever seen, and the third chapter is a good overview of the different types of geographic objects. As a librarian-geo nerd, I love the chapters that deal with coordinate metadata and gazetteers, but won’t be using them in this class.

Image Of The City
Kevin Lynch; M.I.T. Press 1960

This is an urban planning / design classic, and I’ll have my students read the summary of Lynch’s city elements (based on his research, Lynch proposed that people mentally break the urban environment down into five types of elements in order to organize and navigate the city: paths, barriers, districts, nodes, and landmarks).

Realms, Regions And Concepts

This is the only traditional textbook that I’ll be borrowing from (I actually used it when I was a Freshmen, way back when). While I’m using the previous three books to discuss egocentric places, or how we as individuals conceive of place, I’m using the first chapter of this book to give the students an overview of geocentric places – the formal, defined hierarchy of places that exist in the world – and to introduce them to the concept of regions.

How To Lie With Maps
Mark S. Monmonier; University Of Chicago Press 1991

This has become a modern classic and I almost assigned it as a second textbook. I am assigning the chapter on maps for propaganda as a background to our discussion on map interpretation and communication, and will later use the chapter on census maps to talk about the effects of data classification and choice of enumeration units.

Desktop GIS
Gary Sherman; Pragmatic Bookshelf 2008

This is the only software book that I’ll be using chapters from, so the students have some formal guide for using QGIS (in addition to the QGIS documentation). I’m using the chapters on vector and raster data.

Key Concepts And Techniques In GIS
Jochen Albrecht; Sage Publications Ltd 2007

This concise, excellent book deals strictly with the concepts and principles behind GIS. I’m using the chapters on spatial search and geoprocessing, but would recommend the entire book for any GIS course, novice to advanced.

In addition to chapters from these books, I’ll also be using:

  • “Revolutions in Mapping” by John Noble Wilford, National Geographic Feb 1998 – a great overview of the history of cartography
  • USGS GIS poster – if there is such a thing, this is a “web classic” and an accessible intro to GIS
  • One article from a scholarly journal and one article from a mass market magazine to illustrate how geographic research is covered and used
  • And for shameless self-promotion, I summary I wrote about US Census data – In Three Parts

Finally, an honorable mention:

A Primer Of GIS
Francis Harvey; The Guilford Press 2008

If I was teaching an introductory GIS course in a geography or earth sciences department, this is certainly the book I would use, and for those of you in that boat I’d recommend checking it out. It does an excellent job of covering GIS principles without being software specific, contains exercises at the end of each chapter, and is well written and affordable. Since the scope of my course is broader than GIS and my audience more general and diverse, I opted to leave it out (but may still assign a chapter).

Print Composer in QGIS – ACS Puma Maps

ny_youth_pumasI wrapped up a project recently where I created some thematic maps of 2005-2007 ACS PUMA level census data for New York State. I decided to do all the mapping in open source QGIS, and was quite happy with the result, which leads me to retract a statement from a post I made last year, where I suggested that QGIS may not be the best for map layout. The end product looked just as good as maps I’ve created in ArcGIS. There were a few tricks and quirks in using the QGIS Print Composer and I wanted to share those here. I’m using QGIS Kore 1.02, and since I was at work I was using Windows XP with SP3 (I run Ubuntu at home but haven’t experimented with all of these steps yet using Linux). Please note that the data in this map isn’t very strong – the subgroup I was mapping was so small that there were large margins of errors for many of the PUMAs, and in many cases the data was suppressed. But the map itself is a good example of what an ACS PUMA map can look like, and is a good example of what QGIS can do.

  • Inset Map – The map was of New York State, but I needed to add an inset map of New York City so the details there were not obscured. This was just a simple matter of using the Add New Map button for the first map, and doing it a second time for the inset. In the item tab for the map, I changed the preview from rectangle to cache and I had maps of NY state in each map. Changing the focus and zoom of the inset map was easy, once I realized that I could use the scroll on my mouse to zoom in and out and the Move Item Content button (hand over the globe) to re-position the extent (you can also manually type in the scale in the map item tab). Unlike other GIS software I’ve experimented with, the extent of the map layout window is not dynamically tied to the data view – which is a good thing! It means I can have these two maps with different extents based on data in one data window. Then it was just a matter of using the buttons to raise or lower one element over another.
  • Legend – Adding the legend was a snap, and editing each aspect of the legend, the data class labels, and the categories was a piece of cake. You can give your data global labels in the symbology tab for the layer, or you can simply alter them in the legend. One quirk for the legend and the inset map – if you give assign a frame outline that’s less than 1.0, and you save and exit your map, QGIS doesn’t remember this setting if when you open your map again – it sets the outline to zero.
  • Text Boxes / Labels – Adding them was straightforward, but you have to make sure that the label box is large enough to grab and move. One annoyance here is, if you accidentally select the wrong item and move your map frame instead of the label, there is no undo button or hotkey. If you have to insert a lot of labels or free text, it can be tiresome because you can’t simply copy and paste the label – you have to create a new one each time, which means you have to adjust your font size and type, change the opacity, turn the outline to zero, etc each time. Also, if the label looks “off” compared to any automatic labeling you’ve done in the data window, don’t sweat it. After you print or export the map it will look fine.
  • North Arrow – QGIS does have a plugin for north arrows, but the arrow appears in the data view and not in the print layout. To get a north arrow, I inserted a text label, went into the font menu, and chose a font called ESRI symbols, which contains tons of north arrows. I just had to make the font really large, and experiment with hitting keys to get the arrow I wanted.
  • Scale Bar – This was the biggest weakness of the print composer. The scale bar automatically takes the unit of measurement from your map, and there doesn’t seem to be an option to convert your measurement units. Which means you’re showing units in feet, meters, or decimal degrees instead of miles or kilometers, which doesn’t make a lot of sense. Since I was making a thematic map, I left the scale bar off. If anyone has some suggestions for getting around this or if I’m totally missing something, please chime in.
  • Exporting to Image – I exported my map to an image file, which was pretty simple. One quirk here – regardless of what you set as your paper size, QGIS will ignore this and export your map out as the optimal size based on the print quality (dpi) that you’ve set (this isn’t unique to QGIS – ArcGIS behaves the same way when you export a map). If you create an image that you need to insert into a report or web page, you’ll have to mess around with the dpi to get the correct size. The map I’ve linked to in this post uses the default 300 dpi in a PNG format.
  • Printing to PDF – QGIS doesn’t have a built in export function for PDF, so you have to use a PDF print driver via your print screen (if you don’t have the Adobe PDF printer or a reasonable facsimile pre-installed, there are a number  of free ones available on sourceforge – PDFcreator is a good one). I tried Adobe and PDFcreator and ran into trouble both times. For some reason when I printed to PDF it was unable to print the polygon layer I had in either the inset map or the primary map (I had a polygon layer of pumas and a point layer of puma centroids showing MOEs). It appeared that it started to draw the polygon layer but then stopped near the top of the map. I fiddled with the internal settings of both pdf drivers endlessly to no avail, and after endless tinkering found the answer. Right before I go to print to pdf, if I selected the inset map, chose the move item content button (hand with globe), used the arrow key to move the extent up one, and then back one to get it to it’s original position, then printed the map, it worked! I have no idea why, but it did the trick. After printing the map once, to print it again you have to re-do this trick. I also noticed that after hitting print, if the map blinked and I could see all the elements, I knew it would work. But, if the map blinked and I momentarily didn’t see the polygon layer, I knew it wouldn’t export correctly.

Despite a few quirks (what software doesn’t have them), I was really happy with the end result and find myself using QGIS more and more for making basic to intermediate maps at work. Not only was the print composer good, but I was also able to complete all of the pre-processing steps using QGIS or another open source tool. I’ll wrap up by giving you the details of the entire process, and links to previous posts where I discuss those particular issues.

I used 2005-2007 American Community Survey (ACS) date from the US Census Bureau, and mapped the data at the PUMA level. I had to aggregate and calculate percentages for the data I downloaded, which required using a number of spreadsheet formulas to calculate new margins of error; (MOEs). I downloaded a PUMA shapefile layer from the US Census Generalized Cartographic Boundary files page, since generalized features were appropriate at the scale I was using. The shapefile had an undefined coordinate system, so I used the Ftools add-on in QGIS I converted the shapefile from single-part to multi-part features. Then I used Ftools to join my shapefile to the ACS data table I had downloaded and cleaned-up (I had to save the data table as a DBF in order to do the join). Once they were joined, I classified the data using natural breaks (I sorted and eyeballed the data and manually created breaks based on where I thought there were gaps). I used the Color Brewer tool to choose a good color scheme, and entered the RGB values in the color / symbology screen. Once I had those colors, I saved them as custom colors so I could use them again and again. Then I used Ftools to create a polygon centroid layer out of my puma/data layer. I used this new point layer to map my margin of error values. Finally, I went into the print composer and set everything up. I exported my maps out as PNGs, since this is a good image format for preserving the quality of the maps, and as PDFs.

NY Times Interactive Maps

The New York Times has been putting together some great, web-based, thematic maps lately. I thought I’d provide a summary of some of the latest and greatest here.

US Maps

  • Immigration Explorer – Explore foreign born groups for the United States by county, based on the decennial census from 1880 to 2000. Choropleth maps of the largest immigrant group per county and graduated circle maps depicting the size of each group. 3/10/2009.
  • The Geography of a Recession – Choropleth map of the US that shows the annual change in unemployment by county. Lets you filter by county type (urban, rural, manufacturing areas, housing bubbles). 3/7/2009.
  • A Growing Detention Network – a graduated circle map of the US that shows detention centers where people are held on immigration violations, by number of detainees and type of facility. 12/26/2008.
  • Where Homes Are Worth Less Than the Mortgage – State-based US choropleth and graduated circle maps of the housing and debt crisis. 11/10/2008.

NYC Maps

  • New Yorkers Assess Their City – How New Yorkers rate their neighborhood based on quality of life, city services, education, transportation and crime. Based on a large survey of city residents. Choropleth maps of community districts. 3/7/2009.
  • Census Shows Growing Diversity in New York City – Choropleth maps show median rent and median income by PUMAs in 2000 and 2007. An example of mapping 3-year ACS data by PUMAs to show patterns below the county level. 12/9/2008.

World Maps

  • A Map of Olympic Medals – A cartogram of countries based on the number of olympic medals they’ve won for every olympics from 1896 to 2008. Mouse over to get medal counts. 8/4/2008.

Mapping ACS Census Data for Urban Areas With PUMAs

The NY Times wrote a story recently based on the new 3 year ACS data that the Census Bureau released a couple weeks ago (see my previous post for details). They created some maps for this story using geography that I would never have thought to use.

Outside of Decennial Census years, it is difficult to map demographic patterns and trends within large cities as you’ll typically get one figure for the entire city and you can’t get a break down for areas within. Data for areas like census tracts and zip codes is not available outside the ten-year census (yet), and large cities exist as single municipal divisions that aren’t subdivided. New York City is an exception, as it is the only city composed of several counties (boroughs) and thus can be subdivided. But the borough data still doesn’t reveal much about patterns within the city.

The NY Times used PUMAS – Public Use Microdata Areas – to subdivide the city into smaller areas and mapped rents and income. PUMAs are aggregations of census tracts and were designed for aggregating and mapping public microdata. Microdata consists of a selection of actual individual responses from the census or survey with the personal identifying information (name, address, etc) stripped away. Researchers can build their own indicators from scratch, aggregate them to PUMAs, and then figure out the degree to which the sample represents the entire population.

Since PUMAs have a large population, the new three-year ACS data is available at the PUMA level. The PUMAs essentially become surrogates for neighborhoods or clusters of neighborhoods, and in fact several NYC agencies have created districts or neighborhoods based on these boundaries for statistical or planning purposes. This wasn’t the original intent for creating or using PUMAs, but it’s certainly a useful application of them.

You can check out the NY Times article and maps here – Census Shows Growing Diversity in New York City (12/9/08). I tested ACS / PUMA mapping out myself by downloading some PUMA shapefiles from the Census Bureau’s Generalized Cartographic Boundaries page, grabbing some of the new annual ACS data from the American Factfinder, and creating a map of Philly. In the map below, you’re looking at 2005-2007 averaged data that shows the percentage of residents who lived in their current home last year. If you know Philly, you can see that the PUMAs do a reasonable job of approximating regions in the city – South Philly, Center City, West Philly, etc.

The problem I ran into here was that data did not exist for all of the PUMAs – in this case, South Philly and half of North Philly had values of zero. According to the footnotes on the ACS site, there were no values for these areas because “no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution”. So even though the PUMA geography is generally available, there still may be cases where data for particular variables for individual geographies is missing.

Just for the heck of it, I tried looking at the annual ACS data which is limited to more populated areas (must have 65k population where 3 year estimates are for areas with at least 20k) and even more data was missing (in this instance, all the areas in the northeast). Even though PUMAs have a minimum population of 100k people, the ACS sampling is county based. So even if the sample size for a county is ideal, they may not have a significant threshold for individual places within a county to compute an estimate. At least, that’s my guess. Regardless, it’s still worth looking at for the city and data you’re interested in.

ACS Data for Philly Pumas

Red States / Blue States

It’s been a busy summer – I’ve spent a good chunk of it working on an election mapping project. The library wanted to create a resource for students to use for the upcoming 2008 presidential election. Here it is:

Red States, Blue States: Electoral Strategy Behind the Map

A few of the procedures and issues I encountered while working on the project became fodder for a number of posts to this blog, so I thought I’d share the end result.

I’ve also been assembling data and pages for a server I’ve been given to provide GIS data at Baruch, and have been investigating open source alternatives to ArcGIS. More on that later…

Image Formats for Exporting Maps

I’ve been working on a project where I need to create maps in ArcGIS, save them as images, and embed them in a webpage. Seems simple enough right? Well, it turned into a much more complicated affair, as the file formats I was using to export the images looked terrible. I thought I would share this experience, as I had a hard time finding info about it and I imagine this is a problem that many have faced at one point or another.

I was exporting some basic two-color thematic maps of the US out as jpegs, and the colors were blurry and the boundaries block-like. I tried increasing the resolution, which didn’t work because it made the images larger. Couldn’t do that, because I needed the images to be a specific size to mesh with the content on the pages I was creating. So I tried tiffs and gifs as well, which were only mildly better.

I recalled having these problems in the past, but I always got around it by exporting the maps out as pdfs, which look pretty good. But in those cases I was just trying to preserve the map in a static format, and since you can’t embed pdfs into html (you can only link to them) that option was out. I’ve used emf files when my goal was to insert the image into a Word document, but emfs are not recognized by web browsers nor can they be embedded in html, so no dice there.

As I delved into this further, I discovered that pdfs and emfs looked good because they are vector based. Since the map I was creating is vector based, the conversion is pretty clean. The jpegs and tiffs are raster based. So when you make the conversion, the image quality suffers, particularly when using jpegs as the files gets compressed. So, what I really needed here was a vector based image format that you could view in a web browser.

This is when I stumbled upon svg files – scalable vector graphics. They are open standard, vector based, and are essentially xml files. You can even open them in a text editor and, if you know what you’re doing, edit them. They are scalable because you can zoom in and out without the resolution getting poor. SVG files can be viewed using recent versions of the Firefox broswer, and you can embed them into html using an object or embed tag (can’t use the standard img tag). They look great – crystal clear. The problem here is that Internet Explorer doesn’t support svg without a special Adobe plugin. Doh! Which means if you’re designing a web page with svg files, only 18% of the web surfing population can view them without having to bend over backwards. So, that’s not going to work.

Then I was surfing around Wikipedia (for unrelated reasons), and noticed that several maps embedded in their pages are in SVG format. And, I was able to view them in Firefox and IE without a problem, and without a plugin. Then I discovered on one of the documentation pages that they use a program within the MediaWiki software called RSVG that draws from a library called librsvg, which rasterizes all of their svg files. The program looks like it does a great job. But getting the web server I’m using configured to handle this is beyond my control. But it’s good to know that there is a server-side solution.

I did find a detailed page on Wikipedia that was created to guide people in submitting images to the site, and they recommended using SVGs or PNGs – portable network graphics, which are an open standard raster format. They had some useful illustrations comparing the quality of the different images and the reasons why some are better than others.

In the end, I went with the png format, which still isn’t as crisp as the svg but is far better than the other rasters. And, it’s widely supported, so no problems embedding it in html with the standard img tags. Some older versions of the IE browser may render them a little funny, or not at all, but you’re safe if you’re using version 6 or 7. Hopefully, the next version of IE will support svg, as it does provide some great opportunities for creating maps outside of GIS. If you have an svg file with countries of the world (download one for free from wikipedia), you can open it in a text editor and assign different countries different background colors based on a range of values. But that’s another story.

In summary, when you want to save maps as static files:

- Use pngs if you want to embed them in a web page
- Use emf if you want to embed them in a word processing document
- Use pdf to preserve the map in a stand-alone file for linking to or printing
- Use svg for preserving stand-alone maps for viewing locally or printing

Compare:

JPEG Map vs PNG Map

(You can download an SVG example as well and take a look in Firefox. For some reason, if you try and view it directly from here, you’ll see the code and not the image – may have something to do with the configuration of this webserver – I’ll have to investigate).

Useful links:

- Wikipedia guidelines for submitting images, includes discussion of jpeg, png, svg
- Wikipedia guidelines for svg
- Download svg map images from Wikipedia
- Instructions for embedding svg into html
- SVG homepage
- PNG homepage
- Adobe SVG viewer