I’m coming out of my blog hibernation for this announcement – the US Census Bureau is proposing that they drop the 3-year series of the American Community Survey in fiscal year 2016. A colleague mentioned that he overheard this at a meeting yesterday. Searching the web, I found a post at the Free Government Information site which points to this Census Bureau Press release. The press release cites the predictable reasons (budget constraints, funding priorities, etc.) for dropping the series. Oddly, the news comes through some random site and not through the Census Bureau’s website, where there’s no mention of it. I saw that Stanford also had a post, where they shared the same press release.
I kept searching for some definitive proof, and through someone’s tweet I found a link to a PDF of the US Census Bureau’s Budget Estimates for Fiscal Year 2016, presented to Congress this February 2015. I found confirmation buried on page CEN – 106 (the 100th page in a 190 page doc):
Restoration of ACS Data Products ($1.5 million): Each year, the ACS releases a wide range of data products widely used by policymakers, Federal, state and local governments, businesses and the public to make decisions on allocation of taxpayer-funds, the location of businesses and the placement of products, emergency management plans, and a host of other matters. Resource constraints have led to the cancellation of data products for areas with populations between 20 and 60 thousand based on 3-year rolling averages of ACS data (known as the “3-Year Data” Product).They have also resulted in delays in the release of the 1- and 5- year Public Use Macro Sample (PUMS) data files and canceled the release of the 5- year Comparison Profile data product and the Spanish Translation of the 1- and 5- year Puerto Rico data products.
The Census Bureau proposes to terminate permanently the 3-Year Data Product. The Census Bureau intended to produce this data product for a few years when the ACS was a new survey. Now that the ACS has collected data for nearly a decade, this product can be discontinued without serious impacts on the availability of the estimates for these communities.
The ACS would like to restore the timely release of the other essential products in FY2016. The continued absence of these data products will impact the availability of data – especially for Puerto Rico – to public and private sector decision makers.
So at this point it’s still just a proposal. The benefits, besides the ability to release other datasets in a timely fashion, would be simplification for users. Instead of choosing between three datasets now there will only be two – the one year and the five year. You choose the one year for large areas and the five year for every place else. In terms of disadvantages, consider this example – here are the number of children enrolled in nursery school in NY State PUMA 03808, which covers Murray Hill, Gramercy, and Stuyvesant Town in the eastern half of Midtown Manhattan:
Population Over 3 Years Old Enrolled in Nursery / Pre-school
- 1 year 2013: 1,166 +/- 609
- 3 year 2011-2013: 1,549 +/- 530
- 5 year 2009-2013: 1,819 +/- 409
Since PUMAs are statistical areas built to contain 100k people, data for all of them is available in each series. Like all the ACS estimates these have a 90% confidence interval. Look at the data for the 1-year series. The margin of error (ME) is so large that’s it’s approximately 50% of the estimate, which in my opinion makes it worthless for just about any application. The estimate itself is much lower than the estimate for the other two series. It’s true that it’s only capturing the latest year, but administrative data and news reports suggest that the number of nursery school children in the district that covers this area has been relatively stable over time, with modest increases (geographically the district covers an area much larger than this PUMA). This suggests that the estimate itself is not so great.
The 5 year estimate may be closer to reality, and its ME is only 20% of the estimate. But it covers five years in time. If you wanted something that was a compromise – more timely than the five year but with a lower ME than the one year, then the three year series was your choice, in this case with an ME that’s about 33% of the estimate. But under this proposal, this choice goes away and you have to make do with either 1-year estimates (which will be lousy for geographies that aren’t far above the 65k population threshold, and lousy for small population groups where ever they are located), or better 5-year estimates that cover a greater time span.