The US Census Bureau has begun releasing data for the 2007 Economic Census. The bureau conducts the survey of businesses every five years – all medium to large size businesses and multi-part businesses are counted, samples are taken for smaller businesses, and various administrative records are used to calculate businesses with no employees (i.e. freelancers). All businesses are categorized hierarchically by North American Industrial Classification System (NAICS) codes and the data is reported by industry and geography. The number of establishments, employees, payroll, and sales are counted for the nation, states, counties, metro areas, places, and zip codes.
At this point national industry totals for the broadest categories of NAICS are available, as are preliminary numbers for the most specific NAICS categories (six digit) at the national level. Data for smaller geographic areas will be released between October 2009 and August 2010.
The biggest change from the 2002 Economic Census is the delivery method for the data. There will be no more 90 page pdf files or HTML tables that drill down six levels. All of the data will be released via the American Factfinder only. Other changes include the addition of some new geography (CDPs with at least 5000 people), new metro area definitions, and the revised 2007 definitions for NAICS which include small changes to the Finance, Insurance, Real Estate, Professional Services, and Administrative Services categories.
Additional changes for 2007, the data release schedule, NAICS codes, and methodology docs are all available at the 2007 Economic Census homepage within the Census Bureau’s website.
All of the data is aggregated by industry and geography – you cannot get lists of businesses with names and addresses as this information is kept confidential. Furthermore, to maintain confidentiality, if one company controls a large share of the market for a specific sector within a specific geographic area, or if there few businesses within a sector in a specific geographic area, much of the data (with the exception of the number of businesses) remains classified (marked with a D for disclosure). Oftentimes this means that data for industries within small areas (big box retail in a small town) and data for industries with few establishments in an area (mining establishments in New York City) are hidden. The smaller the geography, the more likely it is that the data will not be disclosed. This becomes a technical issue if you want / need to move this data into a database, as these pesky disclosure notes are stored in the same columns as the data and prevent you for designating the fields as numeric.
Given the delay between the time the data is collected and the time it is released, it isn’t particularly helpful for analyzing our current economic climate, but it does provide a snapshot of the way the US economy looked at that moment, and is useful in understanding how the economy is evolving. Be aware that when making comparisons to past data, you have to correct for changes in geography and NAICS definitions. The differences between 2002 and 2007 are not too great, but more adjustments are necessary as you go further back in time. The Bureau provides data back to 1997 through the American Factfinder and some data from 1992 on an older page. If you need to go back further, you’ll be entering the realm of (gasp!) CD-ROMs or the paper reports.