Wednesday, November 17, 2010

Four good sources of bicycle statistics

National Statistical Data Sources

National Data Sets serve both statistical and discursive functions. The data sets included below cannot speak to specific individuals, and some of them cannot even speak to specific places. Nor can causal relationships be extrapolated from them without violating basic statistical principles (correlation is not causation!) What they can do is show the correlation between two variables – say, biking to work and education level – and predict the likelihood that an increase in one will accompany an increase in the other. Thus, though they cannot characterize specific bicyclists or give reasons as to why certain people in certain places bike more than others, they can provide characteristics that bicyclists and neighborhoods with high rates of bicycling are likely to have. Perhaps unsurprisingly, data from three of the four studies below crop up in planning and advocacy discourse at both the Federal and local Austin levels; this fact alone indicates that these data sets merit further study, both for what information can be mined from them and for the ways in which they discursively construct bicyclists and bike-friendly neighborhoods.

US Census Bureau
Decennial Census 2000 (2010), Summary File 3, and the 2009 American Community Survey (

Description: Both the Decennial Census and the American Community Surveys include questions on bicycling to work. Though the American Community Surveys include continuously collected data and are therefore more up-to-date, they are estimates, not true counts, and their sample sizes are too small to accurately predict behavior at the block group level (small groups of a few blocks within an urban census tract, which itself is usually a small parcel of no more than six block groups.) Thus, for predicting the percentage of Austin commuters who biked to work in 2009, the American Community Survey theoretically provides the most accurate numbers. For predicting bike commuting at the neighborhood level, however, the much larger sample size in the Decennial Census theoretically provides more accurate numbers. (The League of American Bicyclists has used American Community Survey data to put together bicycling commuting rates for the 70 largest cities and the US for the past ten years; see

Pros: Together, the Decennial Census and the American Community Surveys provide large amounts of data and allow for correlations between biking and a host of other variables, including household income, race, ethnicity, gender, population density, education level, and household size. Also, sample data can be downloaded and manipulated (hooray!)

Cons: Because the Census data focus on biking to work, they are unable to capture cycling for non-work purposes, including running errands, biking to places of entertainment like bars, coffeeshops, and movie theaters, and recreational riding. It also excludes cycling as work. Further, focusing on work leaves out students, who very likely bike at least as much as workers. Also, the question about transportation to work asks for the main mode of transportation to work used by the respondent in a recent “reference week,” and only allows one mode to be reported, even if the respondent uses multiple modes on a daily basis. Thus, it excludes people who ride to work, but who use their bikes for the shorter portion of their trip, or who ride to work, but not as much as they use some other mode. Further, the census provides no data on cycling frequency or time of year the data was collected, nor (because it cannot report records for specific individuals) does it indicate potential routes to work. Finally, as with any survey, all information is self-reported and is therefore only as accurate as the respondent wants it to be, though I imagine that the threat of penalty under Federal law keeps a few people from lying on their surveys. It thus can only provide information on people who self-report as regular commuters, and its picture of cyclists is accordingly skewed.

Bureau of Transportation Statistics
2009 Omnibus Household Survey (

Description: Administered annually by the Federal Bureau of Transportation Statistics, the Omnibus Household Survey is a 15-minute nationwide telephone survey of approximately 1,000 randomly selected households. The 2009 Survey also includes data for “Target MSAs,” metropolitan statistical areas around major cities (LA, New York, Chicago, etc.) that include transit in their modal mix. It aims to gauge public opinion regarding the national transportation system (including roads, railroads, and airplanes) and includes two questions about bicycling: one, whether the respondent bicycled at all, for any reason, in the “reference week” preceeding the survey, and the other, how many days in the past week the respondent biked.

Pros: Like the Census data, the Omnibus Household Survey collects demographic and geographic data as well as biking, so it also allows for correlations between biking and several other variables. It is also available for download, and its relatively small size (about 1,600 respondents) makes it easy to work with. In addition, it considers biking to be both transportation AND a sport, thereby capturing riders who ride for recreation and for non-work purposes in addition to commuters. Further, it attempts to quantify biking by asking for the number of days biked in the previous week, thus allowing for correlations between bicycling frequency and variables like education level, neighborhood urbanicity, income level, etc. It also includes questions on availability of bike infrastructure, thus allowing for correlations between cycling and available lanes/paths, etc.

Cons: Although the data are broken into a national set and a focused urban set, the sets are neither place-specific nor particularly large, and with a total of roughly 1,600 respondents in the 2009 Survey, even if the data could be broken down into neighborhoods or even cities, there are not enough respondents for results to be statistically significant in any particular place. Further, this data is generated via randomized phone calls to landline telephones, and as of 2008 nearly 20% of households were wireless only; the data may be skewed to older people, less urban people, and people with children or other dependents. Finally, although asking about days biked in the previous week does provide more insight into bicycling behaviors than the Census question does, more depth regarding reasons for cycling, mileage, routes, and trips taken would provide a nuanced picture of cyclists.

Association of Pedestrian and Bicycle Professionals/ USDoT
2010 Women’s Cycling Survey (

Description: The 2010 Women’s Cycling Survey was administered in the spring of 2010 by the Association of Pedestrian and Bicycle Professionals and sponsored by the Federal Highway Administration; Mark Schultz and Anna Sibley of the Department of Public Health at UNC-Greensboro handled the preliminary analysis. Its aim was to determine bicycling attitudes and behaviors among female cyclists. Unlike the other surveys discussed here, this survey was conducted online via SurveyMonkey, and its distribution was not controlled for geography – instead, it was circulated via bike-related blogs and websites. It garnered more than 13,000 respondents, all but 200 of whom were women, and more than 6,000 of those women reported biking daily for transportation. The demographics and cycling behaviors of its respondents differ significantly from those of the Census Bureau and BTS Surveys above and the Attitudes and Behaviors Survey below; I suspect the difference in survey design and distribution may be the largest contributing factor.

Pros: Because the survey was distributed online and data are therefore not dependent on the respondent having a landline, data may represent more cyclists more fully than other surveys. Also, the sample size is huge! The survey also goes into more detail regarding cycling behavior, and includes questions on miles biked per week, reasons for biking, and factors that may increase cycling (bike lanes, off-road paths, etc.) By focusing on women, the survey also collects substantial information on a cycling population that is often overlooked in popular cycling discourse.

Cons: Probably the largest problem with this data is its opacity: as yet, I have been unable to find even a representative sample of the survey questions or the data collected. Also, the vast majority of its respondents (85%) reported having a Bachelor’s degree or higher, 90% reported that they are white, 80% characterized their communities as either medium or large cities, and roughly half of respondents reported that they rode daily. These percentages are considerably skewed with respect to the other three data sets and national statistics collected by the census bureau; the non-randomized distribution method may have significantly skewed the data. Also, because we have no comparable data (the survey will only be performed once, and does not include male respondents), its findings may be relevant for correlating female cycling with bike infrastructure, but I’m not sure that we can safely draw any other conclusions from them.

Bureau of Transportation Statistics/USDoT
2002 National Survey of Bicyclist and Pedestrian Attitudes and Behaviors (
Description: This 9,000-respondent telephone survey, which was conducted in the summer of 2002, appears to be one of the most frequently-cited sources of cycling statistics for planning agencies and advocacy groups alike. It was the first national survey to go into detail about how, how often, where, and why people walk or bike, and it goes into depth about safety concerns, distance biked, type and number of trips, start points, end points, and thoughts on bicycling infrastructure. The survey also collected info on many of the demographic variables mentioned in the above surveys, thus allowing for correllations between cycling and a variety of factors.

Pros: The survey has a large sample size, making it statistically relevant at many geographic levels. It also provides the most detailed data about who bikes where and why, and the randomized telephone survey method helps keep basic demographic information on par with national data.

Cons: Although the reports do provide base counts and the actual wording of questions, I have yet to find either the actual data or a representative sample of it. This opacity is concerning because although the reports do a great deal with race, they do very little with social class, and based on the data I have been able to work with, class correlates much more strongly with bicycling than does race. Also, as with the Omnibus Household Survey, the reliance on landline phones may skew the results toward people with landlines, although this may not have been as much of a concern in 2002. Finally, the age of the survey may be a problem, as bike networks have exploded and bike commuting has increased by 44% since 2000 (see the League of American Bicyclists’ summary of American Community Survey biking rates, above.)

No comments:

Post a Comment