Skip to contentUnited States Department of Transportation - Federal Highway Administration Go to TFHRC homeGo to FHWA websiteFeedback

TFHRC Home > Safety > Safety Publications > Safety Effects of Differential Speed Limits > Appendix C

APPENDIX C. CONFIRMATION OF THE NEGATIVE BINOMIAL DISTRIBUTION TO CRASH DATA

As stated in the literature, the assumption in the application of the empirical Bayes formulation as done herein is that for a particular site i, the distribution of the number of crashes Ki,y over the years y obeys the Poisson distribution. Further, for a particular year y, the distribution of the number of crashes Ki,y between different i sites follows the negative binomial distribution. Based on these two assumptions, the expected number of crashes of a group mi1 are Gamma distributed.10,11,12,13 Figure 33 illustrates these concepts, where Ki,y is the actual crash count for site i and year y and mi,y is the expected crash counts for site i and year y.

View Alternative Text

Figure 33. Chart. Relationship between the Poisson and Negative Binomial Distributions for Crash Frequencies.

The Poisson and negative binomial distributions were tested with the data sets for selected states as described herein.

Verification of the Poisson Distribution

Using data from Virginia and Arizona, two techniques were used to verify that the Poisson distribution is appropriate. Firstly, theoretical versus actual frequencies were compared graphically. Secondly, the chi-square test was used to determine whether a statistically significant difference existed between the actual and theoretical distributions for the Kiy over time.

Figure 34 compares the actual crash frequency distribution and the Poisson distribution using one site on Interstate 85 in Virginia between milepost 19.52 and milepost 24.73, looking at the annual number of crashes between 1991 and 1999.

View Alternative Text

Figure 34. Chart. Comparison of Poisson Distribution and Actual Crash Distribution.

Table 27 shows that the calculated Χ2 value is less than the critical (tabulated) Χ2 value, which means that the assumed distribution is accepted. Theoretically, the computed chi-square value (which represents error in, or divergence from, the Poisson distribution) is less than the tabulated chi-square value; therefore, the hypothesis that the distributions are different cannot be proven at the 5 percent confidence level.

Table 27. Poisson validation description and results using the total crashes at four test sites.

State Test Sites Sample Size Data Χ2 calculated Χ2 sta, 0.05 Χ2 sta, 0.01
Arizona I-8 mp 42.06 to 54.96101991-20009.53014.0718.47
Arizona I-10 mp 19.79 to 26.65101991-20016.73815.5120.08
Virginia I-85 mp 19.52 to 24.7351995-19998.76411.0715.09
Virginia I-81 mp 206.04 to 213.4861995-20006.4687.8211.33

Verification of the Negative Binomial Distribution

A similar procedure was used to test the validity of the negative binomial distribution, except that crash rates as defined in figure 3 rather than the total number of crashes, was used to as the variable of interest. Table 28 highlights the result of the chi-square test and visual inspection of figure 35 suggests that the negative binomial distribution is appropriate for these data. (Crash rates rather than the number of crashes was used because of variation in the section lengths.)

Table 28. Negative binomial validation description and results.

StateYearSample SizeTest SiteCrash TypeΧ2calΧ2sta,0.05Χ2sta,0.01Result at 5% LevelResult at 1% Level
VA 19919185n,95n,81ntotal42.32644.860.1Yesyes
VA 19929085n,95n,81ntotal14.51619.6824.75Yesyes
VA 19939185n,95n,81ntotal10.73020.0814.07Yesyes
VA 199511785,95,81ntotal21.38622.3727.72Yesyes
VA 199611685,95,81ntotal18.15318.3123.91Yesyes
VA 199711685,95,81ntotal14.13519.6824.76Yesyes
VA 19998485n,85s,81ntotal29.85227.5933.44no(close)*yes
NC 19932640,95,77,85total4.00569.22Yesyes
NC 19992540,95,77total7.38511.0715.09Yesyes
ID 19923284,86,90,15total28.32738.8945.67Yesyes
ID 19943284,86,90,16total28.30537.6644.34Yesyes
ID 19993284,86,90,16total36.32242.5749.61Yesyes
ID 20003284,86,90,16total26.20932.6838.96Yesyes
AZ 19912778,10,15,17,19,40total24.15121.0326.25no(close)*yes
AZ 19932778,10,15,17,19,40total20.39223.3727.71Yesyes
AZ 19942788,10,15,17,19,40total17.69726.332.03Yesyes
AZ 19952778,10,15,17,19,40total21.23222.3727.71Yesyes
AZ 19962788,10,15,17,19,40total19.97722.3727.71Yesyes
AZ 19982798,10,15,17,19,41total21.16422.3727.71Yesyes
AZ 19992808,10,15,17,19,42total20.14923.6929.17yesyes
AZ 20002818,10,15,17,19,43total24.18727.5933.44yesyes

*The significance level of a chi-square test is actually a proof that a theoretical distribution does not fit the data. Thus, if a calculated chi-square value is sufficiently large such that it exceeds the 5 percent chi-square value, then it can be said that "researchers are 95 percent certain that the two distributions are different." In the two rows with asterisks, there is a 95 percent certainty that the two distributions are different but not 99 percent certain. In all other cases, it cannot be proved at the 95 percent level that the theoretical and actual distributions are different; therefore, it is presumed they are the same.

View Alternative Text

Figure 35. Chart. Comparison of Negative Binomial Distribution and Actual Crash Distribution (Probability Density Function).

 


Previous | Table of Contents | Next

 
FHWA-HRT-05-042
FHWA
TFHRC Home | FHWA Home | Feedback