Skip to contentUnited States Department of Transportation - Federal Highway Administration Go to TFHRC homeGo to FHWA websiteFeedback
Go to FHWA website Go to DOT website

APPENDIX B

DEVELOPMENT OF BASE MODELS

     The base models for the accident prediction algorithm were developed as part of the preparation of two FHWA reports, Accident Models for Two-Lane Rural Roads: Segments and Intersections, and Accident Models for Rural Intersections: 4-Lane by 2-Lane Stop-Controlled and 2-Lane by 2-Lane Signalized.(3,5) This appendix describes the data base development, the base model for roadway segments, and the base models for at-grade intersections.

Data Base Development

     The base models were developed with geometric design, traffic control, traffic volume, and accident data on roadway sections and intersections on rural two-lane highways in California, Michigan, Minnesota, and Washington. These data were obtained from the FHWA Highway Safety Information System (HSIS). The geometric design data in the HSIS files, from databases maintained by the States identified above, were supplemented with additional data obtained by Vogt and Bared from field measurements and photolog review. These efforts are described more fully in two FHWA reports and a published paper.(3,4,5)

Base Model for Roadway Segments

     In the modeling of roadway segment accidents, the dependent variable included five years of accident data (1985-1989) for 619 rural two-lane roadway segments in Minnesota and 3 years of accident data (1993-1995) for 712 roadway segments in Washington. The model development excluded roadway segments within 76 m (250 ft) of an at-grade intersection and excluded the (relatively few) accidents that occurred more than 76 m (250 ft) from an intersections but were identified by the investigating officer as related to an intersection.

     The independent variables representing geometric, design traffic control. and traffic volume used in modeling included:

All of these independent variables were found to have a statistically significant relationship to roadway section accidents.

     The base model for roadway segments was developed from the HSIS roadway segment data for rural two-lane highways in Minnesota and Washington. The base model is presented below:

Nbr = EXPO exp(0.6409 + 0.1388STATE - 0.0846LW - 0.0591 SW + 0.0668 RHR + 0.0084DD)
(S WHi exp(0.0450DEGi )) (SWVjexp (0.4652Vj)) (SWGk exp(0.1048GRk))
(42)

where:

Nbr = predicted number of total accidents per year on a particular roadway segment;
EXPO = exposure in million vehicle-miles of travel per year = (ADT)(365)(L)(10-6);
ADT = average daily traffic volume (veh/day) on roadway segment;
L = length of roadway segment (mi);
STATE = location of roadway segment (0 in Minnesota, 1 in Washington);
LW = lane width (ft); average lane width if the two directions of travel differ;
SW = shoulder width (ft); average shoulder width if the two directions of travel differ;
RHR = roadside hazard rating; this measure takes integer values from 1 to 7 and represents the average level of hazard in the roadside environment along the roadway segment. (For the development of the roadside hazard rating, see Zegeer et al.; for definitions of individual rating levels, see Appendix D.);(6)
DD = driveway density (driveways per mi) on the roadway segment;
WHi = weight factor for the i th horizontal curve in the roadway segment; the proportion of the total roadway segment length represented by the portion of the ith horizontal curve that lies within the segment. (The weights, WHi, must sum to 1.0.);
DEGj = degree of curvature for the ith horizontal curve in the roadway segment (degrees per 100 ft);
WVj = weight factor for the jth crest vertical curve in the roadway segment; the proportion of the total roadway segment length represented by the portion of the jth crest vertical curve that lies within the segment. (The weights, WVj, must sum to 1.0.);
 
Vj = crest vertical curve grade rate for the jth crest vertical curve within the roadway segment in percent change in grade per 31 m (100 ft) = |g j2- gj1|/l j;
g jl, g j2 = roadway grades at the beginning and end of the jth vertical curve (percent);
l j = length of the jth vertical curve (in hundreds of feet);
WGk = weight factor for the kth straight grade segment; the proportion of the total roadway segment length represented by the portion of the kth straight grade segment that lies within the segment. (The weights, WGk, must sum to 1.0.);and
GRk = absolute value of grade for the kth straight grade on the segment (percent).

     The model was developed with extended negative binomial regression analysis. This extension of the standard negative binomial regression analysis technique was devised by Miaou.(3) In negative binomial models for roadway segments, the mean number of accidents in a specified time period is typically represented in the form:

EXPO exp(aX + bY + ...)

which is equivalent to:

EXPO exp(aX) exp(bY) ...

where EXPO is a measure of exposure, X and Y are measures of roadway segment characteristics, and a and b are appropriate regression coefficients.

     The extended negative binomial regression analysis technique devised by Miaou replaces some of the factors exp(aX) by expressions of the form:

w1 exp(aX1) + w2 exp(aX2) + ... +wm exp (aXm)

where X1, X2, ..., Xm are local variables along the roadway segment, characterizing for subsegments what X attempts to measure for the entire segment. For example, X1, ..., Xm might represent the degree of curvature for individual horizontal curves while X is the average degree of curvature for the roadway segment as a whole.Thus, X is a composite variable, while the Xis represent variation within the segment. Such variation occurs for many variables used in accident modeling of roadway segments, even on supposedly homogeneous segments. Notable examples are degree of curvature, grade, and change of grade per unit length. The variable wi is the proportion of the segment length to which the value Xi applies. (The default value for Xi is assumed to be zero if there is no horizontal or vertical curve or if the grade is level, and an artificial subsegment with this value is added, if necessary, so that the weights, wi, always sum to 1.0.)(3)

     The extended negative binomial regression model decomposes the roadway segment into subsegments within which the roadway characteristic measured by X is constant. If this is done for two or more variables (e.g., X, Y, ...), the method assumes that the variables are independent of one another, so that the value Yj occurs with a particular value of Xi. Although such independence cannot be assured, the extended negative binomial regression model attempts to capture the effect of variation within a segment in an additive manner consistent with the basic form of the model.(3)

     Table 29 summarizes the model presented in equation (42) with the coefficient, standard deviation, and significance level (p) for each independent variable and the overdispersion parameter (k). The goodness-of-fit measures for the roadway segment base model include R2, the traditional measure of the percentage of variation in accident frequency explained by the independent variables in the model, as well as Rk2, defined as:

Rk2 =

1 -

k

kmax
(43)

Where:

k = the overdispersion parameter for the regression model; and
kmax = the overdispersion parameter in a model with no covariables (the so-called “zero model”).

This latter measure of goodness of fit, Rk2, has been proposed by Miaou.(9) For the roadway segment base model in equation (42), the values of the goodness-of-fit measures are R2 = 0.6547 and Rk2 = 0.8291.

Table 29. Model Parameters and Goodness of Fit for Equation (42).

  Independent Variable
Parameter Intercept State LW SW RHR DD DEG V GR Overdipersion
parameter (k)
Coefficient 0.6409 0.1388 -0.0846 -0.0591 0.0668 0.0084 0.0450 0.4652 0.1048 0.3056
Standard deviation 0.5008 0.0659 0.0425 0.0114 0.0211 0.0026 0.0078 0.1260 0.0287 0.0331
Significance level (p) 0.2006 0.0351 0.0465 0.0001 0.0015 0.0011 0.0001 0.0002 0.0003 0.0001

Note: The values of the goodness-of-fit measures are R2 = 0.6547 and Rk2 = 0.8291

Table 30 presents descriptive statistics for the variables in the roadway segment model.

Table 30. Descriptive Statistics for Roadway Segments Used in Modeling.

Variable
Mean
Standard
deviation
Minimum
25th
percentile
Median
75th
percentile
Maximum
MINNESOTA (619 roadway segments)
ADT (veh/day) 2,402 1,937 208 1,176 1,866 2,900 15,162
L (mi) 1.14 1.30 0.10 0.26 0.66 1.50 8.24
LW (ft) 11.54 0.67 10.00 11.00 12.00 12.00 12.00
SW (ft) 7.08 2.44 0.00 6.00 8.00 8.00 12.00
RHR 2.14 0.98 1.00 1.00 2.00 3.00 6.00
DD (mi-1) 6.58 10.25 0.00 0.89 3.73 7.68 100.00
DEG (degrees/100 ft) 0.51 0.95 0.00 0.00 0.08 0.62 7.50
V (percent/100 ft) 0.066 0.092 0.00 0.007 0.037 0.086 0.888
GR (percent) 0.38 0.52 0.00 0.10 0.24 0.45 4.46
WASHINGTON (712 roadway segments)
ADT (veh/day) 3,352 3,199 159 1,261 2,239 4,455 17,766
L (mi) 0.75 0.83 0.10 0.27 0.554 0.948 13.23
LW (ft) 11.37 0.56 9.00 11.00 11.00 12.00 12.00
SW (ft) 5.01 2.35 0.00 3.00 5.00 7.00 10.00
RHR 3.67 1.57 1.00 2.00 3.00 6.00 7.00
DD (mi-1) 10.12 12.41 0.00 2.07 6.12 13.61 85.07
DEG (degrees/100 ft) 1.03 2.13 0.00 0.00 0.32 1.31 30.55
V (percent/100 ft 0.068 0.127 0.000 0.000 0.026 0.083 1.997
GR (percent) 0.92 1.17 0.00 0.20 0.49 1.13 6.92

Conversion: 1 ft = 0.305 m; 1 mi = 1.61 km

     The exposure variable, EXPO, was treated as a scale factor in the development of the model. Therefore, even though multiple years of accident data were used in developing the model, the expected annual accident frequency can be determined from the model if EXPO is determined as (ADT)(365)(L)(10-6).

     Other variables investigated in the development of the roadway segment base models included posted speed limit, truck percentage, and intersection density (i.e., number of intersections per mile). Posted speed limit was found to be negatively correlated with accident frequency, while truck percentage and intersection density were positively correlated with accident frequency. However, none of these three variables was statistically significant in regression models that included the variables listed above. Grade change per unit length of roadway was also considered for sag vertical curves and for all vertical curves (i.e. both sags and crests), but the version of the variable for crest vertical curves (Vj) had the greatest statistical significance and was, therefore, retained in the final model. Two weather-related variables were also investigated in the modeling for the Minnesota roadway segments—number of rain days and number of snow days per year. These variables were based on the climate district in which each roadway segment was located. These variables were found to be negatively correlated with accident frequency. Because these variables were not sufficiently local (i.e., they represented climate districts rather than the climate of individual roadway segments) and because they were only marginally statistically significant, they were not collected for the Washington data and were omitted from the final model.

Base Models for At-Grade Intersections

     Base models have been developed for three types of at-grade intersections on rural two-lane highways. These are:

Models for each of these intersection types were developed using three different modeling approaches, described below. The available database used in modeling included 5 years of accident data (1985-89) for selected STOP-controlled intersections in Minnesota and 3 years of accident data (1993-1995) for selected signalized intersections in California and Michigan. Following the description of the modeling approaches, the models developed for each intersection type are presented.

     The available accident data for at-grade intersections generally included accidents within 76 m (250 ft) of the intersection on the major road, typically a State highway. For minor roads that are also State highways, data were generally available for all accidents that occur within 76 m (250 ft) of the intersection. For minor roads that are not State highways, accidents on the minor road that are classified as intersection-related are typically assigned the milepost of the intersection on the major road and are therefore included in the available data. In Michigan, all minor-road accidents within 31 m (100 ft) of the intersection and, in California, all accidents within 76 m (250 ft) of the intersection are included in the available data. In Minnesota, all accidents that occurred on minor-road approaches and were identified as related to the intersection are included in the available data.

Modeling Approaches

     Three different modeling approaches were employed to develop candidate base models for each intersection type. The approaches are:

     Each of these modeling techniques is described in greater detail in the following discussion. A fourth modeling technique that was considered but not used is explained at the end of this appendix.

     The research reports by Vogt and Bared from the FHWA project that developed the base models and the related paper include only the models based on accidents that were identified by the investigating officer as being intersection related.(3,4,5) The models based on accident types that are generally related to intersection operations and on the iterative offset approach were developed as part of the same project but are documented only in this report.

Limitation of the Analysis to Intersection-Related Accidents

     The first modeling approach used to develop base models for at-grade intersections was to limit the dependent variable to those accidents that occurred within the curbline limits of the intersection or within 76 m (250 ft) of the intersection on any approach and were identified by the investigating officer as related to the intersection. This definition is generally reasonable because the vast majority of intersection-related accidents on a rural two-lane highway would be expected to occur within 76 m (250 ft) of the intersection to which they are related. For example, traffic queues that extend more than 76 m (250 ft) from an intersection and which might lead to rear-end collisions remote from the intersection are much less common in rural than in urban areas.

     Not all accidents within 76 m (250 ft) of an intersection would be expected to be related to the operation of that intersection. For example, collisions between motor vehicles and animals and collisions involving vehicles entering or leaving a driveway may occur within 76 m (250 ft) of an intersection but may have no particular relationship to the presence of that intersection. In other words, some accidents would be expected to occur on the roadway near the intersection whether the intersection were present or not. the first approach to modeling distinguishes between intersection-related and non-intersection-related accidents based on judgments made by the investigating officer or by an accident coder and recorded in the State’s computerized accident record system. Nominally, such judgments make exactly the distinction that is desired, but there is concern that investigating officers and accident coders may not make such judgments consistently.

     Once the dependent variable was defined, as described above, predictive models were developed using extended negative binomial regression in a manner similar to the development of the roadway segment model in equation (42).

Limitation of the Analysis to Selected Accident Types

     The second modeling approach used was to limit the dependent variable to accidents that occurred within the curbline limits of the intersection and are of accident types that are generally related to intersection operations. The accident types classified as related to the intersection were:

All other accidents types were classified as unrelated to the intersection.

     The list of accident types classified as intersection-related and non-intersection-related were established through two special studies that involved a review of hard copy police accident reports and classification of accidents as intersection-related or non-intersection-related based on the judgment of safety experts.

     This approach to modeling eliminates the concern raised in the first modeling approach about the potential misjudgment in classifying accidents made by investigating officers or accident coders. However, the approach based on classification of particular accident types as intersection-related has a similar concern since not every accident of the types identified above is actually related to an intersection and some accidents of other types are, in fact, intersection-related. For example, the proposed classification scheme based on accident types would classify all reported right-turn accidents within 76 m (250 ft) of the intersection as intersection-related, including accidents that are, in fact, related to turning movements at nearby driveways. The effectiveness of this classification could be improved by an agency by utilizing other fields of the accident record such as a driveway involvement indicator, if available, in classifying accidents.

     Once the dependent variable was defined, as described above, predictive models were developed using extended negative binomial regression in a manner similar to the development of the roadway segment model in equation (42).

Use of All Accidents with an Offset for Expected Roadway-Segment Accidents

     The third modeling technique uses as the dependent variable all accidents that occur within the curbline limits of a particular intersection and all accidents that occur within 76 m (250 ft) of that intersection, regardless of the accident type and regardless of the judgment made by the investigating officer or accident coder concerning the relationship of the accident to the intersection. Since this dependent variable includes some accidents that are clearly unrelated to the intersection, the expected frequency of roadway segment accidents, as predicted by equation (42), is used as an offset factor in the model development. This modeling used an iterative technique that makes successive approximations to the model coefficients.

The iterative offset modeling approach was performed as follows. Let:

Zi = exp(ki + aiX + biY + ...) (44)

denote an equation with integer subscript i estimating the mean number of intersection-related accidents per unit time, Zi, in terms of intersection characteristics (X, Y, ...). Let N denote the mean number of non-intersection-related accidents on a 152 m (500 ft) section of roadway containing the intersection (i.e., 76 m or 250 ft on either side of the intersection), as predicted by a roadway section model such as equation (42), as applied to Minnesota (STATE=0).

     Let A be the mean number of accidents of all kinds per unit time within 76 m (250 ft) of the intersection. Let:

OFFSET1 = ln
ln
 
N + Zi
(45)

Zi

 

     The modeling approach used is iterative. Equation (44) implies that the following relationship must be valid for iteration i+1:

Zi+1 = exp(ki+1 + ai+1X + bi+1Y + ...) (46)

A negative binomial model for A is sought of the form:

A = exp(OFFSETi + ki+1 + ai+1X + bi+1Y + ...) (47)

which can be expressed as:

A = exp(OFFSETi) exp(ki+1 + ai+1X + bi+1Y) (48)

 

A = exp(OFFSETi) Zi+1 (49)

 

A =
N + Zi
Zi + 1 (50)

Zi

     The initial negative binomial model for intersection-related accidents is represented by an equation for Z0 in the form of equation (44). The offset technique represented by equations (47) through (50) is applied repeatedly to obtain a sequence of new models for Z1, Z2, Z3, ... . This process is continued until the coefficients kn, an, bn, ... cease to change appreciably; i.e., until Zn+1 is approximately the same as Zn. The appropriate model for the mean number of intersection-related accidents is then:

Z = exp(kn + anX + bnY + ...) (51)

 

In principle, A = N + Z, or Z = A - N. Thus from, the model for roadway segment accidents used to estimate N and the initial model for intersection-related accidents Z0, the offset technique yields a final model for intersection-related accidents, Z. This final model depends on the choice of the model for N, but should not generally depend on the initial model for Z0. Thus, initial estimates or starting values of the regression coefficients k0, a0, b0, ... can be selected through engineering judgment or an alternative preliminary model.

Models Developed

     The following discussion presents the candidate base models developed for three-leg STOP-controlled intersections, four-leg STOP-controlled intersections, and four-leg signalized intersections.

Three-Leg STOP-Controlled Intersections

     Candidate base models were developed for three-leg intersections with STOP control on the minor-road approach. The dependent variables used in these models have been described above in the discussion of modeling techniques. All of the models for three-leg STOP-controlled intersections used a data sets of approximately 382 intersections in Minnesota including 5 years of accident data (1985-1989). There were minor variations in sample size from one model to the next because of small amounts of missing data. The candidate independent variables considered in predicting accidents at three-leg STOP-controlled intersections were:

Table 31 presents descriptive statistics for these variables.

     A candidate model for three-leg STOP-controlled intersections developed using the intersection-related accident definition based on the investigating officer’s assessment of each accident is:

Nbi = exp(-12.99 + 0.8.05ln ADT1 + 0.504ln ADT2 + 0.290VCI + 0.034HI
+ 0.029SPDI + 0.173RHRI + 0.27RT + 0.0045SKEW3)
(52)

where:

Nbi = predicted number of total accidents per year at a particular intersection and within 76 m (250 ft) in either direction along the major road;
ADT1 = average daily traffic volume (veh/day) on the major road; if the ADTs differ between the major-road legs, they should be averaged;
ADT2 = average daily traffic volume (veh/day) on the minor road;
VI = crest vertical curve grade rate on the major road within 76 m (250 ft) of the intersection = (1/m) SVi for all crest vertical curves wholly or partly within 76 m (250 ft) of the intersection;
m = number of crest vertical curves wholly or partly within 76 m (250 ft) of the intersection;
HI = horizontal curvature change rate on the major road within 76 m (250 ft) of the intersection = (1/n) SDEGi for all horizontal curves wholly or partly within 76 m (250 ft) of the intersection;
n = number of horizontal curves within 76 m (250 ft) of the intersection;
SPDI = posted speed limit on the major road (mi/h);
RHRI = roadside hazard rating within 76 m (250 ft) of the intersection on the major road [see description of the variable RHR in Equation (42)];
RT = presence of right-turn lane on the major road (0 = no right-turn lane present; 1 = right-turn lane present); and
SKEW3 = intersection angle (degrees) minus 90 for the angle between the major-road leg in the direction of increasing stations and a leg to the right; 90 minus intersection angle (degrees) for the angle between the major-road leg in the direction of increasing stations and a leg to the left.

Table 31. Descriptive Statistics for 382 Three-Leg STOP Controlled Intersections in Minnesota Used Modeling.

Variable
Mean
Standard
deviation
Minimum
25th
percentile
Median
75th
percentile
Maximum
ADT1 (veh/day)
3,718
3,725
201
1,239
2,333
4,627
19,413
ADT2 (veh/day)
408
531
5
103
237
478
4206
ln ADT1
7.81
0.91
5.30
7.12
7.75
8.44
9.87
ln ADT2
5.40
1.14
1.51
4.64
5.47
6.17
8.34
VCI (percent/100 ft)
0.142
0.300
0
0
0
0
4.39
HI (degrees/100 ft)
1.22
2.52
0
0
0
2
29
SPDI (mph)
52.8
4.6
22.5
52.5
55
55
55
RHRI
2.10
0.88
1
1
2
3
5
RT
0.42
0.49
0
0
0
1
1
SKEW3 (degrees)
-0.67
24.98
-90
0
0
0
85.1
ND1
1.24
1.44
0
0
1
2
9

Conversion: 1 ft = 0.305 m

     Equation (52) includes all of the candidate independent variables except the number of driveways which was dropped because its coefficient was not statistically significant (p=0.5405) and because coefficient had a negative sign, which is opposite to the direction expected. Table 32 summarizes the model parameters and goodness of fit for the model in equation (52). Goodness of fit for models like equation (52) is measured by R2 and Rk2, as noted earlier for equation (42), as well as by RPD2. This last goodness-of-fit measure, RPD2, has been proposed by Fridstrøm for use with negative binomial models.(43) A description of this goodness-of-fit measure is also provided by Vogt and Bared.(3)

Table 32. Model Parameters and Goodness of Fit for Equation (52).

  Independent variable Over-
dispersion
parameter(k)
Parameter Intercept ln ADT 1 ln ADT 2 VCI HI SPDI RHRI RT SKEW3
Coefficient -12.99 0.805 0.504 0.290 0.034 0.029 0.173 0.27 0.0045 0.481
Standard deviation 1.15 0.064 0.071 0.294 0.033 0.018 0.068 0.140 0.0032 0.100
Significance level 0.000 0.0001 0.0001 0.323 0.3004 0.107 0.0108 0.0561 0.1578 0.0001

Note: The values of the goodness-of-fit measures are R2 = 0.4409, Rk2 = 0.7805, and RPD2 = 0.6279

     The model in equation (52) was reevaluated including only those variables that were statistically significant with a significance level (p) of 0.150 or less. This model is presented below:

Nbi = exp(-11.28 + 0.79ln ADT1 + 0.49ln ADT2 + 0.19RHRI + 0.28RT) (53)

     Table 33 summarizes the model parameters and goodness of fit for the model in equation (53).

     A candidate model developed for three-leg STOP-controlled intersections using only those accident types generally related to intersection operations is:

Nbi = exp(-12.82 + 1.001ln ADT1 + 0.406ln ADT2 + 0.22RHRI + 0.33RT + 0.0040SKEW3) (54)

Equation (54) includes all of the candidate independent variables except speed limit (p=0.41), number of driveways (p=0.56), horizontal curvature (p=0.62), and vertical curvature (p=0.40) which were dropped because their coefficients were not statistically significant. Table 34 summarizes the model parameters and goodness of fit for the model in equation (54).

Table 33. Model Parameters and Goodness of Fit for Equation (53).

Independent variable
Over-dispersion
parameter (k)
Parameter Intercept ln ADT 1 ln ADT 2 RHRI RT
Coefficient -11.28 0.79 0.49 0.19 0.28 0.54
Standard deviation 0.063 0.062 0.068 0.067 0.14 0.102
Significance level (p) 0.0001 0.0001 0.0001 0.0035 0.0402 0.0001

Note: The values of the goodness-of-fit measures are R2 = 0.3955, Rk2 = 0.7546, and R PD2 = 0.6109.

Table 34. Model Parameters and Goodness of Fit for Equation (54).

  Independent variable Over-dispersion
parameter (k)
Parameter Intercept ln ADT 1 ln ADT 2 RHRI RT SKEW3
Coefficient -12.82 1.001 0.406 0.22 0.33 0.0040 0.46
Standard deviation 0.73 0.072 0.073 0.073 0.15 0.0029 0.106
Significance level 0.0001 0.0001 0.0001 0.0024 0.032 0.17 0.0001

Note: The values of the goodness-of-fit measures are R2 = 0.4181, Rk2 = 0.8070, and R PD2 = 0.7228.

     The model in equation (54) was reevaluated including only those variables that were statistically significant with a significance level of 0.150 or less. This model is presented below:

Nbi = exp(-13.01 + 1.015ln ADT1 + 0.42ln ADT2 + 0.23RHRI + 0.29RT) (55)

     Table 35 summarizes the model parameters and goodness of fit for the model in equation (55).

Table 35. Model Parameters and Goodness of Fit for Equation (55).

  Independent variable Over-dispersion
parameter (k)
Parameter Intercept ln ADT 1 ln ADT 2 RHRI RT
Coefficient -13.01 1.015 0.42 0.23 0.29 0.49
Standard deviation 0.72 0.072 0.071 0.073 0.14 0.108
Significance level (p) 0.0001 0.0001 0.0001 0.0020 0.041 0.0001

Note: The values of the goodness-of-fit measures are R2 = 0.4008, R k2 = 0.7953, and R PD2 = 0.7162

     A candidate model for three-leg STOP-controlled intersections developed with the iterative offset technique is:

Nbi = exp(-12.40 + 0.74ln ADT1 + 0.53ln ADT2 + 0.36 VCI + 0.028SPDI + 0.14RHRI + 0.0063 SKEW3) (56)

     Equation (56) includes all of the candidate independent variables except number of driveways (p=0.35), presence of right-turn lane (p=0.47), and horizontal curvature (p=0.50) which were dropped because their coefficients were not statistically significant. Table 36 summarizes the model parameters and goodness of fit for the model in equation (56). It should be noted that the goodness-of-fit measure, RPD2, is not directly applicable to models developed with the iterative offset technique and, therefore, is not presented in Table 36. Moreover, R2 must be interpreted with caution since it measures the goodness-of-fit for the combined roadway segment and intersection model for all accidents with 76 m (250 ft) of the intersection. Since most of these accidents are intersection-related, R2 at least roughly measures the goodness of fit for intersection-related accidents

Table 36. Model Parameters and Goodness of Fit for Equation (56)

  Independent variable Over-dispersion
parameter (k)
Parameter Intercept ln ADT 1 ln ADT 2 VCI SPDI RHRI SKEW3
Coefficient -12.40 0.74 0.53 0.36 0.028 0.14 0.0063 0.52
Standard deviation 0.93 0.063 0.064 0.28 0.013 0.070 0.0029 0.090
Significance level (p) 0.0001 0.0001 0.0001 0.20 0.033 0.053 0.032 0.0001

Note: The values of the goodness-of-fit measures are R2 = 0.4163, and R k2 = 0.5809.

     The model in equation (56) was reevaluated including only those variables that were statistically significant with a significance level of 0.150 or less. This model is presented below:

Nbi = exp(-12.25 + 0.75ln ADT1 + 0.52ln ADT2 + 0.026SPDI + 0.15RHRI + 0.0059SKEW3) (57)

     Table 37 summarizes the model parameters and goodness of fit for the model in equation (57).

     A decision was made to use in the accident prediction algorithm the models containing only those independent variables that are statistically significant at a significance level of 0.15 or less, like equations (53), (55), and (57).

Table 37. Model Parameters and Goodness of Fit for Equation (57).

  Independent Variable Over-dispersion
parameter (k)
Parameter Intercept ln ADT 1 ln ADT 2 SPDI RHRI SKEW3
Coefficient -12.25 0.75 0.52 0.026 0.15 0.0059 0.52
Standard deviation 0.92 0.062 0.064 0.013 0.065 0.0029 0.089
Significance level (p) 0.0001 0.0001 0.0001 0.046 0.019 0.043 0.0001

Note: The value of goodness-of-fit measures are R2 = 0.4069 and Rk2 = 0.5765.

Four-Leg STOP-Controlled Intersections

     Candidate base models were developed for four-leg intersections with STOP control on the minor-road approaches. The dependent variables used in these models have been described above in the discussion of modeling techniques. All of the models for four-leg STOP-controlled intersections used a data sets of approximately 324 intersections in Minnesota including 5 years of accident data (1985-1989). There were minor variations in sample size from one model to the next because of small amounts of missing data. The candidate independent variables considered in predicting accidents at four-leg STOP-controlled intersections were:

Table 38 presents descriptive statistics for these variables.

     A candidate model for four-leg STOP-controlled intersections developed using the intersection-related accident definition is:

Nbi = exp(-10.43 + 0.603ln ADT1 + 0.609ln ADT2 + 0.29 VCI + 0.045 HI + 0.019 SPDI + 0.12ND1 .0049 - SKEW4) (58)

where:

ND1 = number of driveways on the major road with 76 m (250 ft) of the intersection; and
SKEW4 = intersection angle (degrees) expressed as one-half of the angle to the right minus one-half of the angle to the left for the angles between the major-road leg in the direction of increasing stations and the right and left legs, respectively.

Table 38. Descriptive Statistics for 324 Four-Leg STOP-Controlled Intersections in Minnesota.

Variable
Mean
Standard
deviation
Minimum
25th
percentile
Median
75th
percentile
Maximum
ADT 1 (veh/day)
2,216
1,966
174
972
1,739
2,611
14,611
ADT 2 (veh/day)
304
383
7
105
191
365
3,414
ln ADT 1
7.42
0.75
5.16
6.88
7.46
7.87
9.59
ln ADT 2
5.25
0.97
1.93
4.65
5.25
5.90
8.14
VCI (percent/100 ft)
0.146
0.280
0
0
0.023
0.207
2.942
HI (degrees/100 ft)
0.46
1.08
0
0
0
0.25
8.00
SPDI (mi/h)
54.0
3.3
30
55
55
55
55
ND1
0.61
1.14
0
0
0
1
6
SKEW4 (degrees)
-0.14
18.34
-60.00
-0.44
0.00
0.58
75.00

Conversion: 1 ft = 0.305 m; 1 mi = 1.61 km

     Equation (58) includes all of the candidate independent variables except roadside hazard rating (p=0.28) and presence of right-turn lanes (p=0.66) which were not statistically significant. Table 39 summarizes the model parameters and goodness of fit for the model in equation (58).

Table 39. Model Parameters and Goodness of Fit for Equation (58).

  Independent variable Over-dispersion
parameter (k)
Parameter Intercept lnADT 1 lnADT 2 VCI HI SPDI ND1 SKEW4
Coefficient -10.43 0.603 0.609 0.29 0.045 0.019 0.12 -0.0049 0.205
Standard deviation 1.32 0.084 0.069 0.26 0.047 0.018 0.05 0.0033 0.065
Significance level 0.000 0.000 0.000 0.26 0.34 0.29 0.01 0.13 0.0016

Note: The values of goodness-of-fit are R2 = 0.5944, R k2 = 0.8336, and R PD2 = 0.7364

     The model in equation (58) was reevaluated including only those variables that were statistically significant with a significance level of 0.150 or less. This model is presented below:

Nbi = exp(-9.34 + 0.60ln ADT1 + 0.61ln ADT2 + 0.13ND1 - .0054 SKEW4) (59)

Table 40 summarizes the model parameters and goodness of fit for the model in equation (59).

Table 40. Model Parametere and Goodness of Fit for Equation (59).

  Independent variable Over-dispersion parameter (k)
Parameter Intercept ln ADT 1 ln ADT 2 ND1 SKEW4
Coefficient -9.34 0.601 0.61 0.13 -0.0054 0.24
Standard deviation 0.72 0.078 0.069 0.039 0.0034 0.071
Significance level (p) 0.0001 0.0001 0.0001 0.0009 0.108 0.0008

Note: The values of goodness-of-fit measures are R2 = 0.5662, R k2 = 0.8081, and R PD2 = 0.7326

     A candidate model developed using only those accident types generally related to intersection operations is:

Nbi = exp(-9.40 + 0.55ln ADT1 + 0.65ln ADT2 + 0.31 VCI + 0.14ND1 - 0.0049 SKEW4) (60)

     Equation (60) includes all of the candidate independent variables except speed limit (p=0.25 to 0.50), presence of right-turn lane (p=0.59), and horizontal curvature (p=0.48) which were not statistically significant, and roadside hazard rating (p=0.12) which was marginally statistically significant but whose coefficient was negative, which is opposite to the direction expected. Table 41 summarizes the model parameters and goodness of fit for the model in equation (60).

     The model in equation (60) was reevaluated including only those variables that were statistically significant with a significance level of 0.150 or less. This model is presented below:

Nbi = exp(-9.30 + 0.53ln ADT1 + 0.67ln ADT2 + 0.15ND1 - 0.0049 SKEW4) (61)

Table 41. Model Parameters and Goodness of Fit for Equation (60).

  Independent variable   Over-dispersion
parameter (k)
Parameter Intercept ln ADT 1 ln ADT 2 VCI ND1 SKEW4
Coefficient -9.40 0.55 0.65 0.31 0.14 -0.0049 0.253
Standard deviation 0.77 0.085 0.073 0.27 0.05 0.0038 0.079
Significance level (p) 0.0001 0.0001 0.0001 0.25 0.004 0.1997 0.0014

Note: The values of goodness-of-fit measures are R2 = 0.5495, R k2 = 0.8131, and R PD2 = 0.7183

Table 42. Model Parameters and Goodness of Fit for Equation (61).

  Independent variable Over-dispersion
parameter (k)
Parameter Intercept ln ADT 1 ln ADT 2 ND1 SKEW4
Coefficient -9.30 0.53 0.67 0.15 -0.0057 0.293
Standard deviation 0.78 0.085 0.074 0.043 0.0039 0.086
Significance level (p) 0.0001 0.0001 0.0001 0.0004 0.14 0.0007

Note: The values of goodness-of-fit measures are R2= 0.5020, R k2= 0.7835, and R PD2= 0.7047

Table 42 summarizes the model parameters and goodness of fit for the model in equation (61).

     A candidate model for four-leg STOP-controlled intersections developed with the iterative offset technique is:

Nbi = exp(-11.25 + 0.69ln ADT1 + 0.56ln ADT2 + 0.033 SPDI + 0.11ND1 - 0.21 RT - 0.0059 SKEW4) (62)

     Equation (62) includes all of the candidate independent variables except horizontal curvature (p=0.51) and vertical curvature (p=0.30 to 0.41) which were not statistically significant, and roadside hazard rating (p=0.11) which was marginally statistically significant but whose coefficient was negative, which is opposite to the direction expected. Table 43 summarizes the model parameters and goodness of fit for the model in Equation (62). No value of R PD2 is computed for equation (62) because this goodness-of-fit measure is not directly applicable to iterative offset technique.

Table 43. Model Parameters and Goodness of Fit for Equation (62).

  Independent variable Over-dispersion
parameter (k)
Parameter Intercept ln ADT 1 ln ADT 2 SPDI ND1 RT SKEW4
Coefficient -11.25 0.69 0.56 0.033 0.107 -0.21 -0.0059 0.203
Standard deviation 1.59 0.080 0.073 0.021 0.054 0.12 0.0031 0.065
Significance level 0.0001 0.0001 0.0001 0.11 0.048 0.082 0.057 0.0018

Note: The values of goodness-of-fit measures are R2 = 0.5563, and Rk2 = 0.7984

     All of the variables in equation (62) are statistically significant at the 0.15 significance level. Therefore, no alternative version of equation (62) was developed.

Four-Leg Signalized Intersections

     Candidate base models were developed for four-leg signalized intersections. The dependent variables used in these models were those described above for the intersection-related and selected-accident-type modeling techniques. All of the models for four-leg signalized intersections used a data sets of 49 intersections, 18 in California and 31 in Michigan, with 3 years of accident data (1993-1995) for each intersection. The candidate independent variables considered in predicting accidents at four-leg signalized intersections on rural two-lane highways were:

Table 44 presents descriptive statistics for these variables.

Table 44. Descriptive Statistics for 49 Four-Leg Signalized Intersections in California and Michigan Used in Modeling.

Variable Mean Standard
deviation
Minimum 25th
percentile
Median 75th
percentile
Maximum
ADT 1 (veh/day) 10,491 4,331 4,917 7,568 8,900 13,133 25,133
ADT 2 (veh/day) 4,367 2,369 940 2,800 3,670 5,080 12,478
ln ADT 1 9.18 0.39 8.50 8.93 9.09 9.48 10.13
ln ADT 2 8.26 0.49 6.85 7.93 8.21 8.53 9.43
SUMLADT 17.44 0.65 16.34 16.94 17.49 17.78 19.14
PCTLEFT2 (percent) 28.4 15.1 2.5 19.0 25.7 35.7 75.70
VEICOM (percent/100 ft)) 1.88 1.87 0.00 0.50 1.43 2.54 8.13
PTRUCK (percent) 9.0 6.7 2.7 5.0 7.7 11.2 45.4
ND1 3.00 3.00 0.00 0.00 3.00 4.00 15.00
PROTLT a - - - - - - -

Conversion: 1 ft = 0.305 m

a 43 percent of the four-leg signalized intersections have protected left-turn signal phases and 57 percent do not.

     Negative binomial models were developed for predicting accident experience at four-leg signalized intersections on rural two-lane highways using the modeling technique based on the investigating officer’s identification of intersection-related accidents. A candidate model developed using this approach is:

Nbi = exp(-6.12 + 0.46 SUMLADT- 0.61 PROTLT - 0.013 PCTLEFT2 + 0.12VEICOM + 0.030PTRUCK) (63)

where:

SUMLADT = ln ADT 1 + ln ADT 2;
PROTLT =
presence of protected left-turn signal phase on one or more major-road approaches; = 1 if present; 0 = if not present;
PCTLEFT2 =
percentage of minor-road left-turning traffic at the signal for the morning and evening peak hours combined;
VEICOM =
grade rate for all vertical curves (crests and sags) any portion of which is within 244 m (800 ft) of the intersection averaged for the major- and minor- road legs of the intersection;
PTRUCK = percentage of trucks (vehicles with more than four wheels) entering the intersection for morning and evening peak hours combined;

     The variable PROTLT indicates the presence of either a fully protected left-turn signal phase or a protected-permitted phase.

An alternative to equation (63) using the major- and minor-road traffic volumes separately:

Nbi = exp(-6.95 + 0.62ln ADT1 + 0.39ln ADT2 - 0.68 PROTLT - 0.014 PCTLEFT2 + 0.13 VEICOM + 0.032PTRUCK) (64)

     Tables 45 and 46 summarize the model parameters and goodness of fit for equations (63) and (64), respectively.

     Negative binomial models were developed for predicting accident experience at four-leg signalized intersections on rural two-lane highways using the modeling technique based on the accident types generally considered to be intersection-related. A model using the sum of the major- and minor-road traffic volume variables is:

Nbi = exp(-4.96 + 0.39 SUMLADT - 0.38 PROTLT - 0.015 PCTLEFT2 + 0.103 VEICOM + 0.027PTRUCK) (65)

     An alternative form of equation (65) using separate variables for the major- and minor-road traffic volumes is:

Nbi = exp(-6.084 + 0.60ln ADT1 + 0.29ln ADT2 - 0.47 PROTLT - 0.017 PCTLEFT2 + 0.11 VEICOM + 0.029PTRUCK) (66)

     Negative binomial models comparable to equations (65) and (66) including an independent variable representing the number of driveways within 76 m (250 ft) of the intersection were also developed. A model using the sum of the major- and minor-road traffic volume variables is:

Nbi = exp(-4.11 + 0.33 SUMLADT- 0.30 PROTLT - 0.016 PCTLEFT2 + 0.100 VEICOM + 0.023PTRUCK + 0.035 ND1) (67)

Table 45. Model Parameters and Goodness of Fit for Equation (63).

  Independent variable Over-dispersion
parameter (k)
Parameter Intercept SUMLADT PROTLT PCTLEFT2 VEICOM PTRUCK
Coefficient -6.12 0.46 -0.611 -0.013 0.12 0.030 0.12
Standard deviation 2.60 0.15 0.151 0.0048 0.051 0.014 0.032
Significance level (p) 0.018 0.0017 0.0001 .0052 0.014 0.033 0.0002

Note: The values of goodness-of-fit measures are R2 = 0.5208, R k2 = 0.6414, and R PD2 = 0.2550

Table 46. Model Parameters and Goodness of Fit for Equation (64).

  Independent variable Over-dispersion
parameter (k)
Parameter Intercept ln ADT 1 ln ADT 2 PROTLT PCTLEFT2 VEICOM PTRUCK
Coefficient -6.95 0.62 0.39 -0.68 -0.014 0.13 0.032 0.12
Standard deviation 2.79 0.25 0.17 0.18 0.0047 0.045 0.014 0.032
Significance level 0.013 0.013 0.023 0.0002 .0023 0.0039 0.028 0.0003

Note: The values of goodness-of-fit measures are R2 = 0.5053, R k2 = 0.6490, and R PD2 = 0.2362

where:

ND1 = number of driveways with 76 m (250 ft) of the intersection on the major road.

An alternate form of equation (67) using separate variables for the major- and minor-road traffic volumes is:

Nbi = exp(-5.46 + 0.60ln ADT1 + 0.20ln ADT2 - 0.40PROTLT - 0.018 PCTLEFT2 + 0.11 VEICOM + 0.026PTRUCK + 0.041 ND1) (68)

Tables 47 through 50 summarize the model parameters and goodness of fit for equations (65) through (68), respectively.

A negative binomial model for predicting accident experience at four-leg signalized intersections on rural two-lane highways was also developed using the iterative offset technique. This model uses a different functional form for the major and minor road ADT variables than was used in the preceding models. This model is:

Nbi = ADT10.307 exp(0.0000376 ADT1) ADT20.461 LTLN1 RTLN1

(69)

where:

LTLN1 =

factor for number of major-road left-turn lanes present at the intersection

(RTLN1 = 1.000 for zero left-turn lanes, 0.934 for one left-turn lane, and 0.737 for two left-turn lanes.); and

RTLN1 =
The value of the overdispersion parameter for this model is 0.26.

Table 47. Model Parameters and Goodness of Fit for Equation (65).

  Independent variable Over-dispersion
parameter (k)
Parameter Intercept SUMLADT PROTLT PCTLEFT2 VEICOM PTRUCK
Coefficient -4.96 0.39 -0.38 -0.015 0.103 0.027 0.14
Standard deviation 3.078 0.18 0.17 0.006 0.042 0.013 0.039
Significance level (p) 0.107 0.0309 0.022 0.0101 0.013 0.040 0.0005

Note: The val