Xuesong Wang,Mohamed Abdel-Aty
Abstract:Longitudinal intersection crash data are observations on a cross-section of intersections that are observed over several time periods. Such cross-section and time series data structures have positive temporal correlation within each intersection. Using the basic negative binomial regression leads to invalid statistical inference due to incorrect reported test statistics and standard errors based on the misspecified variance. Generalized Estimating Equations (GEEs) provide an extension of Generalized Linear Models (GLMs) to the analysis of longitudinal data, which account for the correlation among the repeated observations for a given intersection. The main objective of this study is to use GEEs with negative binomial link function to model temporal correlation for longitudinal intersection crash data. This analysis is based on 3-year period data for 208 four-legged signalized intersections in the Central Florida area. The model for intersection crash frequencies was fitted using GEEs with negative binomial link function for four different correlation structures. Subsequent main effect analysis identified the relative effect for the variables in the models. Intersections with heavy traffic, larger total number of lanes, large number of phases per cycle, and high speed limits, and in urban areas are correlated with high crash frequencies. While the intersections with more exclusive right-turn lanes, having partial left-turn protection phase, and asphalt mixture surface have the lower risk of crashes.