Credibility-aware learning for automated driving crash severity prediction: An audit-guided approach
Date:
Li, Z., Liang, H., & Ye, Y.* (2026, December 14-15). Credibility-aware learning for automated driving crash severity prediction: An audit-guided approach [Poster Presentation]. The 30th International Conference of Hong Kong Society for Transportation Studies, Hong Kong, China.
Abstract: As automated driving systems (ADS) continue to move from controlled testing toward real world road operation, their safety performance has become a central concern for researchers, regulators, and technology developers. Although crash reports now provide an important basis for empirical safety assessment, accurate prediction of crash severity for ADS is still hindered by two persistent data limitations: the small number of reported crashes and the extreme scarcity of severe outcomes. Under such conditions, conventional classifiers are often dominated by majority classes, while synthetic samples introduced for rebalancing may improve class proportions without necessarily improving sample credibility. This study proposes an audit guided framework for ADS crash severity prediction using crash reports from the U.S. National Highway Traffic Safety Administration (NHTSA) Standing General Order database. The analysis first identifies a competitive prediction backbone without audit enhancement. Candidate minority samples generated by multiple rebalancing methods are then evaluated by a teacher committee trained only on real observations. Four complementary criteria are considered, including probability support, class boundary separation, cross model consistency, and local density plausibility. To avoid subjective weight assignment, the framework combines within class percentile normalization with Pareto based ranking to retain only the most credible synthetic samples for final training. The selected model is further interpreted with SHapley Additive exPlanations (SHAP) to examine the relative influence and directional effects of key variables on severity outcomes. The study shifts the focus from sample generation alone to sample credibility control, and offers a practical data driven approach for ADS safety assessment, targeted testing, and risk sensitive deployment.
