Identifying factors that are associated with the probability of roadside work zone collisions enables decision makers to better assess and control the risk of scheduling a particular maintenance or construction activity by modifying the characteristics of the operation. This can be achieved by studying the effect of work zone properties on the risk of roadside work zone collisions. Much of the existing work in this area is based on data in the police traffic collision reports, which do not include data on the characteristics of the work zone itself. This paper develops a comprehensive data set of 42 features describing time, location, work zone characteristics, traffic volume, and road properties. Using recent machine learning techniques such as extreme gradient boosting classifiers on this extensive set of features allows for more accurate analysis to identify factors that affect the risk of work zone collisions or indicate higher than baseline chances of a roadside crash. Our statistical analysis reveals 10 important features and shows that four of these features are significantly associated with higher probabilities of roadside work zone collisions.