Very few studies have examined the impact of built environment on urban rail transit ridership at the station-to-station (origin-destination) level. Moreover, most direct ridership models (DRMs) tend to involve simple a prior assumed linear or log-linear relationship in which the estimated parameters are assumed to hold across the entire data space of the explanatory variables. These models cannot detect any changes in the linear (or non-linear) effects across different values of the features of built environment on urban rail transit ridership, which possibly induces biased results and hides some non-negligible and detailed information. Based on these research gaps, this study develops a time-of-day origin-destination DRM that uses smart card data pertaining to the Nanjing metro system, China. It applies a gradient boosting regression trees model to provide a more refined data mining approach to investigate the non-linear associations between features of the built environment and station-to-station ridership. Data related to the built environment, station type, demographics, and travel impedance including a less used variable – detour, were collected and used in the analysis. The empirical results show that most independent variables are associated with station-to-station ridership in a discontinuous non-linear way, regardless of the time period. The built environment on the origin side has a larger effect on station-to-station ridership than the built environment on the destination side for the morning peak hours, while the opposite holds for the afternoon peak hours and night. The results also indicate that transfer times is more important variables than detour and route distance.
|Number of pages
|Transportation Research. Part D: Transport and Environment
|Published - May 2020
- Built environment
- Gradient boosting regression trees
- Non-linear effect
- Station-to-station ridership