*PhD Defense*
February 26 2015 10:00–12:00
4080 AIR Building
Thi Bich Thuy Luong
Thi Bich Thuy Luong
TS PhD Candidate

In this research, we propose a series of models designed to take advantage of availability of data—both structured and unstructured—from a variety of sources ranging from passive data, to questionnaires, to social media to analyze underlying patterns and trends of travel and activity behavior. The results support enhancements both in transportation planning and also in the application of programming to support such efforts.
First, a framework for automatically inferring the travel modes and trip purposes of human movement, when tracked by a GPS device, is introduced. We utilize a multiple changepoints algorithm to divide trajectories into segments using only speed data, with no use of referencing information or assumptions about the participants’ temporal or location contexts. Then, Random Forest is used to classify segments into moving and not—moving types. For moving segments, travel mode is predicted. Next, multiple machine learning algorithms are employed, validated, and tested to identify the most suitable model for inferring trip purposes. Estimation results indicate that Random Forest provides the best results. The overall prediction accuracy is over 80% on the testing set—both with and without data on socio-demographic variables—predicting “shop” trips with an accuracy of 92.1%, while its accuracy for “go home” and “studying” trips reaches 100%.
Additionally, we analyze data pertaining to responses to the introduction of light rail service taken in waves to complement and evaluate knowledge about how personal travel behavior varies over time of day, day of week, and between waves. Our results indicate that, although the average of activity duration varies significantly over days of week and waves, the random effect of these two factors on activity duration was minor; time of day contributed over one third of the total variance in the duration.
Finally, the dissertation demonstrates uses of Twitter data as a potentially important data source to understand comments, criticisms, and responses about light rail in Los Angles. This result can be useful for exploring trends among commuters and how their emotions varied according to the light rail line they used, the time of day, and the day of the week.