Recent developments in sensors, GPS and smart phones have provided us with a large amount of mobility data. At the same time, large-scale crowd-generated social media data, such as geo-Tagged tweets, provide rich semantic information about locations and events. Combining the mobility data and surrounding social media data enables us to semantically understand why a person travels to a location at a particular time (e.g., attending a local event or visiting a point of interest). Previous research on mobility data mining has been mainly focused on mining patterns using only the mobility data. In this paper, we study the problem of using social media to annotate mobility data. As social media data is often noisy, the key research problem lies in using the right model to retrieve only the relevant words with respect to a mobility record. We propose frequency-based method, Gaussian mixture model, and kernel density estimation (KDE) to tackle this problem. We show that KDE is the most suitable model as it captures the locality of word distribution very well. We test our proposal using the real dataset collected from Twitter and demonstrate the effectiveness of our techniques via both interesting case studies and a comprehensive evaluation.