Predicting patients' risk of developing certain diseases is an important research topic in healthcare. Accurately identifying and ranking the similarity among patients based on their historical records is a key step in personalized healthcare. The electric health records (EHRs), which are irregularly sampled and have varied patient visit lengths, cannot be directly used to measure patient similarity due to the lack of an appropriate representation. Moreover, there needs an effective approach to measure patient similarity on EHRs. In this paper, we propose two novel deep similarity learning frameworks which simultaneously learn patient representations and measure pairwise similarity. We use a convolutional neural network (CNN) to capture local important information in EHRs and then feed the learned representation into triplet loss or softmax cross entropy loss. After training, we can obtain pairwise distances and similarity scores. Utilizing the similarity information, we then perform disease predictions and patient clustering. Experimental results show that CNN can better represent the longitudinal EHR sequences, and our proposed frameworks outperform state-of-the-art distance metric learning methods.
All Science Journal Classification (ASJC) codes
- Medicine (miscellaneous)
- Biomedical Engineering
- Pharmaceutical Science
- Computer Science Applications
- Electrical and Electronic Engineering