Multimodal deep learning for cervical dysplasia diagnosis

Tao Xu, Han Zhang, Xiaolei Huang, Shaoting Zhang, Dimitris N. Metaxas

Research output: Chapter in Book/Report/Conference proceedingConference contribution

118 Scopus citations


To improve the diagnostic accuracy of cervical dysplasia,it is important to fuse multimodal information collected during a patient’s screening visit. However,current multimodal frameworks suffer from low sensitivity at high specificity levels,due to their limitations in learning correlations among highly heterogeneous modalities. In this paper,we design a deep learning framework for cervical dysplasia diagnosis by leveraging multimodal information. We first employ the convolutional neural network (CNN) to convert the low-level image data into a feature vector fusible with other non-image modalities. We then jointly learn the non-linear correlations among all modalities in a deep neural network. Our multimodal framework is an end-to-end deep network which can learn better complementary features from the image and non-image modalities. It automatically gives the final diagnosis for cervical dysplasia with 87.83% sensitivity at 90% specificity on a large dataset,which significantly outperforms methods using any single source of information alone and previous multimodal frameworks.

Original languageEnglish (US)
Title of host publicationMedical Image Computing and Computer-Assisted Intervention - MICCAI 2016 - 19th International Conference, Proceedings
EditorsGozde Unal, Sebastian Ourselin, Leo Joskowicz, Mert R. Sabuncu, William Wells
PublisherSpringer Verlag
Number of pages9
ISBN (Print)9783319467221
StatePublished - 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9901 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'Multimodal deep learning for cervical dysplasia diagnosis'. Together they form a unique fingerprint.

Cite this