Conceptual data sampling for breast cancer histology image classification

Eman Rezk, Zainab Awan, Fahad Islam, Ali Jaoua, Somaya Al Maadeed, Nan Zhang, Gautam Das, Nasir Rajpoot

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Data analytics have become increasingly complicated as the amount of data has increased. One technique that is used to enable data analytics in large datasets is data sampling, in which a portion of the data is selected to preserve the data characteristics for use in data analytics. In this paper, we introduce a novel data sampling technique that is rooted in formal concept analysis theory. This technique is used to create samples reliant on the data distribution across a set of binary patterns. The proposed sampling technique is applied in classifying the regions of breast cancer histology images as malignant or benign. The performance of our method is compared to other classical sampling methods. The results indicate that our method is efficient and generates an illustrative sample of small size. It is also competing with other sampling methods in terms of sample size and sample quality represented in classification accuracy and F1 measure.

Original languageEnglish (US)
Pages (from-to)59-67
Number of pages9
JournalComputers in Biology and Medicine
Volume89
DOIs
StatePublished - Oct 1 2017

Fingerprint

Histology
Image classification
Breast Neoplasms
Sampling
Sample Size
Formal concept analysis

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Health Informatics

Cite this

Rezk, E., Awan, Z., Islam, F., Jaoua, A., Al Maadeed, S., Zhang, N., ... Rajpoot, N. (2017). Conceptual data sampling for breast cancer histology image classification. Computers in Biology and Medicine, 89, 59-67. https://doi.org/10.1016/j.compbiomed.2017.07.018
Rezk, Eman ; Awan, Zainab ; Islam, Fahad ; Jaoua, Ali ; Al Maadeed, Somaya ; Zhang, Nan ; Das, Gautam ; Rajpoot, Nasir. / Conceptual data sampling for breast cancer histology image classification. In: Computers in Biology and Medicine. 2017 ; Vol. 89. pp. 59-67.
@article{97c8c8c957c649cba527fcd753f2580d,
title = "Conceptual data sampling for breast cancer histology image classification",
abstract = "Data analytics have become increasingly complicated as the amount of data has increased. One technique that is used to enable data analytics in large datasets is data sampling, in which a portion of the data is selected to preserve the data characteristics for use in data analytics. In this paper, we introduce a novel data sampling technique that is rooted in formal concept analysis theory. This technique is used to create samples reliant on the data distribution across a set of binary patterns. The proposed sampling technique is applied in classifying the regions of breast cancer histology images as malignant or benign. The performance of our method is compared to other classical sampling methods. The results indicate that our method is efficient and generates an illustrative sample of small size. It is also competing with other sampling methods in terms of sample size and sample quality represented in classification accuracy and F1 measure.",
author = "Eman Rezk and Zainab Awan and Fahad Islam and Ali Jaoua and {Al Maadeed}, Somaya and Nan Zhang and Gautam Das and Nasir Rajpoot",
year = "2017",
month = "10",
day = "1",
doi = "10.1016/j.compbiomed.2017.07.018",
language = "English (US)",
volume = "89",
pages = "59--67",
journal = "Computers in Biology and Medicine",
issn = "0010-4825",
publisher = "Elsevier Limited",

}

Rezk, E, Awan, Z, Islam, F, Jaoua, A, Al Maadeed, S, Zhang, N, Das, G & Rajpoot, N 2017, 'Conceptual data sampling for breast cancer histology image classification', Computers in Biology and Medicine, vol. 89, pp. 59-67. https://doi.org/10.1016/j.compbiomed.2017.07.018

Conceptual data sampling for breast cancer histology image classification. / Rezk, Eman; Awan, Zainab; Islam, Fahad; Jaoua, Ali; Al Maadeed, Somaya; Zhang, Nan; Das, Gautam; Rajpoot, Nasir.

In: Computers in Biology and Medicine, Vol. 89, 01.10.2017, p. 59-67.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Conceptual data sampling for breast cancer histology image classification

AU - Rezk, Eman

AU - Awan, Zainab

AU - Islam, Fahad

AU - Jaoua, Ali

AU - Al Maadeed, Somaya

AU - Zhang, Nan

AU - Das, Gautam

AU - Rajpoot, Nasir

PY - 2017/10/1

Y1 - 2017/10/1

N2 - Data analytics have become increasingly complicated as the amount of data has increased. One technique that is used to enable data analytics in large datasets is data sampling, in which a portion of the data is selected to preserve the data characteristics for use in data analytics. In this paper, we introduce a novel data sampling technique that is rooted in formal concept analysis theory. This technique is used to create samples reliant on the data distribution across a set of binary patterns. The proposed sampling technique is applied in classifying the regions of breast cancer histology images as malignant or benign. The performance of our method is compared to other classical sampling methods. The results indicate that our method is efficient and generates an illustrative sample of small size. It is also competing with other sampling methods in terms of sample size and sample quality represented in classification accuracy and F1 measure.

AB - Data analytics have become increasingly complicated as the amount of data has increased. One technique that is used to enable data analytics in large datasets is data sampling, in which a portion of the data is selected to preserve the data characteristics for use in data analytics. In this paper, we introduce a novel data sampling technique that is rooted in formal concept analysis theory. This technique is used to create samples reliant on the data distribution across a set of binary patterns. The proposed sampling technique is applied in classifying the regions of breast cancer histology images as malignant or benign. The performance of our method is compared to other classical sampling methods. The results indicate that our method is efficient and generates an illustrative sample of small size. It is also competing with other sampling methods in terms of sample size and sample quality represented in classification accuracy and F1 measure.

UR - http://www.scopus.com/inward/record.url?scp=85026816963&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85026816963&partnerID=8YFLogxK

U2 - 10.1016/j.compbiomed.2017.07.018

DO - 10.1016/j.compbiomed.2017.07.018

M3 - Article

C2 - 28783538

AN - SCOPUS:85026816963

VL - 89

SP - 59

EP - 67

JO - Computers in Biology and Medicine

JF - Computers in Biology and Medicine

SN - 0010-4825

ER -