Cervical Cancer Screening using Data Mining Technique

Authors

  • saritchai predawan Sirindhorn College of Public Health, Chonburi
  • Pinkamon Sompeewong Sirindhorn College of Public Health, Chonburi

Keywords:

Cervical Cancer, Correlation of Data, Data Mining

Abstract

Cervical cancer is one of the most common cancers in females these days. Previous screening diagnosis of cervical cancer has been done by several methods. One method is to check the medical history, HPV high-risk type testing, body fluids, PAP smear, and tissue biopsy. In this paper, we proposed a cervical cancer screening diagnostic method by using data mining with Ant-Miner Algorithms. The objective was to search the data mining techniques to create a cervical cancer screening model of efficiency in the classification and feature selection for the data mining method through a correlation-based approach.    These experiments on medical datasets (There are 32 attributes, 4 classes with 858 samples) showed that Correlation-based Feature Selection (CFS- good feature sets contain attributes that are highly correlated with the class) rapidly identifies and screens unrelated, humdrum, and missing features, and identifies relevant features as long as their relevance does not strongly depend on other features. CFS helps by providing a smaller number of features with the high performance of cervical cancer screened by accuracy and precision. The results show that age, number of sexual partners, first sexual intercourse, number of pregnancies, hormonal contraceptives, and IUDS are the main predictive features for cervical cancer.        The screening model of total classes showed a high average accuracy of 94.68% with an average precision of 93.78%. When considered by the type of class the results are as follows: the accuracy of the Hinselmann class was 93.26%, with a precision of 90.00%, the accuracy of the Schiller class was 90.86%, with a precision of 95.24%. The accuracy of the Cytology class was 96.26%, with a precision of 92.10% and the accuracy of the Biopsy class was 98.35%, with a precision of 97.78% respectively. Data mining with the Ant-Miner Algorithm has shown to be advantageous in handling a cervical cancer screening diagnostic assignment with excellent performance.

Keywords: Cervical Cancer, Correlation of Data, Data Mining

Author Biographies

saritchai predawan, Sirindhorn College of Public Health, Chonburi

Instructor

Pinkamon Sompeewong, Sirindhorn College of Public Health, Chonburi

Pharmacist

References

World Health Organization. Cervix uteri [Internet]. 2019 [cited 2020 February 10]. Available from: https://gco.iarc.fr/today/data/factsheets/cancers/23-Cervix-uteri-fact-sheet.pdf

โพสต์ทูเดย์. คณะแพทยศาสตร์ 4 สถาบันเสนอนโยบายสู้มะเร็งที่ถูกต้องลดการเสียชีวิต [อินเทอร์เน็ต]. 2562 [เข้าถึงเมื่อ2 มีนาคม 2563]. เข้าถึงได้จาก: https://www.posttoday.com/pr/597295

กองบรรณาธิการ. มะเร็งปากมดลูก : สาเหตุ อาการ การวินิจฉัย การรักษา และ วัคซีนป้องกัน [อินเทอร์เน็ต]. 2561 [เข้าถึงเมื่อ 19 กุมภภาพันธ์ 2563]. เข้าถึงได้จาก: https://www.honestdocs.co/

cervical-cancer-symptoms-treatment-prevention

UCI Machine Learning Repository. Cervical cancer (Risk Factors) Data Set [Internet]. 2017

[cited 2020 March 3]. Available from: https://archive.ics.uci.edu/ml/datasets/ Cervical+cancer+%28Risk+Factors%29

Fernandes K, Cardoso SJ, Fenandes J. Transfer Learning with Partial Observability Applied to Cervical Cancer Screening. IbPRIA 2017;243-50.

Akyol K. A Study on Test Variable Selection and Balanced Data for Cervical Cancer Disease. Int J of Information Engineering and Electronic Business 2018;(10):1-7.

Hall MA. Correlaton-based feature selection for discrete and numeric class machine learning. In Proceeding of the 17th International Conference on Machine Learning 2000;359-66.

Parpinelli R, Lopes H, Freitas A. Data Mining With an Ant Colony Optimization Algorithm. Evaolutionary Computation, IEEE Transaction on 2002 Sep;(6):321-32.

Raschka S. Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning Wisconsin : University of Wisconsin–Madison ; 2018.

เอกสิทธิ์ พัชรวงศ์ศักดา. การวิเคราะห์ข้อมูลด้วยเทคนิคดาต้าไมน์นิง เบื้องต้น. กรุงเทพมหานคร : บริษัท เอเชีย ดิจิตอลการพิมพ์ จำกัด ; 2557.

Al-Wesabi YMS, Choudhury A, Won D. Classification of Cervical Cancer Dataset. In Proceedings of the 2018 IISE Annual Conference 2018;1456-61.

Unlersen MF, Sabanci K, Ozcan M. Dertemining Cervical Cancer Possibility by Using Machine Learning Methods. International Journal of Latest Research in Engineering and Technology 2017 December; 03(12):65-71.

Downloads

Published

2021-01-16 — Updated on 2026-02-12

Versions