Using Fine Needle Aspiration Data to Classify Breast Cancer Types by Machine Learning

Authors

  • Rami Suleiman Khader Al-Quds Open University
  • Mohamed Mahmoud Dweib Al-Quds Open University
  • Yousef Saleh Abuzir Al-Quds Open University

DOI:

https://doi.org/10.33977/2106-000-008-003

Keywords:

Machine Learning (ML), , Breast Cancer Classifications, Decision Tree Classifier (DTC), , Support Vector Machine (SVM), , Random Forest Classifier (RFC), , Fine Needle Aspiration

Abstract

Objectives: Breast cancer, a leading cause of death worldwide and the foremost in Palestine, often benefits from early diagnosis to improve patient outcomes. However, diagnosing small tumors accurately can be challenging, with a high risk of human error. This study seeks to enhance breast cancer classification by utilizing machine learning (ML) algorithms.

Methods: The research analyzed and utilized three machine learning techniques - Decision Tree Classifier (DTC), Support Vector Machine (SVM), and Random Forest Classifier (RFC) - to predict breast cancer tumors. The accuracy of the three algorithms was analyzed and evaluated using a confusion matrix as well as different metrics on a dataset containing 569 samples and 29 features.

Results: The result showed that the Decision Tree Classifier (DTC) has the high scores of 100% in accuracy, precision, sensitivity, and specificity.

Conclusions: In the conclusion, the research emphasizes the excellent performance of the Decision Tree Classifier in classifying breast cancer, which could significantly improve diagnostic accuracy and patient outcomes. The results indicate that DTC has the potential to be a useful ML model in decreasing human diagnostic mistakes and enhancing the early detection and care in medical environments, prompting additional studies to enhance and confirm its effectiveness.

Methods: The research analyzed and utilized three machine learning techniques - Decision Tree Classifier (DTC), Support Vector Machine (SVM), and Random Forest Classifier (RFC) - to predict breast cancer tumors. The accuracy of the three algorithms was analyzed and evaluated using a confusion matrix as well as different metrics on a dataset containing 569 samples and 29 features.

Results: The result showed that the Decision Tree Classifier (DTC) has the high scores of 100% in accuracy, precision, sensitivity, and specificity.

Conclusions: In the conclusion, the research emphasizes the excellent performance of the Decision Tree Classifier in classifying breast cancer, which could significantly improve diagnostic accuracy and patient outcomes. The results indicate that DTC has the potential to be a useful ML model in decreasing human diagnostic mistakes and enhancing the early detection and care in medical environments, prompting additional studies to enhance and confirm its effectiveness.

Author Biographies

Rami Suleiman Khader , Al-Quds Open University

Master’s Student

Mohamed Mahmoud Dweib, Al-Quds Open University

Associate Professor

Yousef Saleh Abuzir , Al-Quds Open University

Professor

References

REFERENCES

­ Abuzir Y., Abuzir M., and Abuzir A. (2020), Using Artificial Neural Networks (ANN) to Detect the Diabetes, in COMMUNICATION & COGNITION (C&C) Journal, V53, N3-4 pp 103-122, (2020). Ghent, Belgium.

­ Rao, K. M., Saikrishna, G., & Supriya, K. (2023). Data preprocessing techniques: Emergence and selection towards machine learning models - A practical review using HPA dataset. Multimedia Tools and Applications, 82(1), 1-20. https://doi.org/10.1007/s11042-023-15087-5

­ Awad M. M, Khanna A. (2021), A Review of Artificial Intelligence Techniques in Breast Cancer Detection and Diagnosis, Journal of Breast Cancer Research and Treatment, 2021.

­ Bhardwaj A., Tiwari A. (2015). Breast cancer diagnosis using genetically optimized neural network models. Expert Syst. Appl. 2015, 42, 4611–4620.

­ Bokhare, A., & Jha, P. (2023). Machine learning models applied in analyzing breast cancer classification accuracy. IAES International Journal of Artificial Intelligence (IJ-AI), 12(3), 1370. https://doi.org/10.11591/ijai.v12.i3.pp1370-1377

­ Breast Cancer Wisconsin (Diagnostic) Data Set (BCWD 1995), UCI Machine Learning Repository (Center for Machine Learning and Intelligent Systems), Link UCI Machine Learning Repository: Breast Cancer Wisconsin (Diagnostic) Data Set.

­ Budiman, E., Haviluddin, H., Dengan, N., & Kridalaksana, A. H. (2018). Performance of decision tree C4.5 algorithm in student academic evaluation. In Computational Science and Technology (pp. 380-389). Lecture Notes in Electrical Engineering. https://doi.org/10.1007/978-981-10-8276-4_36

­ Centers for Disease Control and Prevention. (n.d.). breast cancer? CDC. https://www.cdc.gov/breast-cancer/index.html (Access June 2024)

­ Chang, M. (2019). Machine learning techniques for personalized breast cancer risk prediction: Comparison with the BCRAT and BOADICEA models. BMC Medical Informatics and Decision Making.

­ Chen, H., Wang, N., Du, X., Mei, K., Zhou, Y., & Cai, G. (2023). Classification Prediction of Breast Cancer Based on Machine Learning. Computational Intelligence and Neuroscience, 2023, 1–9. https://doi.org/10.1155/2023/6530719

­ Cingillioglu, I., & Makalic, E. (2022). A 3-stage classification system for predicting breast cancer diagnosis via FNA biopsy features. https://doi.org/10.21203/rs.3.rs-1982314/v1

­ Dhahri, H. (2019). Automated breast cancer diagnosis based on machine learning algorithms. Hindawi. Retrieved from https://www.hindawi.com/journals/.

­ ENT Health: American Academy of Otolaryngology and Neck Surgery (2024), Fine Needle Aspiration, https://www.enthealth.org/conditions/fine-needle-aspiration/.

­ Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115-118. https://doi.org/10.1038/nature21056

­ Ettazi, H., Najat, R., & Abouchabaka, J. (2023). Machine learning for a medical prediction system: Breast cancer detection as a use case. E3S Web of Conferences, 412, 01092. https://doi.org/10.1051/e3sconf/202341201092

­ Fritz, P., Raoufi, R., Dalquen, P., Sediqi, A., Müller, S., Mollin, J., Goletz, S., Dippon, J., Hubler, M., Aeppel, T., Soudah, B., Firooz, H., Weinhara, M., Fabian De Barreto, I., Aichmüller, C., & Stauch, G. (2023). Artificial intelligence assisted diagnoses of fine-needle aspiration of breast diseases: A single-center experience. Journal of Digital Health, 1–11. https://doi.org/10.55976/jdh.2202311501-11

­ Gibbons, C. (2017). Supervised machine learning algorithms can classify open-text feedback of doctor performance with human-level accuracy. Journal of Medical Internet Research.

­ Hassan M. A., R., Basheer, N. M., & Younis, A. K. (2023). A survey: Breast Cancer Classification by Using Machine Learning Techniques. NTU Journal of Engineering and Technology, 2(1). https://doi.org/10.56286/ntujet.v2i1.367

­ Hassan, M., & Sobia, I. (2020). Breast cancer diagnosis using deep learning algorithms by analyzing different classification techniques: A systematic review. Journal of Healthcare Engineering.

­ https://doi.org/10.1109/BioSMART58455.2023.10162052

­ Juanjuan Li, Bradley M. (2021), (NPJ Journal), (Automated and rapid detection of cancer in suspicious axillary lymph nodes in patients with breast cancer), Link (Automated and rapid detection of cancer in suspicious axillary lymph nodes in patients with breast cancer | npj Breast Cancer (nature.com)), July 2021.

­ Kharya S., Dubey D., Soni S. (2013), Predictive Machine Learning Techniques for Breast Cancer Detection, International Journal of Computer Science and Information Technologies (IJCSIT), Vol. 4 (6), 2013, 1023-1028.

­ Li, S., & Margolies, L. R. (2019). Deep learning to improve breast cancer detection on screening mammography. Scientific Reports. Retrieved from https://www.nature.com/.

­ Maglogiannis, I., Zafiropoulos, E., & Anagnostopoulos (2009), An intelligent system for automated breast cancer diagnosis andprognosis using SVM based classifiers, Applied intelligence journal, Volume 30, Issue1, February 2009.

­ Mahmood, M., Imran, M., Satuluri, N., Kuppa, M. R., & Rajesh, V. (2011). An improved CART decision tree for datasets with irrelevant features. In Proceedings of the International Conference on Swarm, Evolutionary, and Memetic Computing (pp. 539-549).

­ Mandeep R, M., Chandorkar, P., Dsouza, A., & Kazi, N. (2015). Breast cancer diagnosis and recurrence prediction using machine learning techniques. International Journal of Research in Engineering and Technology.

­ McKinney, S. M., Sieniek, M., Godbole, V., Godwin, J., Antropova, N., Ashrafian, H., Back, T., Chesus, M., Corrado, G. S., Darzi, A., Etemadi, M., Garcia-Vicente, F., Gilbert, F. J., Halling-Brown, M., Hassabis, D., Jansen, S., Karthikesalingam, A., Kelly, C. J., King, D., ... Shetty, S. (2020). International evaluation of an AI system for breast cancer screening. Nature, 577(7788), 89-94. https://doi.org/10.1038/s41586-019-1799-6.

­ Ministry of Health – State of Palestine MHPS. (2021). Health Annual Report Palestine.

­ Ong, M.-S. (2012). Automated identification of extreme-risk events in clinical incident reports. Journal of the American Medical Informatics Association.

­ Pandya, R., & Pandya, J. (2015). C5.0 algorithm to improve decision tree with feature selection and reduced error pruning. International Journal of Computer Applications, 117(16), 18-21.

­ Pedregosa, F., Varoquaux, G., Gramfort, A., & others. (2023). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 24, 1-9. https://doi.org/10.5555/3548367.3548368

­ Qaiser, T., & Bhatti, S. H. (2019). Machine learning approaches for breast cancer classification. Expert Systems with Applications.

­ Quinlan, J. R. (1986). Induction of Decision Trees. Machine Learning, 1(1), 81-106

­ Krishna R, K., T M, R., Gopal M. G., N., & G, K. (2023). Breast Cancer Classification Using Machine Learning. International Research Journal on Advanced Science Hub, 5(Issue 05S), 88–93. https://doi.org/10.47392/irjash.2023.S012

­ Rokach, L., & Maimon, O. (2008). Data mining with decision trees: Theory and applications. World Scientific Publishing Co.

­ Rui, T., Tianyi, W., Yifan, X., Hongji, S., & Toe, T. T. (2023). Breast image classification based on ResNet and Random Forest multilayer classifier model. 2023 5th International Conference on Bio-Engineering for Smart Technologies (BioSMART), 1–6.

­ Saravanakumar, M., & Kannan, Dr. S. (2023). Pattern Recognition in Breast Cancer Using Machine Learning. INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT, 07(03). https://doi.org/10.55041/IJSREM18255

­ Shafique, R., Rustam, F., Choi, G. S., Díez, I. D. L. T., Mahmood, A., Lipari, V., Velasco, C. L. R., & Ashraf, I. (2023). Breast Cancer Prediction Using Fine Needle Aspiration Features and Upsampling with Supervised Machine Learning. Cancers, 15(3), 681. https://doi.org/10.3390/cancers15030681

­ Sharma, H., & Kumar, S. (2016). A survey on decision tree algorithms of classification in data mining. International Journal of Science and Research, 5(4), 2094-2097.

­ Sheth, D., & Giger, M. L. (2019). Artificial intelligence in the interpretation of breast cancer on MRI. Journal of Magnetic Resonance Imaging. https://doi.org/10.1002/jmri.26878

­ Singh, A. K. (2023). Breast Cancer Classification Using ML on WDBC. In K. Kumar Singh, M. K. Bajpai, & A. Sheikh Akbari (Eds.), Machine Vision and Augmented Intelligence (Vol. 1007, pp. 609–619). Springer Nature Singapore. https://doi.org/10.1007/978-981-99-0189-0_48

­ Song, Y. Y., & Ying, L. (2015). Decision tree methods: Applications for classification and prediction. Shanghai Archives of Psychiatry, 27(2), 130.

­ Sugimoto, M., Hikichi, S., Takada, M., & Toi, M. (2021). Machine learning techniques for breast cancer diagnosis and treatment: A narrative review. Annals of Breast Surgery, 7. https://abs.amegroups.org/article/view/7085

­ Tarawneh, O., Otair, M., Husni, M., Abuaddous, Hayfa. Y., Tarawneh, M., & Almomani, M. A. (2022). Breast Cancer Classification using Decision Tree Algorithms. International Journal of Advanced Computer Science and Applications, 13(4). https://doi.org/10.14569/IJACSA.2022.0130478

­ Taznim, S. A., & Ferdous, S. M. (2018). Integrating big data and machine learning techniques for cancer risk prediction. International Conference on Bangla Speech and Language Processing.

­ Tran, H. (2019). A survey of machine learning and data mining techniques used in multimedia systems.

­ Varsha, B., Sneka, P., Tanuja, A., & Shana, J. (2023). Classification Models for Breast Cancer Detection. In A. Chitra, V. Indragandhi, & W. Razia Sultana (Eds.), Intelligent and Soft Computing Systems for Green Energy (1st ed., pp. 255–264). Wiley. https://doi.org/10.1002/9781394167524.ch19

­ Wankhade, Y., Toutam, S., Thakre, K., Kalbande, K., & Thakre, P. (2023). Machine learning approach for breast cancer prediction: A review. In 2023 2nd International Conference on Applied Artificial Intelligence and Computing (ICAAIC) (pp. 566-570). https://doi.org/10.1109/ICAAIC56838.2023.10141164

­ Wei, Y., Zhang, D., Gao, M., Tian, Y., He, Y., Huang, B., & Zheng, C. (2023). Breast cancer prediction based on machine learning. Journal of Software Engineering and Applications, 16, 348-360. https://doi.org/10.4236/jsea.2023.168018

­ World Health Organization. WHO (2024). Cancer. Retrieved from https://www.who.int/

­ Yue, W., Wang, Z., Chen, H., & Payne, A. M. (2018). Machine learning with applications in breast cancer diagnosis and prognosis. Designs, 2(2), 13. https://doi.org/10.3390/designs 2020013

­ Zeng, C. (2022). An Application of Generalized Linear Models to Fine Needle Aspiration in Breast Cancer. Highlights in Science, Engineering and Technology, 8, 178–184. https://doi.org/10.54097/hset.v8i.1125.

Downloads

Published

2025-06-02

How to Cite

Khader , R. S., Dweib, M. M., & Abuzir , Y. S. (2025). Using Fine Needle Aspiration Data to Classify Breast Cancer Types by Machine Learning. Palestinian Journal of Technology and Applied Sciences (PJTAS), 1(8). https://doi.org/10.33977/2106-000-008-003

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.