Paper Title
A Model for the Classification of Breast Cancer Using Random Forest Algorithm
Oyelakin, Akinyemi Moruff
Breast cancer is a common disease among women globally. Past studies have used Machine learning techniques to speed up the prediction of the disease using labeled datasets. This study proposed a supervised machine learning approach for the classification of breast cancer. The model was built using Random Forest Algorithm. The dataset chosen for this study is a Wisconsin breast cancer (Diagnostic) dataset. The breast cancer dataset was originally released by the University of Wisconsin Hospitals, Madison. Python programming language and some of its libraries were used for the experimental analyses. The dataset was split in the ratio 75:25 percent as training and testing sets respectively. The metrics used for the performance evaluation of the model built include: accuracy, precision, recall, f1-score, and Cohen’s Kappa Statistics. In the experimental analyses, accuracy of 96% was recorded. 98% was obtained for the precision. For the recall, 96% was obtained. Moreso, 97% was obtained for F1-score while 91% was recorded for Cohen’s Kappa Statistics. The model provides superior classification performance in terms of the chosen evaluation metrics.
Breast Cancer Classification, Machine Learning, Feature Selection, Predictive Accuracy