Machine Learning and Resampling Techniques for Enhancing Credit Card Fraud Detection in Imbalanced Dataset
DOI:
https://doi.org/10.33977/2106-000-007-001Keywords:
Credit Card Fraud Detection, imbalanced data, machine learning, resampling techniques, SMOTE technique, baseline model, logistic regression, decision treeAbstract
Objectives: This research addresses the challenge of creditcard fraud detection, complicated by highly imbalanced data where only a small fraction of transactions are fraudulent. It evaluates machine learning methods, including the Baseline Model, Logistic Regression, and Decision Tree, in conjunction with resampling techniques for handling imbalanced data in fraud detection.
Methods: The study utilizes a structured approach involving a dataset, machine learning algorithms, and resampling techniques (Oversampling, Undersampling, SMOTE) to address class imbalance in credit card fraud detection. It aims to improve accuracy by comparing models and assessing the impact of resampling methods on fraud detection performance.
Results: The results indicate that the Synthetic Minority Over-sampling Technique (SMOTE) outperforms traditional methods, achieving an accuracy of 99.89%. The Decision Tree model excels further, with 99.92% accuracy, higher recall (78.79%), and precision (98.11%). These findings underscore the potential of specialized machine learning techniques in improving fraud detection.
In conclusion, this research emphasizes the importance of resampling methods in addressing imbalanced data in credit card fraud detection. The Decision Tree model and SMOTE technique offer practical solutions for real-world applications. This study provides insights for enhancing fraud detection and highlights the role of advanced machine learning in combating credit card fraud effectively in a concise 200-word summary.
References
References
Abuzir Y. (2018). Predict the Main Factors that Affect the Vegetable Production in Palestine Using WEKA Data Mining Tool, Palestinian Journal of Technology and Applied Sciences (PJTAS), 1: (58-71).
Abuzir Y., Abuzir S. Y., (2020). Data Mining Techniques for Prediction of Concrete Compressive Strength (CCS), Palestinian Journal of Technology and Applied Sciences (PJTAS), 3 (57-72).
Akbar, A. T., Husaini, R., Akbar, B. M., & Saifullah, S. (2020). A proposed method for handling an imbalance data in classification of blood type based on Myers-Briggs type indicator. Available: https://scite.ai/reports/10.14710/jtsiskom.2020.13625.
Akondi, V. S., Menon, V., Baudry, J., & Whittle, J. (2022). Novel Big Data-Driven Machine Learning Models for Drug Discovery Application. Molecules, 27(3), 594., Available: https://doi.org/10.3390/molecules27030594.
Botchey, F. E., Qin, Z., & Hughes-Lartey, K. (2020). Mobile Money Fraud Prediction - A Cross-Case Analysis on the Efficiency of Support Vector Machines, Gradient Boosted Decision Trees, and Naïve Bayes Algorithms. Information. 11(8):383. Available: https://scite.ai/reports/10.3390/info11080383.
Fang W., Li X., Zhou P., Yan J., Jiang D. and Zhou T., (2021). Deep Learning Anti-Fraud Model for Internet Loan: Where We Are Going, in IEEE Access, vol. 9: (9777-9784), doi: 10.1109/ACCESS.2021.3051079.
Hamid, M. H. A., Yusoff, M., & Mohamed, A. (2022). Survey on Highly Imbalanced Multi-class Data. International Journal of Advanced Computer Science and Applications, 13(6):211-229. https://doi.org/10.14569/ijacsa.2022.0130627.
Hasanin, T., Khoshgoftaar, T. M., Leevy, J. L., & Bauder, R. A. (2019). Severely imbalanced Big Data challenges: investigating data sampling approaches. Journal of Big Data, 6(1), (2019, November 30). https://doi.org/10.1186/s40537-019-0274-4.
Kalid, S. N., Ng, K. H., Tong, G., & Khor, K. (2020). A Multiple Classifiers System for Anomaly Detection in Credit Card Data with Unbalanced and Overlapped Classes, IEEE Access, vol. 8: 28210-28221doi: 10.1109/ACCESS.2020.2972009.
Khan, A. K. A., & Malim, N. H. A. H. (2023). Comparative Studies on Resampling Techniques in Machine Learning and Deep Learning Models for Drug-Target Interaction Prediction. Molecules 28(4):1663. https://doi.org/10.3390/molecules28041663.
Koonsanit, K., & Nishiuchi, N. (2021). Predicting Final User Satisfaction Using Momentary UX Data and Machine Learning Techniques, J. Theor. Appl. Electron. Commer. Res. 16(7):3136-3156. https://doi.org/10.3390/jtaer16070171.
Kumar, T. (2021). Comparison of Logistic Regression and Decision Tree method for Credit Card Fraud Detection, International Journal for Research in Applied Science & Engineering Technology (IJRASET), 9(V):680-683. https://scite.ai/reports/10.22214/ijraset.2021.34241.
Kwaku J. A., Tawiah K., Pels W., Sandra A. H., Dwamena H.A., Owiredu E. O., Samuel A. A., Eshun J. (2023), A supervised machine learning algorithm for detecting and predicting fraud in credit card transactions, Decision Analytics Journal, 6:100-163, https://doi.org/10.1016/j.dajour.2023.100163.
Liu, A., Cheng, L., & Yu, C. (2022). SASMOTE: A Self-Attention Oversampling Method for Imbalanced CSI Fingerprints in Indoor Positioning Systems, Sensors 22(15), 5677; https://doi.org/10.3390/s22155677.
Mabani, C., Tuskov, A., & Shchanina, E. (2022). Detection Of Credit Card Frauds With Machine Learning Solutions: An Experimental Approach. Krasnoyarsk Science, 11(3), 17-28. https://doi.org/10.12731/2070-7568-2022-11-3-17-28.
Manek, H., Kataria, N., Jain, S., & Bhole, C. (2019, November 15). Various Methods for Fraud Transaction Detection in Credit Cards. Journal of Ubiquitous Systems & Pervasive Networks, 12(1): 25-30. https://scite.ai/reports/10.5383/juspn.12.01.004.
Maniraj, S.P., Saini, A., Ahmed, S., & Sarkar, S.D. (2019) Credit Card Fraud Detection using Machine Learning and Data Science. International Journal of Engineering Research & Technology (IJERT), 8(9):110-115. DOI:10.17577/ijertv8is090031
Meera E., (2022). Credit Card Fraud Detection Using Machine Learning. Thesis. Rochester Institute of Technology, NY, USA.
Mohammed, S. J. (2022). Detection and Prevention WEB-Service for Fraudulent E-Transaction using APRIORI and SVM. Al-Mustansiriyah Journal of Science, 33(4):72–79. https://scite.ai/reports/10.23851/mjs.v33i4.1242.
Raghavendra R. P., Sivanesh K. A. (2022). A Fraudulent Transaction Prediction in Credit Card by Using Novel LGBA over LR Algorithms. Advances in Parallel Computing: Algorithms, Tools and Paradigms 41: (515-521). https://doi.org/10.3233/apc220073.
Rajendran, K., Jayabalan, M., & Thiruchelvam, V. (2020). Predicting Breast Cancer via Supervised Machine Learning Methods on Class Imbalanced Data. International Journal of Advanced Computer Science and Applications, 11(8): 54-63, https://doi.org/10.14569/ijacsa.2020.0110808.
Seeja, K. R. and Zareapoor, M. (2014). Fraudminer: a novel credit card fraud detection model based on frequent itemset mining. The Scientific World Journal, 1-10. https://doi.org/10.1155/2014/252797.
Tran, L. V., Tran, T., Tran, L. T., & Mai, A. (2019). Solve fraud detection problem by using graph-based learning methods. Journal of Engineering and Science Research, 3(4), 28-31. https://doi.org/10.26666/rmp.jesr.2019.4.6
Wu, C., Wang, N., & Wang, Y. (2021). Increasing minority recall support vector machine model for imbalanced data classification. Discrete Dynamics in Nature and Society, 1-12. https://doi.org/10.1155/2021/6647557.
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
- The editorial board confirms its commitment to the intellectual property rights
- Researchers also have to commit to the intellectual property rights.
- The research copyrights and publication are owned by the Journal once the researcher is notified about the approval of the paper. The scientific materials published or approved for publishing in the Journal should not be republished unless a written acknowledgment is obtained by the Deanship of Scientific Research.
- Research papers should not be published or republished unless a written acknowledgement is obtained from the Deanship of Scientific Research.
- The researcher has the right to accredit the research to himself, and to place his name on all the copies, editions and volumes published.
- The author has the right to request the accreditation of the published papers to himself.