Machine Learning and Resampling Techniques for Enhancing Credit Card Fraud Detection in Imbalanced Dataset

Authors

  • Yousef Abuzir Al-Quds Open University
  • Saleh Y. Abuzir جامعة بريشيا

DOI:

https://doi.org/10.33977/2106-000-007-001

Keywords:

Credit Card Fraud Detection, imbalanced data, machine learning, resampling techniques, SMOTE technique, baseline model, logistic regression, decision tree

Abstract

Objectives: This research addresses the challenge of creditcard fraud detection, complicated by highly imbalanced data where only a small fraction of transactions are fraudulent. It evaluates machine learning methods, including the Baseline Model, Logistic Regression, and Decision Tree, in conjunction with resampling techniques for handling imbalanced data in fraud detection.

Methods: The study utilizes a structured approach involving a dataset, machine learning algorithms, and resampling techniques (Oversampling, Undersampling, SMOTE) to address class imbalance in credit card fraud detection. It aims to improve accuracy by comparing models and assessing the impact of resampling methods on fraud detection performance.

Results: The results indicate that the Synthetic Minority Over-sampling Technique (SMOTE) outperforms traditional methods, achieving an accuracy of 99.89%. The Decision Tree model excels further, with 99.92% accuracy, higher recall (78.79%), and precision (98.11%). These findings underscore the potential of specialized machine learning techniques in improving fraud detection.

In conclusion, this research emphasizes the importance of resampling methods in addressing imbalanced data in credit card fraud detection. The Decision Tree model and SMOTE technique offer practical solutions for real-world applications. This study provides insights for enhancing fraud detection and highlights the role of advanced machine learning in combating credit card fraud effectively in a concise 200-word summary.

Author Biography

Yousef Abuzir, Al-Quds Open University

Associate Professor

References

References

­ Abuzir Y. (2018). Predict the Main Factors that Affect the Vegetable Production in Palestine Using WEKA Data Mining Tool, Palestinian Journal of Technology and Applied Sciences (PJTAS), 1: (58-71).

­ Abuzir Y., Abuzir S. Y., (2020). Data Mining Techniques for Prediction of Concrete Compressive Strength (CCS), Palestinian Journal of Technology and Applied Sciences (PJTAS), 3 (57-72).

­ Akbar, A. T., Husaini, R., Akbar, B. M., & Saifullah, S. (2020). A proposed method for handling an imbalance data in classification of blood type based on Myers-Briggs type indicator. Available: https://scite.ai/reports/10.14710/jtsiskom.2020.13625.

­ Akondi, V. S., Menon, V., Baudry, J., & Whittle, J. (2022). Novel Big Data-Driven Machine Learning Models for Drug Discovery Application. Molecules, 27(3), 594., Available: https://doi.org/10.3390/molecules27030594.

­ Botchey, F. E., Qin, Z., & Hughes-Lartey, K. (2020). Mobile Money Fraud Prediction - A Cross-Case Analysis on the Efficiency of Support Vector Machines, Gradient Boosted Decision Trees, and Naïve Bayes Algorithms. Information. 11(8):383. Available: https://scite.ai/reports/10.3390/info11080383.

­ Fang W., Li X., Zhou P., Yan J., Jiang D. and Zhou T., (2021). Deep Learning Anti-Fraud Model for Internet Loan: Where We Are Going, in IEEE Access, vol. 9: (9777-9784), doi: 10.1109/ACCESS.2021.3051079.

­ Hamid, M. H. A., Yusoff, M., & Mohamed, A. (2022). Survey on Highly Imbalanced Multi-class Data. International Journal of Advanced Computer Science and Applications, 13(6):211-229. https://doi.org/10.14569/ijacsa.2022.0130627.

­ Hasanin, T., Khoshgoftaar, T. M., Leevy, J. L., & Bauder, R. A. (2019). Severely imbalanced Big Data challenges: investigating data sampling approaches. Journal of Big Data, 6(1), (2019, November 30). https://doi.org/10.1186/s40537-019-0274-4.

­ Kalid, S. N., Ng, K. H., Tong, G., & Khor, K. (2020). A Multiple Classifiers System for Anomaly Detection in Credit Card Data with Unbalanced and Overlapped Classes, IEEE Access, vol. 8: 28210-28221doi: 10.1109/ACCESS.2020.2972009.

­ Khan, A. K. A., & Malim, N. H. A. H. (2023). Comparative Studies on Resampling Techniques in Machine Learning and Deep Learning Models for Drug-Target Interaction Prediction. Molecules 28(4):1663. https://doi.org/10.3390/molecules28041663.

­ Koonsanit, K., & Nishiuchi, N. (2021). Predicting Final User Satisfaction Using Momentary UX Data and Machine Learning Techniques, J. Theor. Appl. Electron. Commer. Res. 16(7):3136-3156. https://doi.org/10.3390/jtaer16070171.

­ Kumar, T. (2021). Comparison of Logistic Regression and Decision Tree method for Credit Card Fraud Detection, International Journal for Research in Applied Science & Engineering Technology (IJRASET), 9(V):680-683. https://scite.ai/reports/10.22214/ijraset.2021.34241.

­ Kwaku J. A., Tawiah K., Pels W., Sandra A. H., Dwamena H.A., Owiredu E. O., Samuel A. A., Eshun J. (2023), A supervised machine learning algorithm for detecting and predicting fraud in credit card transactions, Decision Analytics Journal, 6:100-163, https://doi.org/10.1016/j.dajour.2023.100163.

­ Liu, A., Cheng, L., & Yu, C. (2022). SASMOTE: A Self-Attention Oversampling Method for Imbalanced CSI Fingerprints in Indoor Positioning Systems, Sensors 22(15), 5677; https://doi.org/10.3390/s22155677.

­ Mabani, C., Tuskov, A., & Shchanina, E. (2022). Detection Of Credit Card Frauds With Machine Learning Solutions: An Experimental Approach. Krasnoyarsk Science, 11(3), 17-28. https://doi.org/10.12731/2070-7568-2022-11-3-17-28.

­ Manek, H., Kataria, N., Jain, S., & Bhole, C. (2019, November 15). Various Methods for Fraud Transaction Detection in Credit Cards. Journal of Ubiquitous Systems & Pervasive Networks, 12(1): 25-30. https://scite.ai/reports/10.5383/juspn.12.01.004.

­ Maniraj, S.P., Saini, A., Ahmed, S., & Sarkar, S.D. (2019) Credit Card Fraud Detection using Machine Learning and Data Science. International Journal of Engineering Research & Technology (IJERT), 8(9):110-115. DOI:10.17577/ijertv8is090031

­ Meera E., (2022). Credit Card Fraud Detection Using Machine Learning. Thesis. Rochester Institute of Technology, NY, USA.

­ Mohammed, S. J. (2022). Detection and Prevention WEB-Service for Fraudulent E-Transaction using APRIORI and SVM. Al-Mustansiriyah Journal of Science, 33(4):72–79. https://scite.ai/reports/10.23851/mjs.v33i4.1242.

­ Raghavendra R. P., Sivanesh K. A. (2022). A Fraudulent Transaction Prediction in Credit Card by Using Novel LGBA over LR Algorithms. Advances in Parallel Computing: Algorithms, Tools and Paradigms 41: (515-521). https://doi.org/10.3233/apc220073.

­ Rajendran, K., Jayabalan, M., & Thiruchelvam, V. (2020). Predicting Breast Cancer via Supervised Machine Learning Methods on Class Imbalanced Data. International Journal of Advanced Computer Science and Applications, 11(8): 54-63, https://doi.org/10.14569/ijacsa.2020.0110808.

­ Seeja, K. R. and Zareapoor, M. (2014). Fraudminer: a novel credit card fraud detection model based on frequent itemset mining. The Scientific World Journal, 1-10. https://doi.org/10.1155/2014/252797.

­ Tran, L. V., Tran, T., Tran, L. T., & Mai, A. (2019). Solve fraud detection problem by using graph-based learning methods. Journal of Engineering and Science Research, 3(4), 28-31. https://doi.org/10.26666/rmp.jesr.2019.4.6

­ Wu, C., Wang, N., & Wang, Y. (2021). Increasing minority recall support vector machine model for imbalanced data classification. Discrete Dynamics in Nature and Society, 1-12. https://doi.org/10.1155/2021/6647557.

Published

2024-01-27

How to Cite

Abuzir, Y., & Abuzir, S. Y. (2024). Machine Learning and Resampling Techniques for Enhancing Credit Card Fraud Detection in Imbalanced Dataset. Palestinian Journal of Technology and Applied Sciences (PJTAS), 1(7). https://doi.org/10.33977/2106-000-007-001

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.