Optimizing Support Vector Machine Classification Based on Semantic-Text Knowledge Enrichment

Authors

Keywords:

Support Vector Machine, Semantic Enrichment, Text Classification, Latent Dirichlet Allocation, Semantic Concepts Extraction.

Abstract

In this research, we enhanced the performance of Support Vector Machine (SVM) in text classification by applying semantic-knowledge enrichment. We propose using semantic-knowledge enrichment scheme to inject new concepts into the original contents of the text documents. A pre-processing technique is proposed for cleaning and extracting features for generating semantic concepts through using WordNet database and the open source Natural Language Toolkit (NLTK). Additionally, the combined online variation Bayes algorithm and the Latent Dirichlet Allocation model are used as a dimensionality reduction technique to generate abstract concepts from the raw text. In our experiment, we clarified the process of preparing data for cleaning, transformation and weighting the features vectors in a multi-dimensional space as a step to measure the performance metrics of SVM, before and after applying our proposed approach on two different datasets. K-Fold Cross-Validation technique is used to validate our proposed approach. Moreover, a confusion matrix is implemented to measure the accuracy and macro-averages of precision, recall and f1 measurements. The result of the evaluation showed improvements in term of accuracy from 94% to 98.3% for the dataset-1, and from 88% to 93% for dataset-2. Moreover, the training time of the classifier in terms of seconds was reduced to 32% and 17% for dataset-1 and dataset-2 respectively, in comparison with the training time of the original data before applying our proposed enrichment scheme.

 

DOI

Author Biography

Mr. Nasim Kamal Hamaydeh, Al Quds Open University

Over the past 16 years, Mr. Shadi Diab has been information technology professional serving the education sector in Palestine, he has gained broad exposure and experiences in different domains of information technology including education, training, management, certification & testing programs. He is currently head of accreditation & internet based testing unit at ICT center of Al-Quds Open University and social media supervisor at Al-Quds education Satellite Channel. Mr. Diab holds MSc degree in Computer Science and has several accredited certifications not only in information technology but also in education, training, and management. He has Infodev Certified Business Incubator Manager (BIM), Training Methods and Skills for Managers (TMSM), Microsoft Certified Trainer (MCT), Microsoft Office Specialist 2007, 2010 and Master Certifications. Moreover, accredited ICDL Quality Assurance Officer, Adobe Certified Associate in Visual Communication (CS3), ICDL & advanced ICDL (V5) And (V6) certifications and Right Strategies on Social Media Certification, Educational Development Management and Right of children during armed conflicts. Mr. Diab has been an effective participant in different training events, he designed and managed Computer-based "Training Need Assessment" test for PA Cabinet of Ministries staff to measure their skills level in using computer literacy. In additional as a trainer he delivered several training sessions in ICDL and Microsoft Office to PMO, UNRWA, Government security services, The Palestinian Central Bureau of Statistics (PCBS), Technical and Vocational Education and Training (TVET) institutions. Mr. Diab participated in different local and international conferences and has one published journal paper "Classification Of Questions And Learning Outcome Statements (Los) Into Bloom’s Taxonomy (BT) By Similarity Measurements", International Journal of Managing Information Technology (IJMIT) Vol.9, No.2, May 2017”. Moreover, his membership of the scientific committee of the Fourth International Conference on Computer and Information Technology (PICCIT), Hebron – Palestine, 2015. And his participation as a speaker in International Conference on Research in Education and Science (ICRES) May 18-21, 2017 / Ephesus - Turkey, Organized by: IOWA State University, International Society for Research in Education and Science (ISRES) and Eastern Virginia Medical School (EVMS). Moreover, he is a member of organizing committee of the International Conference of Technology, Engineering and Science, Turkey (http://www.icontes.net). Different achievements have been made while organizing the participation in Microsoft Office competition and Certiport World Cup Competition in Palestine in 2008 and 2009, and 2011. In additional to associate and build partnerships and work agreement between Al Quds Open University with the International Educational Companies in Information and communication Technologies not limited to Certiport, ORACLE, Prometric, ECDL Foundation, Microsoft, Red hat and Milte2 Companies

Downloads

Published

2019-03-05

How to Cite

Diab, M. S., & Hamaydeh, M. N. K. (2019). Optimizing Support Vector Machine Classification Based on Semantic-Text Knowledge Enrichment. Palestinian Journal of Technology and Applied Sciences (PJTAS), (2). Retrieved from https://journals.qou.edu/index.php/PJTAS/article/view/2329

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.