Optimizing Support Vector Machine Classification Based on Semantic-Text Knowledge Enrichment
Keywords:
Support Vector Machine, Semantic Enrichment, Text Classification, Latent Dirichlet Allocation, Semantic Concepts Extraction.Abstract
In this research, we enhanced the performance of Support Vector Machine (SVM) in text classification by applying semantic-knowledge enrichment. We propose using semantic-knowledge enrichment scheme to inject new concepts into the original contents of the text documents. A pre-processing technique is proposed for cleaning and extracting features for generating semantic concepts through using WordNet database and the open source Natural Language Toolkit (NLTK). Additionally, the combined online variation Bayes algorithm and the Latent Dirichlet Allocation model are used as a dimensionality reduction technique to generate abstract concepts from the raw text. In our experiment, we clarified the process of preparing data for cleaning, transformation and weighting the features vectors in a multi-dimensional space as a step to measure the performance metrics of SVM, before and after applying our proposed approach on two different datasets. K-Fold Cross-Validation technique is used to validate our proposed approach. Moreover, a confusion matrix is implemented to measure the accuracy and macro-averages of precision, recall and f1 measurements. The result of the evaluation showed improvements in term of accuracy from 94% to 98.3% for the dataset-1, and from 88% to 93% for dataset-2. Moreover, the training time of the classifier in terms of seconds was reduced to 32% and 17% for dataset-1 and dataset-2 respectively, in comparison with the training time of the original data before applying our proposed enrichment scheme.
Downloads
Published
How to Cite
Issue
Section
License
- The editorial board confirms its commitment to the intellectual property rights
- Researchers also have to commit to the intellectual property rights.
- The research copyrights and publication are owned by the Journal once the researcher is notified about the approval of the paper. The scientific materials published or approved for publishing in the Journal should not be republished unless a written acknowledgment is obtained by the Deanship of Scientific Research.
- Research papers should not be published or republished unless a written acknowledgement is obtained from the Deanship of Scientific Research.
- The researcher has the right to accredit the research to himself, and to place his name on all the copies, editions and volumes published.
- The author has the right to request the accreditation of the published papers to himself.