Residential Collegefalse
Status已發表Published
A suite of swarm dynamic multi-objective algorithms for rebalancing extremely imbalanced datasets
Jinyan Li1,2; Simon Fong1; Raymond K. Wong3; Sabah Mohammed4; Jinan Fiaidhi4; Yunsick Sung5
2017-11-23
Source PublicationApplied Soft Computing
ISSN1568-4946
Volume69Pages:784-805
Abstract

Imbalanced datasets can be found in a number of fields; they are commonly regarded as big data because of their sheer volume and high attribute dimensions. As the name suggests, imbalanced big datasets come with an extremely imbalanced ratio between the amount of major class and minority class samples. Traditional

methods: have been attempted but still cannot fully, effectively, and reliably solve the imbalanced class classification problem, especially when the distribution of the classes is exceedingly imbalanced. In this paper, we propose a collection of algorithms to solve the problem of imbalanced datasets in binary data classification. Most traditional methods: rebalance the imbalanced dataset merely by matching the data quantities of the two classes. Our proposed algorithms, which take the form of a suite of variants, focus on guaranteeing the credibility of the classification model and reaching the greatest possible accuracy by dynamically rebalancing the training dataset with multi-objective swarm intelligence optimisation. The new algorithms are extended from those we proposed earlier, which had a single objective – first find a set of solutions that satisfy the Kappa criterion, then search for the solution in the set that offers the highest accuracy. Two main modifications are made in the new algorithms. Multi-objective optimisation is aimed at finding a solution that satisfies several criteria at the same time, such as accuracy and identifying a list of credibility indicators. The other enhancement is the incremental operation of the multi-objective optimisation. Incremental optimisation is imperative for processing data feeds that may arrive in a streaming manner. Instead of waiting for the full data archive to be available before optimisation, incremental optimisation rebalances the data feed segment by segment on the fly. The experimental results from the suite of proposed algorithms show that they can effectively attain better and more stable performances from the classification model and are accompanied by much greater credibility than the other five traditional methods when imbalanced datasets are used as training datasets for inducing a classifier.

KeywordSwarm Intelligence Algorithms Dynamic Multi-objective Big Highly Imbalanced Dataset Binary Classification Rebalancing Algorithm
DOI10.1016/j.asoc.2017.11.028
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaComputer Science
WOS SubjectComputer Science, Artificial Intelligence ; Computer Science, Interdisciplinary Applications
WOS IDWOS:000438775200050
PublisherELSEVIER, RADARWEG 29, 1043 NX AMSTERDAM, NETHERLANDS
Scopus ID2-s2.0-85044657110
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionDEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding AuthorSimon Fong
Affiliation1.Department of Computer and Information Science,University of Macau,Taipa,Macao
2.Big Data PDU,Huawei Software Theologies,CO. LTD,Nanjing,China
3.School of Computer Science and Engineering,University of New South Wales,Sydney,2000,Australia
4.Department of Computer Science,Lakehead University,Thunder Bay,Canada
5.Dept of Multimedia Engineering,Dongguk-Seoul,Seoul,Republic of Korea 30, Pildong-ro 1gil, Jung-gu,04602,South Korea
First Author AffilicationUniversity of Macau
Corresponding Author AffilicationUniversity of Macau
Recommended Citation
GB/T 7714
Jinyan Li,Simon Fong,Raymond K. Wong,et al. A suite of swarm dynamic multi-objective algorithms for rebalancing extremely imbalanced datasets[J]. Applied Soft Computing, 2017, 69, 784-805.
APA Jinyan Li., Simon Fong., Raymond K. Wong., Sabah Mohammed., Jinan Fiaidhi., & Yunsick Sung (2017). A suite of swarm dynamic multi-objective algorithms for rebalancing extremely imbalanced datasets. Applied Soft Computing, 69, 784-805.
MLA Jinyan Li,et al."A suite of swarm dynamic multi-objective algorithms for rebalancing extremely imbalanced datasets".Applied Soft Computing 69(2017):784-805.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Jinyan Li]'s Articles
[Simon Fong]'s Articles
[Raymond K. Wong]'s Articles
Baidu academic
Similar articles in Baidu academic
[Jinyan Li]'s Articles
[Simon Fong]'s Articles
[Raymond K. Wong]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Jinyan Li]'s Articles
[Simon Fong]'s Articles
[Raymond K. Wong]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.