Residential Collegefalse
Status已發表Published
Rare event prediction using similarity majority under-sampling technique
Jinyan Li1; Simon Fong1; Shimin Hu1; Victor W. Chu2; Raymond K. Wong3; Sabah Mohammed4; Nilanjan Dey5
2017-11-24
Conference NameSCDS: International Conference on Soft Computing in Data Science
Source PublicationSoft Computing in Data Science
Volume788
Pages23-39
Conference Date27-28 November
Conference PlaceYogyakarta, Indonesia
Abstract

In data mining it is not uncommon to be confronted by imbalanced classification problem in which interesting samples are rare. Having too many ordinary but too few rare samples as training data, will mislead the classifier to become over-fitted by learning too much from majority class samples and become under-fitted lacking recognizing power for minority class samples. In this research work, a novel rebalancing technique that under-samples (reduce by sampling) the majority class size for subsiding the imbalanced class distributions without synthesizing extra training samples, is studied. This simple method is called Similarity Majority Under-Sampling Technique (SMUTE). By measuring the similarity between each majority class sample and its surrounding minority class samples, SMUTE effectively discriminates the majority and minority class samples with consideration of not changing too much of the underlying non-linear mapping between the input variables and the target classes. Two experiments are conducted and reported in this paper: one is an extensive performance comparison of SMUTE with the states-of-the-arts using generated imbalanced data; the other is the use of real data representing a case of natural disaster prevention where accident samples are rare. SMUTE is found to be working favourably well over other methods in both cases.

KeywordImbalanced Classification Under-sampling Similarity Measure Smute
DOI10.1007/978-981-10-7242-0_3
URLView the original
Language英語English
Scopus ID2-s2.0-85036465009
Fulltext Access
Citation statistics
Document TypeConference paper
CollectionDEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Affiliation1.Department of Computer Information Science, University of Macau, Macau SAR, China
2.School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
3.School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
4.Department of Computer Science, Lakehead University, Thunder Bay, Canada
5.Department of Information Technology, Techno India College of Technology, Kolkata, India
First Author AffilicationUniversity of Macau
Recommended Citation
GB/T 7714
Jinyan Li,Simon Fong,Shimin Hu,et al. Rare event prediction using similarity majority under-sampling technique[C], 2017, 23-39.
APA Jinyan Li., Simon Fong., Shimin Hu., Victor W. Chu., Raymond K. Wong., Sabah Mohammed., & Nilanjan Dey (2017). Rare event prediction using similarity majority under-sampling technique. Soft Computing in Data Science, 788, 23-39.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Jinyan Li]'s Articles
[Simon Fong]'s Articles
[Shimin Hu]'s Articles
Baidu academic
Similar articles in Baidu academic
[Jinyan Li]'s Articles
[Simon Fong]'s Articles
[Shimin Hu]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Jinyan Li]'s Articles
[Simon Fong]'s Articles
[Shimin Hu]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.