Residential College | false |
Status | 已發表Published |
Rare event prediction using similarity majority under-sampling technique | |
Jinyan Li1; Simon Fong1; Shimin Hu1; Victor W. Chu2; Raymond K. Wong3; Sabah Mohammed4; Nilanjan Dey5 | |
2017-11-24 | |
Conference Name | SCDS: International Conference on Soft Computing in Data Science |
Source Publication | Soft Computing in Data Science |
Volume | 788 |
Pages | 23-39 |
Conference Date | 27-28 November |
Conference Place | Yogyakarta, Indonesia |
Abstract | In data mining it is not uncommon to be confronted by imbalanced classification problem in which interesting samples are rare. Having too many ordinary but too few rare samples as training data, will mislead the classifier to become over-fitted by learning too much from majority class samples and become under-fitted lacking recognizing power for minority class samples. In this research work, a novel rebalancing technique that under-samples (reduce by sampling) the majority class size for subsiding the imbalanced class distributions without synthesizing extra training samples, is studied. This simple method is called Similarity Majority Under-Sampling Technique (SMUTE). By measuring the similarity between each majority class sample and its surrounding minority class samples, SMUTE effectively discriminates the majority and minority class samples with consideration of not changing too much of the underlying non-linear mapping between the input variables and the target classes. Two experiments are conducted and reported in this paper: one is an extensive performance comparison of SMUTE with the states-of-the-arts using generated imbalanced data; the other is the use of real data representing a case of natural disaster prevention where accident samples are rare. SMUTE is found to be working favourably well over other methods in both cases. |
Keyword | Imbalanced Classification Under-sampling Similarity Measure Smute |
DOI | 10.1007/978-981-10-7242-0_3 |
URL | View the original |
Language | 英語English |
Scopus ID | 2-s2.0-85036465009 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Affiliation | 1.Department of Computer Information Science, University of Macau, Macau SAR, China 2.School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore 3.School of Computer Science and Engineering, University of New South Wales, Sydney, Australia 4.Department of Computer Science, Lakehead University, Thunder Bay, Canada 5.Department of Information Technology, Techno India College of Technology, Kolkata, India |
First Author Affilication | University of Macau |
Recommended Citation GB/T 7714 | Jinyan Li,Simon Fong,Shimin Hu,et al. Rare event prediction using similarity majority under-sampling technique[C], 2017, 23-39. |
APA | Jinyan Li., Simon Fong., Shimin Hu., Victor W. Chu., Raymond K. Wong., Sabah Mohammed., & Nilanjan Dey (2017). Rare event prediction using similarity majority under-sampling technique. Soft Computing in Data Science, 788, 23-39. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment