Residential Collegefalse
Status已發表Published
Robust high-dimensional bioinformatics data streams mining by ODR-ioVFDT
Dantong Wang1; Simon Fong1; Raymond K.Wong2; Sabah Mohammed3; Jinan Fiaidhi3; Kelvin K. L.Wong4
2017-02-23
Source PublicationScientific Reports
ISSN2045-2322
Volume7
Other Abstract

Outlier detection in bioinformatics data streaming mining has received significant attention by research communities in recent years. The problems of how to distinguish noise from an exception and deciding whether to discard it or to devise an extra decision path for accommodating it are causing dilemma. In this paper, we propose a novel algorithm called ODR with incrementally Optimized Very Fast Decision Tree (ODR-ioVFDT) for taking care of outliers in the progress of continuous data learning. By using an adaptive interquartile-range based identification method, a tolerance threshold is set. It is then used to judge if a data of exceptional value should be included for training or otherwise. This is different from the traditional outlier detection/removal approaches which are two separate steps in processing through the data. The proposed algorithm is tested using datasets of five bioinformatics scenarios and comparing the performance of our model and other ones without ODR. The results show that ODR-ioVFDT has better performance in classification accuracy, kappa statistics, and time consumption. The ODR-ioVFDT applied onto bioinformatics streaming data processing for detecting and quantifying the information of life phenomena, states, characters, variables and components of the organism can help to diagnose and treat disease more effectively.

DOI10.1038/srep43167
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaScience & Technology - Other Topics
WOS SubjectMultidisciplinary Sciences
WOS IDWOS:000394748700001
PublisherNATURE PUBLISHING GROUP, MACMILLAN BUILDING, 4 CRINAN ST, LONDON N1 9XW, ENGLAND
Scopus ID2-s2.0-85013655760
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionDEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding AuthorKelvin K. L.Wong
Affiliation1.Department of Computer and Information Science, Univeristy of Macau, SAR, Macau
2.School of Computer Science and Engineering, University of New South Wales, Australia
3.Department of Computer Science, Lakehead University, Thunder Bay, Canada
4.School of Medicine, University of Western Sydney, New South Wales, Australia
Recommended Citation
GB/T 7714
Dantong Wang,Simon Fong,Raymond K.Wong,et al. Robust high-dimensional bioinformatics data streams mining by ODR-ioVFDT[J]. Scientific Reports, 2017, 7.
APA Dantong Wang., Simon Fong., Raymond K.Wong., Sabah Mohammed., Jinan Fiaidhi., & Kelvin K. L.Wong (2017). Robust high-dimensional bioinformatics data streams mining by ODR-ioVFDT. Scientific Reports, 7.
MLA Dantong Wang,et al."Robust high-dimensional bioinformatics data streams mining by ODR-ioVFDT".Scientific Reports 7(2017).
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Dantong Wang]'s Articles
[Simon Fong]'s Articles
[Raymond K.Wong]'s Articles
Baidu academic
Similar articles in Baidu academic
[Dantong Wang]'s Articles
[Simon Fong]'s Articles
[Raymond K.Wong]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Dantong Wang]'s Articles
[Simon Fong]'s Articles
[Raymond K.Wong]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.