Residential College | false |
Status | 已發表Published |
Graph Enhanced Fuzzy Clustering for Categorical Data Using a Bayesian Dissimilarity Measure | |
Zhang, Chuanbin1,2; Chen, Long1; Zhao, Yin Ping3; Wang, Yingxu1; Chen, C. L.P.4 | |
2022-07-11 | |
Source Publication | IEEE TRANSACTIONS ON FUZZY SYSTEMS |
ISSN | 1063-6706 |
Volume | 31Issue:3Pages:810 - 824 |
Abstract | Categorical data is widely available in many real-world applications, and to discover valuable patterns in such data by clustering is of great importance. However, the lack of a decent quantitative relationship among categorical values makes traditional clustering approaches, which are usually developed for numerical data, perform poorly on categorical datasets. To solve this problem and boost the performance of clustering for categorical data, we propose a novel fuzzy clustering model in this paper. At first, by approximating the Maximum A Posteriori (MAP) estimation of a discrete distribution of data partition, a new fuzzy clustering objective function is designed for categorical data. The Bayesian dissimilarity measure is formulated in this objective to tackle the subtle relationships between categorical values efficiently. Then, to further enhance the performance of clustering, a novel Kullback-Leibler (KL) divergence-based graph regularization is integrated into the clustering objective to exploit the prior knowledge on datasets, for example, the information about correlations of data points. The proposed model is solved by the alternative optimization and the experimental results on the synthetic and real-world datasets show that it outperforms the classical and relevant state-of-the-art algorithms. We also present the parameter analysis of our approach, and conduct a comprehensive study on the effectiveness of the Bayesian dissimilarity measure and the KL divergence-based graph regularization. |
Keyword | Bayesian Methods Categorical Data Fuzzy Centroids Fuzzy Clustering Graph KullbacK–leibler (K–l) Divergence |
DOI | 10.1109/TFUZZ.2022.3189831 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Computer Science ; Engineering |
WOS Subject | Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic |
WOS ID | WOS:000966728800001 |
Scopus ID | 2-s2.0-85134251985 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | Faculty of Science and Technology DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Corresponding Author | Chen, Long; Zhao, Yin Ping |
Affiliation | 1.Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Macau, China 2.School of Computer Science and Software, Zhaoqing University, Zhaoqing 526061, China 3.Northwestern Polytechnical University, School of Software, Xi'an, 710072, China 4.South China University of Technology, School of Computer Science and Engineering, Guangzhou, 510641, China |
First Author Affilication | Faculty of Science and Technology |
Corresponding Author Affilication | Faculty of Science and Technology |
Recommended Citation GB/T 7714 | Zhang, Chuanbin,Chen, Long,Zhao, Yin Ping,et al. Graph Enhanced Fuzzy Clustering for Categorical Data Using a Bayesian Dissimilarity Measure[J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2022, 31(3), 810 - 824. |
APA | Zhang, Chuanbin., Chen, Long., Zhao, Yin Ping., Wang, Yingxu., & Chen, C. L.P. (2022). Graph Enhanced Fuzzy Clustering for Categorical Data Using a Bayesian Dissimilarity Measure. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 31(3), 810 - 824. |
MLA | Zhang, Chuanbin,et al."Graph Enhanced Fuzzy Clustering for Categorical Data Using a Bayesian Dissimilarity Measure".IEEE TRANSACTIONS ON FUZZY SYSTEMS 31.3(2022):810 - 824. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment