Residential College | false |
Status | 已發表Published |
Extracting Top-κ Frequent and Diversified Patterns in Knowledge Graphs | |
Zeng,Jian1; U,Leong Hou2; Yan,Xiao3; Li,Yan4; Han,Mingji5; Tang,Bo3 | |
2024-02 | |
Source Publication | IEEE Transactions on Knowledge and Data Engineering |
ISSN | 1041-4347 |
Volume | 36Issue:2Pages:608-626 |
Abstract | A knowledge graph contains many real-world facts that can be used to support various analytical tasks, e.g., exceptional fact discovery and the check of claims. In this work, we attempt to extract top-$k$ frequent and diversified patterns from knowledge graph by well capturing user interest. Specifically, we first formalize the core-based top-$k$ frequent pattern discovery problem, which finds the top-$k$ frequent patterns that are extended from a core pattern specified by user query and have the highest frequency. In addition, to diversify the top-$k$ frequent patterns, we define a distance function to measure the dissimilarity between two patterns, and return top-$k$ patterns in which the pairwise diversity of any two resultant patterns exceeds a given threshold. As the search space of candidate patterns is exponential w.r.t. the number of nodes and edges in the knowledge graph, discovering frequent and diversified patterns is computationally challenging. To achieve high efficiency, we propose a suite of techniques, including (1) We devise a meta-index to avoid generating invalid candidate patterns; (2) We propose an upper bound of the frequency score (i.e., MNI) of the candidate pattern, which is used to prune unqualified candidates earlier and prioritize the enumeration order of patterns; (3) We design an advanced join-based approach to compute the MNI of candidate patterns efficiently; and (4) We develop a lower bound for distance function and incrementally compute the pairwise diversity among the patterns. Using real-world knowledge graphs, we experimentally verify the efficiency and effectiveness of our proposed techniques. We also demonstrate the utility of the extracted patterns by case studies. |
Keyword | Knowledge Discovery Graph Pattern Mining Data Exploration |
DOI | 10.1109/TKDE.2022.3233594 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Computer Science ; Engineering |
WOS Subject | Computer Science, Artificial Intelligence ; Computer Science, Information Systems ; Engineering, Electrical & Electronic |
WOS ID | WOS:001140611700001 |
Publisher | IEEE COMPUTER SOC, 10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1314 |
Scopus ID | 2-s2.0-85164412387 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE Faculty of Science and Technology THE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU) |
Corresponding Author | Tang,Bo |
Affiliation | 1.Harbin Institute of Technology, Harbin, China 2.Department of Computer and Information Science, State Key Laboratory of Internet of Things for Smart City, Centre for Data Science, University of Macau, Macau, China 3.Department of Computer Science and Engineering, Research Institute of Trustworthy Autonomous Systems, Southern University of Science and Technology, Shenzhen, China 4.Shenzhen Polytechnic, Shenzhen, China 5.Boston University, Boston, MA, USA |
Recommended Citation GB/T 7714 | Zeng,Jian,U,Leong Hou,Yan,Xiao,et al. Extracting Top-κ Frequent and Diversified Patterns in Knowledge Graphs[J]. IEEE Transactions on Knowledge and Data Engineering, 2024, 36(2), 608-626. |
APA | Zeng,Jian., U,Leong Hou., Yan,Xiao., Li,Yan., Han,Mingji., & Tang,Bo (2024). Extracting Top-κ Frequent and Diversified Patterns in Knowledge Graphs. IEEE Transactions on Knowledge and Data Engineering, 36(2), 608-626. |
MLA | Zeng,Jian,et al."Extracting Top-κ Frequent and Diversified Patterns in Knowledge Graphs".IEEE Transactions on Knowledge and Data Engineering 36.2(2024):608-626. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment