cispa_all_3038.pdf (2.89 MB)

What is Normal, What is Strange, and What is Missing in an Knowledge Graph

Download (2.89 MB)
conference contribution
posted on 2023-11-29, 18:12 authored by Caleb Belth, X. Zheng, Jilles VreekenJilles Vreeken, Danai Koutra
Knowledge graphs (KGs) store highly heterogeneous information about the world in the structure of a graph, and are useful for tasks such as question answering and reasoning. However, they often contain errors and are missing information. Vibrant research in KG refinement has worked to resolve these issues, tailoring techniques to either detect specific types of errors or complete a KG. In this work, we introduce a \textit{unified solution} to KG characterization by formulating the problem as \emph{unsupervised KG summarization} with a set of inductive, \textit{soft rules}, which describe what is \emph{normal} in a KG, and thus can be used to identify what is \emph{abnormal}, whether it be strange or missing. Unlike first-order logic rules, our rules are labeled, rooted graphs, i.e., patterns that describe the expected neighborhood around a (seen or unseen) node, based on its type and information in the KG. Stepping away from the traditional support/confidence-based rule mining techniques, we propose \method, \emph{Knowledge Graph Inductive SummarizaTion}, which learns a summary of inductive rules that best compress the KG according to the Minimum Description Length principle---a formulation that we are the first to use in the context of KG rule mining. We apply our rules to three large KGs (\NELL{}, \DBpedia{}, and \Yago{}), and tasks such as compression, various types of error detection, and identification of incomplete information. We show that \method outperforms task-specific, supervised and unsupervised baselines in error detection and incompleteness identification, (identifying the location of up to 93\% of missing entities---over 10\% more than baselines), while also being efficient for large knowledge graphs.


Preferred Citation

Caleb Belth, X. Zheng, Jilles Vreeken and Danai Koutra. What is Normal, What is Strange, and What is Missing in an Knowledge Graph. In: The Web Conference (WWW). 2020.

Primary Research Area

  • Empirical and Behavioral Security

Secondary Research Area

  • Trustworthy Information Processing

Name of Conference

The Web Conference (WWW)

Legacy Posted Date


Open Access Type

  • Unknown


@inproceedings{cispa_all_3038, title = "What is Normal, What is Strange, and What is Missing in an Knowledge Graph", author = "Belth, Caleb and Zheng, X. and Vreeken, Jilles and Koutra, Danai", booktitle="{The Web Conference (WWW)}", year="2020", }

Usage metrics


    No categories selected


    Ref. manager