Shapley Homology: Topological Analysis of Sample Influence for Neural Networks

Kaixuan Zhang, Qinglong Wang, Xue Liu, C. Lee Giles

Research output: Contribution to journalArticle

Abstract

Data samples collected for training machine learning models are typically assumed to be independent and identically distributed (i.i.d.). Recent research has demonstrated that this assumption can be problematic as it simplifies the manifold of structured data. This has motivated different research areas such as data poisoning, model improvement, and explanation of machine learning models. In this work, we study the influence of a sample on determining the intrinsic topological features of its underlying manifold. We propose the Shapley homology framework, which provides a quantitative metric for the influence of a sample of the homology of a simplicial complex. Our proposed framework consists of two main parts: homology analysis, where we compute the Betti number of the target topological space, and Shapley value calculation, where we decompose the topological features of a complex built from data points to individual points. By interpreting the influence as a probability measure, we further define an entropy that reflects the complexity of the data manifold. Furthermore, we provide a preliminary discussion of the connection of the Shapley homology to the Vapnik-Chervonenkis dimension. Empirical studies show that when the zero-dimensional Shapley homology is used on neighboring graphs, samples with higher influence scores have a greater impact on the accuracy of neural networks that determine graph connectivity and on several regular grammars whose higher entropy values imply greater difficulty in being learned.

Original languageEnglish (US)
Pages (from-to)1355-1378
Number of pages24
JournalNeural computation
Volume32
Issue number7
DOIs
StatePublished - Jul 1 2020

All Science Journal Classification (ASJC) codes

  • Arts and Humanities (miscellaneous)
  • Cognitive Neuroscience

Fingerprint Dive into the research topics of 'Shapley Homology: Topological Analysis of Sample Influence for Neural Networks'. Together they form a unique fingerprint.

  • Cite this