Independent informative subgraph mining for graph information retrieval

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

In order to enable scalable querying of graph databases, intelligent selection of subgraphs to index is essential. An improved index can reduce response times for graph queries significantly. For a given subgraph query, graph candidates that may contain the subgraph are retrieved using the graph index and subgraph isomorphism tests are performed to prune out unsatisfied graphs. However, since the space of all possible subgraphs of the whole set of graphs is prohibitively large, feature selection is required to identify a good subset of subgraph features for indexing. Thus, one of the key issues is: given the set of all possible subgraphs of the graph set, which subset of features is the optimal such that the algorithm retrieves the smallest set of candidate graphs and reduces the number of subgraph isomorphism tests? We introduce a graph search method for subgraph queries based on subgraph frequencies. Then, we propose several novel feature selection criteria, Max-Precision, Max-Irredundant-Information, and Max-Information-Min-Redundancy, based on mutual information. Finally we show theoretically and empirically that our proposed methods retrieve a smaller candidate set than previous methods. For example, using the same number of features, our method improve the precision for the query candidate set by 4%-13% in comparison to previous methods. As a result the response time of subgraph queries also is improved correspondingly.

Original languageEnglish (US)
Title of host publicationACM 18th International Conference on Information and Knowledge Management, CIKM 2009
Pages563-572
Number of pages10
DOIs
StatePublished - Dec 1 2009
EventACM 18th International Conference on Information and Knowledge Management, CIKM 2009 - Hong Kong, China
Duration: Nov 2 2009Nov 6 2009

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

OtherACM 18th International Conference on Information and Knowledge Management, CIKM 2009
CountryChina
CityHong Kong
Period11/2/0911/6/09

Fingerprint

Graph
Information retrieval
Query
Feature selection
Isomorphism
Response time
Indexing
Selection criteria
Redundancy
Data base
Mutual information

All Science Journal Classification (ASJC) codes

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Cite this

Sun, B., Mitra, P., & Giles, C. L. (2009). Independent informative subgraph mining for graph information retrieval. In ACM 18th International Conference on Information and Knowledge Management, CIKM 2009 (pp. 563-572). (International Conference on Information and Knowledge Management, Proceedings). https://doi.org/10.1145/1645953.1646026
Sun, Bingjun ; Mitra, Prasenjit ; Giles, C. Lee. / Independent informative subgraph mining for graph information retrieval. ACM 18th International Conference on Information and Knowledge Management, CIKM 2009. 2009. pp. 563-572 (International Conference on Information and Knowledge Management, Proceedings).
@inproceedings{ca472dc2e5464c409ec40dc531b8f4f3,
title = "Independent informative subgraph mining for graph information retrieval",
abstract = "In order to enable scalable querying of graph databases, intelligent selection of subgraphs to index is essential. An improved index can reduce response times for graph queries significantly. For a given subgraph query, graph candidates that may contain the subgraph are retrieved using the graph index and subgraph isomorphism tests are performed to prune out unsatisfied graphs. However, since the space of all possible subgraphs of the whole set of graphs is prohibitively large, feature selection is required to identify a good subset of subgraph features for indexing. Thus, one of the key issues is: given the set of all possible subgraphs of the graph set, which subset of features is the optimal such that the algorithm retrieves the smallest set of candidate graphs and reduces the number of subgraph isomorphism tests? We introduce a graph search method for subgraph queries based on subgraph frequencies. Then, we propose several novel feature selection criteria, Max-Precision, Max-Irredundant-Information, and Max-Information-Min-Redundancy, based on mutual information. Finally we show theoretically and empirically that our proposed methods retrieve a smaller candidate set than previous methods. For example, using the same number of features, our method improve the precision for the query candidate set by 4{\%}-13{\%} in comparison to previous methods. As a result the response time of subgraph queries also is improved correspondingly.",
author = "Bingjun Sun and Prasenjit Mitra and Giles, {C. Lee}",
year = "2009",
month = "12",
day = "1",
doi = "10.1145/1645953.1646026",
language = "English (US)",
isbn = "9781605585123",
series = "International Conference on Information and Knowledge Management, Proceedings",
pages = "563--572",
booktitle = "ACM 18th International Conference on Information and Knowledge Management, CIKM 2009",

}

Sun, B, Mitra, P & Giles, CL 2009, Independent informative subgraph mining for graph information retrieval. in ACM 18th International Conference on Information and Knowledge Management, CIKM 2009. International Conference on Information and Knowledge Management, Proceedings, pp. 563-572, ACM 18th International Conference on Information and Knowledge Management, CIKM 2009, Hong Kong, China, 11/2/09. https://doi.org/10.1145/1645953.1646026

Independent informative subgraph mining for graph information retrieval. / Sun, Bingjun; Mitra, Prasenjit; Giles, C. Lee.

ACM 18th International Conference on Information and Knowledge Management, CIKM 2009. 2009. p. 563-572 (International Conference on Information and Knowledge Management, Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Independent informative subgraph mining for graph information retrieval

AU - Sun, Bingjun

AU - Mitra, Prasenjit

AU - Giles, C. Lee

PY - 2009/12/1

Y1 - 2009/12/1

N2 - In order to enable scalable querying of graph databases, intelligent selection of subgraphs to index is essential. An improved index can reduce response times for graph queries significantly. For a given subgraph query, graph candidates that may contain the subgraph are retrieved using the graph index and subgraph isomorphism tests are performed to prune out unsatisfied graphs. However, since the space of all possible subgraphs of the whole set of graphs is prohibitively large, feature selection is required to identify a good subset of subgraph features for indexing. Thus, one of the key issues is: given the set of all possible subgraphs of the graph set, which subset of features is the optimal such that the algorithm retrieves the smallest set of candidate graphs and reduces the number of subgraph isomorphism tests? We introduce a graph search method for subgraph queries based on subgraph frequencies. Then, we propose several novel feature selection criteria, Max-Precision, Max-Irredundant-Information, and Max-Information-Min-Redundancy, based on mutual information. Finally we show theoretically and empirically that our proposed methods retrieve a smaller candidate set than previous methods. For example, using the same number of features, our method improve the precision for the query candidate set by 4%-13% in comparison to previous methods. As a result the response time of subgraph queries also is improved correspondingly.

AB - In order to enable scalable querying of graph databases, intelligent selection of subgraphs to index is essential. An improved index can reduce response times for graph queries significantly. For a given subgraph query, graph candidates that may contain the subgraph are retrieved using the graph index and subgraph isomorphism tests are performed to prune out unsatisfied graphs. However, since the space of all possible subgraphs of the whole set of graphs is prohibitively large, feature selection is required to identify a good subset of subgraph features for indexing. Thus, one of the key issues is: given the set of all possible subgraphs of the graph set, which subset of features is the optimal such that the algorithm retrieves the smallest set of candidate graphs and reduces the number of subgraph isomorphism tests? We introduce a graph search method for subgraph queries based on subgraph frequencies. Then, we propose several novel feature selection criteria, Max-Precision, Max-Irredundant-Information, and Max-Information-Min-Redundancy, based on mutual information. Finally we show theoretically and empirically that our proposed methods retrieve a smaller candidate set than previous methods. For example, using the same number of features, our method improve the precision for the query candidate set by 4%-13% in comparison to previous methods. As a result the response time of subgraph queries also is improved correspondingly.

UR - http://www.scopus.com/inward/record.url?scp=74549174134&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=74549174134&partnerID=8YFLogxK

U2 - 10.1145/1645953.1646026

DO - 10.1145/1645953.1646026

M3 - Conference contribution

AN - SCOPUS:74549174134

SN - 9781605585123

T3 - International Conference on Information and Knowledge Management, Proceedings

SP - 563

EP - 572

BT - ACM 18th International Conference on Information and Knowledge Management, CIKM 2009

ER -

Sun B, Mitra P, Giles CL. Independent informative subgraph mining for graph information retrieval. In ACM 18th International Conference on Information and Knowledge Management, CIKM 2009. 2009. p. 563-572. (International Conference on Information and Knowledge Management, Proceedings). https://doi.org/10.1145/1645953.1646026