DSI: A method for indexing large graphs using distance set

Yubo Kou, Yukun Li, Xiaofeng Meng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent years we have witnessed a great increase in modeling data as large graphs in multiple domains, such as XML, the semantic web, social network. In these circumstances, researchers are interested in querying the large graph like that: Given a large graph G, and a query Q, we report all the matches of Q in G. Since subgraph isomorphism checking is proved to be NP-Complete[1], it is infeasible to scan the whole large graph for answers, especially when the query's size is also large. Hence, the "filter-verification" approach is widely adopted. In this approach, researchers first index the neighborhood of each vertex in the large graph, then filter vertexes , and finally perform subgraph matching algorithms. Previous techniques mainly focus on efficient matching algorithms, paying little attention to indexing techniques. However, appropriate indexing techniques could help improve the efficiency of query response by generating less candidates. In this paper we investigate indexing techniques on large graphs, and propose an index structure DSI(Distance Set Index) to capture the neighborhood of each vertex. Through our distance set index, more vertexes could be pruned, resulting in a much smaller search space. Then a subgraph matching algorithm is performed in the search space. We have applied our index structure to real datasets and synthetic datasets. Extensive experiments demonstrate the efficiency and effectiveness of our indexing technique.

Original languageEnglish (US)
Title of host publicationWeb-Age Information Management - 11th International Conference, WAIM 2010, Proceedings
Pages297-308
Number of pages12
DOIs
StatePublished - Aug 3 2010
Event11th International Conference on Web-Age Information Management, WAIM 2010 - Jiuzhaigou, China
Duration: Jul 15 2010Jul 17 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6184 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th International Conference on Web-Age Information Management, WAIM 2010
CountryChina
CityJiuzhaigou
Period7/15/107/17/10

Fingerprint

Graph Distance
Indexing
Graph in graph theory
Matching Algorithm
Subgraph
Query
Semantic Web
XML
Search Space
Data structures
Filter
Data Modeling
Vertex of a graph
Social Networks
Isomorphism
Efficient Algorithms
NP-complete problem
Experiments
Demonstrate
Experiment

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Kou, Y., Li, Y., & Meng, X. (2010). DSI: A method for indexing large graphs using distance set. In Web-Age Information Management - 11th International Conference, WAIM 2010, Proceedings (pp. 297-308). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6184 LNCS). https://doi.org/10.1007/978-3-642-14246-8_30
Kou, Yubo ; Li, Yukun ; Meng, Xiaofeng. / DSI : A method for indexing large graphs using distance set. Web-Age Information Management - 11th International Conference, WAIM 2010, Proceedings. 2010. pp. 297-308 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{3ac3c17fee6344a6ae0fa39668c903d0,
title = "DSI: A method for indexing large graphs using distance set",
abstract = "Recent years we have witnessed a great increase in modeling data as large graphs in multiple domains, such as XML, the semantic web, social network. In these circumstances, researchers are interested in querying the large graph like that: Given a large graph G, and a query Q, we report all the matches of Q in G. Since subgraph isomorphism checking is proved to be NP-Complete[1], it is infeasible to scan the whole large graph for answers, especially when the query's size is also large. Hence, the {"}filter-verification{"} approach is widely adopted. In this approach, researchers first index the neighborhood of each vertex in the large graph, then filter vertexes , and finally perform subgraph matching algorithms. Previous techniques mainly focus on efficient matching algorithms, paying little attention to indexing techniques. However, appropriate indexing techniques could help improve the efficiency of query response by generating less candidates. In this paper we investigate indexing techniques on large graphs, and propose an index structure DSI(Distance Set Index) to capture the neighborhood of each vertex. Through our distance set index, more vertexes could be pruned, resulting in a much smaller search space. Then a subgraph matching algorithm is performed in the search space. We have applied our index structure to real datasets and synthetic datasets. Extensive experiments demonstrate the efficiency and effectiveness of our indexing technique.",
author = "Yubo Kou and Yukun Li and Xiaofeng Meng",
year = "2010",
month = "8",
day = "3",
doi = "10.1007/978-3-642-14246-8_30",
language = "English (US)",
isbn = "3642142451",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "297--308",
booktitle = "Web-Age Information Management - 11th International Conference, WAIM 2010, Proceedings",

}

Kou, Y, Li, Y & Meng, X 2010, DSI: A method for indexing large graphs using distance set. in Web-Age Information Management - 11th International Conference, WAIM 2010, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6184 LNCS, pp. 297-308, 11th International Conference on Web-Age Information Management, WAIM 2010, Jiuzhaigou, China, 7/15/10. https://doi.org/10.1007/978-3-642-14246-8_30

DSI : A method for indexing large graphs using distance set. / Kou, Yubo; Li, Yukun; Meng, Xiaofeng.

Web-Age Information Management - 11th International Conference, WAIM 2010, Proceedings. 2010. p. 297-308 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6184 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - DSI

T2 - A method for indexing large graphs using distance set

AU - Kou, Yubo

AU - Li, Yukun

AU - Meng, Xiaofeng

PY - 2010/8/3

Y1 - 2010/8/3

N2 - Recent years we have witnessed a great increase in modeling data as large graphs in multiple domains, such as XML, the semantic web, social network. In these circumstances, researchers are interested in querying the large graph like that: Given a large graph G, and a query Q, we report all the matches of Q in G. Since subgraph isomorphism checking is proved to be NP-Complete[1], it is infeasible to scan the whole large graph for answers, especially when the query's size is also large. Hence, the "filter-verification" approach is widely adopted. In this approach, researchers first index the neighborhood of each vertex in the large graph, then filter vertexes , and finally perform subgraph matching algorithms. Previous techniques mainly focus on efficient matching algorithms, paying little attention to indexing techniques. However, appropriate indexing techniques could help improve the efficiency of query response by generating less candidates. In this paper we investigate indexing techniques on large graphs, and propose an index structure DSI(Distance Set Index) to capture the neighborhood of each vertex. Through our distance set index, more vertexes could be pruned, resulting in a much smaller search space. Then a subgraph matching algorithm is performed in the search space. We have applied our index structure to real datasets and synthetic datasets. Extensive experiments demonstrate the efficiency and effectiveness of our indexing technique.

AB - Recent years we have witnessed a great increase in modeling data as large graphs in multiple domains, such as XML, the semantic web, social network. In these circumstances, researchers are interested in querying the large graph like that: Given a large graph G, and a query Q, we report all the matches of Q in G. Since subgraph isomorphism checking is proved to be NP-Complete[1], it is infeasible to scan the whole large graph for answers, especially when the query's size is also large. Hence, the "filter-verification" approach is widely adopted. In this approach, researchers first index the neighborhood of each vertex in the large graph, then filter vertexes , and finally perform subgraph matching algorithms. Previous techniques mainly focus on efficient matching algorithms, paying little attention to indexing techniques. However, appropriate indexing techniques could help improve the efficiency of query response by generating less candidates. In this paper we investigate indexing techniques on large graphs, and propose an index structure DSI(Distance Set Index) to capture the neighborhood of each vertex. Through our distance set index, more vertexes could be pruned, resulting in a much smaller search space. Then a subgraph matching algorithm is performed in the search space. We have applied our index structure to real datasets and synthetic datasets. Extensive experiments demonstrate the efficiency and effectiveness of our indexing technique.

UR - http://www.scopus.com/inward/record.url?scp=77955024273&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77955024273&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-14246-8_30

DO - 10.1007/978-3-642-14246-8_30

M3 - Conference contribution

AN - SCOPUS:77955024273

SN - 3642142451

SN - 9783642142451

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 297

EP - 308

BT - Web-Age Information Management - 11th International Conference, WAIM 2010, Proceedings

ER -

Kou Y, Li Y, Meng X. DSI: A method for indexing large graphs using distance set. In Web-Age Information Management - 11th International Conference, WAIM 2010, Proceedings. 2010. p. 297-308. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-14246-8_30