Powerpoint template for scientific poster

Download Report

Transcript Powerpoint template for scientific poster

Multi-Constrained Graph Pattern Matching in
Large-Scale Contextual Social Graphs
Guanfeng Liu, Kai Zheng, Yan Wang, Mehmet A. Orgun, An Liu, Lei Zhao and Xiaofang Zhou
School of Computer Science and Technology, Soochow University, China
Advanced Data Analytics Lab, Soochow University, China
School of ITEE, The University of Queensland, Australia
Introduction
Graph Pattern Matching (GPM) plays a significant role in social
network analysis, which has been widely used in, for example,
experts finding, social community mining and social position detection.
Given a pattern graph and a data graph, a GPM algorithm finds those
subgraphs that match the query graph. However, the existing GPM
methods do not consider the multiple constraints on edges in a query
graph, which are commonly exist in various applications such as,
crowdsourcing travel, social network based e-commerce and study
group selection, etc.
Contributions:
a. We propose a new notion of Multiple-Constrained
Simulation (MCS).
b. We propose a concept called Strong Social Component
(SSC).
c. We propose a novel index structure and a graph
compression method for SSC with polynomial time
complexity.
d. We propose a Heuristic Algorithm for MC-GPM, called
HAMC .
Graph Isomorphism
Bounded Simulation
(Ullmann et al., 1976)
(Fan et al., 2010)
Strong Social Component (SSC)
Multiple Constrained Simulation
subsumes the classical NP1
Complete multi-constrained path selection problem (Korkmaz et al.,
INFOCOM’01), and thus is NP-Complete
HAMC
A subgraph is said to be socially strongly connected if each
vertex associated with a high role impact factor value in a
specific domain is connected with the edges associated with
intimate social relationships and strong social trust
relationships.
Graph Compression
• Join-based method (Fan et al., VLDB2010,ICDE2014,
VLDB2014) aims to find a maximal matching that contains all
matching subgraphs in a data graph.
• Exploration-based method (Sun et al., VLDB2012, Cheng et
al., ICDE2013) aims to quickly answer a GPM query
• The time complexity of HAMC is
Experiments
Index of SSC
The average number of answers
returned after visiting 1000 mapped
paths