TY - GEN

T1 - Towards a faster network-centric subgraph census

AU - Paredes, Pedro

AU - Ribeiro, Pedro

PY - 2013

Y1 - 2013

N2 - Determining the frequency of small subgraphs is an important computational task lying at the core of several graph mining methodologies, such as network motifs discovery or graphlet based measurements. In this paper we try to improve a class of algorithms available for this purpose, namely networkcentric algorithms, which are based upon the enumeration of all sets of κ connected nodes. Past approaches would essentially delay isomorphism tests until they had a finalized set of κ nodes. In this paper we show how isomorphism testing can be done during the actual enumeration. We use a customized g-trie, a tree data structure, in order to encapsulate the topological information of the embedded subgraphs, identifying already known node permutations of the same subgraph type. With this we avoid redundancy and the need of an isomorphism test for each subgraph occurrence. We tested our algorithm, which we called FaSE, on a set of different real complex networks, both directed and undirected, showcasing that we indeed achieve significant speedups of at least one order of magnitude against past algorithms, paving the way for a faster network-centric approach.

AB - Determining the frequency of small subgraphs is an important computational task lying at the core of several graph mining methodologies, such as network motifs discovery or graphlet based measurements. In this paper we try to improve a class of algorithms available for this purpose, namely networkcentric algorithms, which are based upon the enumeration of all sets of κ connected nodes. Past approaches would essentially delay isomorphism tests until they had a finalized set of κ nodes. In this paper we show how isomorphism testing can be done during the actual enumeration. We use a customized g-trie, a tree data structure, in order to encapsulate the topological information of the embedded subgraphs, identifying already known node permutations of the same subgraph type. With this we avoid redundancy and the need of an isomorphism test for each subgraph occurrence. We tested our algorithm, which we called FaSE, on a set of different real complex networks, both directed and undirected, showcasing that we indeed achieve significant speedups of at least one order of magnitude against past algorithms, paving the way for a faster network-centric approach.

KW - Complex networks

KW - G-tries

KW - Graph mining

KW - Graphlets

KW - Network motifs

KW - Subgraphs

UR - http://www.scopus.com/inward/record.url?scp=84893212925&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893212925&partnerID=8YFLogxK

U2 - 10.1145/2492517.2492535

DO - 10.1145/2492517.2492535

M3 - Conference contribution

AN - SCOPUS:84893212925

SN - 9781450322409

T3 - Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2013

SP - 264

EP - 271

BT - Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2013

PB - Association for Computing Machinery

T2 - 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2013

Y2 - 25 August 2013 through 28 August 2013

ER -