## Abstract

Determining the frequency of small subgraphs is an important graph mining primitive. One major class of algorithms for this task is based upon the enumeration of all sets of k connected nodes. These are known as network-centric algorithms. FAst Subgraph Enumeration (FaSE) is a exact algorithm for subgraph counting that contrasted with its past approaches by performing the isomorphism tests while doing the enumeration, encapsulating the topological information in a g-trie and thus largely reducing the number of required isomorphism tests. Our goal with this paper is to expand this approach by providing an approximate algorithm, which we called Rand-FaSE. It uses an unbiased sampling estimator for the number of subgraphs of each type, allowing an user to trade some accuracy for even faster execution times. We tested our algorithm on a set of representative complex networks, comparing it with the exact alternative, FaSE. We also do an extensive analysis by studying its accuracy and speed gains against previous sampling approaches. With all of this, we believe FaSE and Rand-FaSE pave the way for faster network-centric census algorithms.

Original language | English (US) |
---|---|

Article number | 17 |

Pages (from-to) | 1-18 |

Number of pages | 18 |

Journal | Social Network Analysis and Mining |

Volume | 5 |

Issue number | 1 |

DOIs | |

State | Published - Jan 1 2015 |

Externally published | Yes |

## All Science Journal Classification (ASJC) codes

- Information Systems
- Communication
- Media Technology
- Human-Computer Interaction
- Computer Science Applications

## Keywords

- Complex networks
- G-tries
- Graph mining
- Graphlets
- Network motifs
- Subgraphs