TY - GEN
T1 - Detecting community structures in Hi-C genomic data
AU - Cabreros, Irineo
AU - Abbe, Emmanuel
AU - Tsirigos, Aristotelis
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/4/26
Y1 - 2016/4/26
N2 - Community detection (CD) algorithms are applied to Hi-C data to discover new communities of loci in the 3D conformation of human and mouse DNA. We find that CD has some distinct advantages over pre-existing methods: (1) it is capable of finding a variable number of communities, (2) it can detect communities of DNA loci either adjacent or distant in the 1D sequence, and (3) it allows us to obtain a principled value of k, the number of communities present. Forcing k = 2, our method recovers earlier findings of Lieberman-Aiden, et al. (2009), but letting k be a parameter, our method obtains as optimal value k∗ = 6, discovering new candidate communities. In addition to discovering large communities that partition entire chromosomes, we also show that CD can detect small-scale topologically associating domains (TADs) such as those found in Dixon, et al. (2012). CD thus provides a natural and flexible statistical framework for understanding the folding structure of DNA at multiple scales in Hi-C data.
AB - Community detection (CD) algorithms are applied to Hi-C data to discover new communities of loci in the 3D conformation of human and mouse DNA. We find that CD has some distinct advantages over pre-existing methods: (1) it is capable of finding a variable number of communities, (2) it can detect communities of DNA loci either adjacent or distant in the 1D sequence, and (3) it allows us to obtain a principled value of k, the number of communities present. Forcing k = 2, our method recovers earlier findings of Lieberman-Aiden, et al. (2009), but letting k be a parameter, our method obtains as optimal value k∗ = 6, discovering new candidate communities. In addition to discovering large communities that partition entire chromosomes, we also show that CD can detect small-scale topologically associating domains (TADs) such as those found in Dixon, et al. (2012). CD thus provides a natural and flexible statistical framework for understanding the folding structure of DNA at multiple scales in Hi-C data.
KW - Community Detection
KW - DNA Folding
KW - Hi-C
KW - Mixed-Membership Models
UR - http://www.scopus.com/inward/record.url?scp=84992332909&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84992332909&partnerID=8YFLogxK
U2 - 10.1109/CISS.2016.7460568
DO - 10.1109/CISS.2016.7460568
M3 - Conference contribution
AN - SCOPUS:84992332909
T3 - 2016 50th Annual Conference on Information Systems and Sciences, CISS 2016
SP - 584
EP - 589
BT - 2016 50th Annual Conference on Information Systems and Sciences, CISS 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 50th Annual Conference on Information Systems and Sciences, CISS 2016
Y2 - 16 March 2016 through 18 March 2016
ER -