TY - GEN
T1 - On the duplication distance of binary strings
AU - Alon, Noga
AU - Bruck, Jehoshua
AU - Farnoud, Farzad
AU - Jain, Siddharth
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/8/10
Y1 - 2016/8/10
N2 - We study the tandem duplication distance between binary sequences and their roots. This distance is motivated by genomic tandem duplication mutations and counts the smallest number of tandem duplication events that are required to take one sequence to another. We consider both exact and approximate tandem duplications, the latter leading to a combined duplication/Hamming distance. The paper focuses on the maximum value of the duplication distance to the root. For exact duplication, denoting the maximum distance to the root of a sequence of length n by f(n), we prove that f(n) = Θ(n). For the case of approximate duplication, where a β-fraction of symbols may be duplicated incorrectly, we show using the Plotkin bound that the maximum distance has a sharp transition from linear to logarithmic in n at β = 1/2.
AB - We study the tandem duplication distance between binary sequences and their roots. This distance is motivated by genomic tandem duplication mutations and counts the smallest number of tandem duplication events that are required to take one sequence to another. We consider both exact and approximate tandem duplications, the latter leading to a combined duplication/Hamming distance. The paper focuses on the maximum value of the duplication distance to the root. For exact duplication, denoting the maximum distance to the root of a sequence of length n by f(n), we prove that f(n) = Θ(n). For the case of approximate duplication, where a β-fraction of symbols may be duplicated incorrectly, we show using the Plotkin bound that the maximum distance has a sharp transition from linear to logarithmic in n at β = 1/2.
UR - http://www.scopus.com/inward/record.url?scp=84985916609&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84985916609&partnerID=8YFLogxK
U2 - 10.1109/ISIT.2016.7541301
DO - 10.1109/ISIT.2016.7541301
M3 - Conference contribution
AN - SCOPUS:84985916609
T3 - IEEE International Symposium on Information Theory - Proceedings
SP - 260
EP - 264
BT - Proceedings - ISIT 2016; 2016 IEEE International Symposium on Information Theory
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2016 IEEE International Symposium on Information Theory, ISIT 2016
Y2 - 10 July 2016 through 15 July 2016
ER -