## Abstract

— The deletion distance between two binary words u, v ∈ {0, 1}^{n} is the smallest k such that u and v share a common subsequence of length n−k. A set C of binary words of length n is called a k-deletion code if every pair of distinct words in C has deletion distance greater than k. In 1965, Levenshtein initiated the study of deletion codes by showing that, for k ≥ 1 fixed and n going to infinity, a k-deletion code C ⊆ {0, 1}^{n} of maximum size satisfies Ωk(2^{n}/n^{2k}) ≤ |C| ≤ Ok(2^{n}/n^{k}). We make the first asymptotic improvement to these bounds by showing that there exist k-deletion codes with size at least Ωk(2^{n} log n/n^{2k}). Our proof is inspired by Jiang and Vardy’s improvement to the classical Gilbert–Varshamov bounds. We also establish several related results on the number of longest common subsequences and shortest common supersequences of a pair of words with given length and deletion distance.

Original language | English (US) |
---|---|

Pages (from-to) | 125-130 |

Number of pages | 6 |

Journal | IEEE Transactions on Information Theory |

Volume | 70 |

Issue number | 1 |

DOIs | |

State | Published - Jan 1 2024 |

## All Science Journal Classification (ASJC) codes

- Information Systems
- Library and Information Sciences
- Computer Science Applications

## Keywords

- Deletion codes
- longest common subsequence
- probabilistic combinatorics