Some database systems that use a form of B-tree for the underlying data structure do not do rebalancing on deletion. This means that a bad sequence of deletions can create a very unbalanced tree. Yet such databases perform well in practice. Avoidance of rebalancing on deletion has been justified empirically and by averagecase analysis, but to our knowledge, no worst-case analysis has been done. We do such an analysis. We show that the tree height remains logarithmic in the number of insertions, independent of the number of deletions. Furthermore, the amortized time for an insertion or deletion, excluding the search time, is O(1), and nodes are modified by insertions and deletions with a frequency that is exponentially small in their height. The latter results do not hold for standard B-trees. By adding periodic rebuilding of the tree, we obtain a data structure that is theoretically superior to standard B-trees in many ways. Our results suggest that rebalancing on deletion not only is unnecessary but may be harmful.
All Science Journal Classification (ASJC) codes
- Information Systems