Low-Complexity Interactive Algorithms for Synchronization from Deletions, Insertions, and Substitutions

Vasuki Narasimha Swamy, Kannan Ramchandran and Ramji Venkataramanan

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2015-227
December 2, 2015

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-227.pdf

Consider two remote nodes having binary sequences X and Y , respectively. Y is an edited version of X, where the editing involves random deletions, insertions, and substitutions, possibly in bursts. The goal is for the node with Y to reconstruct X with minimal exchange of information over a noiseless link. The communication is measured in terms of both the total number of bits exchanged and the number of interactive rounds of communication. This report focuses on the setting where the number of edits is o( n ), where n is the length of log n X. We first describe an interactive synchronization algorithm with near-optimal communication rate and average computational complexity that is linear in n which we call VTSync algorithm. This algorithm uses interaction to efficiently split the source sequence into substrings containing exactly one deletion or insertion. Each of these substrings is then synchronized using an optimal one-way synchronization code based on the single-deletion correcting channel codes of Varshamov and Tenengolts (VT codes). We then build on VTSync algorithm in three different ways. First, it is modified to work with a single round of interaction. The reduction in the number of rounds comes at the expense of higher communication, which is quantified. We then look at an extension to the practically important case where the insertions and deletions may occur in (potentially large) bursts. Finally, we show how to synchronize the sources to within a target Hamming distance. This feature can be used to differentiate between substitution and indel edits. In addition to theoretical performance bounds, we provide several validating simulation results for the proposed algorithms.

Advisor: Anant Sahai


BibTeX citation:

@mastersthesis{Narasimha Swamy:EECS-2015-227,
    Author = {Narasimha Swamy, Vasuki and Ramchandran, Kannan and Venkataramanan, Ramji},
    Title = {Low-Complexity Interactive Algorithms for Synchronization from Deletions, Insertions, and Substitutions},
    School = {EECS Department, University of California, Berkeley},
    Year = {2015},
    Month = {Dec},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-227.html},
    Number = {UCB/EECS-2015-227},
    Abstract = {Consider two remote nodes having binary sequences X and Y , respectively. Y is an edited version of X, where the editing involves random deletions, insertions, and substitutions, possibly in bursts. The goal is for the node with Y to reconstruct X with minimal exchange of information over a noiseless link. The communication is measured in terms of both the total number of bits exchanged and the number of interactive rounds of communication.
This report focuses on the setting where the number of edits is o( n ), where n is the length of log n
X. We first describe an interactive synchronization algorithm with near-optimal communication rate and average computational complexity that is linear in n which we call VTSync algorithm. This algorithm uses interaction to efficiently split the source sequence into substrings containing exactly one deletion or insertion. Each of these substrings is then synchronized using an optimal one-way synchronization code based on the single-deletion correcting channel codes of Varshamov and Tenengolts (VT codes).
We then build on VTSync algorithm in three different ways. First, it is modified to work with a single round of interaction. The reduction in the number of rounds comes at the expense of higher communication, which is quantified. We then look at an extension to the practically important case where the insertions and deletions may occur in (potentially large) bursts. Finally, we show how to synchronize the sources to within a target Hamming distance. This feature can be used to differentiate between substitution and indel edits. In addition to theoretical performance bounds, we provide several validating simulation results for the proposed algorithms.}
}

EndNote citation:

%0 Thesis
%A Narasimha Swamy, Vasuki
%A Ramchandran, Kannan
%A Venkataramanan, Ramji
%T Low-Complexity Interactive Algorithms for Synchronization from Deletions, Insertions, and Substitutions
%I EECS Department, University of California, Berkeley
%D 2015
%8 December 2
%@ UCB/EECS-2015-227
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-227.html
%F Narasimha Swamy:EECS-2015-227