We revisit the Double Digest problem, which occurs in sequencing of large DNA strings and consists of reconstructing the relative positions of cut sites from two different enzymes: we first show that Double Digest is strongly NP-complete, improving upon previous results that only showed weak NP-completeness. Even the (experimentally more meaningful) variation in which we disallow coincident cut sites turns out to be strongly NP-complete. In a second part, we model errors in data as they occur in real-life experiments: we propose several optimization variations of Double Digest that model partial cleavage errors. We then show APX-completeness for most of these variations. In a third part, we investigate these variations with the additional restriction that conincident cut sites are disallowed, and we show that it is NP-hard to even find feasible solutions in this case, thus making it impossible to guarantee any approximation ratio at all.
|Title of host publication||Proceedings of the 9th International Computing and Combinatorics Conference (COCOON'2003, Big Sky MT, USA, July 25-28, 2003)|
|Editors||T. Warnow, B. Zhu|
|Place of Publication||Berlin|
|Publication status||Published - 2003|
|Name||Lecture Notes in Computer Science|
Cieliebak, M., Eidenbenz, S., & Woeginger, G. J. (2003). Double digest revisited : Complexity and approximability in the presence of noisy data. In T. Warnow, & B. Zhu (Eds.), Proceedings of the 9th International Computing and Combinatorics Conference (COCOON'2003, Big Sky MT, USA, July 25-28, 2003) (pp. 519-527). (Lecture Notes in Computer Science; Vol. 2697). Berlin: Springer. https://doi.org/10.1007/3-540-45071-8_52