Reconstructing Circular Order from Inaccurate Adjacency Information Applications in NMR Data Interpretation Ming-Yang Kao Problem Description (160,520) (540,160) 160 520 (520,220) 540 220 (190,540) (220,480) 190 480 (480,190) Problem Description Given Find the correct order (220,480) ? (160,520) ? (480,190) (540,160) (520,220) (190,540) ? ? ? ? Introduction • Nuclear Magnetic Resonance (NMR) Introduction • Nuclear Magnetic Resonance (NMR) – Use the strong magnetic wave to align nuclei (isotopes). – When this spin transition occurs, the nuclei are said to be in resonance with the applied radiation. NMR Measurement • Chemical Shift – ppm – Electrons in the molecule have small magnetic fields – When magnetic field is applied, electrons tend to oppose the applied field • NMR Spectrum Determining Protein Structure Using NMR 1. 2. 3. 4. 5. NMR Spectral Data generation Peak Picking Peak Assignment Structural Restraint Extraction Structure Calculation NMR Data Interpretation • Peak Assignment. – Map resonance peaks from different NMR spectra to same residue – Identify adjacency relationship – Assign the segments to the protein sequence • Currently done manually • Bottleneck for high throughput structure determination Peak Assignment • Two kinds of information available – Distribution of spin systems for different amino acids – The adjacency information between spin systems Our Focus Problem Description (Input) (a1,b1) b1 (a2,b2) b2 b3 b4 b5 b6 (a3,b3) (a4,b4) (a5,b5) (a6,b6) a1 a2 a3 a4 a5 a6 Problem Description (Output) (a1,b1) b1 (a5,b5) (a3,b3) b2 b3 b4 b5 b6 ? (a4,b4) (a2,b2) (a6,b6) a1 a2 a3 a4 a5 a6 Problem Description (Output) (a1,b1) b1 (a5,b5) b2 b3 b4 b5 b6 (a3,b3) (a4,b4) (a2,b2) (a6,b6) a1 a2 a3 a4 a5 a6 Problem Description (Output) (a1,b1) b1 (a5,b5) b2 b3 b4 b5 b6 (a3,b3) (a4,b4) (a2,b2) (a6,b6) a1 a2 a3 a4 a5 a6 Problem Description (Output) (a1,b1) b1 (a5,b5) b2 b3 b4 b5 b6 (a3,b3) (a4,b4) (a2,b2) (a6,b6) a1 a2 a3 a4 a5 a6 Problem Description (Output) (a1,b1) b1 (a5,b5) b2 b3 b4 b5 b6 (a3,b3) (a4,b4) (a2,b2) (a6,b6) a1 a2 a3 a4 a5 a6 Problem Description (Output) (a1,b1) b1 (a5,b5) b2 b3 b4 b5 b6 (a3,b3) (a4,b4) (a2,b2) (a6,b6) a1 a2 a3 a4 a5 a6 Problem Description (Output) (a1,b1) b1 (a5,b5) b2 b3 b4 b5 b6 (a3,b3) (a4,b4) (a2,b2) (a6,b6) a1 a2 a3 a4 a5 a6 Problem Description (Output) (a1,b1) (a5,b5) b5 ≤ b1 ≤ b6 ≤ b2 ≤ b3 ≤ b4 (a3,b3) (a4,b4) (a2,b2) (a6,b6) a1 ≤ a3 ≤ a6 ≤ a4 ≤ a2 ≤ a5 Equivalent Problem Description (a1,b1) v1 (a5,b5) v2 v3 v4 v5 v6 (a3,b3) (a4,b4) (a2,b2) (a6,b6) u1 u2 u3 u4 u5 u6 Cyclic Augmentation (a1,b1) v1 (a5,b5) v2 v3 v4 v5 v6 (a3,b3) (a4,b4) (a2,b2) (a6,b6) u1 u2 u3 u4 u5 u6 A matching M is called a cyclic augmentation if HM forms a hamiltonian cycle. Not every matching forms a cycle Not every matching forms a cycle Not every matching forms a cycle Not every matching forms a cycle Cost of an edge in M 270 200 Cost of this edge is 70 Cost of an edge in M 100 1200 Cost of this edge is 1100 Minimum Bipartite Cyclic Augmentation Input: U = {u1, u2,…, un} v1 v2 v3 v4 v5 v6 V = {v1, v2,…, vn} H : a perfect matching between U and V u1 Output: A perfect matching M such that 1. HM forms a cycle 2. ∑(u,v)M|u-v| is minimized Sum of cost of edges u2 u3 u4 u5 u6 Bottleneck Bipartite Cyclic Augmentation Input: U = {u1, u2,…, un} v1 v2 v3 v4 v5 v6 V = {v1, v2,…, vn} H : a perfect matching between U and V u1 u2 Output: A perfect matching M such that 1. HM forms a cycle 2. max(u,v)M{|u-v|} is minimized Cost of most expensive edges u3 u4 u5 u6 Outline • MD : the minimum cost matching • We will transform MD to an optimal cost matching using exchange operations • Some properties of an optimal matching to prune down the space of exchanges required • Exchange graph • Optimal matching – MST in exchange graph MD : the minimum cost matching MD : the minimum cost matching The minimum cost matching may not be a cyclic augmentation Exchanges Exchanges Exchanges Exchanges between different cycles merges them Cost of an Exchange Cost of an Exchange Cost of an Exchange x Cost of the exchange is 2.x Transform MD into a minimum cost cyclic augmentation using exchange operations Which exchanges will yield the optimal cyclic augmentation? Clusters l1 l2 l3 l4 l5 l6 l7 l8 Exchange Graph l1 l2 l3 l4 l5 l6 l7 l8 67 12 Nodes ≡ Cycles in MD 56 45 23 34 78 Edges ≡ Adjacent Clusters in MD Exchange Graph l1 l2 l3 l4 l5 l6 l7 l8 67 12 56 45 78 Weight on Edges ≡ Cost of corresponding Exchange . 23 34 Solution Exchanges corresponding to the Minimum Spanning Tree on Exchange Graph yield a minimum cost cyclic augmentation Results • Minimum Bipartite Cyclic Augmentation Ω(n log n) • Bottleneck Bipartite Cyclic Augmentation 3 approx. algorithm The End Thank You