A New Algorithm for Sampling Closed Equilateral Random Walks Jason Cantarella, Clayton Shonkwiler, and Erica Uehara University of Georgia, Colorado State University, and Ochanomizu University Applied Math Seminar November 13, 2014 Random Walks (and Polymer Physics) Physics Question What is the average shape of a polymer in solution or in melt? Protonated P2VP Roiter/Minko Clarkson University Plasmid DNA Alonso-Sarduy, Dietler Lab EPF Lausanne Random Walks Let Arm(n) be the moduli space of random walks in R3 consisting of n unit-length steps up to translation. Then Arm(n) ∼ = S 2 (1) × . . . × S 2 (1). {z } | n Random Walks Let Arm(n) be the moduli space of random walks in R3 consisting of n unit-length steps up to translation. Then Arm(n) ∼ = S 2 (1) × . . . × S 2 (1). {z } | n Let Pol(n) ⊂ Arm(n) be the submanifold of closed random walks (or random polygons); i.e., those walks which satisfy n X i=1 ~ei = ~0. A Random Walk with 3,500 Steps A Closed Random Walk with 3,500 Steps Main Problem Main Question How can we sample closed equilateral random walks according to the codimension-3 Hausdorff measure on Pol(n) ⊂ Arm(n)? Motivating Question How do we understand Pol(n)? What is this space? Point of Talk New direct sampling algorithms backed by symplectic geometry. Sample ensembles are guaranteed to be perfect. O(n3 ) complexity, but reasonably fast in practice. (Incomplete?) History of Sampling Algorithms • Markov Chain Algorithms • crankshaft (Vologoskii 1979, Klenin 1988) • polygonal fold (Millett 1994) • Direct Sampling Algorithms • triangle method (Moore 2004) • generalized hedgehog method (Varela 2009) • sinc integral method (Moore 2005, Diao 2011) (Incomplete?) History of Sampling Algorithms • Markov Chain Algorithms • crankshaft (Vologoskii et al. 1979, Klenin et al. 1988) • convergence to correct distribution unproved • polygonal fold (Millett 1994) • convergence to correct distribution unproved • Direct Sampling Algorithms • triangle method (Moore et al. 2004) • samples a subset of closed polygons • generalized hedgehog method (Varela et al. 2009) • unproved whether this is correct distribution • sinc integral method (Moore et al. 2005, Diao et al. 2011) • requires sampling complicated 1-d polynomial densities Action-Angle Coordinates Definition A triangulation T of the n-gon has n − 3 nonintersecting chords. The lengths of these chords obey triangle inequalities, so they lie in a convex polytope in Rn−3 called the moment polytope P. Definition If T n−3 = (S 1 )n−3 , there are action-angle coordinates: α : P × T n−3 → Pol(n)/ SO(3) 14 ✓2 d1 d2 ✓1 c ~1) using the action-angle map. FIG. 2: This figure shows how to construct an equilateral pentagon in Pol(5; First, we pick a point in the moment polytope shown in Figure 3 at center. We have now specified diagonals d and d of the pentagon, so we may build the three triangles in the triangulation from their side lengths, Main Theorem Theorem (with Cantarella, arXiv:1310.5924) α pushes the standard probability measure on P × T n−3 forward to the correct probability measure on Pol(n)/ SO(3). Corollary Sampling P and T n−3 independently is a perfect sampling algorithm for Pol(n). Main Theorem Theorem (with Cantarella, arXiv:1310.5924) α pushes the standard probability measure on P × T n−3 forward to the correct probability measure on Pol(n)/ SO(3). Ingredients of the Proof. Kapovich–Millson toric symplectic structure on polygon space + Duistermaat–Heckman theorem + Hitchin’s theorem on compatibility of Riemannian and symplectic volume on symplectic reductions of Kähler manifolds + Howard–Manon–Millson analysis of polygon space. Corollary Sampling P and T n−3 independently is a perfect sampling algorithm for Pol(n). Understanding the Triangulation Polytope If the chord lengths are di then the triangle inequalities are 0 ≤ d1 ≤ 2 1 ≤ di + di+1 |di − di+1 | ≤ 1 0 ≤ dn−3 ≤ 2 and the moment polytope is 2 d1 1 d2 0 0 1 2 Understanding the Triangulation Polytope If the chord lengths are di then the triangle inequalities are 0 ≤ d1 ≤ 2 1 ≤ di + di+1 |di − di+1 | ≤ 1 0 ≤ dn−3 ≤ 2 and the moment polytope is d2 ≤ 2 2 d1 d2 ≤ d1 + 1 d1 ≤ 2 1 d2 d1 + d2 ≥ 1 0 0 d1 ≤ d2 + 1 1 2 Understanding the Triangulation Polytope If the chord lengths are di then the triangle inequalities are 0 ≤ d1 ≤ 2 1 ≤ di + di+1 |di − di+1 | ≤ 1 0 ≤ dn−3 ≤ 2 and the moment polytope is (2, 3, 2) d1 d2 d3 (0, 0, 0) (2, 1, 0) Hit-and-Run Theorem For any convex polytope P, the hit-and-run Markov chain is uniformly ergodic with respect to Lebesgue measure on P. Hit-and-Run Theorem For any convex polytope P, the hit-and-run Markov chain is uniformly ergodic with respect to Lebesgue measure on P. Hit-and-Run Theorem For any convex polytope P, the hit-and-run Markov chain is uniformly ergodic with respect to Lebesgue measure on P. Hit-and-Run Theorem For any convex polytope P, the hit-and-run Markov chain is uniformly ergodic with respect to Lebesgue measure on P. Hit-and-Run Theorem For any convex polytope P, the hit-and-run Markov chain is uniformly ergodic with respect to Lebesgue measure on P. Hit-and-Run Theorem For any convex polytope P, the hit-and-run Markov chain is uniformly ergodic with respect to Lebesgue measure on P. A Markov Chain for Closed Random Walks Definition (TSMCMC(β)) If xk = (~pk , θ~k ) ∈ P × T n−3 , define xk +1 by: • With probability β, update ~ pk by a hit-and-run step on P. ~k with a new uniformly • With probability 1 − β, replace θ sampled point in T n−3 . At each step, construct the corresponding closed random walk α(xk ) using action-angle coordinates. A Markov Chain for Closed Random Walks Definition (TSMCMC(β)) If xk = (~pk , θ~k ) ∈ P × T n−3 , define xk +1 by: • With probability β, update ~ pk by a hit-and-run step on P. ~k with a new uniformly • With probability 1 − β, replace θ sampled point in T n−3 . At each step, construct the corresponding closed random walk α(xk ) using action-angle coordinates. Proposition (with Cantarella) TSMCMC(β) is uniformly ergodic with respect to the standard probability measure on Pol(n)/SO(3). A change of coordinates If we introduce a fake chordlength d0 = 1, and make the linear transformation ai = di − di+1 , for 0 ≤ i ≤ n − 4, an−3 = dn−3 − d0 then our inequalities 0 ≤ d1 ≤ 2 1 ≤ di + di+1 |di − di+1 | ≤ 1 0 ≤ dn−3 ≤ 2 become −1 ≤ ai ≤ 1, | {z X |di −di+1 |≤1 ai = 0, } 2 | i−1 X j=0 aj + ai ≤ 1 {z di +di+1 ≥1 } A change of coordinates If we introduce a fake chordlength d0 = 1, and make the linear transformation ai = di − di+1 , for 0 ≤ i ≤ n − 4, an−3 = dn−3 − d0 then our inequalities 0 ≤ d1 ≤ 2 1 ≤ di + di+1 |di − di+1 | ≤ 1 0 ≤ dn−3 ≤ 2 become X −1 ≤ ai ≤ 1, ai = 0, | {z } easy conditions 2 i−1 X j=0 aj + ai ≤ 1 | {z } hard conditions Basic Idea Definition The n-dimensional cross-polytope C is the slice of the hypercube [−1, 1]n+1 by the plane x1 + · · · + xn+1 = 0. 0 → 1 2 2 2 1 1 0 0 0 1 2 Idea Sample points in the cross polytope, which all obey the “easy conditions”, and reject any samples which fail to obey the “hard conditions”. Relative volumes Theorem (Marichal-Mossinghoff) The volume of the projection of the (n − 3)-dimensional cross-polytope C is Pb n−2 2 c j=0 n−2 j (−1)j (n − 2j − 2)n−3 (n − 3)! Theorem (Khoi, Takakura, Mandini) The volume of the (n − 3)-dimensional moment polytope for n-edge equilateral polygons P is − Pb n2 c j=0 (−1)j (n − 2j)n−3 2(n − 3)! n j Runtime of algorithm depends on acceptance ratio Acceptance ratio = bounded below by Vol(P) Vol(C) 1 n. is conjectured ∼ 6 n+2 . It is certainly 0.14 0.12 0.10 Vol(P) Vol(C) 0.08 graph of 0.06 6 n+2 (in red) 0.04 0.02 100 200 300 400 number of edges in polygon 500 Sampling the Cross Polytope Definition The hypersimplex ∆k ,n is the slice of the cube [0, 1]n with Pn Pn−1 n−1 with k − 1 ≤ i=1 xi = k or the slab of [0, 1] i=1 xi ≤ k . Theorem (Stanley) There is a unimodular triangulation1 of ∆k ,n in [0, 1]n−1 indexed by permutations of (1, . . . , n − 1) with k − 1 descents. ψ −1 −→ Standard triangulation 1 Stanley triangulation a decomposition into disjoint simplices of equal volume Sampling the Cross Polytope Definition The hypersimplex ∆k ,n is the slice of the cube [0, 1]n with Pn Pn−1 n−1 with k − 1 ≤ i=1 xi = k or the slab of [0, 1] i=1 xi ≤ k . Theorem (Stanley) There is a unimodular triangulation1 of ∆k ,n in [0, 1]n−1 indexed by permutations of (1, . . . , n − 1) with k − 1 descents. ψ −1 −→ Standard triangulation 1 Stanley triangulation a decomposition into disjoint simplices of equal volume Sampling the Cross Polytope Definition The hypersimplex ∆k ,n is the slice of the cube [0, 1]n with Pn Pn−1 n−1 with k − 1 ≤ i=1 xi = k or the slab of [0, 1] i=1 xi ≤ k . Theorem (Stanley) There is a unimodular triangulation1 of ∆k ,n in [0, 1]n−1 indexed by permutations of (1, . . . , n − 1) with k − 1 descents. ψ −1 −→ Standard triangulation 1 Stanley triangulation a decomposition into disjoint simplices of equal volume Sampling the Cross Polytope Definition The hypersimplex ∆k ,n is the slice of the cube [0, 1]n with Pn Pn−1 n−1 with k − 1 ≤ i=1 xi = k or the slab of [0, 1] i=1 xi ≤ k . Theorem (Stanley) There is a unimodular triangulation1 of ∆k ,n in [0, 1]n−1 indexed by permutations of (1, . . . , n − 1) with k − 1 descents. ψ −1 −→ Standard triangulation 1 Stanley triangulation a decomposition into disjoint simplices of equal volume Sampling the Cross Polytope Definition The hypersimplex ∆k ,n is the slice of the cube [0, 1]n with Pn Pn−1 n−1 with k − 1 ≤ i=1 xi = k or the slab of [0, 1] i=1 xi ≤ k . Theorem (Stanley) There is a unimodular triangulation1 of ∆k ,n in [0, 1]n−1 indexed by permutations of (1, . . . , n − 1) with k − 1 descents. ψ −1 −→ Standard triangulation 1 Stanley triangulation a decomposition into disjoint simplices of equal volume Sampling the Cross Polytope Definition The hypersimplex ∆k ,n is the slice of the cube [0, 1]n with Pn Pn−1 n−1 with k − 1 ≤ i=1 xi = k or the slab of [0, 1] i=1 xi ≤ k . Theorem (Stanley) There is a unimodular triangulation1 of ∆k ,n in [0, 1]n−1 indexed by permutations of (1, . . . , n − 1) with k − 1 descents. ψ −1 −→ Standard triangulation 1 Stanley triangulation a decomposition into disjoint simplices of equal volume Parity Issues Recall that we’re interested in the cross polytope determined by a0 + a1 + . . . + an−3 = 0 in the cube [−1, 1]n−2 . After translation and scaling, this corresponds to the slice x0 + x1 + . . . + xn−3 = n−2 2 in the cube [0, 1]n−2 . For even n this is just the hypersimplex ∆(n−2)/2,n−2 and Stanley’s triangulation applies. For odd n this is not a hypersimplex. The Algorithm (for n even) Moment Polytope Sampling Algorithm (with Cantarella and Uehara) 1 Generate a permutation of n − 3 numbers with n/2 − 2 descents. O(n2 ) time . 2 Generate a point in the corresponding simplex of the Stanley triangulation for ∆(n−2)/2,n−2 . 3 Project up to the cross polytope in [0, 1]n−2 , over to the cross polytope in [−1, 1]n−2 and down to a set of diagonals. 4 Test the proposed set of diagonals against the “hard” conditions. acceptance ratio > 1/n 5 Generate dihedral angles from T n−3 . 6 Build sample polygon in action-angle coordinates. Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example MPSA 160-gon Example Sampling Calculation We ran 10,000 samples of 48-gons using MPSA and 10,000 samples of 48-gons using TSMCMC and computed total curvature: Algorithm MPSA TSMCMC(β) TSMCMC(β, δ) Exact Value Result 76.5568 78.0419 76.6142945311 76.6014 95% confidence interval (76.4763, 76.6572) (76.4848, 79.545) (76.5163, 76.7122) TSMCMC(β) (no permutations) used lagged autocovariances up to lag 2018 in computing the error estimate, while TSMCMC(β, δ) (90% permutations) used up to lag 4. MPSA doesn’t need to compute any lagged autocovariances and just uses the sample variance to find the confidence interval. A Combinatorial Mystery Recall the volumes of the cross polytope and moment polytope: Vol(C) = Pb n−2 2 c Vol(P) = − j=0 Pb n2 c j=0 (−1)j (n − 2j − 2)n−3 (n − 3)! (−1)j (n − 2j)n−3 2(n − 3)! n j n−2 j A Combinatorial Mystery Recall the volumes of the cross polytope and moment polytope: Vol(C) = = m Pb n−2 2 c j=0 (−1)j (n − 2j − 2)n−3 2n−3 (n − 3)! (n − 3)! n−3 , n/2 − 2 n−2 j where k is the Eulerian number which gives the number of permutations of {1, . . . , m} with exactly k descents. The Eulerian numbers satisfy the recurrence relation DmE m−1 m−1 = (k + 1) + (m − k ) k k k −1 Eulerian Numbers m k counts the number of permutations of {1, . . . , m} with exactly k descents and satisfies the recurrence relation DmE m−1 m−1 = (k + 1) + (m − k ) k k k −1 m k k m 1 1 1 1 1 1 1 1 1 1 4 11 26 57 120 247 502 1 11 66 302 1191 4293 14608 1 26 302 2416 15619 88234 1 57 1191 15619 156190 1 120 1 4293 247 1 88234 14608 502 1 A Combinatorial Mystery Recall the volumes of the cross polytope and moment polytope: Vol(C) = Pb n−2 2 c Vol(P) = − j=0 Pb n2 c j=0 (−1)j (n − 2j − 2)n−3 (n − 3)! (−1)j (n − 2j)n−3 2(n − 3)! n j n−2 j A Combinatorial Mystery Recall the volumes of the cross polytope and moment polytope: Vol(P) = − Pb n2 c ' conj. j=0 √ (−1)j (n − 2j)n−3 2(n − 3)! n j 3 3 Cn/2 8 where Cm is the mth Catalan number 2m Γ(2m + 1) 1 Cm = = . m+1 m Γ(m + 2)Γ(m + 1) Catalan Numbers and Moment Polytope Volumes The ratio of Vol(P) to Cn/2 : 1. 0.8 Vol(P) Cn/2 0.6 √ 3 3 8 0.4 (in red) 0.2 0 20 40 60 80 number of edges in polygon 100 A Recurrence Relation for Vol(P) The normalized volume (n − 3)! Vol(P) of the moment polytope is the k = 1 case of a two-parameter family V (n, k ) given by the recurrence relation V (n, k ) = (n − k − 1)V (n − 1, k − 1) + (n + k − 1)V (n − 1, k + 1) subject to the boundary conditions V (n, 0) = 0, V (3, 1) = 1, V (3, 2) = 1 . 2 A Recurrence Relation for Vol(P) The normalized volume (n − 3)! Vol(P) of the moment polytope is the k = 1 case of a two-parameter family V (n, k ) given by the recurrence relation V (n, k ) = (n − k − 1)V (n − 1, k − 1) + (n + k − 1)V (n − 1, k + 1) 0 1 0 0 0 0 0 0 0 1 2 A Recurrence Relation for Vol(P) The normalized volume (n − 3)! Vol(P) of the moment polytope is the k = 1 case of a two-parameter family V (n, k ) given by the recurrence relation V (n, k ) = (n − k − 1)V (n − 1, k − 1) + (n + k − 1)V (n − 1, k + 1) 0 0 0 0 0 0 0 0 1 2 5 24 154 1280 13005 156800 1 2 1 4 22 160 1445 15680 199066 1 8 75 800 9821 137088 1 16 236 3584 58478 1 32 1 721 64 1 15232 2178 128 1 cf. OEIS A012249 “Volume of a certain rational polytope. . . ” Catalan Numbers and Eulerian Numbers Conjecture √ 3 3 Vol(P) ' Cn/2 . 8 Catalan Numbers and Eulerian Numbers Conjecture √ 3 3 Vol(P) ' Cn/2 . 8 Combining the conjecture with Vol(C) = Stirling’s approximation, and 2n−3 (n−3)! D n−3 n/2−2 Theorem (Giladi–Keller ’94) n−3 n/2 − 2 ' s 6 (n − 3)! π(n − 2) would suffice to prove √ Vol(P) 6 n−2 ' . Vol(C) n3/2 E , √ 6 n−2 n3/2 √ 6 n−2 n3/2 6 n+2 are basically for Vol(P) Vol(C) : and estimates indistinguishable as asymptotic 1. 0.8 graph of 6 n+2 (in blue) Vol(P) 0.6 Vol(C) 0.4 graph of √ 6 n−2 n3/2 20 30 (in yellow) 0.2 10 vs. 40 number of edges in polygon 50 6 n+2 Future Directions √ • Prove conjectured volume Vol(P) ' 3 8 3 Cn/2 , which implies the volume ratio estimate. • Extend technique to odd numbers of edges. (Requires a new method for sampling the cross-polytope.) • Improve runtime to O(n2 ) using a reordering method. • Improve algorithm for generating random k -descent permutations. • Reference implementation and reference ensembles of polygons. Thank you! Thank you for listening! Supported by: Simons Foundation NSF DMS–1105699 Background • Probability Theory of Random Polygons from the Quaternionic Viewpoint Jason Cantarella, Tetsuo Deguchi, and Clayton Shonkwiler Communications on Pure and Applied Mathematics 67 (2014), no. 10, 1658–1699. • The Expected Total Curvature of Random Polygons Jason Cantarella, Alexander Y. Grosberg, Robert Kusner, and Clayton Shonkwiler arXiv:1210.6537. To appear in American Journal of Mathematics. • The Symplectic Geometry of Closed Equilateral Random Walks in 3-space Jason Cantarella and Clayton Shonkwiler arXiv:1310.5924.