Research Summary Dimitrios Giannakis April 21, 2015 My research is at the interface between applied mathematics and climate atmosphere ocean science. My primary applied mathematics research interests are in geometrical data analysis algorithms and statistical modeling of dynamical systems. These tools are applied in a range of applications in climate science, including intraseasonal oscillations of organized tropical convection and arctic sea-ice variability. Since joining the Center for Atmosphere Ocean Science at the Courant Institute, I have also worked on information theoretic methods to quantify predictability and model error in dynamical systems [1–3] and Markov chain Monte Carlo algorithms for signals with intermittent instabilities [4]. Geometrical data analysis for dynamical systems High-dimensional data generated by dynamical systems are encountered in many disciplines in science and engineering. For instance, in atmosphere ocean science, the dynamics take place in an infinite-dimensional phase space where the coupled nonlinear partial differential equations for fluid flow and thermodynamics are defined, and the observed data correspond to functions of that phase space, such as temperature or circulation measured over a geographical region of interest. Mathematically, the observed data at time ti can be represented by a vector xi ∈ Rn with n 1, and the time-ordered collection of these vectors forms a high-dimensional timeseries. There exists a strong need for applied mathematics techniques to extract and predict important phenomena which are an outcome of the underlying dynamics, including the El Niño Southern Oscillation (ENSO) in the ocean and the Madden-Julian Oscillation (MJO) in the atmosphere. My work in high-dimensional time series analysis has focused on the following two main questions: 1. Dimension reduction and spatiotemporal pattern extraction; i.e., how to represent the data using a small number of coordinates (and their associated spatiotemporal patterns) in a manner that reveals intrinsic timescales of the dynamics [5, 6]. 2. Nonparametric forecasting; i.e., how to predict future values of observables (or probability densities) from the current initial data given a training dataset of past observations of the system, but without having access to the equations of motion and without using a parametric reduced model [7, 8]. A common theme in this work has been to combine kernel methods from harmonic analysis and machine learning with ideas from dynamical systems theory to construct dimension reduction maps adapted to the dynamical system generating the data, and to learn operators governing the time evolution of observables and probability densities. 1 Kernel methods to detect slow intrinsic timescales Consider a time series, x = {x0 , x1 , . . . , xs }, consisting of samples xi ∈ Rn taken at times ti = i δt time with a uniform timestep δt. The general setting of interest is that the samples are generated by an abstract dynamical system operating in a phase space manifold M , and the samples are the outcome of a vector valued function on that manifold; i.e., xi = F(ai ) with ai ∈ M . We also have a dynamical flow Tt on M such that ai = Tti a0 with ti = i δt, which we will assume to be ergodic and with invariant measure peq . Dimension reduction can be described in terms of a map Φ : Rn 7→ Rl , where Φ(xi ) = (φ1 (xi ), . . . , φl (xi )) and l n, and our objective is to construct this map empirically from the observed data and endow it with the following “desirable” properties: • Φ should preserve the manifold structure of M . • Φ should be intrinsic to the dynamical system generating the data, i.e., it should have strong invariance properties under changes of the observation modality F. • The reduced coordinates φi should individually reveal meaningful dynamical processes embedded in the high-dimensional observed signal. In kernel methods for data analysis (e.g., [9–11]), one constructs dimension reduction maps through eigenfunctions of a Markov operator P acting on scalar functions on M constructed from exponentially decaying kernel functions. The kernel can be thought of as a measure of pairwise similarity K : Rn × Rn 7→ R+ between samples in data space. If M is equipped with a volume form (for our purposes, the volume form d peq associated with the invariant measure of the dynamics) it naturally leads to an integral operator G R acting on functions on M through the expression G f (ai ) = M K(F(xi ), F(x j )) f (a j ) d peq (a j ). In practical applications, the action of G is approximated by a Monte Carlo sum in time, which corresponds to integration with respect to the invariant measure of the dynamics. The Markov operator P is then constructed through a sequence of normalizations of G ensuring that P f = f if f is constant, and the corresponding dimension reduction coordinates are computed by solving the eigenvalue problem Pφi = λi φi . (Technically, the dimension reduction coordinates include eigenvalue-dependent scaling factors, which we omit here.) Geometrically, this procedure is motivated from the fact that for a suitable choice of kernel and in the limit of large data, P approximates the heat kernel on the manifold associated with a Riemannian metric that depends on K—it is a well-established fact that eigenfunctions of heat kernels and the associated Laplace-Beltrami operators can be used to embed manifolds in Euclidean spaces with optimal preservation of the Riemannian geometry [9, 12– 14]. This observation motivates the design of kernels for dynamical systems to obtain the desirable properties listed above from the properties of the corresponding induced geometry. To that end, in [5, 6] a family of kernels was introduced that modifies the geometry of the data by incorporating two dynamics-dependent features: Delay-coordinate mappings. Following state-space reconstruction methods [15–17], we construct a new observation map F̃ : M 7→ Rnq through lagged sequences of F: Xi = F̃(ai ) = (xi , xi−1 , . . . , xi−(q−1) ). If q is sufficiently large (and under mild conditions on the dynamical system, the observation function, and the sampling interval), then with high probability the data points Xi are in one-to-one correspondence with the points ai on the attractor. Thus, the time series {Xi } becomes Markovian even if the observations F are incomplete (i.e., F(M ) is not a diffeomorphic copy of M ). Moreover, F̃ modifies the geometry of the data since distances in delay-coordinate space depend on differences Xi − X j between “videos” as opposed to differences xi − x j between “snapshots.” In work in collaboration with Andrew Majda on so-called nonlinear Laplacian spectral analysis (NLSA) algorithms [18], it was experimentally observed that the use 2 of kernels in delay-coordinate space significantly enhances the ability to extract distinct timescales from high-dimensional signals with individual eigenfunctions. AOS examples demonstrating this behavior can be found in Fig. 5 ahead, as well as in [18–23]. A more rigorous theoretical justification of the enhanced timescale separation capability of these eigenfunctions was made in independent work by Berry et al. [24], where it was shown that the geometry of the data in delay-coordinate space is biased towards the most stable Lyapunov subspace of the dynamical system generating the data, and this subspace is independent of the observation modality. In summary, eigenfunctions from kernels with delay-coordinate maps are “good” dimension reduction coordinates in the sense that they tend to represent processes with a coherent temporal character, which are also intrinsic to the dynamics. Dependence on the vector field of the dynamics. The generator of the dynamics, i.e., the skew-symmetric operator v giving the time-derivative of functions through v( f )(a) = limt→0 ( f (Tt a) − f (a))/t, maps to a vector field V in delay-coordinate space which can be approximated by finite differences in time; e.g., V̂i = (Xi+1 − Xi−1 )/(2 δt) is a second-order approximation of V at state ai measuring the local time tendency of the state vector. Moreover, for a state X j lying in a neighborhood of Xi , the vector u = X j − Xi approximates a tangent vector on M and cos θi = u · V̂i /(kukkV̂i k) approximates the cosine of the angle between that tangent vector and V in the geometry inherited by M from the embedding F̃. In [6], a one-parameter family of “cone kernels” was introduced that incorporates this directional dependence through the expression 2 K(Xi , X j ) = e−A(Xi ,X j )/δt , A(Xi , X j ) = kXi − X j k2 [(1 − ζ cos2 θi )(1 − ζ cos2 θ j )]1/2 , kV̂i kkV̂ j k ζ ∈ [0, 1). (1) In (1) the sampling interval δt controls the bandwidth of the kernel so that the limit of large data corresponds to δt → 0. Moreover, the parameter ζ controls the influence of the directional terms, and for ζ > 0 cone kernels preferentially assign large similarity to pairs of samples whose displacement vector is locally aligned with the dynamical flow. In particular, in the limit ζ → 1 the Riemannian metric induced by cone kernels becomes generate, assigning arbitrarily small norm to tangent vectors parallel to v. As a result, the associated Laplace-Beltrami operator ∆ depends on the directional derivatives of functions along v, as opposed to the full gradient. The outcome of this asymptotic structure of ∆ is that its eigenfunctions extremize a Dirichlet energy that penalizes variations along the integral curves of v. This property is independent of observation modality, and endows the eigenfunctions with invariance under a weakly restrictive class of transformations of the data (including conformal transformations). Another consequence of the structure of ∆ is that the time series of the eigenfunction values, ti 7→ φ j (ai ), capture intrinsic slow timescales of the dynamics. Figure 1 displays a visualization of this “along-v” property for a dynamical system on the two-torus. This property is also beneficial in kernel analog forecasting techniques, discussed below. Kernel analog forecasting Analog forecasting is a nonparametric technique introduced by Lorenz in 1969 [25], which predicts the evolution of states of a dynamical system (or observables defined on the states) by following the evolution of the sample in a historical record of observations which most closely resembles the current initial data. In the initialization stage of analog forecasting, one identifies an analog, i.e., the state in the historical record which most closely resembles the current initial data. Traditionally, this is accomplished using Euclidean distances in the ambient data space so that the analog xi in the historical record x = {x0 , x1 , . . .} corresponding to the initial data y is given by xi = argminky − x j k. x j ∈x Then, in the forecast step, the historical evolution of that state is followed for the desired lead time τ, and the observable of interest is predicted based on its value on the analog. Denoting the time series of observable 3 Figure 1. Laplace-Beltrami eigenfunctions φi for a dynamical system on the two-torus, illustrating the “along-v” property of the eigenfunctions from the cone kernels from (1) with ζ ≈ 1. From left to right, the columns display representative eigenfunctions obtained through the diffusion maps algorithm [10] for a radial Gaussian kernel, the cone kernel from with ζ = 0, and the cone kernel with ζ = 0.995. A portion of the dynamical trajectory is also plotted in a black line for reference. Notice that in the ζ = 0.995 case the leading eigenfunctions vary predominantly in directions transverse to v. As a result, the level sets of these eigenfunctions are aligned with the orbits of the dynamics, and the timeseries ti 7→ φ j (ai ) vary slowly. Figure reproduced from [6]. values corresponding to the historical record by { f0 , f1 , . . .}, and the k-step shift map of that time series by Sk f j = f j+k , the analog forecast fˆ(y, τ) at lead time τ = k δt is given by fˆ(y, τ) = Sk fi , where i is the timestamp of the analog xi . Two major factors influencing the efficacy of analog forecasting are (1) the identification of skillful analogs in the training data; (2) the choice of forecast observable (predictand). In work with Jane Zhao [8], a kernel nonparametric forecasting technique was developed which improves both of the above aspects of traditional analog forecasting. First, note that the ability to identify skillful analogs amounts to being able to identify subsets of the training data whose dynamical evolution will shadow the future time evolution for the given initial data. This suggests that using the cone kernels in (1) to select analogs on the basis of kernel affinity should improve forecast skill, at least in the short to medium term, as analogs with time tendency (V ) similar to the time tendency of the initial data will be favored. Similarly, selecting analogs in delay-coordinate space (i.e., using lagged sequences Y of initial data instead of snapshots) should improve skill, especially in situations with incomplete initial data. Both of these ingredients are included in our proposed scheme. Another improvement over traditional analog forecasting is to replace single-analog prediction with prediction based on a weighted ensemble of analogs. Mathematically, the procedure to construct weighted 4 ensembles of analogs is motivated by out-of-sample extension techniques for functions on manifolds [26, 27]. For example, in the geometric harmonics technique [26] (which is related to the Nyström method for outof-sample extension), the observable f is represented by a truncated expansion f ≈ fl = ∑lk=0 cl φl in the eigenfunction basis of a kernel operator P, and then an estimate fˆ(Y ) of the value of f at an out-of-sample state Y is computed as a weighted sum of the in-sample eigenfunction values. Specifically, we have l fˆ(Y ) = cj ∑ λ j ∑ p(Y, Xi )φ j (Xi ), i j=1 where λ j is the eigenvalue corresponding to φ j , and p(·, ·) the kernel of P. In [8], this expression is modified to produce an analog forecast at lead time τ = k δt by applying the shift map to the eigenfunctions, giving l fˆ(Y, τ) = cj ∑ λ j ∑ p(Y, Xi )φ j (Xi+k ). j=1 (2) i A similar construction can be made using the Laplacian Pyramids algorithm for out-of-sample extension [27]; a multiscale iterative method that does not make direct use of the eigenfunctions. With regards to the choice of the forecast observable, (2) suggests that higher forecast skill should be possible for observables which are well-approximated by slowly-varying eigenfunctions. The leading eigenfunctions from the cone kernels in (1) are good candidates for such observables, and in practice we find that these eigenfunctions describe physically meaningful patterns with favorable predictability properties. Figure 2 shows results from a challenging application involving low-frequency sea surface temperature (SST) patterns in the North Pacific sector of a comprehensive climate model, where parametric regression models fail to beat a trivial persistence forecast. Nonparametric forecasting with shift maps The objective of this work [7], carried out in collaboration with Tyrus Berry and John Harlim, is to use a smooth orthonormal basis of functions, obtained through the diffusion maps algorithm, to approximate the semigroup of solution operators of stochastic differential equations on manifolds directly from the data and without knowing or estimating the drift and diffusion coefficients. Let L be the generator of a stochastic process on a smooth manifold M with invariant measure peq . (In the absence of stochastic effects, L reduces to the vector field v discussed earlier, but in the setting of interest here drift and diffusion are both present.) We denote the semigroup of solution operators over time τ by eτL so that eτL f (x) = Ex f (xτ ) gives the expectation of f (xτ ) at time τ conditioned on x0 = x. The adjoint of L ∗ in the Hilbert space L2 (M , peq ) is the Fokker-Planck operator, governing the evolution of probability densities relative to the equilibrium ∗ measure. That is, an initial density ρ0 relative to peq evolves according to ρτ = eτL ρ0 , and if the density is uniform relative to the equilibrium measure (e.g., at asymptotic times, τ → ∞) we have ρτ = 1. We denote the inner product of L2 (M , peq ) by h·, i peq . 2 Next, consider an orthonormal basis {φi }∞ i=0 of L (M , peq ). Such a basis can be constructed through the eigenfunctions of the generator ∆ of a gradient flow (∆ is a weighted Laplacian), which can be approximated from data using the diffusion maps algorithm [10] and variable-bandwidth kernel techniques [28]. As above, we denote the k-step shift map of the sampled time series {x0 , x1 , . . . , xN } by Sk so that for any observable f , Sk f (xi ) = f (xi+k ). The key observation made in [7] (which can be viewed as a generalization of the finite-difference approximation of vector fields in [6]), is that the matrix elements obtained through the Monte Carlo sum 1 N−k Âi j (τ) = ∑ φi (xl )Sk φ j (xl ) N l=0 5 Analog Forecasting with Dynamics-Adapted Kernels 1.6 1.6 1.4 1.4 1.2 1.2 1 RMSE RMSE 1 0.8 0.6 0.8 0.6 Persistence Euclidean Distance Diffusion Distance Nystrom Laplacian Pyramids 0.4 0.2 0 0 37 10 20 30 40 τ (months) 50 0.4 0.2 0 0 60 10 1 1 0.8 0.8 0.6 0.4 0.2 0 0 30 40 τ (months) 50 60 50 60 (b) RMSE-NPGO pattern correlation pattern correlation (a) RMSE-PDO 20 0.6 0.4 0.2 10 20 30 40 τ (months) 50 0 0 60 (c) PC-PDO 10 20 30 40 τ (months) (d) PC-NPGO Figure 2. Analog forecasting of skill the Pacific oscillation and the North oscillation 4.7, but(PDO) computed using as Pacific groundgyre truth the PDO(NPGO) in Figure 4.8: Same scoresdecadal as Figure the coupled climate modeleigenfunctions CCSM3. Theobtained PDO andfrom NPGO dominant modeserror. of low-frequency (interannual to and NPGO theare testthe data with model decadal) variability in the North Pacific, and are well represented by a pair of cone kernel eigenfunctions [6]. The panels above show root mean squared error (RMSE) and pattern correlation (PC) scores for hindcasts of these modes using the In the experiments the low-order we demonstrated Takens based on persistence forecast, traditional analogwith forecasting basedatmospheric on Euclideanmodel, distances, single-analogthat forecasting delay-coordinate maps areforecasts an effective way improving forecastapproach skill withinpartially observed pyramids. diffusion distances, and kernel weighted using theofNyström extension (2) and Laplacian initial dataoffor allcompared methods to studied here. In forecast this relatively low-dimensional setting with for the Notice the improvement skill the persistence and single-analog techniques, especially NPGO. Figuredense reproduced from sampling of [8]. the attractor, single-analog forecasts with diffusion distances provided a small to moderate improvement prediction skill compared to conventional analogs. On the other hand, forecasts with the weighted-ensemble methods were able to track the metastable provide a noisy approximation matrix Ai j (τ) = hφtoi , aeτL φ j i peq of improvement the solution of semigroup regime transitions in to thisthe model for elements longer times, leading pronounced where τ = kskill δt and δt is the sampling interval. In particular, the estimates are unbiased, and their  (τ) ij especially at moderate to long leads. 2 variance is of order λ j τ/N, eigenvaluewe corresponding to the eigenfunction φ j of ∆. Similarly, In the Northwhere Pacificλ SST applications, constructed data-driven forecast observables j is the ∗ ∗ τL through kernel eigenfunctions representing two prominent interannual patterns of variability the quantities Âi j (τ) =  ji (τ) approximate the matrix elements of the operator e governing the evolution in densities the North Pacific, the Pacific decadalTooscillation (PDO) [39] and theinitial North of probability relative to namely the equilibrium measure. forecast the evolution of an density ρ0 , ∞ The=timescale separation capabilities of the kernels used here Pacific oscillation [25]. ρ we first compute itsgyre spectral expansion c φ with c = hφ , ρ i, and then evaluate the density ρτ at ∑i=0 i i 0 i i 0 contributed significantly to the clean low-frequency character and favorable predictability time τ using ∞ ρτ (x) ≈ ρ̂τ (x) = ∑ φi (x)c j Âi j (τ)φi (x). (3) i, j=0 This scheme, which we refer to as diffusion forecast, can be thought of as a spectral Galerkin method for the Fokker-Planck equation formulated in a basis inherited from the gradient flow on the dataset. By keeping track of the full density, the method is able to provide both the mean forecast, as well as uncertainty quantification (UQ), e.g., through the second moment of ρτ . Moreover, the method is valid for arbitrary sampling intervals and the forecast densities have the correct long-time behavior by construction. Figure 3 shows forecast skill results from this method as well as other nonparametric techniques for the Lorenz 63 model. Representative basis functions from the gradient flow are also shown in Fig. 3. Additional applications to stochastic systems on the torus and ENSO forecasting can be found in [7]. 6 45 45 40 40 35 35 30 30 25 25 20 20 15 15 10 12 10 8 10 5 10 0 −10 20 10 0 −10 −20 10 0 −10 45 45 40 40 35 35 30 30 25 25 20 20 15 15 10 20 10 0 −10 −20 RMSE 5 Ensemble Forecast Ensemble Error Estimate Diffusion Forecast Diffusion Error Estimate 6 Local Linear Forecast Local Linear Error Estimate Iterated Local Linear Forecast Iterated Local Linear Error Estimate Invariant Measure 4 2 10 5 0 0 5 10 0 −10 20 10 0 −10 −20 10 0 −10 20 10 0 −10 −20 10 20 30 40 50 60 Forecast Steps (∆ t = 0.1) 70 80 90 Figure 3. Nonparametric forecasting of the state vector of the Lorenz 63 model. In this example, the initial density is a Gaussian with variance 0.01 centered on a point close to the attractor, sampled randomly from a Gaussian distribution. The training dataset consists of 5000 points sampled at a timestep δt = 0.01. The diffusion forecast is performed via (3) using 4500 eigenfunctions, and the root mean squared error (RMSE) is comparable to an ensemble forecast with 5000 samples which has access to the true model. The diffusion forecast also provides a reasonable UQ (error estimate) through the standard deviation of the forecast distribution. Also shown for reference are results from nonparametric models based on local linear linearization. In some cases, these models perform comparably to the diffusion forecast for the mean, but generally provide poorer UQ. The left-hand panels display a selection of the eigenfunctions of the gradient flow used to represent the shift map and the initial probability density. Figure adopted from [7]. Applications in climate atmosphere ocean science As with many other science and engineering disciplines, climate atmosphere ocean science has been experiencing an exponential increase in the amount of data collected by observational networks or output by numerical models. For instance, the CMIP5 data archive [29] contains several Petabytes of climate model output from modeling centers around the world, and similarly comprehensive observational and reanalysis datasets are available. Contained in this data is information that can lead to significant advances in our scientific understanding and forecasting capability of important phenomena evolving on daily to decadal timescales. However, due to the sheer volume and complexity of the data, “look and see” approaches have limited potential in extracting that information. Frequently, data analysis techniques are used to define the phenomena of interest themselves, and in such cases there exists a clear need for methods that require minimal ad hoc preprocessing of the data. For instance, the indices and spatiotemporal patterns for ENSO and the MJO, which are constructed through methods such as Fourier analysis or principal component analysis of SST, outgoing wavelength radiation (OLR), and other spatiotemporal fields (e.g., [30]), impact our theoretical understanding of these phenomena, as well as how we assess their representation in weather and climate models. Despite the significant skill advances in forecasts with large-scale numerical models taking place in recent years, there exist notable examples (including the applications discussed below) where the lack of spatial and temporal resolution, parameterization of unresolved processes, and lack of knowledge of the operating physical laws result in poor dynamical representation of the phenomena of interest. In such cases, low-order statistical models are useful alternatives to large-scale numerical models, providing comparable or superior forecast skill. My work in AOS has broadly focused on using the data analysis techniques described above to (1) extract physically meaningful modes of variability in the ocean [5, 18], atmosphere [19, 23, 31], and the cryosphere [20–22] from models and observations with minimal preprocessing of the data; (2) study the potential predictability of these modes [23, 32], and construct low-order parametric [4] and nonparametric [7, 8] models for their prediction. Two topics that I am particularly interested in, and are described in 7 detail below, is the co-variability of arctic sea ice with the ocean and atmosphere, and tropical intraseasonal oscillations (ISOs). Arctic sea-ice reemergence in models and observations Arctic sea ice is a sensitive component of the climate system, with dynamics and variability that are strongly coupled to the atmosphere and ocean. In addition to the strong declining trends observed in recent years [33], arctic sea ice also exhibits large internal variability. Studies using comprehensive climate models have estimated that 50–60% of recent Arctic sea-ice changes can be attributed to externally forced trends, with the remainder resulting from internal variability in the climate system [34]. Therefore, the challenge of making accurate projections of future Arctic sea-ice conditions hinges on quantifying the sea-ice response to changes in external forcing (i.e., greenhouse gas forcing), and understanding the nature and magnitude of internal variability in the coupled ice-ocean-atmosphere system. Recently, Blanchard-Wrigglesworth et al. [35] proposed a mechanism for sea-ice–SST co-variability in which sea-ice and SST anomalies trade off, allowing for unexpected memory effects in sea ice. In this mechanism, spring sea-ice anomalies imprint SST anomalies of opposite sign, which persist over the summer months. During the fall, ice grows southward and interacts with these SST anomalies, reproducing ice anomalies of the same sign as the spring. This mechanism (together with a similar, but weaker, mechanism operating from fall to spring) potentially provides a source of predictability on seasonal to interannual timescales—a problem of significant current interest [36]. Arctic sea-ice reemergence mechanisms were studied in series of papers with Mitchell Bushuk and Andrew Majda [20, 21]. In this work, a multivariate extension of the NLSA algorithm was used to construct spatiotemporal modes of sea-ice, oceanic, and atmospheric co-variability. This method essentially uses the kernel in (1) with ζ = 0 to combine data sources with different physical dimension in a “unit-free” manner, and without requiring variance normalization. To represent sea ice, the ocean, and atmosphere we used sea ice concentration (SIC; the fractional area covered by sea ice), SST, and sea level pressure (SLP) data over the North Pacific and Arctic domains. We studied control integrations of the CCSM3 model, as well as observational and reanalysis data from the HadISST dataset and the ERA-Interim product. In both the model and observational data, we recovered spatiotemporal modes of variability which provide a reduced representation the sea-ice reemergence mechanism proposed in [35]. The families also reveal time-dependent aspects of reemergence, which were not accessible in previous studies. In particular, interannual components of large-scale SLP variability, which emerge objectively from this analysis, are found to be related to coherent sea-ice reemergence events in geographically distinct regions, and suggest an SLP–sea-ice reemergence mechanism. Spatiotemporal reconstructions illustrating these processes are displayed in Fig. 4. In recent work with Mitchell Bushuk [22] the analysis in [20, 21] was extended to a hierarchy of models with reduced atmospheric and oceanic dynamics. This hierarchy was designed to probe the relative role of the ocean and atmosphere in sea-ice reemergence—in particular, it includes a slab ocean model (SOM) with simplified oceanic dynamics but full ocean–atmosphere coupling, as well as so-called CORE models with prescribed atmospheres and no ocean to atmosphere feedback. We found that a CORE model with an interannual atmosphere has a more realistic reemergence signal than the SOM, suggesting that the SST–SIC mechanism is able to operate as a standalone mechanism. Yet, in models with ocean-to-atmosphere coupling (including the SOM), atmospheric variability plays a key role in reemergence, setting the spatial patterns of SIC reemergence. This suggests that dynamical feedback from the ocean to the atmosphere is essential in creating large-scale organized patterns of SIC–SLP co-variability, such as the patterns displayed in Fig. 4. Ongoing and future work in this area includes forecasting total sea-ice area and volume with the nonparametric techniques described above, and studying the role of sea ice thickness in reemergence. Another topic of interest is the extraction of modes representing the global warming trend of sea-ice timeseries. 8 Figure 4. Arctic sea-ice reemergence reconstructed through NLSA. The panels display composites of SIC, SST, and 65in a 900-year control integration SLP anomalies the reconstructed samples CCSM3 climate Fig.computed 14. Sea from ice, SST, and SLP patterns of CCSM3 reemergence Family F2M of atthe di↵erent model wheremonths the activity of the mode leading-low frequency NLSA mode exceeded a threshold. During of the year. These spatial patterns are composites, obtained by averaging over all the March sea-ice maximum, there exist strong > 1.SIC anomalies in the marginal ice zones (particularly in the Bering and Labrador years in which LSIC 2 seas), which leave an imprint in SST anomaly of the opposite sign (i.e., positive SIC is accompanied by negative, cold, SST anomalies). The SIC anomalies disappear during the summer months, but the SST anomaly persists. In the growth period starting in September sea ice growth is favored (weakened) in the areas of negative (positive) SST anomaly, leading to a reemergence of the SIC anomalies from the previous spring. Associated with this process is a low-frequency annular SLP mode which produces poleward (warm) and equatorward (cold) winds in the North Pacific and North Atlantic basins, respectively. These winds are consistent with an atmospheric teleconnection setting up anticorrelated SIC anomalies in the two basins. Figure reproduced from [21]. 9 Tropical intraseasonal oscillations Organized tropical convection is a key element of global climate dynamics with direct impact on both short-term weather forecasting and long-term climate projections. Among the dominant modes of tropical variability, intraseasonal oscillations (ISOs) play a key role in explaining large-scale convective organization at subseasonal timescales while also influencing the global climate through extratropical interactions [30]. The dominant ISO modes are the well-known Madden-Julian oscillation (MJO) [37], a 30–90-day eastwardpropagating pattern with zonal wavenumber 1–4 mainly active in the boreal winter, and the boreal summer ISO (BSISO), which has a more emphasized poleward-propagating pattern with a weakened eastward propagation [38]. These modes influence and are influenced by a plethora of other modes in the climate system spanning diurnal to interannual timescales. However, despite their dynamical significance and the fact that they are a major source of predictability on intraseasonal timescales, accurate simulation of the MJO and other ISOs by large-scale dynamical models remains elusive. A significant challenge in understanding the behavior of ISOs and their connections to other modes of variability is that the phenomena themselves are defined subjectively through some data analysis technique [39], and the data is frequently subjected to preprocessing (e.g., bandpass filtering) in order to isolate the signal of interest. In this work, carried out in collaboration with Nan Chen, Andrew Majda, Eniko Szekely, and Wen-wen Tung, our objectives have been to extract modes of organized tropical convection without preprocessing the data [19, 23], quantify the potential predictability of the MJO conditional on large-scale regimes [23], and construct low-order stochastic models for MJO prediction [4]. Figure 5 shows Laplace-Beltrami eigenfunctions obtained via the NLSA algorithm applied to infrared brightness temperature data (Tb ; a proxy for convective activity) from the CLAUS multi-satellite archive. The data has a 0.5◦ spatial resolution and is sampled every δt = 8 hours over the period 1983–2006. The eigenfunctions displayed in Fig. 5 represent the MJO, BSISO, as well as modulated diurnal modes with intraseasonal modulating envelopes. Besides the modes in Fig. 5, the NLSA spectrum contains eigenfunctions representing ENSO and the annual cycle and its harmonics—the timescale separation of in the eigenfunctions is a consequence of delay-coordinate mapping as discussed earlier (here, the number of lags is q = 512, corresponding to a 64-day temporal window.) A key difference between the patterns in Fig. 5 and the corresponding modes obtained via comparable linear techniques is strong intermittency and seasonality, leading to superior separation of the boreal-winter MJO and BSISO through NLSA. Phase composites of these patterns are displayed in Fig. 6. In [4], a nonlinear stochastic oscillator model is developed to predict the MJO eigenfunctions. In this model, the intermittency of the MJO modes is modeled via a hidden variable representing time-dependent damping. This variable interacts with the MJO modes through energy-conserving quadratic nonlinearities, which also contribute to strong parametric robustness of the model. The hidden variable at forecast initialization is estimated through a maximum likelihood calculation for conditionally Gaussian random variables. These models have skillful prediction for 40, 25 and 18 days in strong, moderate, and weak MJO winters, respectively. Furthermore, the ensemble spread is an accurate indicator of forecast uncertainty at long lead times. Current and future work in this area includes applications of the nonparametric techniques described above to MJO and BSISO forecasting, and studying the circulation aspects of the identified ISO patterns and their impact on the Indian monsoon. References [1] D. Giannakis and A. J. Majda. Quantifying the predictive skill in long-range forecasting. Part I: Coarse-grained predictions in a simple ocean model. J. Climate, 25:1793–1813, 2012. doi:10.1175/2011jcli4143.1. [2] D. Giannakis and A. J. Majda. Quantifying the predictive skill in long-range forecasting. Part II: Model error in coarse-grained Markov models with application to ocean-circulation regimes. J. Climate, 25:1814–1826, 2012. doi:10.1175/jcli-d-11-00110.1. 10 0 −2 φ21 2 0 −2 φ23 2 0 −2 9201 9203 9206 9209 9212 9303 9306 9309 9312 1E−3 1E−2 1E−1 1E0 |FT(φ12)|2 |FT(φ13)|2 φ13 2 |FT(φ21)|2 φ12 0 −2 |FT(φ23)|2 1E9 1E8 1E7 1E6 1E5 1E4 1E3 1E2 1E9 1E8 1E7 1E6 1E5 1E4 1E3 1E2 1E9 1E8 1E7 1E6 1E5 1E4 1E3 1E2 1E9 1E8 1E7 1E6 1E5 1E4 1E3 1E2 2 1E1 frequency (d−1) date (yymm) Figure 5. Timeseries of Laplace-Beltrami eigenfunctions from NLSA representing the MJO (first row from top), the BSISO (third row) and amplitude-modulated diurnal modes with intraseasonal envelopes (second and fourth rows) in the CLAUS Tb dataset. Notice the strong intermittency and seasonality of the MJO and BSISO eigenfunctions, which are predominantly active during the boreal winter and summer months, respectively. The eigenfunctions are shown here for a two-year portion of the 1983–2006 training dataset which includes the intensive observing period of the TOGA COARE field campaign (November 1, 1992 to February 28, 1993); the significant MJO events identified via NLSA are in agreement with the directly observed TOGA COARE events. Other modes in the NLSA spectrum (not shown here) represent the annual cycle and its harmonics and ENSO. Delay-coordinate mapping is crucial for producing the observed timescale separation in the eigenfunctions. Figure adopted from [23]. Figure 6. Phase composites of the MJO (left) and BSISO (right) recovered by NLSA from CLAUS Tb data. Colorcoded are Tb anomalies (in K), with negative (positive) anomalies representing enhanced (suppressed) cloudiness and convection. In the case of the MJO, initiation over the Indian ocean, accompanied by the development of anomalously high convection over eastern Africa and tropical eastern Brazil, takes place in Phases 1 and 2, and is followed by propagation over the Maritime Continent and the western Pacific warm pool in Phases 3–5 and 6–7, respectively. Termination at the dateline takes place in Phase 8. For BSISO, Phase 1 corresponds to a cluster of positive Tb anomalies developing in the central Indian Ocean. In Phases 2–4, that cluster moves northeastward towards the Bay of Bengal and India and branches off towards the western Pacific and the Monsoon Trough, bypassing the Maritime Continent from the north. Following the dry phase of BSISO, a cluster of anomalously high convection develops in Phase 5 in the central Indian Ocean, and propagates towards India and the western Pacific in Phases 6–8, completing the BSISO cycle. Figure reproduced from [23]. 11 [3] D. Giannakis, A. J. Majda, and I. Horenko. Information theory, model error, and predictive skill of stochastic models for complex nonlinear systems,. Phys. D., 241:1735–1752, 2012. doi:10.1016/j.physd.2012.07.005. [4] N. Chen, A. J. Majda, and D. Giannakis. Predicting the cloud patterns of the Madden-Julian Oscillation through a low-order nonlinear stochastic model. Geophys. Res. Lett., 41(15):5612–5619, 2014. doi:10.1002/2014gl060876. [5] D. Giannakis and A. J. Majda. Nonlinear Laplacian spectral analysis for time series with intermittency and low-frequency variability. Proc. Natl. Acad. Sci., 109(7):2222–2227, 2012. doi:10.1073/pnas.1118984109. [6] D. Giannakis. Dynamics-adapted cone kernels. SIAM J. Appl. Dyn. Sys., 2015. In press. [7] T. Berry, D. Giannakis, and J. Harlim. Nonparametric forecasting of low-dimensional dynamical systems. Phys. Rev. E., 91:032915, 2015. doi:10.1103/PhysRevE.91.032915. [8] Z. Zhao and D. Giannakis. Analog forecasting with dynamics-adapted kernels. Nonlinearity, 2014. In review. [9] M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput., 15:1373–1396, 2003. doi:10.1162/089976603321780317. Diffusion maps. Appl. Comput. Harmon. Anal., 21:5–30, 2006. [10] R. R. Coifman and S. Lafon. doi:10.1016/j.acha.2006.04.006. [11] T. Berry and T. Sauer. Local kernels and the geometric structure of data. J. Appl. Comput. Harmon. Anal., 2015. doi:10.1016/j.acha.2015.03.002. In press. [12] P. H. Bérard. Spectral Geometry: Direct and Inverse Problems, volume 1207 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 1989. [13] P. W. Jones, M. Maggioni, and R. Schul. Manifold parametrizations by eigenfunctions of the Laplacian and heat kernels. Proc. Natl. Acad. Sci., 105:1803, 2008. [14] J. Portegies. Embeddings of Riemannian manifolds with heat kernels and eigenfunctions. arXiv:1311.7568, 2014. [15] F. Takens. Detecting strange attractors in turbulence. In Dynamical Systems and Turbulence, volume 898 of Lecture Notes in Mathematics, pages 366–381. Springer, Berlin, 1981. doi:10.1007/bfb0091924. [16] D. S. Broomhead and G. P. King. Extracting qualitative dynamics from experimental data. Phys. D, 20(2–3): 217–236, 1986. doi:10.1016/0167-2789(86)90031-x. [17] T. Sauer, J. A. Yorke, and M. Casdagli. Embedology. J. Stat. Phys., 65(3–4):579–616, 1991. doi:10.1007/bf01053745. [18] D. Giannakis and A. J. Majda. Comparing low-frequency and intermittent variability in comprehensive climate models through nonlinear Laplacian spectral analysis. Geophys. Res. Lett., 39:L10710, 2012. doi:10.1029/2012GL051575. [19] W.-w. Tung, D. Giannakis, and A. J. Majda. Symmetric and antisymmetric signals in MJO deep convection. Part I: Basic modes in infrared brightness temperature. J. Atmos. Sci., 71:3302–3326, 2014. doi:10.1175/jas-d-13-0122.1. [20] M. Bushuk, D. Giannakis, and A. J. Majda. Reemergence mechanisms for North Pacific sea ice revealed through nonlinear Laplacian spectral analysis. J. Climate, 27:6265–6287, 2014. doi:10.1175/jcli-d-13-00256. [21] M. Bushuk, D. Giannakis, and A. J. Majda. Arctic sea ice reemergence: The role of large-scale oceanic and atmospheric variability. J. Climate, 2015. doi:10.1175/jcli-d-14-00354.1. In press. [22] M. Bushuk and D. Giannakis. Sea-ice reemergence in a model hierarchy. Geophys. Res. Lett., 2015. In review. [23] E. Szekely, D. Giannakis, and A. J. Majda. Extraction and predictability of coherent intraseasonal signals in infrared brightness temperature data. Climate Dyn., 2015. In minor revision. [24] T. Berry, R. Cressman, Z. Greguric Ferencek, and T. Sauer. Time-scale separation from diffusion-mapped delay coordinates. SIAM J. Appl. Dyn. Sys., 12:618–649, 2013. [25] E.N. Lorenz. Atmospheric predictability as revealed by naturally occurring analogues. J. Atmos. Sci., 26:636–646, 1969. doi:10.1175/1520-0469(1969)26<636:aparbn>2.0.co;2. [26] R. R. Coifman and S. Lafon. Geometric harmonics: A novel tool for multiscale out-of-sample extension of empirical functions. Appl. Comput. Harmon. Anal., 21:31–52, 2006. doi:j.acha.2005.07.005. [27] N. Rabin and R. R. Coifman. Heterogeneous datasets representation and learning using diffusion maps and Laplacian pyramids. In 12th SIAM International Conference on Data Mining, 2012. [28] T. Berry and J. Harlim. Variable bandwidth diffusion kernels. J. Appl. Comput. Harmon. Anal., 2014. doi:10.1016/j.acha.2015.01.001. In press. [29] K. E. Taylor, R. J. Stouffer, and G. A. Meehl. An overview of CMIP5 and the experiment design. Bull. Amer. Meteor. Soc., 93:485–498, 2011. doi:10.1175/bams-d-11-00094.1. [30] W. K. M. Lau and D. E. Waliser. Intraseasonal Variability in the Atmosphere–Ocean Climate System. SpringerVerlag, Berlin, 2011. 12 [31] D. Giannakis, W.-w. Tung, and A. J. Majda. Hierarchical structure of the Madden-Julian oscillation in infrared brightness temperature revealed through nonlinear Laplacian spectral analysis. In 2012 Conference on Intelligent Data Understanding (CIDU), pages 55–62, Boulder, Colorado, 2012. doi:10.1109/CIDU.2012.6382201. [32] D. Giannakis and A. J. Majda. Limits of predictability in the North Pacific sector of a comprehensive climate model. Geophys. Res. Lett., 39:L24602, 2012. doi:10.1029/2012gl054273. [33] D. J. Cavalieri and C. L. Parkinson. Arctic sea ice variability and trends. The Cryosphere, 6(4):881–889, 2012. doi:doi:10.5194/tc-6-881-2012. [34] J. C. Stroeve, V. Kattsov, A. Barrett, T. Serreze, M. Pavlova, M. Holland, and W. N. Meier. 2012: Trends in arctic sea ice extent from CMIP5, CMIP3 and observations. Geophys. Res. Lett., 39(15):L16502, 2012. doi:10.1029/2012gl052676. [35] E. Blanchard-Wrigglesworth, K. C. Armour, C. M. Bitz, and E. DeWeaver. Persistence and inherent predictability of Arctic sea ice in a GCM ensemble and observations. J. Climate, 24:231–250, 2011. doi:10.1175/2010jcli3775.1. [36] J. C. Stroeve, L. C. Hamilton, C. M. Bitz, and E. Blanchard-Wrigglesworth. Ensemble skill of the SEARCH Sea Ice outlook 2008–2013. Geophys. Res. Lett., 41(7):2411–2418, 2014. doi:10.1002/2014gl059388. [37] R. A. Madden and P. R. Julian. Detection of a 40–50 day oscillation in the zonal wind in the tropical Pacific. J. Atmos. Sci., 28(5), 1971. doi:10.1175/1520-0469(1971)028<0702:doadoi>2.0.co;2. [38] K. Kikuchi, B. Wang, and Y. Kajikawa. Bimodal representation of the tropical intraseasonal oscillation. Climate Dyn., 38:1989–2000, 2012. doi:10.1007/s00382-011-1159-1. [39] G. N. Kiladis, J. Dias, K. H. Straub, M. C. Wheeler, S. N. Tulich, K. Kikuchi, K. M. Weickmann, and M. J. Ventrice. A comparison of OLR and circulation-based indices for tracking the MJO. Mon. Wea. Rev., 142: 1697–1715, 2014. doi:10.1175/mwr-d-13-00301.1. 13