DELAY EMBEDDINGS FOR FORCED SYSTEMS: II. STOCHASTIC FORCING 1 2 3 2 J. Stark , D.S. Broomhead , M.E. Davies and J. Huke 1 2 3 Centre for Nonlinear Dynamics and its Applications, University College London, Gower Street, London, WC1E 6BT. Department of Mathematics, University of Manchester Institute of Science and Technology, PO Box 88, Manchester, M60 1QD. Department of Electronic Engineering, Queen Mary, University of London, Mile End Road, London E1 4NS. Monday, July 15, 2002 ABSTRACT Takens’ Embedding Theorem forms the basis of virtually all approaches to the analysis of time series generated by nonlinear deterministic dynamical systems. It typically allows us to reconstruct an unknown dynamical system which gave rise to a given observed scalar time series simply by constructing a new state space out of successive values of the time series. This provides the theoretical foundation for many popular techniques, including those for the measurement of fractal dimensions and Liapunov exponents, for the prediction of future behaviour, for noise reduction and signal separation, and most recently for control and targeting. Current versions of Takens’ Theorem assume that the underlying system is autonomous (and noise free). Unfortunately this is not the case for many real systems. In a previous paper, one of us showed how to extend Takens’ Theorem to deterministically forced systems. Here, we use similar techniques to prove a number of delay embedding theorems for arbitrarily and stochastically forced systems. As a special case, we obtain embedding results for Iterated Functions Systems, and we also briefly consider noisy observations. Embedding Stochastic Systems 2 1. INTRODUCTION This paper continues the work begun in [Stark, 1999] where one of us developed techniques for proving extensions of the Takens’ Embedding Theorem to forced dynamical systems. In that paper, these methods were applied to the case of forcing by a finite dimensional deterministic system. In many applications the assumption that the forcing is deterministic is not a reasonable one, and the aim of this paper is therefore to extend these results to far more general forcing processes. Recall that Takens’ Embedding Theorem ([Takens, 1980]; see also [Aeyels, 1981] and [Sauer et al., 1991]) provides the theoretical foundation for the analysis of time series generated by nonlinear deterministic dynamical systems. Informally, it says that if we take a scalar observable ϕ of the state x of a deterministic dynamical system then typically we can reconstruct a copy of the original system by considering blocks (ϕ(xt), ϕ(xt+τ ), ϕ(xt+2τ ), …, ϕ(xt+(d-1)τ )) of d successive observations of ϕ, for d sufficiently large. Here xt is the state of the system at time t, and τ > 0 is some sampling interval. Since xt will usually be unknown, whilst ϕ(xt) is a quantity we can measure in practice, this result has stimulated a vast range of applications in fields ranging from fluid dynamics, through electrical engineering to biology, medicine and economics (for a good overview see e.g. [Ott et al, 1994]). One might even say that this one theorem has given rise to virtually a new branch of nonlinear dynamics, often informally called chaotic time series analysis. However, for Takens’ Theorem to be valid, we need to assume both that the dynamics is deterministic (ie that there is some mapping f such that xt+τ = f(xt)), and that both the dynamics, and the observations are autonomous (so that f and ϕ depend on x only). Note that by rescaling time, if necessary, we may assume that τ = 1, and hence restrict t to integer values. In [Stark, 1999] we extended Takens’ Theorem to the non-autonomous case where f is also a function of some other variable yt , where yt itself is generated by a deterministic system, so that yt+1 = g(yt) for some g. Here we turn to systems driven by far more general processes. It turns out that the same approach can encompass a number of different cases. In particular the theorems proved below apply to a wide class of stochastic dynamical systems (which we can think of as deterministic systems driven by some stochastic process), to input-output systems and to irregularly sampled systems. Input-output systems have already been considered in this context by [Casdagli, 1992], and our results here in essence prove his conjecture on reconstructing such systems. Broadly speaking our approach results in a reconstruction framework where both the dynamical systems and the delay reconstruction map are indexed by the realization of the forcing process. This leads to results similar to the Bundle Delay Embedding Theorem (Theorem 3.2) of [Stark, 1999], and indeed the proof of the main “Stochastic Takens’ Embedding Theorem” here (Theorem 2.3 below) closely parallels that of Theorem 3.2 of [Stark, 1999]. We also consider a number of variations of this main theorem, including the case of Iterated Function Systems (eg [Norman, 1968; Barnsley, 1988]) and of noisy observations. It is perhaps interesting to note that the latter, at least in the case where the forcing process takes on continuous values is far the easiest case to prove, and amounts to little more than the classical Whitney Embedding Theorem (eg [Hirsch, 1976]). Embedding Stochastic Systems 3 We assume familiarity with the standard Takens’ framework, and use the notation of [Stark, 1999]. Most of the transversality techniques employed here closely follow those developed in that paper, and we shall make use of a number of technical results and calculations derived there. We begin by describing a general framework for treating arbitrarily forced systems, and stating the principal theorems proved in this paper. 2. DELAY RECONSTRUCTION FOR ARBITRARY FORCING A convenient formalism for incorporating arbitrary forcing into a dynamical system is that of random dynamical systems (eg [Kifer, 1988; Arnold, 1998]). This encompasses both a wide class of noisy systems, as well as input-output systems in the terminology of [Casdagli, 1992], irregularly sampled systems and others. In the case of discrete time systems, which is the situation that we treat here, we assume that the forcing at time i∈Z is specified by a variable ωi , drawn from some appropriate space N. In the case of noisy systems, ωi is chosen at random, with respect to some probability measure µ on N. The state of the system at time i∈Z is denoted by xi ∈M and evolves according to xi+1 = f(xi , ωi ) (2.1) As is usual in the standard Takens’ framework, we assume that M is a smooth compact manifold. The case of M non-compact can be treated (eg [Takens, 1980; Huke, 1993]), but will not be considered here further. If we think of ωi as a parameter, then we can interpret (2.1) as a standard dynamical system with noise or forcing on the parameters. It can also be helpful to write fω (xi ) instead of f(xi ,ωi ). This suggests the interpretation that instead of applying the same map f every time, we choose a different map fω at each time step. The case of a single deterministic system f subject to additive (dynamical) noise can be included in this formalism by setting fω (xi ) = f(xi ) + ωi . i i i In the most general case (assuming M compact), N can be taken to be the space Dr(M) of all diffeomorphisms on M [Kifer, 1988]. At the other extreme, N may be a discrete space consisting of a finite number of points, so that fω is chosen from a finite set of maps. This leads to the well known example of so called iterated function systems (eg [Norman, 1968; Barnsley, 1988]). In this paper we shall choose N somewhere between these two extremes, and make the standing assumption that N is a compact manifold. i 2.1. SHIFT SPACES Standard dynamical systems theory cannot directly encompass non-autonomous systems of the form ( 2.1). The usual approach to non-autonomous systems is to enlarge the state space sufficiently to make the system autonomous. This is well known in the case of a periodically forced differential equation which can be transformed into an autonomous system by the addition of a dummy variable to represent time. In the context of arbitrary forcing, this is most easily accomplished through the use of an appropriate shift space. Thus, let Σ = N Z be the space of biinfinite sequences ω = ( …, ω-1 , ω0 , ω1 , …) of elements in N with the product topology. Since we assume that N is compact, the Tychonoff theorem implies that Σ is also compact. Let σ : Σ → Σ be the usual shift map Embedding Stochastic Systems 4 f x f(x,Ω0) Ω Σ ΣΩ Figure 2.1 Graphical representation of a random dynamical system, from [Stark, 2001]. [σ(ω)] i = ωi+1 where ωi ∈N is the i th component of ω ∈Σ. Then the evolution of xi given by (2.1) can be represented by the skew product T : M×Σ → M×Σ (see Figure 2.1) defined by T(x,ω) = (f(x,ω0),σ(ω)) (2.2) Since the space Σ contains all possible sequences of elements in N, this gives us a very general model of systems driven by arbitrary sequences. Furthermore, if one wants to restrict interest to a particular class of input sequences, one can replace Σ by any shift invariant closed subset. If in addition we have some measure µ on N then the corresponding product measure µ∞ on Σ is shift invariant and hence (σ , µ∞) gives rise to a (Bernoulli) stochastic process. We can also consider general σ-invariant measures µΣ to take account of correlations in the choice of successive ωi (so that for instance ω is a Markov process). The restriction to σ-invariant measures corresponds to a stationarity condition on the corresponding random process. Additionally, f could be taken to be a general function f : M×Σ → M, rather than just depending on a single component ω 0 of ω. We shall not do so here, though as we shall soon see, the delay reconstruction map will depend on more than a single element of ω. If we dispense with the measure µ, the same formalism can also be used to model a deterministic system driven by an arbitrary input sequence ω. This arises frequently in communications systems where the sequence ω would represent the information being transmitted (e g [Broomhead et al, 1999]). Another application is to irregularly sampled time series where ωi denotes the time between sample i and i+1 (see [Martin, 1998] , and Example 3.5 of [Stark, 1999]). 2.2. C ONJUGACIES FOR RANDOM D YNAMICAL SYSTEMS The crucial question we need to address in order to develop an analogue of Takens’ Theorem for random dynamical systems is what do we mean by a delay reconstruction of T ? It is well Embedding Stochastic Systems 5 known (eg [Takens, 1980; Stark, 1999; Stark, 2001]) that the fundamental property required of a reconstruction is that the reconstructed system (which we shall call F, in line with [Stark, 1999; Stark, 2001]) should be equivalent to the original dynamical system f under a coordinate change. In the standard Takens embedding framework therefore we have F = Φ°f°Φ -1 , where Φ is the delay reconstruction map. We thus need to determine what it means for two random dynamical systems T and T' to be equivalent in this way. The most general concept is simply to require T' = H °T °H -1 for some fΩ h-1Ω(x) fΩ(h-1Ω(x)) Ω ΣΩ Σ h-1Ω hΣΩ fΩ' x fΩ' (x) Ω Σ ΣΩ Figure 2.2 Co-ordinate change for random dynamical systems, from [Stark, 2001]. Embedding Stochastic Systems 6 (invertible) co-ordinate change H : M×Σ → M×Σ. It may not be possible to arrange for T' = H°T °H -1 to hold for all ω ∈Σ, and hence we may either require this for only µΣ -almost every ω in the probabilistic setting, or for only generic ω in a topological setting. In the context of delay reconstruction, however, it turns out to be convenient to place some further restrictions on H. The space M×Σ will in general be infinite dimensional, and hence we have little hope of reconstructing it in a finite dimensional delay space Rd. Furthermore ϕ is a function of M only and hence attempting to reconstruct Σ using ϕ seems foolhardy (though note that in the case of deterministic forcing, this is in fact possible, [Stark, 1999]). We therefore want to restrict ourselves to conjugacies which correspond to reconstructing only M and leaving Σ untouched. A reasonable interpretation of this is to require that the Σ component of H is the identity (Figure 2.2). In other words, we only consider conjugacies of the form H = (h , Id ) for some map h : M×Σ → M. If H is to be invertible for µΣ -almost every ω, or for generic ω, then hω = h(•,ω) : M → M has to be invertible for µΣ -almost every ω, or for generic ω. We shall refer to co-ordinate changes of this form as bundle conjugacies (using the term loosely for any of hω , h and H itself). Co-ordinate changes of this form are common in the theory of random dynamical systems (eg [Arnold, 1998]). ~ ~ Given a skew product (2.2), it is convenient to define f : M×Σ → M by f (x,ω) = f(x,ω0) so that ~ ~ ~ ~ T = (f ,σ). If similarly, T' = (f ',σ) then T' = H °T °H -1 is equivalent to (f ',σ ) = (h , Id )°(f ,σ)°(h , ~ ~ ~ ~ Id )-1 . If we denote f ω = f (•,ω) : M → M we have (h , Id )°(f ,σ)°(h , Id )-1 (•,ω) = (h , Id )°(f ω°hω-1 , ~ -1 σ(ω)) = (hσ(ω)°f ω°hω , σ(ω)). Hence ~ f ω' = ~ hσ(ω)° f ω°hω-1 (2.3) ~ ~ where as usual f ω' = f '(•,ω) : M → M. Note that in [Stark et al., 1997] and [Stark, 1999] we ~ abused notation to write fω instead of f ω . Observe the similarity between (2.3) and the deterministic conjugacy f ' = h°f°h-1. Essentially all we do in the random case is index both the dynamics and the co-ordinate change with ω. The only slightly delicate point is the σ appearing in hσ(ω). The reason for this is that by the time we come to apply h in equation (2.3) we have carried out one time step of the dynamics, and hence ω has moved to σ(ω) (Figure 2.2). In applications to reconstruction we need to consider one further issue, namely that the domain and range of H will typically not be the same space. Thus, recall that in the standard Takens framework the delay reconstruction map Φ is a map Φ : M → Rd. The main content of Takens Theorem is that this is an embedding (a diffeomorphism onto its image Φ(M)) and hence we can define the co-ordinate change F = Φ°f°Φ -1 on Φ(M). In the present context Φ will depend on ω and so for each ω can be regarded as a map Φ ω : M×Σ → R d (see §2.3 below). Informally our concept of a “Stochastic Takens’ Theorem” is to require Φ to be a bundle embedding. By this we mean that Φ ω should be an embedding (a diffeomorphism onto its image Φ ω(M)) for typical ω, ie for µΣ -almost every ω, or for generic ω. Note that in general the image Φ ω(M) will be different for each ω (but all these images will be diffeomorphic to M). The range of the map H = (Φ , Id ) will thus not be M×Σ as above, but rather the reconstruction space Rd×Σ . When Φ is a bundle embedding, the co-ordinate change T' = H °T °H -1 is defined on H(M×Σ). This space cannot typically be written in the form M'×Σ for some appropriate M', but it is bundle diffeomorphic to M×Σ. Embedding Stochastic Systems 2.3. 7 THE STOCHASTIC TAKENS ’ THEOREM We shall now define the reconstruction map more precisely, and give several versions of the a “Stochastic Takens’ Theorem”. Suppose that we observe the skew product (2.2) using a measurement function ϕ : M → R, so that the observed time series is generated by ϕi = ϕ(xi ) (for the moment we assume that ϕ is independent of ω). The usual delay map can then be written as Φ f,ϕ(x,ω) (ϕ(x), ϕ( fω (x)), ϕ( fω ω (x)), …, ϕ( fω = 0 1 0 d -2 …ω 0 (x))) † where fω …ω = fω ° … ° fω (as in §3.5 of [Stark, 1999]). Observe that Φ f,ϕ can be regarded as either a map M×N d-1 → R d, or a map M×Σ → R d. In the latter form, Φf,ϕ is a candidate for a bundle embedding which is equivalent to the condition that Φf,ϕ,ω = Φ f,ϕ(•,ω) should be an embedding for µΣ -almost every ω, or for generic ω. In [Stark, 1999] we were able to prove that this was indeed the case for finite dimensional deterministic forcing. Given that in the setting described in §2.1, σ defines a deterministic dynamical system (albeit an infinite dimensional one) and Φ f,ϕ,ω depends only on a finite number of components of ω, it should not come as a great surprise that we can modify Theorem 3.2 of [Stark, 1999] to hold in the present context. Thus, as in [Stark, 1999] denote by Dr(M×N, M) the space of maps f : M×N → M such that fy : M → M is a C r diffeomorphism of M for any y, where as usual fy(x) = f(x,y). Then i 0 i 0 Theorem 2.1 Let M and N be compact manifolds of dimension m ≥ 1 and n respectively. Suppose that d ≥ 2m+1. Then for r ≥ 1, there exists a residual set of ( f,ϕ)∈Dr(M×N, M)× C r(M,R) such that for any ( f ,ϕ) in this set there is an open dense set of ω in Σ such that Φf,ϕ,ω is an embedding. The same approach also gives a measure theoretic version of this theorem. Since the proof ultimately relies on Sard’s Theorem, we need to impose some conditions on the shift invariant measure on Σ that describes the stochastic forcing. The most straightforward case is when we assume that this measure is a product measure µ∞ arising from a measure µ on N, as discussed above in §2.1. This gives the following statement of the theorem, already announced in [Stark et al., 1997] and [Stark, 1999]: Theorem 2.2 Let M and N be compact manifolds of dimension m ≥ 1 and n respectively and let µ be a measure on N which is absolutely continuous with respect to Lebesgue measure. Suppose that d ≥ 2m+1. Then for r ≥ 1, there exists a residual set of (f,ϕ)∈Dr(M×N, M)× C r(M,R) such that for any ( f,ϕ) in this set Φ f,ϕ,ω is an embedding for µ∞ almost every ω . Our method of proof, however, allows for more general stochastic processes. The key property required of µΣ is that the marginal measure µd-1 on cylinders of length d-1 is absolutely continuous with respect to Lebesgue measure. Recall that µd-1 is the measure on N d-1 defined by µd-1 (U) = µΣ (( πd-1 )-1 (U)) for all measurable sets U ⊂ N d-1, where πd-1 : Σ → N d-1 is the projection πd-1 (ω) = (ω0, …, ωd-2 ). We then have Theorem 2.3 (Takens’ Theorem for Stochastic Systems) Let M and N be compact manifolds of dimension m ≥ 1 and n respectively and let µΣ be an invariant measure on Σ = N Z such that µd-1 is absolutely continuous with respect to Lebesgue measure on N d-1. Suppose that d ≥ Embedding Stochastic Systems 8 2m+1. Then for r ≥ 1, there exists a residual set of (f,ϕ)∈Dr(M×N, M)× C r(M,R) such that for any ( f,ϕ) in this set Φ f,ϕ,ω is an embedding for µΣ almost every ω. Note that Φf,ϕ,ω only depends on the finite number of components (ω0, …, ωd-2)∈N d-1 and hence as we shall see in §3.1 below it is sufficient to prove that Φf,ϕ,ω is an embedding for µd-1 almost every (ω0 , …, ωd-2). When N is zero dimensional, and hence consists of a finite number of points, the only set which has µd-1 full measure is the whole of N d-1 . Thus when dim N = 0 the theorem implies that Φ f,ϕ,ω is an embedding for all ω. It turns out that in this case we need to use a different proof (see §3.2 below for a more detailed explanation) to that for dim N > 0, which also gives an open dense set of (f,ϕ), rather than merely a residual set. We thus get Theorem 2.4 (Takens’ Theorem for Iterated Function Systems) Let M and N be compact manifolds of dimension m ≥ 1 and n = 0 respectively (so that N is a finite set of points). Suppose that d ≥ 2m+1 and r ≥ 1. Then there exists an open dense set of (f,ϕ)∈Dr(M×N, M)× C r(M, R) such that for any ( f,ϕ) in this set Φ f,ϕ,ω is an embedding for every ω∈Σ. This thus encompasses both Theorems 2.1 and 2.3 for dim N = 0, and is proved separately in §6. It is also possible to incorporate “noisy observations” in either of these results. This in fact should lead to easier proofs, since it allows us greater freedom in making perturbations. On the other hand, such a generalization does significantly complicate the notation. We therefore postpone discussion of this until §2.5 below. 2.4. RECONSTRUCTING THE D YNAMICS The most important consequence of the standard Takens’ Theorem is that when Φ is an embedding it possible to reconstruct a copy of the original system from successive observations of ϕ. Essentially the same situation holds in our generalized framework (Figure 2.3). Thus suppose that ω is such that Φ f,ϕ,ω and Φ f,ϕ,σ(ω) are both embeddings of M. Then the map Fω = Φ f,ϕ,σ(ω)° fω ° (Φ f,ϕ,ω )-1 is well defined and is a diffeomorphism between Φ f ,ϕ,ω (M) ⊂ R d and Φ f,ϕ,σ(ω)(M) ⊂ Rd. Let (xi ,σ i(ω)) be an orbit of ( f,σ), so that xi+1 = f (xi ,ωi ), recall that ϕi = ϕ (xi ) and define zi = (ϕi , ϕi+1 , … , ϕi+d-1 ). Then zi = Φf,ϕ,σ (ω)(xi ) and hence 0 zi+1 i = Φf,ϕ,σ (xi+1) = Φf,ϕ,σ (fω (xi )) = Φf,ϕ,σ(σ (ω))( fω (( Φ f,ϕ,σ (ω))-1 (zi ))) = Fσ (ω)(zi ) i+1(ω) i+1(ω) i i i i i Therefore in exact analogy to the standard Takens’ framework, Fσ (ω) is just the map which shifts a block of the time series forward by one time step, and hence ( ϕi , ϕi+1 , … , ϕi+d-1 ) a (ϕi+1 , ϕi+2 , … , ϕi+d) is bundle conjugate to our original dynamics fω . Note however, that whereas fω only depends on ωi = σ i(ω)0 , the map Fσ (ω) depends on ωi , ωi+1 , … , ωi+d-1 . Also in contrast to the standard framework, different Fω have different domains (each a subset of Rd diffeomorphic to M). There is no reason in general why these should all be disjoint. i i i i The first d-1 components of Fω are trivial. If we denote the last component by Gω : Φf,ϕ,ω(M) → R then Embedding Stochastic Systems ϕi+d 9 Gσ (ω)(ϕi , ϕi+1 , … , ϕi+d-1 ) = i If we write out the dependence on σ i(ω) explicitly, we get ϕi+d G(ϕi , ϕi+1 , … , ϕi+d-1 , ωi , ωi+1 , … , ωi+d-1 ) = (2.4) The existence of such a function was conjectured by [Casdagli, 1992]. From another point of view, processes of this form are known in signal processing under the name of Nonlinear AutoRegressive Moving Average (NARMA) models. Some authors (eg [Chen and Billings, 1989]) have already used models with the general structure of (2.4), (but with ωi restricted to one dimension). More commonly, however, it is assumed that there is only a single additive random component: ϕi+d G(ϕi , ϕi+1 , … , ϕi+d-1 ) + ω i+d-1 = fΩ xi xi+1 Observation Ω ΣΩ Σ i f,,Ω R d FΩ zi+1 zi Delay Reconstruction f,,Ω(M) Ω Σ ΣΩ Figure 2.3 Delay reconstruction of a random dynamical system. Embedding Stochastic Systems 10 It is clear that estimating both G and ω in a model of the form (2.4) will be a major challenge. Note that ωi , …, ωi+d-2 have all been determined by time i+d-1, and so there is at least some hope of estimating them from previous values of the time series. By contrast ωi+d-1 corresponds to new uncertainty entering the system in the time step from i+d-1 to i+d. 2.5. NOISY OBSERVATIONS So far we have assumed that the observations are completely noise free. This is clearly unrealistic in many applications. In this section we therefore generalize our approach to cover the possibility of noise in the observations. We can envisage two cases: the observation noise may be independent of the noise in the dynamics, or both may depend on the same underlying stochastic process. In the first case, which is more likely to be relevant in applications, we can construct another shift space Σ' = (N')Z to represent the observation dynamics, where N' is some compact manifold. The observation function is now ϕ : M×N' → R, which gives rise to the delay map Φf,ϕ,ω,η defined by Φf,ϕ,ω,η(x) = (ϕη (x), ϕη ( fω (x)), ϕη ( fω ω (x)), … , ϕη ( fω 0 1 0 2 1 0 d-1 d-2 …ω 0 (x))) † where ϕη (x) = ϕ(x,ηi) for η∈Σ'. We can then require this to be an embedding for typical (ω,η)∈ Σ×Σ' : i Theorem 2.5 Let M, N and N' be compact manifolds, with m = dim M > 0. Suppose that d ≥ 2m+1. Then for r ≥ 1, there exists a residual set of ( f,ϕ)∈Dr(M×N, M)× C r(M×N', R) such that for any ( f,ϕ) in this set there is an open dense set Σf,ϕ ⊂ Σ×Σ' such that Φf,ϕ,ω,η is an embedding for all (ω ,η)∈Σf,ϕ . If µΣ and µ'Σ' are invariant measures on Σ and Σ' respectively, such that µd-1 and µ'd are absolutely continuous with respect to Lebesgue measure on N d-1 and (N')d respectively, then we can choose Σf,ϕ such that µΣ ×µ'Σ'(Σf,ϕ) = 1. As with Theorem 2.3, we need to treat the cases dim N = 0 and dim N' = 0 separately. Details can be found in §7 below, which also contains the proofs of the various resulting versions of Theorem 2.5. The other possibility is that the observation noise depends on the same process as the dynamic noise. The relevant delay map is then ~ Φ f,ϕ,ω(x) = (ϕω (x), ϕω ( fω (x)), ϕω ( fω ω (x)), … , ϕω (fω 0 1 0 2 1 0 d-1 d-2 …ω 0 (x))) † where ϕω (x) = ϕ(x,ωi ). We then get i Theorem 2.6 Let M and N be compact manifolds, with m = dim M > 0. Suppose that d ≥ 2m+1. Then for r ≥ 1, there exists a residual set of ( f,ϕ)∈Dr(M×N, M)× C r(M×N, R) such that for any ( f,ϕ) in this set there is an open dense set Σf,ϕ ⊂ Σ such that ~ Φ f,ϕ,ω is an embedding for all Z ω ∈Σf,ϕ . If µΣ is an invariant measure on Σ = N such that µd is absolutely continuous with respect to Lebesgue measure on N d, then we can choose Σf,ϕ such that µΣ (Σf,ϕ ) = 1. Again, the case dim N = 0 is treated separately. Details and proofs are in §8. Embedding Stochastic Systems 11 3. STRUCTURE OF THE PROOFS OF THEOREMS 2.1 AND 2.3 The proofs of Theorems 2.1 and 2.3 are largely motivated by the proof of Theorem 3.2 of [Stark, 1999]. The main new ingredient is that the driving system is now infinite dimensional. However, since Φ f,ϕ,ω depends only on a finite number of components of ω , we can easily reduce the theorem to a finite dimensional one. This is done in §3.1 below. Additionally, in the measure theoretic case (Theorem 2.3) we require a measure theoretic version of the finite dimensional Parametric Transversality Theorem ([Abraham, 1963; Abraham and Robbin, 1967] or see Appendix A of [Stark, 1999]). As far as we are aware there is no published proof of this theorem, but it follows trivially by replacing the use of Smale’s Density Theorem in the standard Parametric Transversality Theorem by an application of Sard’s Theorem. We sketch the argument in Appendix A. 3.1. REDUCTION TO FINITE D IMENSIONS Since Φ f ,ϕ,ω depends only on ω0, …, ωd-2 , it turns out to be sufficient to consider subsets of N d-1 rather than of Σ itself. Recall that we defined the projection πd-1 : Σ → N d-1 by πd-1 (ω) = ( ω0, …, ωd-2 ). Given any U ⊂ N d-1 we can define the cylinder ΣU by ΣU = (πd-1 )-1 (U) so that ΣU { ω ∈Σ : (ω 0 , … , ωd-2 )∈U } = Then Lemma 3.1 If U is open and dense in N d-1 then ΣU is open and dense in Σ . If U is residual in N d-1 then ΣU is residual in Σ . Proof Recall that by the definition of the product topology on Σ , πd-1 is a continuous open mapping. Thus if U is open then ΣU = ( πd-1 )-1 (U) is open. Now suppose that ΣU is not dense. Then Σ\ΣU contains some open set V. Since πd-1 (V) ∩ U = ∅, and πd-1(V) is open this means that U cannot be dense. Hence if U is dense, then so is ΣU . If U is residual then U contains the countable intersection of dense open sets Ui . By above, each (πd-1 )-1 (Ui ) is open and dense and hence their intersection is residual. ■ If µΣ is a measure on Σ recall that we defined the marginal measure µd-1 on N d-1 by µd-1 (U) = µΣ (( πd-1 )-1 (U)) for all measurable sets U ⊂ N d-1. This definition immediately implies Lemma 3.2 If U ⊂ N d-1 has full measure with respect to µd-1 , then ΣU has full measure with respect to µΣ . We also have the elementary Lemma 3.3 Suppose that U ⊂ N d-1 is a measurable set of full Lebesgue measure (ie N d-1 \U has Lebesgue measure 0). Then U is dense in N d-1 and µd-1 (U) = 1 for any measure µd-1 that is absolutely continuous with respect to Lebesgue measure on N d-1 . Proof If U is not dense then N d-1 \U contains an open set and hence is not of Lebesgue measure 0. Hence U cannot be of full Lebesgue measure. This contradiction implies that U must be dense if it has full Lebesgue measure. For the second part, if N d-1 \U has Lebesgue measure 0, then µd-1 (N d-1 \U) = 0 by the definition of absolute continuity. Thus µd-1 (U) = 1, as required. ■ Embedding Stochastic Systems 12 Finally, if ω ∈Σ, define y = πd-1 (ω) = (ω 0 , … , ωd-2 )∈N d-1. Then Φf,ϕ,ω(x) = Φ f,ϕ,y(x) where Φ f,ϕ,y(x) (ϕ( f (0)(x,y)), ϕ( f (1)(x,y)), …, ϕ(f (d-1)(x,y))) † = (3.1) with f (i) : M×N d-1 → M defined in an analogous fashion to §3.2 of [Stark, 1999] by f (i+1) (x,y) = f ( f (i)(x,y),yi ) = fy ( f (i)(x,y)) and f (0)(x,y) = x. Here yi is the i th component of y, so that yi = ωi ; we adopt this notation to emphasize that y depends on only a finite number of components of ω. i Using Lemmas 3.1, 3.2 and 3.3, we can then immediately deduce Theorems 2.1 and 2.3 from Theorem 3.4 Let M and N be compact manifolds, with m = dim M > 0 and suppose that d ≥ 2m+1. Then for r ≥ 1, there exists a residual set of (f,ϕ)∈Dr(M×N, M)× C r(M, R) such that for any ( f,ϕ) in this set there is an open set Uf,ϕ ⊂ N d-1 of full Lebesgue measure such that Φf,ϕ,y is an embedding for all y∈Uf,ϕ . 3.2. THE D IMENSION OF N Unfortunately, it turns out that to proceed further we have to treat the case of n = dim N = 0 (ie the iterated function system case) separately from the general case of n ≥ 1. The reason for this is that as in Theorem 3.2 of [Stark, 1999] one of our first steps is to eliminate driving sequences ω with any repeating entries. Thus after reduction to finite dimensions, we remove those ω for which ωi = ωj for any i ≠ j with 0 ≤ i, j ≤ d-2. When n ≥ 1, the set of such ω has zero Lebesgue measure in N d-1 and hence can be ignored. However, when n = 0 and N consists of a finite number of discrete points, this set will in general have positive measure. In effect, for n = 0 we have to prove the theorem for all ω , not just Lebesgue almost all. Additional insight into the n = 0 case comes from the observation that if N is a single point, then Theorems 2.1 and 2.3 reduce to the standard Takens’ Theorem. Any proof of these theorems for n = 0 must therefore include a proof of the standard result. Such a proof (eg see §4 of [Stark, 1999]) has to treat the short periodic orbits of f with some care. The image under the delay map of any point on such an orbit has components that are necessarily identical. This is because for such points f i (x) = f j(x) for some 0 ≤ i < j ≤ 2d-1. Each such point therefore has to be embedded individually. This can easily be done, since for generic f on a compact M there is only a finite number of short periodic orbits (this is the easy part of the Kupka-Smale Theorem; a self-contained proof is given in §4.2.1 of [Stark, 1999]). In Theorems 2.1 and 2.3, the analogues of points on short periodic orbits are points whose images under Φ f,ϕ,y are identical, that is points such that xi = xj for some 0 ≤ i < j ≤ 2d-1. Thus define ~ Pω … ω 0 q-1 = { x∈M : fω i-1…ω0 (x) = fω j-1 …ω0 (x) for some i ≠ j with 0 ≤ i < j ≤ q } A technically intricate, but conceptually straightforward extension of the argument in §5.1.1 of ~ [Stark, 1999] can be used to show that for an open dense set of f, the set Pω … ω consists of a finite number of points for any given ω0, ω1, … , ωq-1 . Thus, if dim N = 0, so that there is only a finite number of choices of ω0 , … , ωq-1 , the set 0 ~ (q) P = U (ω 0 ,…,ω q −1 )∈N q ~ Pω … ω 0 q-1 q-1 Embedding Stochastic Systems 13 is finite, and each point in can be dealt with individually. On the other hand if dim N ≥ 1, then this set is no longer finite, and a different approach is necessary, eg the one adopted in Theorem 3.2 of [Stark, 1999]. To summarize, therefore, we have to give somewhat different proofs in the cases dim N ≥ 1 and dim N = 0 respectively. We shall outline the main ideas of the proof in the next two sections, and the detailed proofs are then given in §5 and §6 respectively. 3.3. MAIN IDEAS : DIM N ≥ 1 When dim N ≥ 1, Theorem 3.4 is very similar to Theorem 3.2 of [Stark, 1999], and it is not surprising that the proofs proceed along almost identical lines. The strategy is to first show that ~ for a residual set of f and ϕ, TΦ f,ϕ is transversal to the zero section in T Rd and Φ f,ϕ × Φf,ϕ is transversal to the diagonal in R d×R d (with the domain of Φ f,ϕ,y × Φf,ϕ,y being restricted to M×M\∆, where ∆ is the diagonal in M×M ). Here Φ f,ϕ : M×N d-1 → R d is defined by Φ f,ϕ(x,y) = Φ f,ϕ,y(x), ~ with Φ f,ϕ,y given by (3.1), and TΦ f,ϕ is the tangent map of Φf,ϕ restricted to the unit tangent bun~ dle TM = { v∈TM : ||v || = 1 } of M. We then treat y as a parameter and apply the Parametric ~ Transversality Theorem to the maps y a TΦ f,ϕ,y and y a Φ f,ϕ,y × Φf,ϕ,y . This shows that for a ~ residual set of full measure of y the maps TΦ f,ϕ,y and Φ f,ϕ,y × Φf,ϕ,y are transversal to the zero section in T Rd and the diagonal in Rd×Rd respectively. We then apply the same dimension counting argument as was used repeatedly in [Stark, 1999] to show that if d ≥ 2m +1, then transver~ sality implies non-intersection. Thus for a residual set of full measure of y the image of TΦ f,ϕ ,y cannot intersect the zero section in T R d and the image of Φ f,ϕ,y × Φf,ϕ,y (restricted to M×M\∆) cannot intersect the diagonal in Rd×Rd. But these are precisely the statements of the immersivity and injectivity of Φ f,ϕ,y , respectively. Exactly as in [Stark, 1999], the big problem is points such that xi = xj with i ≠ j. We proceed ~ just as we did there, by ensuring that such points occur on a family of submanifolds WI of ~ M×N d-1 and then dealing with each of these separately. Again, each WI is characterized by the set of pairs (i,j) for which xi = xj , and the more such pairs (i, j) there are, the smaller the dimen~ ~ ~ sion of the corresponding WI. In particular we shall see that WI,y = WI ∩ (M×{y}) has codimension (d-γ )m where γ is the number of distinct points in the set {x0, … , xd-1 }. Thus it seems plau~ sible that generically Φf,ϕ,y should embed WI,y if γ ≥ 2(m - (d- γ )m) + 1. But this is always satisfied ~ if d ≥ 2m + 1, since then 2m + 1 - 2(d- γ )m - γ ≤ (d- γ )(1-2m) ≤ 0. Since the union of the WI,y over all I is M ×{y} this means that Φ f,ϕ,y should embed the whole of M×{y}. Of course, the problem with this argument as stated is that the residual set of f and ϕ for which Φf,ϕ,y is an embedding will depend on y, and hence a priori the intersection of these sets over an open and dense set of y will not necessarily be residual. We deal with this as in §6 of [Stark, 1999] by ~ combining the construction of WI , which itself is a transversality argument with the proof of the ~ ~ transversality of TΦ f,ϕ,y and Φ f,ϕ,y × Φf,ϕ,y on WI . This allows a single invocation of the Parametric Transversality Theorem to yield a residual set of f and ϕ. 3.4. MAIN IDEAS : DIM N = 0 The overall structure closely follows the transversality proof of the standard Takens’ Theorem given in [Stark, 1999], which in turn is based on Takens’ original proof ([Takens, 1980; Huke, 1993]). We first ensure that periodic orbits of period less than 2d are isolated and have distinct Embedding Stochastic Systems 14 eigenvalues, and then embed these individually. We then use the Parametric Transversality Theorem exactly as outlined above for dim N > 0 to show that for a dense set of f and ϕ, the ~ map TΦ f,ϕ is transversal to the zero section in T Rd and Φf,ϕ × Φf,ϕ is transversal to the diagonal in R d×R d. Counting dimensions shows that if d ≥ 2m, transversality implies non-intersection, so that Φ f,ϕ is respectively immersive and injective. The main complication is dealing with the short periodic orbits. In particular, a priori it is possible for two such orbits for two different sequences (ω0, ω1, … , ωq-1 ) and (ω'0, ω'1, … , ω'q'-1 ) to have points in common. Not only does this prevent us perturbing ϕ independently on each orbit when we embed the orbit, but in fact it turns out to be an obstruction if we try and mimic the proof of Lemma 4.8 of [Stark, 1999] to show that generically periodic orbits are isolated. A different way of stating this problem is that in the standard case, if x is a periodic orbit of f of minimal period q then f i (x) = x if and only if i = 0 (mod q). This need no longer be the case in the present situation (and in fact the definition of minimal period requires some care). It turns out that to avoid this problem, we need to carry out an inductive step which simultaneously shows that if periodic orbits of period less than q are isolated and those for different sequences are distinct, then the same holds for period q orbits. This then allows us to carry the remainder of the proof in a more or less straightforward fashion. 4. PRELIMINARY CALCULATIONS 4.1. NOTATION As in §6 of [Stark, 1999] we first prove the theorem for a sufficiently large r, and then show in §5.5 that this implies the theorem for all r ≥ 1. It turns out that initially we shall want f and ϕ to be C 2r where r = n(d-1) (recall that we assume d ≥ 2m+1 and m ≥ 1). Also note that since Φ f,ϕ,y depends continuously on y (Lemma 4.3 below) and embeddings are open in C k(M, R d) (eg [Hirsch, 1976]), the set of y such that Φf,ϕ,y is an embedding for a fixed (f,ϕ) is open. Hence all that we need to prove is that this set has full Lebesgue measure. Recall that in §3.1, given ( f,ϕ)∈D2r(M×N, M)× C 2r(M, R), we defined f (i)∈D2r(M×N d-1, M) by f (i+1) (x,y) = f ( f (i)(x,y),yi ) = fy ( f (i)(x,y)) and f (0)(x,y) = x. The delay reconstruction map Φf,ϕ : M× N d-1 → Rd is then given by i Φ f,ϕ(x,y) (ϕ( f (0)(x,y)), ϕ( f (1)(x,y)), … , ϕ( f (d-1)(x,y))) † = and for a given y∈N d-1 , we also define Φf,ϕ,y : M → Rd as in (3.1) by Φ f,ϕ,y(x) Φ f,ϕ(x,y) = (4.1) Finally, let ρ : D2r(M×N, M)× C 2r(M, R) → Dr(M×N d-1, Rd) be the map that takes ( f,ϕ) to Φf,ϕ ρ( f,ϕ) = Φ f,ϕ (4.2) Recall (eg Appendix A of [Stark, 1999]) that the evaluation map evρ : D2r(M×N, M)× C 2r(M, R)× M×N d-1 → Rd of ρ is defined by evρ( f,ϕ,x,y) = ρ( f,ϕ)(x,y) = Φf,ϕ(x,y). In proving immersivity, we shall also want the corresponding map for the tangent map of Φ f,ϕ . Since we are only interested Embedding Stochastic Systems 15 in the immersivity of each Φ f,ϕ,y individually, we only want to differentiate in the x direction. Thus τ : D2r(M×N, M)× C 2r(M, R) → VB r-1 (TM×N d-1,T Rd) is given by τ(f,ϕ) T 1Φf,ϕ = (4.3) where T1 is the partial tangent operator in the x direction, so that T 1Φ(v) = TΦ(v,0). Here, extending the notation of Appendix B.3 of [Stark, 1999], VB r-1(TM×N d-1,T R d) is the space of maps from TM×N d-1 to T R d that are linear on the fibres of TM, and whose dependence on (x,y)∈M×N d-1 is C r , but on v∈Tx M is only C r-1 . Thus V B r-1(TM×N d-1,T R d) is a subspace of C r-1(TM×N d-1,T Rd), however since TM is not compact, the latter lacks a natural manifold structure. More usefully, note that for a given y∈N d-1 , a map in VB r-1(TM×N d-1,T Rd) is defined by its ~ action on unit vectors in TM. We can thus also treat τ as mapping into C r-1 (TM×N d-1,T R). The evaluation map evτ : D2r(M×N, M)× C 2r(M,R)×TM×N d-1 → Rd is given by evτ( f,ϕ,v,y) = τ(f,ϕ)(v,y) = T xΦ f,ϕ,y(v) Immersivity is defined by the condition that T xΦ f,ϕ,y(v) ≠ 0 for all v∈TM such that v ≠ 0. By linearity it is sufficient to consider just those v such that ||v || = 1, and hence to restrict ourselves ~ to the unit tangent bundle TM = { v∈TM : ||v || = 1 }. To emphasize this, as in §3.3 above we ~ ~ shall denote the restriction of TΦ f,ϕ,y to TM by TΦf,ϕ,y . 4.2. SMOOTHNESS OF THE EVALUATION MAPS In order to apply the Parametric Transversality Theorem to evρ×ρ and evτ we need to show that these evaluation maps are smooth. This is done in a very similar fashion to the analogous results for deterministic forcing given in Appendix C.1 and C.2 of [Stark, 1999]. We also take this opportunity to compute various derivatives of the evaluation maps that we shall use later. We begin with ρ̂i : D2r(M×N, M) → Dr(M×N d-1, M) given by ρ̂i (f ) f (i) = (4.4) _ _ Lemma 4.1 The map ρ̂i is C r. Given a η∈T f D2r(M×N, M), denote ηi = T f ρ̂i (η). Then η0 = 0 and for i = 1, …, d-1 we have _ ηi(x,y) _ η(xi-1 ,yi-1 ) + T (xi-1,yi-1) f (ηi-1 (x,y),0) = (4.5) Proof By induction on i. Since ρ̂0( f ) = Id for all f, ρ̂0 is trivially C r. Suppose that ρ̂i-1 for some i > 1 is C r. Now, f (i) = f ° ( f (i-1), πi-1 ), where πi : M×N d-1 → N is the projection π i (x,y) = yi . Therefore ρ̂i ( f ) = χ((ρ̂i-1 ( f ), πi-1 ), f ) where χ : D r(M×N d-1, M×N)× D2r(M×N, M) → Dr(M×N d-1, M) is the composition χ(F, f ) = f ° F. Hence ρ̂i = χ ° ((ρ̂i-1 , π̂i-1 ), Id ) where I d : D2r(M×N, M) → D2r(M×N, M) is the identity Id( f ) = f and π̂i : D2r(M×N, M) → C r(M×N d-1, N) is the constant map π̂i ( f ) = πi . Clearly Id and π̂i are C ∞ maps, and it is well known that the composition operator χ is C r ([Eells, 1966; Foster, 1975; Franks, 1979], or see Theorem B.2 of [Stark, 1999]). Hence by the chain rule ρ̂i is C r. Embedding Stochastic Systems 16 _ Since ρ̂0( f ) = Id for all f, we have η0 = 0. We differentiate ρ̂i ( f ) = χ(( ρ̂i-1 ( f ), πi-1 ), f ) using the chain rule and the fact that T (F, f)χ(ζ,η ) = η °F + Tf °ζ (see [Eells, 1966; Foster, 1975; Franks, 1979] or Theorem B.2 of [Stark, 1999]). This yields T f ρ̂i (η) = η ° (ρ̂i-1 ( f ), πi-1 ) + Tf ° (T f ρ̂i-1 (η), _ _ _ 0). Substituting Tf ρ̂i (η ) = ηi and ρ̂i-1 ( f ) = f (i-1) we obtain ηi = η ° ( f (i-1), πi-1 ) + Tf ° (ηi-1 ,0). Evaluating this at (x,y) and using the fact that f (i-1)(x,y) = xi-1 and πi-1 (x,y) = xi-1 gives (4.5). ■ The corresponding evaluation map evρ̂ : D2r(M×N, M)×M×N d-1 → M is given by evρ̂ ( f,x,y) = f (i)(x,y) = xi . Note that, with one or two exceptions, we will only ever need to evaluate Tevρ̂ on vectors of the form (η,0x,0y). In other words, we only need to consider perturbations to f but not to x or y. i i i _ Corollary 4.2 The evaluation map evρ̂ is C r and T (f,x,y)evρ̂ (η,0x,0y) = ηi(x,y). i i Proof By Corollary B.3 of [Stark, 1999] (which is a simple corollary of the smoothness of composition), the evaluation operator is smooth. More precisely ev : Dr(M×N d-1, M)×M ×N d-1 → M given by ev(F,x,y) = F(x,y) is C r and T (F,x,y)ev(ζ,v,w) = ζ(x,y) + T (x,y)F(v,w). But evρ̂ = ev ° (ρ̂i ×Idx × Idy) and hence evρ̂ is C r by the chain rule. Differentiating evρ̂ = ev ° (ρ̂i × Idx × Idy) using the chain rule we obtain i i T (f,x,y)evρ̂ (η,v,w) _ ηi(x,y) + T (x ,y) f (i)(v,w) = i i (4.6) _ Evaluating this for v = 0x , w = 0y yields T(f,x,y)evρ̂ (η,0x,0y) = ηi(x,y), as claimed. ■ i Now define ρi : D2r(M×N,M)× C 2r(M, R) → C r(M×N d-1, R) by ρi (f,ϕ) ϕ ° f (i) = (4.7) Lemma 4.3 The map ρi is C r, with T (f,ϕ)ρi (η,ξ) _ ξ ° f (i) + Tϕ ° ηi = (4.8) The evaluation map evρ : D2r(M×N, M)× C 2r(M,R)×M×N d-1 → R is also C r and if we denote Ξ = ( f,ϕ,x,y) then i T Ξevρ (η,ξ,v,w) _ ξ(xi ) + T xiϕ[ηi(x,y) + T (x,y)f (i)(v,w)] = i (4.9) Proof We have ρi = χ' ° (ρ̂i ×Idϕ) where χ' is the composition χ' : C r(M×N d-1, M)× C 2r(M, R) → C r(M×N d-1, M) and Idϕ is the identity on C 2r(M, R). As above, χ' and Idϕ are C r and so is ρ̂ by Lemma 4.1. Hence ρi is C r by the chain rule. Differentiating ρi = χ' °(ρ̂i ×Idϕ), eg using Theorem _ B.2 of [Stark, 1999], we obtain T (f,ϕ)ρi (η,ξ) = ξ ° ρ̂i ( f ) + Tϕ ° T f ρ̂i (η) = ξ° f (i) + Tϕ ° ηi , as required. Turning to the evaluation function evρ , we proceed in a similar fashion to Corollary 4.2. We have evρ = ev' ° ( ρi ×Idx ×Idy) where ev' : C r(M×N d-1,R)×M×N d-1 → R is the evaluation ev'(Φ,x,y) = Φ(x,y). By Corollary B.3 of [Stark, 1999] this is C r and T (Φ,x,y)ev'(ζ,v,w) = ζ(x,y) + T (x,y)Φ(v, w). Hence evρ is C r by the chain rule and TΞevρ (η,ξ,v,w) = T (f,ϕ)ρi (η,ξ)(x,y) + T (x,y)( ρi (f,ϕ))(v,w) _ _ = ξ° f (i)(x,y) + Tx ϕ(ηi(x,y)) + T x ϕ(T (x,y) f (i)(v,w)) = ξ(xi ) + Tx ϕ[ηi(x,y) + T(x,y) f (i)(v,w)]. ■ i i i i i i i Embedding Stochastic Systems 17 We shall not need to evaluate T Ξevρ except for vectors of the form (0f ,ξ,0x ,0y), in other words, we only consider perturbations to ϕ, but not to f, x or y. We now turn to analogous results for the derivative of ϕ °f (i) and so define τi : D2r(M×N, M)× C 2r(M, R) → VB r-1 (TM×N d-1,T R) by i τi (f,ϕ) T 1(ϕ ° f (i)) = (4.10) where as usual T 1 is the partial tangent in the x direction. Lemma 4.4 The operator τi is C r. Its evaluation map ev τ : D2r(M×N, M)× C 2r(M, R)×TM×N d-1 → T R is C r-1 and if we denote Ξ' = ( f,ϕ,v,y) then i T Ξ' evτ (0f ,ξ,0v ,0y) ϖ(T xiξ(vi)) = i where xi = f (i)(x,y) and vi = T (x,y) f (i)(v,0y) = T 1,(x,y) f (i)(v) and ϖ is the canonical involution on T(T R). Note that in [Stark, 1999] the canonical involution is denoted by ω , but here we want to avoid confusion with the usage of ω as the forcing sequence in §2 and §3.1. Proof We have τi = σ1' ° ρi where σ1' : C r(M×N d-1,R) → VB r-1 (TM×N d-1, T R) is the operator that takes the partial tangent in the x direction. Then σ1' (Φ)(w,y) = T1Φ(w,y) = TΦ(w,0y) = σ'(Φ)(w, 0 y) where σ ' : C r(M×N d-1,R) → VB r-1 (TM×TN d-1 , T R) is the full tangent operator. By Lemma B.11 of [Stark, 1999], σ' is C ∞ and T Φ σ'(ζ) = ϖ ° Tζ , Let ιy : VB r-1 (TM×TN d-1 , T R) → VB r-1 (TM× N d-1 , T R) be the inclusion defined by ιy(Ψ)(v,y) = Ψ(v,0y). Thus ιy(Ψ) = Ψ ° (Id×LN ), where LN : N d-1 → TN d-1 is the zero section in TN d-1 , ie LN (y) = 0y . Hence ιy is just composition on the right with a fixed function. This is just a linear map in the local chart described in Appendix B.1 of [Stark, 1999], and hence ιy is C ∞, with TΨιy(ζ)(v,y) = ζ(v,0y) (this result appears as Proposition 4.2 in [Franks, 1979]). But σ'1(Φ)(v,y) = σ'(Φ)(v,0y ) = ιy(σ'(Φ))(v, y) so that σ'1 = ιy ° σ' and thus σ1' is C ∞. Hence by the chain rule τi = σ1' ° ρi = ιy ° σ' ° ρi is C r. d-1 d-1 d-1 Proceeding as in Corollary 4.2 and Lemma 4.3, we see that the evaluation function evτ is given by ev τ = ev'' °(τi ×Idv ×Idy) where ev'' : VB r-1 (TM×N d-1, T R)×TM×N d-1 → T R is the evaluation ev''(Ψ,v,y) = Ψ(v,y). By Corollary B.3 of [Stark, 1999] this is C r-1 and T (Ψ,v,y)ev''(ζ,w,w' ) = ζ(v, y) + T (v,y)Ψ(w,w' ). Thus ev τ is C r-1 by the chain rule and TΞ' evτ (0f ,ξ,0v ,0y ) = T (f,ϕ)τi (0f ,ξ)(v,y). Differentiating τi = ιy ° σ' °ρi using the chain rule gives T (f,ϕ)τi (0f ,ξ) = T ιy ° ϖ °T(T ( f,ϕ)ρi (0f ,ξ)). By (4.8) we have T (f,ϕ)ρi (0f ,ξ) = ξ ° f (i) and hence T Ξ' evτ (0f ,ξ,0v ,0y ) = ϖ(T ξ ° Tf (i)(v,0y )) = ϖ (T x ξ ° T (x,y) f (i)(v,0y)) = ϖ(T x ξ(vi)) as required. ■ i i i i i i i 5. PROOF OF THEOREM 3.4 FOR DIM N ≥ 1 It turns out that initially we shall want f and ϕ to be C 2r where r = n(d-1) (recall that we assume d ≥ 2m+1 and m ≥ 1). Also note that since Φ f,ϕ,y depends continuously on y (Lemma 4.3 below) and embeddings are open in C k(M, Rd) (eg [Hirsch, 1976]), the set of y such that Φ f,ϕ,y is an embedding for a fixed (f,ϕ) is open. Hence all that we need to prove is that this set has full Lebesgue measure. As long as n ≥ 1, the set of y∈N d-1 such that yi = y j for some i ≠ j is closed, nowhere dense and ~ has zero Lebesgue measure. We can thus ignore it and restrict ourselves to Nd-1 where ~ Nd-1 = { y∈N d-1 : yi ≠ yj for all i ≠ j } (5.1) Embedding Stochastic Systems 18 PARTITIONS OF THE C OMPONENTS OF Φ 5.1. As sketched in §3.3 above, our overall strategy will be to make evρ×ρ and evτ transversal to specific submanifolds of Rd×Rd and T Rd. The difficulty in doing this occurs at points where xi = xj with i ≠ j. At such points, the i th and j th components of Φf,ϕ are not cannot be perturbed independently. We therefore want to define independent subsets of such components. We employ the same notation as in [Stark, 1999]. Thus, let I = { I1, I2, … , Iα } be a partition of { 0, … , d-1 }, and define the associated equivalence relation ~I on {0, …, d-1} by i ~I j if and only if i and j are in the same element of the partition. Given any such partition I, let JI be a set containing precisely one element from each Ik for k = 1, … , α . There will typically be many ways to choose such a J I , but we arbitrarily select just one. Clearly JI has α elements. Write these as JI = { j1, j2, α … , jα } with j1 < j2 < … < jα . For a given partition I, define the map Φf,ϕ,I : M×N d-1 → R by Φ f,ϕ,I(x,y) (ϕ( f (j )(x,y)), ϕ( f (j )(x,y)), …, ϕ( f (jα)(x,y))) † = 1 2 (5.2) We can also write this as Φ f,ϕ,I(x,y) (ϕ(xj ), ϕ(xj ), …, ϕ(xjα ))† = 1 2 where xi = f (i)(x,y). This has the advantage that it emphasizes that we observe the x co-ordinate only, but the disadvantage that it obscures the dependence of xi on y, which can be a potential source of errors when performing calculations. Let ρI : D2r(M×N, M)× C 2r(M, R) → C r(M×N d-1, α R ) be defined in the obvious way by ρI(f,ϕ) Φ f,ϕ,I = (5.3) α and for any y∈N d-1 , define Φf,ϕ,I,y : M → R by Φ f,ϕ,I,y(x) Φ f,ϕ,I(x,y) = Note that ρI = (ρj , ρj , … , ρjα )† where ρi (f,ϕ) = ϕ °f (i) as defined in (4.7). As an immediate consequence of this we get 1 2 Corollary 5.1 The map ρI and its evaluation map evρ are C r for any partition I of { 0, … , d-1 }. If Ξ = ( f,ϕ,x,y) then I T Ξevρ (0f ,ξ,0x ,0y) I (ξ(xj ), ξ(xj ), …, ξ(xjα ))† = 1 2 This corollary is the crucial result underlying our whole approach: it shows that if the points xj , α xj , … , xjα are distinct then T Ξevρ is surjective, and hence transversal to any submanifold of R . ~ α Finally let τI : D 2r(M×N,M)× C 2r(M,R) → C r-1 (TM×N d-1,T R ) be defined by 1 2 I τI( f,ϕ) T 1Φf,ϕ,I = (5.4) where, as above, T1 is the partial tangent operator in the x direction. Thus evτ ( f,ϕ,v,y) I τI( f,ϕ)(v,y) = T xΦ f,ϕ,I,y(v) = Similarly to ρI we have τI = (τj , τj , …, τjα )†, where τi ( f, ϕ) = T1(ϕ ° f (i)) as in (4.10), giving: 1 2 Embedding Stochastic Systems 19 Corollary 5.2 The map τI and its evaluation map evτ are C r-1 for any partition I of { 0, … , d-1 }. If Ξ' = (f,ϕ,v,y) then I T Ξ' evτ (0f ,ξ,0x ,0y ) 5.2. I (ϖ(T x ξ(vj )), ϖ(T x ξ(vj )), … , ϖ(T xα ξ(vjα ))) † = 1 1 2 2 SURJECTIVITY OF evρ̂ Recall from Lemma 4.1 and Corollary 4.2 that ρ̂i : D2r(M×N, M) → Dr(M×N d-1, M) given by ρ̂i (f ) _ _ = f (i ) is C r and T (f,x,y)evρˆ (η,0x,0y) = ηi(x,y) where ηi satisfies (4.5). Let ρ̂ = (ρ̂0 , ρ̂1 , …, ρ̂d-1 ) : D2r(M×N, M) → D r(M×N d-1, Md) and denote the corresponding evaluation map evρ̂ : D 2r(M× N, M)×M×N d-1 → Md. Then evρ̂ is C r and i ~ ~ Lemma 5.3 T (f,x,y)evρ̂ is surjective at all ( f,x,y)∈D2r(M×N, M)×M×Nd-1 , where Nd-1 is defined in (5.1). More precisely, given any (u0, u1, … , ud-1) ∈T x M× … ×Tx M, we can find a η ∈Tf D2r(M× N, M) such that T(f,x,y)evρ̂(η,u0,0y) = (u0, u1, … , ud-1). ~ Proof If y∈Nd-1 then the points { y0, y1, … , yd-2 } are all distinct, and hence { (x0, y0), (x1, y1), … , (xd-2 , yd-2 ) } are distinct. By Corollary C.12 of [Stark, 1999], given any set of ui∈T x M for i = 1, … , d-1 we can find a η∈T f D2r(M×N, M) such that η(xi-1 , yi-1 ) = ui for i = 1, … , d-1. So fix some i∈{ 1, … , d-1 } and some ui∈T x M. Choose η∈T f D2r(M×N, M) such that η(xi-1 , yi-1 ) = ui , η(xi , yi ) _ = -T (x ,y ) f (ui,0) if i ≠ d-1 and η(xj, yj) = 0 for j ≠ i-1, i . Then by (4.6) T (f,x,y)evρ̂ (η,0x,0y ) = ηi(x,y) _ = η(xi-1 , yi-1 ) = ui and if i ≠ d-1 then T(f,x,y)evρ̂ (η,0x,0y ) = η(xi , yi ) + T(x ,y ) f (ηi(x,y),0) = -T(x ,y ) f (ui , _ 0) + T (x ,y ) f (ui ,0) = 0. Furthermore ηj(x,y) = 0 for j ≠ i-1, i and hence T (f,x,y)evρ̂(η,0x,0y ) = (0, … , 0, ui , 0, … , 0)∈T x M× … ×T x M× … ×Tx M . Hence by linearity, given any (u1, … , ud-1 )∈T x M× … ×Tx M, we can find a η∈T f C 2r(M×N, M) such that T(f,x,y)evρ̂(η,0x,0y ) = (0, u1, … , ud-1). 0 d-1 i i i i+1 i i i i i i 0 i d-1 i i d-1 1 It remains to treat the first component. Note that the corresponding proof in Lemma 5.12 of [Stark, 1999] contains an erroneous calculation, but the argument used here is equally valid in that case. Recall that f (0)(x,y) = x and hence given any u0∈T x M, we have by (4.6) T (f,x,y)evρ̂(0f ,u0, 0 y ) = (u0, u1, … , ud-1), for some u1, …, ud-1 whose precise values do not concern us (in fact ui = _ T (x,y) f (i)(u0,0) since if η = 0 we have ηi = 0 by (4.5)). By the first part of the proof, choose η∈ T f D2r(M×N, M) such that T(f,x,y)evρ̂(η,0x,0y ) = (0, u1, … , ud-1). Then T (f,x,y)evρ̂ (-η,u0 ,0y ) = (u 0,0, … , 0), and hence by linearity T(f,x,y)evρ̂ is surjective. ■ 0 5.3. 0 IMMERSIVITY OF Φ As in [Stark, 1999], the basic idea is to make evτ transversal to the zero section in T Rd and then ~ count dimensions. Observe that if for some v∈T x M we have T xΦ f,ϕ,I,y(v) ≠ 0 for some I, then T xΦ f,ϕ,y(v) ≠ 0. Therefore to prove that Φ f,ϕ is immersive at x it is sufficient to show that for all ~ α v∈T x M we have TxΦ f,ϕ,I,y(v) ≠ 0 for some I. If we define the zero section LI in T R by LI = { v∈TM : v = 0 } then our aim is to show that the image of TxΦf,ϕ,y does not intersect LI. As in [Stark, 1999], we proceed by defining a codimension (d-α )m submanifold of Md by ∆I = { (z0, z1, …, zd-1 )∈Md : zi = zj if and only if i ~I j } Embedding Stochastic Systems 20 Recall that ρ̂ = ( ρ̂0, ρ̂1, …, ρ̂d-1 ) : C 2r(M×N, M) → C r(M×N d-1, Md) with ρ̂i as in (4.4), and that evρˆ ~ is C r by Corollary 4.2. Let ιτ be the inclusion ιτ : C r(M×N d-1,Md ) → C r-1 (TM×N d-1,Md) given by ~ ιτ(F) = F ° (τM×Id) where τM : TM → M is the tangent bundle projection, thus ιτ(F)(v,y) = F(x,y) where x∈M is the point such that v∈T xM, ie x = τ M(v). Since ιτ is just composition with fixed ~ ~ α maps, it is C r. Now define τ'I : D2r(M×N, M)× C 2r(M, R) → C r-1 (TM×Nd-1 , T R ×Md) by τ'I( f,ϕ) (τI( f,ϕ), ιτ ° ρ̂( f )) = ~ ~ ~ Note that we have restricted the domain of τ'I(f,ϕ) to TM×Nd-1 , where Nd-1 is defined in (5.1). In other words we only consider those y∈N d-1 in which no two components are the same. Since ~ ~ ~ α Nd-1 is not compact there is no natural manifold structure on C r-1 (TM×Nd-1 , T R ×Md) and we cannot speak of τ'I being smooth. However, it is only really the evaluation map evτ' : D2r(M×N, ~ ~ α M)× C 2r(M, R)×TM×Nd-1 → T R ×M that we need. This is given by I evτ' ( f,ϕ,v,y) I = (evτ ( f,ϕ,v,y),evρ̂( f,τM(v),y)) = (τI( f,ϕ)(v,y),ρ̂( f )(x,y)) = (T xΦ f,ϕ,I,y(v),( f (0)(x,y), f (1)(x,y), … , f (d-1)(x,y))) I where x = τM(v). Since evτ (by Lemma 4.4), evρ̂ (by Corollary 4.2) and τM are all C r-1 so is evτ' . We now claim that I I Proposition 5.4 Given any partition I of {0, … , d-1 }, T(f,ϕ,v,y)evτ' is surjective at all (f,ϕ,v,y)∈ ~ ~ D2r(M×N, M)× C 2r(M, R)×TM×Nd-1 such that ρ̂( f )(x,y)∈∆I . Hence, in particular evτ' is transversal to LI×∆I . I I Proof Suppose that evτ' ( f,ϕ,v,y)∈LI×∆I . We aim to evaluate T Ξevτ' , where Ξ = ( f,ϕ,v,y). We first consider the second component. Let evρ̂τ be the evaluation map of ιτ ° ρ̂, so that evρ̂τ( f,v,y) = evρ̂( f,τM(v),y). Thus by the chain rule T (f,v,y)evρˆτ(η,w,w' ) = T (f,x,y)evρˆ(η,T vτM(w),w' ) where x = τM(v). Let ρ̂( f )(x,y) = z ∈∆I . Then by Lemma 5.3, given any u = (u0, u1, … , ud-1)∈T z∆I , there exists a η∈T f D2r(M×N,M) such that T (f,x,y) evρ̂(η,u0,0y) = u. Furthermore τM is a submersion, ie T vτM is surjective for all v∈TM. This can be shown using local co-ordinates, see for instance the paragraph following Lemma B.4 of [Stark, 1999]. Hence there exists a w ∈T v(TM) such that T vτM(w) = u0 and hence T (f,v,y)evρ̂τ(η, w,0y ) = u (so that in particular T(f,v,y)evρˆτ is surjective). I I Turning to the first component, for any ξ∈T ϕ C 2r(M,R) we have by linearity T Ξevτ (η,ξ,w,0y ) I = T Ξevτ (η,0ϕ,w,0y ) + TΞevτ (0η ,ξ,0v ,0y ) = ~ u + TΞevτ (0η ,ξ,0v ,0y ) I I I α for some ~ u ∈T(T R ), independent of ξ. By Corollary 5.2, T Ξevτ (0f ,ξ,0x ,0y ) = ( ϖ(T x ξ(vj )), ϖ(T x ξ(vj )), … , ϖ(T xα ξ(vjα ))) †, where xi = f (i)(x,y) and vi = T(x,y) f (i)(v,0y ). 2 I 2 1 1 Since evρ̂τ( f,v,y) = evρ̂( f,τM(v),y) = ρ̂( f )(x,y)∈∆I , the points xj , xj , … , xjα are all distinct. For a fixed y, f (j ) is a diffeomorphism and ||v|| = 1 and hence vj ≠ 0 for i = 1, … , α . Thus, by Corol_ α lary C.16 of [Stark, 1999], for any u∈T 0(T R ) there exists a ξ∈T ϕ C 2r(M, R) such that T Ξevτ (0f , _ _ _ _ ξ,0x ,0y ) = u - ~ u . Hence TΞevτ (η,ξ,w,0y ) = ~ u + (u - ~ u ) = u and so T Ξevτ' (η,ξ,w,0y ) = ( u, u). Thus T Ξevτ' is surjective, and in particular transversal to LI ×∆I , as required. ■ 1 2 i i I I I I Embedding Stochastic Systems 21 ~ ~ Observe that dim TM×Nd-1 - codim LI ×∆I = 2m - 1 + n(d-1) - (d- α )m - α, and 2m - 1 + n(d-1) - (d- α )m - α ≤ d - 2 - (d -α)m - α + n(d-1) ≤ (d -α)(1-m) - 2 + n(d-1) < n(d-1) - 1 (5.5) ~ ~ Recall that r = n(d-1) and hence dim TM×Nd-1 - codim LI ×∆I < r - 1. Since evτ' is C r-1 , it satisfies the smoothness condition of the Parametric Transversality Theorem ([Abraham, 1963; Abraham and Robbin, 1967] or see Appendix A of [Stark, 1999]). Thus there is a residual set of ( f,ϕ)∈D2r(M×N, M)× C 2r(M, R) for which τ'I( f,ϕ) is transversal to LI ×∆I . Fix any ( f,ϕ) in this set, ~ ~ α and define τ'f,ϕ,I : Nd-1 → C r-1 (TM, T R ×Md) by τ'f,ϕ,I(y)(v) I τ'I(f,ϕ)(v,y) = so that evτ' ϕ (v,y) = τ'I(f,ϕ)(v,y). Since by our choice of ( f,ϕ), τ'I( f,ϕ) is transversal to LI ×∆I , we see that evτ' ϕ is transversal to LI ×∆I . Also, evτ' ϕ (v,y) = evτ' (f,ϕ,v,y) and so evτ' ϕ is C r-1 . Similarly ~ ~ to above, we have dim TM - codim LI ×∆I = 2m - 1 - (d- α )m - α and hence using (5.5) dim TM - codim LI ×∆I < -1 < r - 1. Therefore, by the Measure Theoretic Finite Dimensional Parametric Transversality Theorem (Appendix A), τ'f,ϕ,I(y) is transversal to LI ×∆I for a residual set of full ~ Lebesgue measure of y in Nd-1 . ~ The dimension of TM is 2m - 1 and that of LI ×∆I is α + dm - (d-α )m = (m+1)α. Using (5.5) ~ again, we have 2m - 1 < (d-α)m + α and hence dim TM + dim LI ×∆I = 2m - 1 + (m+1)α < (d- α )m α ~ ~ + α + (m+1)α = 2α + dm = dim T R ×Md. Now, dim T v(TM) = dim TM, and the dimension of ~ ~ ~ T v[τ'f,ϕ,I(y)](T v(TM)) cannot be greater than that of T v(TM). Hence dim T v[τ'f,ϕ,I(y)](T v(TM)) ≤ ~ ~ dim TM. Suppose that the image of TM intersected LI ×∆I . Denote τ'f,ϕ,I(y)(v) = z∈LI×∆I . Then ~ dim Tz(LI ×∆I) = dim LI ×∆I and hence dim T v[τ'f,ϕ,I(y)](T v(TM)) + dim T z(LI ×∆I) < 2α + dm = α ~ α dim T z(T R ×Md). Hence T v[τ'f,ϕ,I(y)](T v(TM)) and Tz(LI ×∆I) cannot together span T z(T R ×Md) and thus the intersection cannot be transversal. f, ,I f, ,I f, ,I I f, ,I We thus see that if τ'f,ϕ,I(y) is transversal to LI ×∆I , then its image cannot intersect LI ×∆I . Hence, ~ either T xΦ f,ϕ,I,y(v)∉LI or ρ̂( f )(x,y)∉∆I . But for every (x,y)∈M×Nd-1 we have ρ̂( f )(x,y)∈∆I for some I. For this choice of I we must thus have T xΦ f,ϕ,I,y(v)∉LI . But if T xΦ f,ϕ,I,y(v)∉LI , then T xΦ f,ϕ,y(v) ≠ ~ 0. Hence for a residual set of full Lebesgue measure of y we have TxΦ f,ϕ,y(v) ≠ 0 for all v∈TM, so that Φ f,ϕ,y is immersive, as required. ■ 5.4. INJECTIVITY OF Φ This closely parallels §6.3 of [Stark, 1999]. As indicated in §3.3, the strategy is to make Φ f,ϕ × Φ f,ϕ transversal to the diagonal in R d×R d. As in the previous section, we need to need to take particular care at points x such that xi = xj with i ≠ j, but we now additionally also have to consider pairs (x,x' )∈(M×M)\∆ such that xi = x'j (where ∆ is the diagonal in M×M ). As in §6.3 of [Stark, 1999], we proceed by letting R be a subset of JI × JI (possibly empty) and setting JI,R = { i∈JI : (i,i' )∉R for any i'∈JI } Embedding Stochastic Systems 22 For each choice of R, we aim to restrict to points (x,x' )∈(M×M)\∆ such that xi = x'i' if and only if (i,i' )∈R. Let β be the number of elements in R and γ the number of elements in JI,R . Since JI,R is obtained by removing at most β elements from JI, we have γ ≥ α - β. If γ > 0, write JI,R = {j'1, j'2, … , j'γ }, with j'1 < j'2 < … < j'γ . Our aim will be to ensure that the sets {xj' , xj' , … , xj'γ } and {x'j' , x'j' , … , x' j'γ} are distinct. As in §5.6 and §6.3 of [Stark, 1999], we thus define the map ρˆ' : ~ D2r(M×N,M) → C r((M×M)\∆×Nd-1 , M d×M d ) by 1 2 1 2 ρ̂'( f )(x,x',y) = (ρ̂( f )(x,y),ρ̂( f )(x',y)) = (( f (0)(x,y), … , f (d-1)(x,y)), ( f (0)(x',y), … , f (d-1)(x',y))) and the set ∆I,R { (z0, … , zd-1 , z'0, … , z'd-1 )∈M d×M d : (z0, … , zd-1 )∈∆I , and = if (i,i' )∈JI ×JI then zi = z'i' if and only if (i,i' )∈R } Then if evρˆ'( f,x,x',y) = ρ̂'( f )(x,x',y)∈∆I,R , we see that the points { xj' , xj' , … , xj'γ } are all distinct and the sets {xj' , xj' , … , xj'γ } and {x'j' , x'j' ,… , x'j'γ} do not intersect. Observe that for points in ∆I,R we may independently set α of the points (z0, … , zd-1 ). The relation R then fixes β of the points (z'0, … , z'd-1), leaving us free to fix the remaining d- β points. Thus the dimension of ∆I,R is (d-β+ α )m or in other words its codimension in M d×M d is (d-α R)m, where αR = α - β. Next, γ define the map Φ f,ϕ,I,R : M ×N d-1 → R by 1 2 1 1 2 2 Φ f,ϕ,I,R(x,y) = (ϕ( f (j' )(x,y)), ϕ( f (j' )(x,y)), … , ϕ( f (j'γ)(x,y))) † = (ϕ(xj' ), ϕ(xj' ), …, ϕ(xj'γ ))† 1 1 2 2 with Φ f,ϕ,I,R(x,y) = 0 if γ = 0 (using the convention that R0 = {0}). Then, as usual, define Φ f,ϕ,y,I,R γ : M → R by Φ f,ϕ,y,I,R(x) Φ f,ϕ,I,R(x,y) = Observe that if Φ f,ϕ,y,I,R(x) ≠ Φ f,ϕ,y,I,R(x' ) then Φf,ϕ,y(x) ≠ Φ f,ϕ,y(x' ). Thus if for all (x,x' )∈(M×M)\∆ we have Φ f,ϕ,y,I,R(x) ≠ Φ f,ϕ,y,I,R(x' ) for some I,R, then Φ f,ϕ,y is injective on M. Define ρI,R : D2r(M× ~ γ γ N,M)× C 2r(M, R) → C r((M×M)\∆×Nd-1 , R ×R ) by ρI,R( f,ϕ)(x,x',y) = (Φ f,ϕ,I,R(x,y),Φ f,ϕ,I,R(x',y)) ~ γ γ and ρ'I,R : D2r(M×N,M)× C 2r(M, R) → C r((M×M)\∆×Nd-1 , R ×R ×Md×M d ) by ρ'I,R( f,ϕ) = (ρI,R( f,ϕ),ρ̂'( f )) We thus have evρ' ( f,ϕ,x,x',y) I,R = (evρ ( f,ϕ,x,x',y),evρ̂'( f,x,x',y)) = (ρI,R( f,ϕ)(x,x',y),ρ̂'( f )(x,x',y)) = (Φ f,ϕ,I,R(x,y),Φ f,ϕ,I,R(x',y),ρ̂'( f )(x,x',y)) I ,R Embedding Stochastic Systems 23 ((ϕ( f (j' )(x,y)), ϕ( f (j' )(x,y)), … , ϕ( f (j'γ)(x,y))), = 1 2 (ϕ( f (j' )(x',y)), ϕ( f (j' )(x',y)), … , ϕ( f (j'γ)(x',y))), 1 2 ( f (0)(x,y), … , f (d-1)(x,y)),( f (0)(x',y), … , f (d-1)(x',y))) ((ϕ(xj' ), ϕ(xj' ), …, ϕ(xj'γ )),(ϕ(x'j' ), ϕ(x'j' ), …, ϕ(x'j'γ )), = 1 2 1 2 (x0, … , xd-1 ), (x'0, … , x'd-1 ))) By Lemma 4.3 evρ is C r and by Corollary 4.2 evρ̂' is C r and hence evρ' is C r. Finally, let ∆γ be γ γ the diagonal in R×R . I ,R I,R Proposition 5.5 Given any partition I of { 0, … , d-1 }, and any subset R of J I × JI (possibly empty) evρ' is transversal to ∆γ ×∆ I,R for all I,R. I,R Proof Denote Ξ = ( f,ϕ,x,x',y) and suppose that evρ' (Ξ)∈∆γ ×∆ I,R , for some (x,x',y)∈(M×M\∆)× ~ ~ Nd-1 . As in Proposition 5.4, we first consider the second component. For y∈Nd-1 the points { y0, y1, … , yd-2 } are disjoint, and hence (xi , yi ) ≠ (xi' , yi'), (x'i , yi ) ≠ (xi' , yi') and (xi , yi ) ≠ (x'i ' , yi') for all i ≠ i'. Furthermore x ≠ x' (recall we only consider (x,x' )∈(M×M\∆)) and fy is a diffeomorphism for all yi ∈N and hence (xi , yi ) ≠ (x'i , yi ). Hence { (x0, y0), (x1, y1), … , (xd-2 , yd-2 ), (x'0, y'0), (x'1, y'1), … , (x'd-2 , y'd-2 ) } are disjoint. By Corollary C.12 of [Stark, 1999], given any set of ui∈T x M and u'i ∈ T x' M for i = 1, … , d-1 we can find a η∈T f D2r(M×N, M) such that η(xi-1 , yi-1 ) = ui and η(x'i-1 , yi-1 ) = u'i for i = 1, … , d-1. Hence by a straightforward extension of the proof of Lemma 5.3, given any (u1, … , ud-1)∈T x M× … ×Tx M and (u'1, … , u'd-1)∈T x M× … ×Tx M we can find a η∈T f C 2r(M× N, M) such that T( f,x,x',y)evρˆ'(η,0x,0x' ,0y ) = ((0, u1, … , ud-1),(0, u'1, … , u'd-1)). On the other hand, as in the second part of the proof of Lemma 5.3, T ( f,x,x',y)evρ̂'(0η , u0 ,0x' ,0y ) = ((u0, u1, … , ud-1),(0, … , 0)) and hence given any u0∈T x M, there exists η∈T f C 2r(M× N, M) such that T( f,x,x',y)evρ̂'(η,u0 ,0x' , 0 y ) = ((u0, 0, … , 0),(0, … , 0)). Similarly given any u'0∈T x' M, there exists η∈T f C 2r(M×N, M) such that T ( f,x,x',y)evρ̂'(η,0x,u'0 ,0y ) = ((0, … , 0),(u'0, 0, … , 0)). Hence by linearity given any (u0, … , ud-1)∈T x M× … ×Tx M and (u'0, … , u'd-1)∈T x' M× … ×Tx' M we can find a η∈T f C 2r(M×N, M) such that T ( f,x,x',y)evρ̂'(η,u0 ,u'0 ,0y ) = ((u0, … , ud-1),(u'0, … , u'd-1)) (so that in particular T( f,x,x',y)evρ̂' is surjective). I,R i 1 d-1 1 0 d-1 0 0 d-1 0 i i d-1 γ γ Now let us turn to the other component T Ξevρ of T Ξevρ' . If γ = 0, then T(R ×R ) is a single point, and hence the surjectivity of T( f,x,x',y)evρˆ' is sufficient to ensure the surjectivity of TΞevρ' . If γ > 0, then for any ξ∈T ϕ C 2r(M,R) we have by linearity I,R I ,R T Ξevρ (η,ξ,u0 ,u'0 ,0y ) I ,R I,R = T Ξevρ (η,0ϕ ,u0 ,u'0 ,0y ) + TΞevρ (0η ,ξ,0x ,0x' ,0y ) = (~ u ,~ u ' ) + TΞevρ (0η ,ξ,0x ,0x' ,0y ) I ,R I ,R I ,R γ γ γ γ ~ ,u ~ ' ) is for some (~ u ,~ u ' )∈T (z,z)(R ×R ) = T zR ×TzR , where ρI,R( f,ϕ)(x,x',y) = (z,z)∈∆γ . Note that (u independent of ξ. By (4.9), T Ξevρ (0η ,ξ,0x ,0x' ,0y ) = ((ξ(xj' ), ξ(xj' ), … , ξ(xj'γ )),(ξ(x'j' ), ξ(x'j' ), … , ξ(x'j'γ ))). Since ρ̂'( f )(x,x',y)∈∆I,R , the points xj , xj , … , xjα are all distinct and { xj , xj , … , xjα } ∩ { __ x'j' , x'j' , … , x'j'γ } = ∅. Thus, as in the proof of Proposition 6.2 of [Stark, 1999], given any (u,u' )∈ γ γ T zR ×TzR , by Corollary C.12 of [Stark, 1999] there exists a ξ∈T ϕ C 2r(M, R) such that TΞevρ (0η , _ _ _ _ _ ~- ~ ξ,0x ,0x' ,0y ) = ( u-u',0z) - (u u ',0z). Then T Ξevρ (η,ξ,u0,u'0,0y ) = ( ~ u ,~ u ') + (u-u',0z) - (u~- ~ u ',0z) = ( u _ ~ ~ _ _ __ - u' + u ',u ' ). But (u',u' )∈T (z,z)∆γ and ( ~ u ',~ u ' )∈T (z,z)∆γ , and hence (u,u' )∈ Image T Ξevρ + T (z,z)∆γ . I ,R 1 2 1 2 1 2 1 1 2 I ,R 2 I ,R I ,R Embedding Stochastic Systems γ 24 γ Thus Image TΞevρ + T (z,z)∆γ = T (z,z)(R ×R ), and since as we have shown above, the second component of TΞevρ' is surjective, this implies that evρ' is transversal to ∆γ × ∆I,R as required. ■ I ,R I,R I,R Now, just as in the proof of immersivity in § 5.3 all that remains is to count dimensions and ap~ ply the Parametric Transversality Theorem. We have dim (M×M)\∆×Nd-1 - codim ∆γ ×∆I,R = 2m + n(d-1) - γ - (d- α R)m. Since γ ≥ αR , we have 2m + n(d-1) - γ - (d- α R)m ≤ d - 1 - (d -αR)m - αR + n(d-1) ≤ (d -αR)(1-m) - 1 + n(d-1) < n(d-1) (5.6) ~ Recall that r = n(d-1) and hence dim (M×M)\∆×Nd-1 - codim ∆γ ×∆I,R < r. Since evρ' is C r, it satisfies the smoothness condition of the Parametric Transversality Theorem ([Abraham, 1963; Abraham and Robbin, 1967] or see Appendix A of [Stark, 1999]). Thus there is a residual set of ( f,ϕ)∈D2r(M×N, M)× C 2r(M, R) for which ρ'I,R( f,ϕ ) is transversal to ∆γ ×∆I,R . Fix any ( f,ϕ ) in ~ γ γ this set, and define ρ'f,ϕ,I,R : Nd-1 → C r((M×M)\∆, R ×R ×Md×M d) by I,R ρ'f,ϕ,I,R(y)(x,x' ) = ρ'I,R( f,ϕ)(x,x',y) evρ' ϕ (x,x',y) = ρ'f,ϕ,I,R(y)(x,x' ) Then f, ,I,R Since by our choice of (f,ϕ), ρ'I,R( f,ϕ) is transversal to ∆γ ×∆I,R , we see that evρ' ϕ is transversal to ∆γ ×∆I,R . Also, evρ' ϕ is C r and dim (M×M)\∆ - codim ∆γ ×∆I,R = 2m - γ - (d- α R)m. Hence, using (5.6) we have dim (M×M)\∆ - codim ∆γ ×∆I,R < 0. Thus by the Measure Theoretic Finite Dimensional Parametric Transversality Theorem (Appendix A), ρ'f,ϕ,I,R(y) is transversal to ∆γ ×∆I,R ~ for a residual set of full Lebesgue measure of y in Nd-1 . f, ,I,R f, ,I,R The dimension of (M×M)\∆ is 2m and that of ∆γ ×∆I,R is γ + 2dm - (d-α R)m. Using (5.6) again, we have 2m - (d- α R)m < γ and hence dim (M×M)\∆ + dim ∆γ ×∆I,R = 2m + γ + 2dm - (d- α R)m < γ γ 2 γ + 2dm = dim R ×R ×Md×M d. Now, dim T (x,x' )(M×M)\∆ = dim (M×M)\∆, and the dimension of T(x,x')[ρ'f,ϕ,I,R(y)](T(x,x')(M×M)\∆) cannot be greater than that of T(x,x')(M×M)\∆. Suppose that the image of (M×M\∆) intersected ∆γ ×∆I,R . Denote ρ'f,ϕ,I,R(y)(x,x' ) = z∈∆γ ×∆I,R . Then dim Tz(∆γ × ∆ I,R) = dim ∆ γ ×∆I,R and hence dim T (x,x')[ρ'f,ϕ,I,R(y)](T(x,x')(M×M)\∆) + dim T z(∆γ ×∆I,R) < 2γ + γ γ 2dm = dim R ×R ×Md×M d. Hence T (x,x')[ρ'f,ϕ,I,R(y)](T(x,x')(M×M)\∆) and T z(∆γ ×∆I,R) cannot toγ γ gether span T z(R ×R ×Md×M d) and thus the intersection cannot be transversal. We thus see that if ρ'f,ϕ,I,R(y) is transversal to ∆γ ×∆I,R , then its image cannot intersect ∆γ ×∆I,R . Hence, either (Φ f,ϕ,I,R(x,y),Φ f,ϕ,I,R(x',y))∉∆γ or ρ̂'( f )(x,x',y)∉∆I,R. But for every (x,x',y)∈(M×M)\∆ × ~ Nd-1 we have ρ̂'( f )(x,x',y)∉∆I,R for some I,R. For this choice of I and R we must thus have (Φ f,ϕ,I,R(x,y),Φ f,ϕ,I,R(x',y))∉∆ γ . But by definition this means that Φ f,ϕ,I,R(x,y) ≠ Φ f,ϕ,I,R(x',y) and hence Φ f,ϕ(x,y) ≠ Φ f,ϕ(x',y). Hence for a residual set of full Lebesgue measure of y we have Φ f,ϕ(x,y) ≠ Φ f,ϕ(x',y) for all (x,x' )∈(M×M)\∆, so that Φ f,ϕ,y is injective, as required. ■ Embedding Stochastic Systems 5.5. 25 LOWER D EGREES OF SMOOTHNESS We can deduce Theorem 3.4 in the case where f and ϕ are C k for 1 ≤ k < 2n(d-1) using an identical argument to that in §6.4 of [Stark, 1999]. Fix k such that 1 ≤ k < 2n(d-1) and denote ~ B k = Dk(M×N, M)× C k(M, R). For ( f,ϕ)∈B k let N( f,ϕ) ⊂ N d-1 be the set of y such that Φf,ϕ,y is an ~ embedding, and let B( f,ϕ) = N d-1\N( f,ϕ). Note that since Φ f,ϕ,y depends continuously on y and ~ embeddings are open in C k(M, R d), the set N( f,ϕ) is necessarily open. Given an ε > 0, define ~ ~ the ε-neighbourhood of B( f,ϕ) by B( f,ϕ,ε) = { y∈N d-1 : d (y,B( f,ϕ)) < ε }, where d is the metric ~ on N( f,ϕ). ~ Let E k be the set of ( f,ϕ) in B k for which ν(N( f,ϕ)) = 1, where ν is Lebesgue measure. By §5.3 and §5.4, E n(d-1) is dense in B n(d-1). Since the latter is dense in B k we see that E n(d-1) is dense in B k. But E n(d-1) ⊂ E k, and hence E k is dense in B k. This space is separable and hence we may choose a countable set { ( fi ,ϕi)∈E k : i∈N } which is dense in B k. For any i, we have B( fi ,ϕi) I B( f ,ϕ ,ε) = i i ε >0 and since µ(B( fi ,ϕi)) = 0, we have ν(B( fi ,ϕi,ε)) → 0 as ε → 0. Hence given δ > 0, we may choose an ε(i,δ) > 0 such that ν(N d-1 \B( fi ,ϕi,ε(i,δ))) > 1 - δ. Now, N d-1 \B( fi ,ϕi,ε(i,δ)) is closed and hence compact. Using the continuity of Φf,ϕ,y with respect to f,ϕ and y and the density of embeddings in C k(M, Rd), we can find an open neighbourhood N( fi ,ϕi,δ) of ( fi ,ϕi) in B k such that Φ f,ϕ,y is an embedding for all ( f,ϕ)∈N( fi ,ϕi,δ) and y∈N d-1 \B( fi ,ϕi,ε(i,δ)). Since { ( fi ,ϕi) : i∈N } is dense in B k, the union of N( fi ,ϕi,δ) is open and dense in B k. Thus if we let δn be a sequence such that δn → 0 and define N I U N( f ,ϕ ,δ ) = n∈N i i n i∈N we see that N is residual. Now if ( f,ϕ)∈N, then there exists a sequence in such that ( f,ϕ)∈ N( fi ,ϕi ,δn ) for each n. Thus Φf,ϕ,y is an embedding for all y∈N d-1 \B( fi ,ϕi ,ε(i,δn)) for each n. Hence if we define n n ~ N'( f,ϕ) UN = d-1 n n \B( fi ,ϕi ,ε(i,δn)) n n n∈N ~ ~ ~ then N'( f,ϕ) ⊂ N( f,ϕ). Since ν (N d-1 \B( fi ,ϕi ,ε(i,δn))) → 1 as n → ∞, we have ν (N'( f,ϕ)) = 1 and ~ hence ν (N( f,ϕ)) = 1. Thus ( f,ϕ)∈E k, and so N ⊂ E k. Therefore E k contains a residual set, as required. n n 6. PROOF OF THEOREM 2.4 As in the proof for dim N > 0, we have Φf,ϕ,ω(x) = Φf,ϕ,y(x) where y = ( ω 0 , … , ωd-2 )∈N d-1 and it is thus sufficient to show that for and open dense set of (f,ϕ)∈Dr(M×N, M)× C r(M,R) the map Φ f,ϕ,y is an embedding for every y∈N d-1. As usual, we first prove the theorem for a sufficiently large r, and then show that this implies the theorem for all r ≥ 1. Unlike in the previous section for dim N > 0, the latter argument turns out to be elementary in the present context (as it is in the standard Takens’ Theorem). In particular, embeddings are open in C r(M, R d ) (eg [Hirsch, 1976]), and for a fixed y, Φ f,ϕ,y depends continuously on f and ϕ. Thus, for a fixed y the set of Embedding Stochastic Systems 26 (f,ϕ) for which Φf,ϕ,y is an embedding is open in Dr(M×N, M)× C r(M, R) for any r ≥ 1. If dim N = 0, there is only a finite number of y in Nd-1 , and hence the set of (f,ϕ) for which Φ f,ϕ,y is an embedding for all y∈Nd-1 is also open in Dr(M×N, M)× C r(M, R). On the other hand Dr(M×N, M)× C r(M, R) is dense in D1(M×N, M)× C 1(M, R) for any r ≥ 1. Hence, if we show that for some r ≥ 1, the set of (f,ϕ) for which Φf,ϕ,y is an embedding for all y∈Nd-1 is dense in Dr(M×N, M)× C r(M, R), then this set is also dense in D1(M×N, M)× C 1(M, R), as required. It turns out that as in the proof of the standard Takens’ Theorem in [Stark, 1999], it is sufficient to take r = 3. However, in order to use the calculations in §4, and to be consistent with the proof for dim N > 0, we shall work in D2r(M×N, M)× C 2r(M, R) with r = 2. As already indicated in §3.4, we begin be showing that for an open dense set of f the short periodic orbits are isolated, are distinct for different sequences, and have distinct eigenvalues. As in the proof of the standard Takens’ Theorem, we shall concern ourselves with sequences of length less than 2d. 6.1. GENERIC PROPERTIES OF PERIODIC ORBITS FOR IFS’S We begin by giving a precise definition of a periodic orbit, and its minimal period. Definition 6.1 Given a finite sequence (y0, y1, … , yq-1 )∈Nq, let p be the least integer with 0 < p ≤ q for which yi+p = yi for all i = 0, … , q-p-1, or in words so that we can write (y0, y1, … , yq-1 ) as k repeats of (y0, y1, … , yp-1 ), where q = kp. We shall denote this by (y0, … , yp-1 , y0, … , yq-1 , … , y0, … , yq-1 ) = (y0 , y1, … , yp-1 )k. We call p the prime length of (y0, y1, … , yq-1 ) and refer to the initial sequence (y0, y1, … , yp-1 ) of length p as the prime factor of (y0, y1, … , yq-1 ). The sequence (y0, y1, … , yq-1 )∈Nq is called prime if p = q. By a periodic orbit of f∈D2r(M×N d-1, M) we mean a periodic orbit of the skew product (2.2) in the usual sense. It thus consists of an orbit { (xi ,yi ) } such that (xi+q ,yi+q ) = (xi ,yi ) for all i∈Z, where q is the period of the orbit. Any periodic orbit is thus associated with a finite sequence (y0, y1, … , yq-1 )∈Nq and can be defined by Definition 6.2 A periodic point of a sequence (y0, y1, … , yq-1 )∈Nq is a fixed point of fy …y , that is a point x∈M such that xq = fy …y (x) = x. The minimal period of such an orbit is kp, where k is the least positive integer such that (fy …y )k(x) = x and (y0, y1, … , yp-1 ) is the prime factor of (y0, y1, … , yq-1 ). A periodic orbit is hyperbolic if Tx fy …y has no eigenvalues on the unit circle. q-1 q-1 0 0 p-1 0 q-1 0 Denote the set of periodic orbits for a particular sequence (y0, y1, … , yq-1 ) by Py …y 0 = q-1 { x∈M : fy q-1 …y0 (x) = x } Note that unlike the standard case, it is not necessarily true that if x∈Py …y then xi ∈Py …y where xi = fy …y (x). Instead, we trivially have: 0 i-1 q-1 0 q-1 0 Lemma 6.3 Suppose that x is a periodic orbit of (y0, y1, … , yq-1 )∈Nq. Then xi is a periodic orbit for (yi , yi+1, … , yq-1 , y0, … , yi-1 )∈Nq Proof By definition x0 = xq = fy (xi ) and xi = fy q-1 …yi i-1 …y0 (x0). Thus fy (xi ) = xi . i-1 …y0yq-1 …yi ■ Embedding Stochastic Systems 27 The next lemma shows that the minimal period divides q and that the orbit segment {x0, x1, … , xq-1 } is just the initial segment {x0 , x1, … , xkp-1 } repeated r times, where q = kpr. This is well known for standard dynamical systems, but perhaps not entirely obvious in the present setting. Lemma 6.4 Suppose that x is a periodic orbit of (y0, y1, … , yq-1 )∈Nq with minimal period kp. Then xi = x(i mod kp) and yi = y(i mod kp) for all i with 0 ≤ i ≤ q. Proof Write q = kpr + j, for 0 ≤ j < kp, so that j = q mod kp. By definition, ( fy …y )k(x0) = x0 , and since (y0 , y1, … , ykpr-1 ) = (y0 , y1, … , yp-1 )kr, we have xkpr = fy …y (x0) = ( fy …y )kr(x0 ) = x0 . Hence fy …y (x0) = fy …y (xkpr) = xq = x0 . Since q is divisible by p, so is j, say j = k'p, and (ykpr , ykpr+1, … , yq-1 ) = ( y0, y1, … , yp-1 )k'. Thus ( fy …y )k'(x0) = x0 , with k' < k. But, by definition k is the least integer k > 0 such that ( fy …y )k(x0) = x0 and hence we must have k' = 0. Thus (y0, y1, … , yq-1 ) consists of r' repeats of ( y0, y1, … , ykp-1 ) and xi = fy …y (x0) = fy …y ( fy …y )r'(x0) = fy …y (x0) where i = kpr' + j', with j' = i mod kp. But (ykpr', ykpr'+1, … , yi-1 ) consists of the initial j' symbols of (y0, y1, … , yp-1 ), in other words (ykpr', ykpr'+1, … , yi-1 ) = (y0, y1, … , yj'-1 ). Thus xi = fy …y (x0) = xj' , as required. ■ q-1 kpr q-1 kpr-1 0 p-1 0 p-1 0 kpr p-1 p-1 0 0 i-1 0 i-1 kpr' kp-1 0 i-1 kpr' j'-1 0 Note however, that unlike in the usual definition for a single f, is possible to have fy …y (x) = x for some i such that 0 < i < kp. This poses a serious obstacle in trying to generalise the argument of §4.2 of [Stark, 1999] to the present case, and much of the following construction is designed to overcome this difficulty. Thus define i-1 B1(q) 0 { f∈D2r(M×N d-1, M) : Py …y consists of a finite number of points for all (y0, y1, … , yi-1 )∈Ni for all 0 < i < q } = 0 i-1 B2(q) { f∈D2r(M×N d-1, M) : if Py …y ∩ Py' …y' ≠ ∅ for some (y0, y1, … , yi-1 ) ≠ (y'0, y'1, … , y'i'-1 ) with 0 < i,i' < q then (y0, y1, … , yi-1 ) and (y'0, y'1, … , y'i'-1 ) have the same prime factor } = 0 i-1 0 i'-1 B3(q) { f∈D2r(M×N d-1, M) : all x∈Py …y are hyperbolic for all (y0, y1, … , yi-1 )∈Ni for all 0 < i < q } = 0 i-1 and finally B (q) B1(q) ∩ B2(q) ∩ B3(q) = with the convention that B (1) = D2r(M×N d-1, M). Note that if (y0, y1, … , yi-1 ) and (y'0, y'1, … , y'i'-1 ) have the same prime factor and x∈(Py …y ∩ Py' …y' ) then x is the same periodic orbit under both (y0, y1, … , yi-1 ) and (y'0, y'1, … , y'i'-1 ). The following lemma is trivial, but is the key step in our proof of Theorem 3.4 for dim N = 0: 0 i-1 0 i'-1 Lemma 6.5 Suppose that f∈B (q) and for some (y0, y1, … , yq-1 )∈Nq and some x∈M we have xq = xi = x0 for some 0 < i < q. Then (y0, y1, … , yi-1 ) and (y i , yi-1 , … , yq-1 ) are both powers (ie repeats) of the prime factor of (y0, y1, … , yq-1 ), and i is a multiple of the prime length of (y0, y1, … , yq-1 ). Proof We have fy …y (x0) = x0 and fy …y (x0) = fy …y (xi ) = xi = x0 . Thus x0∈(Py …y ∩ Py …y ) and since both 0 < i < q and 0 < q-i < q, the definition of B (q) implies that (y0, y1, … , yi-1 ) and (yi , yi-1 , … , yq-1 ) have the same prime factor, which we denote (y0, y1, … , yp-1 ). Hence (y0, y1, … , yi-1 ) = i-1 0 q-1 i q-1 i 0 i-1 i q-1 Embedding Stochastic Systems 28 (y0, y1, … , yp-1 )k and (yi , yi-1 , … , yq-1 ) = (y0, y1, … , yp-1 )k' for some k,k' with i = kp and q-i = k'p. Thus q = (k+k' )p and (y0, y1, … , yq-1 ) = (y0, y1, … , yp-1 )k+k' so that (y0, y1, … , yp-1 ) is also the prime factor of (y0, y1, … , yq-1 ). ■ Corollary 6.6 Suppose that f∈B (q) and x is a periodic orbit of (y0 , y1, … , yq-1 )∈Nq for f. Then the minimal period of x is the least i with 0 < i ≤ q such that xi = x0. Proof Let i be the least i with 0 < i ≤ q such that xi = x0. By Lemma 6.5, ( y0, y1, … , yi-1 ) = (y0, y1, … , yp-1 )k for some k. Thus (fy …y )k(x) = x, and since i is the least integer such that xi = x0, k is the least integer such that (fy …y )k(x) = x. Hence kp = i is the minimal period of x. ■ p-1 p-1 0 0 Corollary 6.7 Suppose that f∈B (q) and x is a periodic orbit of (y0 , y1 , … , yq-1 )∈Nq for f with minimal period i. Then xj = xj' if and only if j = j' mod i for any 0 ≤ j, j' < q. In particular the points {x0, x1, … , xi-1 } are all distinct, ie xj ≠ xj' for all 0 ≤ j < j' < q. Proof Suppose that xj = xj' for some 0 ≤ j < j' < q. By Lemma 6.4 , xj' = x(j' mod i) and hence we may assume without loss of generality that j < j' ≤ j+i. Recall from Lemma 6.3 that xj is a periodic orbit of the sequence (yj, yj+1, … , yq-1 , y0, … , yj-1)∈Nq for f. Since xj' = xj, Corollary 6.6 implies that the minimal period i' of xj is less than or equal to j' - j, and hence i' ≤ j' - j ≤ i. But, by Lemma 6.4, xi' = x0 and hence Corollary 6.6 implies that the minimal period i of x0 for the sequence (y0, y1, … , yq-1 ) is less than or equal to i', in other words i ≤ i'. Thus i' = j' - j = i, and j = j' mod i as claimed. ■ This corollary shows that for f in B(q), periodic orbits behave as for standard dynamical systems. Analogously to Lemma 4.8 of [Stark, 1999] we now aim to show that B (q) is open and dense in D2r(M×N, M). Since the proof here is somewhat longer, we break it up into a number of steps, dealing with each Bk(q) in turn. The overall proof is by induction, so that we will show that if for some q ≥ 0, B(q) is open and dense in D2r(M×N, M) then each of the Bk(q+1) is open dense in B (q). _ For any given y = (y0, y1, … , yq-1 )∈Nq define ρy : D2r(M×N d-1, M) → Dr(M,M) by _ ρy( f ) = fy q-1 …y0 Thus evρ_ ( f,x) = evρ̂ ( f,x,y) where ρ̂q is as in §4.2. Thus by Corollary 4.2, evρ_ is C r and T (f,x)evρ_ (η, _ _ _ _ 0 x ) = ηq(x,y), with ηq satisfying ηj(x,y) = η(xj-1,yj-1) + T (x ,y ) f (ηj-1(x,y),0). Since T(x ,y ) f (v,0) = T x fy (v) we have y q y j-1 i-1 j-1 y i-1 i-1 i-1 q _ ηq(x,y) ∑T = (η(xj-1,yj-1)) f (6.1) xj yq-1 …yj j =1 with the convention that T x fy …y = Id if q = j. Now, suppose that x is periodic for y = (y0, y1, … , yq-1 ). Let the minimal period of x be i, so that by Lemma 6.4 xj = x(j mod i). Then we can rearrange the sum in (6.1) to give j q-1 j r −1 _ ηq(x,y) = ∑ i (T x fy 0 i-1 …y0 s =0 where q = ri. Let T (r) be the linear operator )s ∑T j =1 f (η(xj-1,yj-1)) xj yi-1 …yj (6.2) Embedding Stochastic Systems 29 r −1 T (r) ∑ (T = )s f (6.3) x0 yi-1 …y0 s =0 with the convention that T (0) = Id. A simple calculation yields: Lemma 6.8 Suppose that f∈B (q' ) for some 0 < q ≤ q', x is periodic for (y0, y1, … , yq-1 )∈Nq, with minimal period i. Then T (r) is invertible. Proof By considering the Jordan Normal Form of T x fy …y we see that the eigenvalues of T (r) are of the form 1 + λ + … + λ r-1 , where λ is an eigenvalue of T x fy …y . Since xi = x0, the segment {x0, x1, … , xi-1 } is a periodic orbit for the sequence (y0, y1, … , yi-1 ). By definition, B (q') ⊂ B (q), and thus Tx fy …y has no eigenvalues on the unit circle and hence in particular λ r ≠ 1. But (1+λ+ …+λ r-1 )(1-λ) = 1-λ r, and so the eigenvalues of T (r) are all non-zero. ■ 0 i-1 0 0 0 i-1 0 i-1 0 Essentially the same proof as that of Lemma 4.8 of [Stark, 1999] gives: Lemma 6.9 Suppose that f∈B (q') for some 0 < q ≤ q', x is periodic for (y 0, y1, … , yq-1 )∈Nq, with minimal period i. Then given any u∈T x M, there exists a η∈T f D2r(M×N, M) such that T(f,x)evρ_ (η, 0 x) = u. Thus in particular T(f,x)evρ_ is surjective. y y Proof By Corollary 6.7 the points {x0, x1, … , xi-1 } are all distinct. As in the proof of Lemma 5.3, we can use Corollary C.12 of [Stark, 1999] to construct a η ∈T f D2r(M×N, M) such that η(xj, yj) = 0x for j = 0, … , i-2, and η(xi-1 , yi-1 ) takes on whatever value we want. By Lemma 6.8, T (r) is invertible, and hence given u∈T x M, we can choose η(xi-1 , yi-1 ) = (T (r))-1 (u). Substituting this into (6.2) immediately gives T (f,x)evρ_ (η,0x) = u, as required. ■ i+1 y Corollary 6.10 If for some q > 0, B (q) is open and dense in D2r(M×N, M) then B1(q+1) is open and dense in B (q), and hence open and dense in D2r(M×N, M). Proof If B (q) is open and dense in D2r(M×N, M), then it is a Banach manifold, and Tf B (q) = T f D2r(M×N, M) for all f∈B (q). Let ∆ = { (x,x' )∈M×M : x = x' } be the diagonal in M×M. For any _ given y = (y0, y1, … , yq-1 )∈Nq define ρ : B (q) → Dr(M,M×M) by _ ρ _ (CId, ρy) = where CId : B (q) → Dr(M,M) is the constant map CId( f ) = Id. Thus, evρ_ is C r and T (f,x)evρ_ (η,0x) = (0x,evρ_ (η,0x)). If evρ_( f,x)∈∆ then xq = x0 so that x0 is a periodic orbit for the sequence (y0, y1, … , yq-1). Thus, by Lemma 6.9, Image T (f,x)evρ_ = {0x}×Tx M. On the other hand T (x,x)∆ = { (u,u' )∈ T (x,x)(M×M) : u' = u } and hence Image T(f,x)evρ_ + T(x,x)∆ =T (x,x)(M×M). y i Thus, evρ_ is transversal to ∆. The codimension of ∆ is m and the dimension of M is also m, so _ that dim M - codim ∆ = 0 < r. Hence, by the Parametric Transversality Theorem, ρ( f ) is trans_ versal to ∆ for an open dense set of f∈B (q). For any such f, the set ( ρ( f ))-1 (∆) is a codimension m _ submanifold of M. Thus (ρ( f )) -1 (∆) has dimension 0 for an open dense set of f∈B (q), and since _ M is compact, it consists of a finite number of isolated points. But ρ( f ) = (Id, fy …y ) and hence _ (ρ( f )) -1 (∆) = { x∈M : fy …y (x) = x } = Py …y . Taking the intersection over all (y0, y1, … , yq-1 )∈Nq we obtain an open dense set of f∈B (q) for which each Py …y consists of a finite number of points, as required. ■ q-1 0 0 q-1 0 q-1 0 q-1 Embedding Stochastic Systems 30 A somewhat more complicated version of the same line of reasoning gives: Lemma 6.11 If for some q > 0, B (q) is open and dense in D2r(M×N, M) then B2(q+1) is open and dense in B (q), and hence open and dense in D2r(M×N, M). Proof Fix some y = (y0, y1, … , yi-1 )∈Ni and y' = (y'0, y'1, … , y'i'-1 )∈Ni' such that y ≠ y' with 0 < _ i,i' ≤ q. Define ρy,y' : B (q) → Dr(M,M×M×M) by _ ρy,y'( f ) = (Id, fy i-1 …y0 , fy' i'-1 …y'0 ) We aim to show that evρ_ is transversal to the bi-diagonal ∆' = { (x,x',x'' )∈M×M×M : x = x' = x'' }. _ _ Then using Corollary 4.2 as in Corollary 6.10, we have T(f,x)evρ_ (η,0x) = (0x ,ηi(x,y),η'i (x,y' )) _ _ where ηi(x,y) satisfies (6.1) and η'i (x,y' ) the analogue of (6.1) for y'. Suppose that evρ_ ( f,x)∈∆' so that x = xi = x'i' and hence x is a periodic orbit for both y and y'. Let the minimal period for y be p and that for y' be p', with i = rp and i' = r'p'. Rearranging the sum in (6.1), as in (6.2), gives y,y' y,y' y,y' r −1 _ ηi(x,y) ∑ (T = p s f ) x0 yp-1 …y0 s =0 ∑T η'i (x,y' ) ∑ = (η(xj-1, yj-1)) f xj yp-1 …yj (6.4) j =1 p′ r ′−1 _ ∑T )s (T x fy' p'-1 …y'0 0 s =0 (η(x'j-1, y'j-1)) f x'j' y'p'-1 …y'j' (6.5) j =1 where x'j = fy' …y' (x). Now suppose that (y0, y1, … , yi-1 ) and (y'0, y'1, … , y'i'-1 ) do not have the same prime factor, and suppose without loss of generality that p ≤ p'. We claim that (y0, y1, … , yp-1 ) ≠ (y'0, y'1, … , y'p-1 ). Suppose not. Then x'p = fy' …y' (x) = fy …y (x) = xp. But p is the minimal period of x with respect to (y0, y1, … , yi-1 ) and hence x'p = xp = x. Hence by Corollary 6.6, the minimal period p' of x with respect to (y'0, y'1, … , y'i'-1 ) is less than or p. Hence p = p' and (y0, y1, … , yp-1 ) = (y'0, y'1, … , y'p'-1). Thus, by Lemma 6.4 (y0, y1, … , yi-1 ) = (y0, y1, … , yp-1 )r and (y'0, y'1, … , y'i'-1 ) = (y'0, y'1, … , y'p'-1)r' = (y0, y1, … , yp-1 )r' and hence (y0, y1, … , yi-1 ) and (y'0, y'1, … , y'i'-1 ) have the same prime factor, which contradicts our assumption that they do not. Thus (y0, y1, … , yp-1 ) ≠ (y'0, y'1, … , y'p-1 ). j-1 0 p-1 0 p-1 0 Let k be the least integer with 0 ≤ k < p such that yk ≠ y'k. Recall that x'0 = x = x0, whilst if k > 0, then (y0, y1, … , yk-1 ) = (y'0, y'1, … , y'k-1) and hence x'j = fy' …y' (x) = fy …y (x) = xj for all 0 ≤ j ≤ k. Hence we have (x'j ,y'j) = (xj , yj) for all 0 ≤ j ≤ k, but ( x'k, y'k) ≠ (x k, yk). Furthermore, by Corollary 6.7 the points {x0 , x1 , … , xp-1 } are all distinct as are the points {x'0, x'1, … , x'p-1 }. Since x'k = xk, this implies that x'k ≠ xj for all j ≠ k with 0 ≤ j < p and xk ≠ x'j for all j ≠ k with 0 ≤ j < p'. Hence as in Lemmas 5.3 and 6.9 we can we can use Corollary C.12 of [Stark, 1999] to construct a η∈T f D2r(M×N, M) such that η(xj , yj) = 0x for j ≠ k with 0 ≤ j < p , η(x'j' , y'j') = 0 x' for j ≠ k with 0 ≤ j < p' and η(xk, yk), η(x'k, y'k) take on whatever values we want. By Lemma 6.8, T (r) and j-1 0 j-1 0 i'+1 i+1 r ′−1 T' (r) ∑ (T = f )s x0 y'p'-1 …y'0 s =0 are invertible, whilst T x fy …y and T x' fy' …y' are both invertible by the definition of D2r(M× N, M). Hence given u,u'∈T x M, choose η(xk, yk ) = (T (r)°T x fy …y )-1 (u) and η(x'k,y'k ) = (T' (r)° k+1 p-1 k+1 k+1 p'-1 k+1 k+1 p-1 k+1 Embedding Stochastic Systems 31 T x' fy' …y' )-1 (u' ). Evaluating using (6.4) and (6.5) gives T (f,x)evρ_ (η,0x) = (0x,u,u' ), so that Image T (f,x)evρ_ = {0x}×TxM×TxM. Since T (x,x,x)∆' = { (u,u',u'' )∈T (x,x,x)(M×M×M) : u = u' = u'' } we have Image T (f,x)evρ_ + T(x,x,x)∆' =T (x,x,x)(M×M×M) and hence evρ_ is transversal to ∆'. k+1 p'-1 k+1 y,y' y,y' y,y' y,y' We now apply the same dimension counting argument as in Propositions 5.4 and 5.5. The codimension of ∆' is 2m and the dimension of M is m, so that dim M - codim ∆' < 0 < r. Thus _ by the Parametric Transversality Theorem, ρy,y'( f ) = (Id, fy …y , fy' …y ) is transversal to ∆' for an open dense set of f∈B (q). For any such f, suppose that (Id, fy …y , fy' …y )(x)∈∆'. The dimension of T xM is m and hence the dimension of T x(Id, fy …y , fy' …y )(T xM) is at most m. The dimension of T (x,x,x)∆' is also m and hence the dimension of T x(Id, fy …y , fy' …y )(T xM) + T(x,x,x)∆' is at most 2m. However, the dimension of T (x,x,x)(M×M×M) is 3m and hence T x(Id, fy …y , fy' …y )(T xM) and _ T (x,x,x)∆' cannot span T (x,x,x)(M×M×M). Thus the only way for ρy,y'( f ) to be transversal to ∆' is for _ the image of ρy,y'( f ) not to intersect ∆', in other words (Id, fy …y , fy' …y )(x)∉∆' for all x∈M. We have therefore shown that if (y0, y1, … , yi-1 ) and (y'0, y'1, … , y'i'-1 ) do not have the same prime factor then for an open set of f∈B (q) we cannot have x = xi = x'i' . Taking the intersection of such open dense sets for all possible choices of y and y', we see that B2(q+1) is open and dense in B (q), as required. ■ i-1 0 i'-1 0 i-1 0 i'-1 0 i-1 0 i'-1 i-1 0 0 i'-1 0 i-1 0 i'-1 0 i-1 0 i'-1 0 It now remains to show that for a dense set of f∈B (q), periodic orbits of period q are hyperbolic. Intuitively this is obvious since by above there is a finite number of such orbits, and hence if necessary we can independently perturb the eigenvalues of each orbit to ensure that there are no eigenvalues on the unit circle. For the convenience of the reader, we give a formal proof, once again using the Parametric Transversality Theorem. We begin with a results analogous to Lemma C.9 of [Stark, 1999] (see also Lemma 4.4 above): _ Lemma 6.12 Let τ y : D2r(M×N d-1, M) → VB r-1 (TM,TM) be given by _ τ y( f ) = T x fy q-1 …y0 for some fixed y = (y0, y1, … , yq-1 )∈Nq. Then evτ_ is C r-1 and T (f,v)evτ_ (η,0v) = y y _ ϖ(T xηq,y(v)) _ _ _ _ where ηq,y is given by ηq,y(x) = ηq(x,y), with ηq as in (4.6) and (6.1). _ _ Proof We have τ y = σ ° ρy where σ : C r(M, M) → VB r-1 (TM,TM) is the tangent operator, as in Appendix B.3 of [Stark, 1999]. By Lemma B.11 of [Stark, 1999], σ is C ∞ and T f σ(η) = ϖ °Tη . _ On the other hand ρy( f )(x) = ρ̂q ( f )(x,y) where ρ̂q : D2r(M×N, M) → Dr(M×N q, M) is as defined _ _ in §4.2, so that by Lemma 4.1, ρ̂q is C r and T f ρ̂q (η)=ηq. Thus ρy( f ) = ρ̂q ( f ) ° ey, where ey : M → _ M×N q is just ey(x) = (x,y). Hence ρy( f ) = iy(ρ̂q ( f )), where iy : Dr(M×N q, M) → Dr(M, M) is given by iy(g) = g ° ey. Hence iy is just composition on the right, and so by Proposition 4.2 of [Franks, 1979] it is C ∞ , as in Lemma 4.4. Its tangent map is given by T g ιy(η)(x) = η (x,y), or in other _ _ words with a mild abuse of notation, T g ιy(η) = η °ey. We thus have τ y = σ ° iy ° ρ̂q and hence τ y is _ _ _ _ _ _ C r, with Tf τ y(η) = ϖ(T xηq,y) where ηq,y = ηq° ey, so that ηq,y(x) = ηq(x,y). _ Similarly to Lemma 4.4, the evaluation function evτ_ is given by evτ_ = ev ° (τ y ×Idv) where ev : VB r-1 (TM, TM)×TM → TM is the evaluation ev(Ψ,v) = Ψ(v). By Corollary B.3 of [Stark, 1999] y y Embedding Stochastic Systems 32 this is C r-1 and T (Ψ,v)ev(η,w) = η(v) + TvΨ(w). Thus evτ_ is C r-1 by the chain rule and T(f,v)evτ_ (η, _ _ 0 v) = Tf τ y(η)(v) = ϖ(T xηq,y(v)), as required. ■ y y Denote by TCM the complexification of TM, that is TCM = { u+iv : u,v∈TM such that τM(u) = τM(v) }, where τM : TM → M is the tangent bundle projection, as in §5.3, so that the condition ~ τM(u) = τM(v) simply requires u and v to be in the same fibre. As usual, let T CM = { u+iv : u,v∈ TM such that τM(u) = τM(v) and ||u ||2 + ||v ||2 = 1 } be the unit bundle in T CM. Let T be the unit _ ~ circle in C and define the operator τ : D2r(M×N, M) → C r-1 (T CM×T, TCM×TCM) by _ τ ( f ) _ (Λ,τ y( f )) = _ ~ where Λ : T CM×T → T CM is the fixed map Λ(w,λ) = λ w. Thus ev _τ (f,w,λ) = τ ( f )(w,λ) = (λw, T x fy …y (w)). Let ∆C = { (w,w' )∈T CM×TCM : w = w' } be the diagonal in TCM×TCM, note _ that if (w,w' )∈∆C, then w and w' must be in the same fibre of T CM. Thus if τ ( f )(w,λ)∈∆C then x = fy …y (x) and T x fy …y (w) = λw, so that x is a periodic orbit with an eigenvalue of unit _ modulus, and vice versa. Hence if the image of τ ( f )(w,λ) does not intersect ∆C, then any periodic orbits of the sequence (y0, y1, … , yq-1 ) must be hyperbolic. We thus use the usual parametric transversality argument to show that the evaluation map of evτ_ is transversal to ∆C, and then count dimensions to show that transversality implies non-intersection. q-1 0 q-1 0 q-1 0 Lemma 6.13 If for some q > 0, B (q) is open and dense in D2r(M×N, M) then evτ_ is transversal to ∆C . Proof Suppose evτ_(f,w,λ)∈∆C. Then x is a periodic orbit of (y0 , y1, … , yq-1 ) with λ and eigenvalue of T x fy …y . This implies that x must have minimal period q, since if not, it would be a periodic orbit for (y0, y1, … , yi-1 ) with 0 ≤ i < q. Since f∈B (q), by definition T xfy …y has no eigenvalues of unit modulus. But Tx fy …y = (T x fy …y )r, where q = ir, and therefore the eigenvalues of T x fy …y are of the form µr, where µ is an eigenvalue of T x fy …y . Since || λ || = 1, this contradicts our assumption that evτ_(f,w,λ)∈∆C. _ By Lemma 6.12 we have T(f,w,λ)evτ_(η,0w,0λ) = (0λw,ϖ(T xηq,y(w)). Differentiating (6.1) gives q-1 0 q-1 q-1 0 i-1 i-1 0 0 0 i-1 0 q _ T xηq,y ∑ T (Tf = x yq-1 …yj j =1 °ηy °fy j-1 j-2 …y0 ) q ∑T = η(yj-1, x j-1 ) (Tfy q-1 …yj j =1 )°T x ηy °T x fy j-1 j-1 (6.6) j-2 …y0 Given w∈T C,xM, as usual denote wj = Tx fy …y (w). By Corollary 6.7 the points {x0, x1, … , xq-1 } are all distinct and thus by Corollary C.16 of [Stark, 1999], given any u∈T w (T CM) there exists a η∈T f D2r(M×N, M) such that ϖ(T x ηy (wq-1 )) = u, and T x ηy (wj-1) = 0 for 0 < j < q. Therefore, since T(Tfy …y ) is linear, we have Tη(y x )(Tfy …y )(T x ηy (wj-1)) = 0 for 0 < j < q, so that j-2 0 q q-1 q-1 j _ ϖ(T xηq,y(w) q-1 j-1, = j-1 j-1 q-1 j ϖ(T xq-1 ηyq-1 (wq-1 )) j-1 j-1 j-1 = u Thus Image T (f,w,λ)evτ_ = {0λw}×Tw(T CM). On the other hand, T (w,w)∆ C = { (u,u' )∈T (w,w)(T CM× T CM) : u = u' } and thus Image T(f,w,λ)evτ_ + T (w,w)∆C =T (w,w)(T CM×TCM). Hence evτ_ is transversal to ∆C, as required. ■ Embedding Stochastic Systems 33 Corollary 6.14 If for some q > 0, B (q) is open and dense in D2r(M×N, M) then B3(q+1) is open and dense in B (q), and hence open and dense in D2r(M×N, M). Proof It is sufficient to consider orbits with period q since by definition if f∈B (q) then any periodic orbit for (y0, y1, … , yi-1 ) with 0 < i < q is hyperbolic. As in Lemma 6.10, if B(q) is open and dense in D2r(M×N, M), then it is a Banach manifold, and T f B (q) = Tf D2r(M×N, M) for all f∈B (q). The (real) dimension of T CM is 3m and hence the codimension of ∆C is 3m, whilst the dimen~ ~ sion of T CM×T is also 3m, so that dim T CM×T - codim ∆C = 0 < r-1. Since evτ_ is C r-1 , by the Pa_ rametric Transversality Theorem, τ ( f ) is transversal to ∆ C for an open dense set of f∈B (q). For _ ~ ~ any such f, suppose that τ ( f )(w,λ)∈∆C, for some (w,λ)∈T CM× T. The dimension of T (w,λ)(T CM× _ _ ~ T) is 3m and hence the dimension of Image T (w,λ)(τ ( f )) = T (w,λ)(τ ( f ))(T (w,λ)(T CM×T)) is at most 3m. The dimension of T (w,w)∆C is also 3m and the dimension of T(w,w)(T CM×TCM) is 6m. How_ ever, note that since T x fy …y (w) is linear, for any α∈C we have τ ( f )(α w,λ) = (αλ w, αT x fy …y (w)) _ _ _ = α τ ( f )(w,λ). Hence if τ ( f )(w,λ)∈∆C then τ ( f )(α w, λ)∈∆ C for all α ∈T. Thus the intersection of _ Image T (w,λ)(τ ( f )) and T(w,w)∆C has (real) dimension at least 1, and so the dimension of Image _ _ _ T (w,λ)(τ ( f )) + T (w,w)∆C is at most 6m - 1. Thus if τ ( f )(w,λ)∈∆C then τ ( f ) cannot be transversal to _ ~ ∆C at (w,λ). Hence τ ( f )(w,λ)∉∆C for all (w,λ)∈T CM×T, or in other words any periodic orbits of minimal period q must be hyperbolic. ■ q-1 0 q-1 0 Corollary 6.15 B (q) is open and dense in D2r(M×N, M) for all q > 0. Proof By induction on q. The result holds for q = 1 trivially by definition. Suppose that it holds for some q > 1. Then by Corollaries 6.10, 6.14 and Lemma 6.11, each of B1(q+1), B2(q+1) and B3(q+1) are open and dense in B (q), and hence in D2r(M×N, M). Thus B (q+1) = B1(q+1) ∩ B 2(q+1) ∩ B3(q+1) is open and dense in D2r(M×N, M), thereby completing the inductive step. ■ Next, we use Lemma 6.12, and a similar (but more convoluted) argument to Lemma 6.13 and Corollary 6.14 to show that generically the eigenvalues of any periodic orbit are distinct and of multiplicity one. We proceed in two stages, defining B4(q) { f∈D2r(M×N d-1, M) : if x∈Py …y for some (y0, y1, … , yi-1 )∈Ni with 0 < i < q then the Jordan Normal Form of Tx fy …y is diagonal } = 0 i-1 B5(q) i-1 0 { f∈D2r(M×N d-1, M) : if x∈Py …y for some (y0, y1, … , yi-1 )∈Ni with 0 < i < q and w,w' are eigenvectors of T x fy …y with the same eigenvalue, then w' = αw for some α∈C } = 0 i-1 i-1 0 We begin by recalling that the Whitney (or direct) sum of TCM with itself is T CM⊕T CM = { (w, ~ ~ ~ w' )∈T CM×TCM : τM(w) = τM(w' ) }, and similarly for unit vectors T CM⊕ T CM = { (w,w' )∈T CM× ~ ~ ~ T CM : τM(w) = τM(w' ) }. Thus TCM⊕T CM has (real) dimension 5m and T CM⊕ T CM has dimension 5m-2. Let ∆(4) C = { (w,w',w'',w''' )∈(T CM⊕T CM)×(T CM⊕T CM) : w = w'', w' = w''' } and define _ ~ ~ τ JNF : D2r(M×N, M) → C r-1 (T CM⊕ T CM×C,(T CM⊕T CM)×(T CM⊕T CM)) by _ τ JNF( f )(w,w',λ) (λw,w+ λw', T x fy = q-1 …y0 (w),T x fy q-1 …y0 (w' )) Recall that τM ° T fy …y = fy …y ° τM and hence if τM(w) = τM(w' ) = x then τM(T x fy …y (w)) = xq = _ _ τM(T x fy …y (w' )) and thus τ JNF( f )(w,w',λ)∈(T CM⊕T CM)×(T CM⊕T CM) as claimed. If τ JNF( f )(w, w',λ)∈∆(4) C , w is an eigenvector of T x f y …y with eigenvalue λ, whilst w' satisfies T x f y …y (w' ) - λw' q-1 0 q-1 0 q-1 0 q-1 0 q-1 0 q-1 0 Embedding Stochastic Systems 34 = w, so that it is a generalized eigenvector for λ. As usual, we want to prove that this not occur for a dense open set of f∈B (q). We do this by showing that evτ_ is transversal to ∆(4) C , so that by _ the Parametric Transversality Theorem τ JNF( f ) is transversal to ∆(4) for a dense open set of f. C We then use the usual dimension counting argument to show that transversality can only occur _ _ if the image of τ JNF( f ) does not intersect ∆(4) is C r-1 and C . Thus, by Lemma 6.12 evτ JNF JNF T (f,w,w',λ)evτ_ (η,0w,0w' ,0λ) _ (0λw,0w+λw' ,ϖ(T xηq,y(w),ϖ(T xηq,y(w' )) = JNF _ It turns out to be convenient to only consider periodic orbits of minimal period q, and hence to organize the proof inductively. We thus begin with: Lemma 6.16 If for some q ≥ 0, B(q) ∩ B4(q) is open and dense in B (q) and hence in D2r(M×N, M) ~ ~ then the evaluation map evτ_ : (B (q)∩B4(q))×T CM⊕ T CM× C → (T CM⊕T CM)×(T CM⊕T CM) is transversal to ∆(4) C . JNF Proof Similarly to Corollary 6.10, since B (q) ∩ B4(q) is open and dense in D2r(M×N, M), it is a Banach manifold, and T f (B (q)∩B4(q)) = T f D2r(M×N, M) for all f∈(B (q)∩B4(q)). Suppose evτ_ (f,w,w', λ)∈∆(4) C , so that x = τ M(w) = τM(w' ) is periodic of period q. As in Lemma 6.13, suppose the minimal period of x is i, with q = ri, with r > 1. Since f ∈(B (q)∩B4(q)), the Jordan Normal Form of T x fy …y is diagonal. Hence, the Jordan Normal Form of Tx fy …y = (T x fy …y )r is also diagonal, contradicting evτ_ (f,w,w',λ)∈∆(4) C . Hence the minimal period of x must be q, so that by Corollary 6.7 the points {x0, x1, … , xq-1 } are all distinct. By Corollary C.18 of [Stark, 1999], the mapping from η to (Tx ηy , T x ηy , … , T x ηy ) is therefore a submersion. Thus, we can choose a η∈T f (B (q)∩ B4(q)) such that Tx ηy = 0 for 0 < j < q and T x ηy is whatever linear map we require. Since λw = Tx fy …y (w) and T x fy …y (w' ) - λw' = w, we have w ≠ αw' for all α ∈T. This is because if w' = αw for some α ∈T, then w = T x fy …y (α w) - λαw = 0 by the linearity of Tx fy …y , contradicting || w || = ~ 1 for all w∈T CM. Since fy …y is a diffeomorphism, we have wj ≠ αw'j for all α ∈T, for all 0 ≤ j ≤ q, where wj = Tx fy …y (w) and w'j = T x fy …y (w' ). Thus, given any u∈T w (T CM) and u'∈T w' (T CM) we can find a η∈Tf (B (q)∩B4(q)) such that ϖ(T x ηy (wq-1 )) = u, ϖ(T x ηy (w'q-1 )) = u', and T x ηy (wj-1) = T x ηy (w'j-1) = 0 for 0 < j < q. Using (6.6) this gives JNF i-1 0 q-1 0 i-1 0 JNF 0 0 1 j-1 q-1 1 q-1 j-1 0 q-1 q-1 j-1 j-1 q-1 0 q-1 0 0 0 j-1 0 j-1 q-1 0 i-1 q-1 q q-1 q-1 q q-1 j-1 j-1 j-1 T (f,w,w',λ)evτ_ (η,0w,0w' ,0λ ) = JNF (0λw,0w+λw' ,u,u' ) But T (λw,w+λw',λw,w+λw' )∆(4) C = { (u,u',u'',u''' )∈T (λw,w+λ w',λ w,w+λ w' )(T CM⊕T CM)×(T CM⊕T CM) : u = u'', u' = u'' }. Thus Image T (f,w,w',λ)evτ_ + T(λw,w+λw',λw,w+λw' )∆(4) C = T (λw,w+λ w',λ w,w+λ w' )(T CM⊕T CM)×(T CM⊕T CM). (4) _ Hence evτ is transversal to ∆C , as required. ■ JNF JNF Corollary 6.17 B (q) ∩ B4(q) is open and dense in D2r(M×N, M). Proof By induction on q. The lemma holds trivially for q = 1 by definition. Suppose that it ~ ~ holds for some q > 1. The codimension of ∆(4) C is 5m and the dimension of T CM⊕ T CM×C is 5m, ~ ~ _ so that dim T CM⊕ T CM×C - codim ∆(4) is C r-1 . Thus by the Parametric C = 0 < r-1, and evτ _ (q) (q) Transversality Theorem, τ JNF( f ) is transversal to ∆(4) C for an open dense set of f∈(B ∩B4 ). For _ _ (4) any such f, suppose that τ JNF( f )(w,w',λ)∈∆C for some (w,w',λ). Then τ JNF( f )(α w,α w',λ) = (λα w, α w+ λα w',α T x fy …y (w),αTx fy …y (w' )) = (λα w,αw+ λα w', λα w,α (w+ λw' ))∈∆(4) C for any α ∈T. Hence _ has dimension at least 1. Thus the intersection of Image T(w,w',λ)(τ JNF( f )) and T(λw,w+λw',λw,w+λw' )∆(4) C _ (4) the dimension of Image T (w,w',λ)(τ JNF( f )) + T (λw,w+λw',λw,w+λw' )∆C is at most 10m-1 and so ImJNF q-1 0 q-1 0 Embedding Stochastic Systems 35 _ age T(w,w',λ)(τ JNF( f )) + T (λw,w+λw',λw,w+λw' )∆(4) cannot span T (λw,w+λw',λw,w+λw' )((T CM⊕T CM)×(T CM⊕ C _ _ (4) T CM)). Since τ JNF( f ) is transversal to ∆C , this implies that the image of τ JNF( f ) cannot intersect (q+1) (q+1) ∆(4) ∩ B4 is C , and hence no eigenvalue of Tx f y …y has a generalized eigenvector. Thus B (q) (q) 2r open and dense in B ∩ B4 , hence open and dense in D (M×N, M), thereby completing the inductive step. ■ q-1 0 We now go on to show that B (q) ∩ B5(q) is dense in D2r(M×N, M). This requires a small modifica_ _ ~ ~ tion of τ JNF, namely τ 2 : D2r(M×N, M) → C r-1 (T CM⊕ T CM×C,(T CM⊕T CM)×(T CM⊕T CM)), given by _ τ 2( f )(w,w',λ) (λw,λw',T x fy = q-1 …y0 (w),T x fy q-1 …y0 (w' )) _ If τ 2( f )(w,w',λ)∈∆(4) C then λw = Tx f y …y (w) so that xq = τM(T x f y …y (w)) = τM(λw) = x, and so x is periodic with period q. Furthermore w and w' are both (unit) eigenvectors of T x fy …y with the same eigenvalue λ. We want to show that this cannot happen if w and w' are independent, that is w ≠ α w' for all α∈T and as usual we employ a transversality argument. Thus, by Lemma 6.12 evτ_ is C r-1 and q-1 0 q-1 0 q-1 0 2 T (f,w,w',λ)evτ_ (η,0w,0w' ,0λ ) _ _ (0λw,0λw' ,ϖ(T xηq,y(w),ϖ(T xηq,y(w' )) = 2 ~ ~ ~ Unfortunately, however, for evτ_ to be transversal to ∆(4) C we need to restrict to ( T CM⊕ T CM)\ ∆, ~ ~ ~ where ∆ = {(w,w' )∈T CM⊕ T CM : w ≠ α w' for some α∈T } is the set of pairs of linearly depend~ ~ ent vectors in T CM⊕ T CM. Note that since w and w' both have unit norm, it is sufficient to consider α ∈T, rather than α ∈C. Then: 2 Lemma 6.18 If for some q ≥ 0, B(q) ∩ B5(q) is open and dense in B (q) and hence in D2r(M×N, M) ~ ~ ~ then the evaluation map evτ_ : (B (q) ∩ B5(q))×((T CM⊕ T CM)\ ∆)×C → (T CM⊕T CM)×(T CM⊕T CM) is transversal to ∆(4) C . 2 Proof This is identical to the proof of Lemma 6.16, except that evτ_ (f,w,w',λ)∈∆(4) C no longer implies that w ≠ α w' for all α ∈T. This condition now follows by definition by excluding points ~ in ∆. ■ 2 Corollary 6.19 B (q) ∩ B4(q) ∩ B5(q) is open and dense in D2r(M×N, M). Proof By induction on q. The lemma holds trivially for q = 1 by definition. Suppose that it holds for some q > 1. We first show that B (q+1) ∩ B5(q+1) is dense in B (q), closely following the ar~ ~ ~ gument in Corollary 6.17. The dimension of ((T CM⊕ T CM)\ ∆)×C is 5m, whilst the codimension ~ ~ ~ (4) r-1 _ of ∆(4) . Thus by the C is 5m so that dim ((T CM⊕ T CM)\ ∆)×C - codim ∆C = 0 < r-1, and evτ is C _ (4) Parametric Transversality Theorem, τ 2( f ) is transversal to ∆C for a residual set of f∈B (q). Note ~ ~ ~ that since (( T CM⊕ T CM)\ ∆)×C is not closed, we do not get transversality on an open set in this _ case. For any such f, suppose that τ 2( f )(w,w',λ)∈∆(4) C for some (w,w',λ). The dimension of _ ~ ~ ~ T (w,w',λ)(((T CM⊕ T CM)\ ∆)×C) is 5m and hence the dimension of Image T (w,w',λ)(τ 2( f )) is at most 5m. The dimension of T (λw,λw',λw,λw' )∆(4) C is also 5m and the dimension of T (λw,λ w',λ w,λ w' )((T CM⊕T CM)× _ (T CM⊕T CM)) is 10m. However, similarly to Corollaries 6.14 and 6.17, if τ 2( f )(w,w',λ)∈∆(4) then _ _ C ( ( τ 2( f )(α w,α 'w', λ )∈∆(4) for any α , α '∈ T. Hence the intersection of Image T τ f )) and (w,w',λ ) 2 C _ (4) T (λw,λw',λw,λw' )∆C has dimension at least 2, and so the dimension of Image T (w,w',λ)(τ 2( f )) + _ _ (4) T (λw,λw',λw,λw' )∆(4) C is at most 10m - 2. Thus τ 2( f ) cannot be transversal to ∆C at (w,w',λ) if τ 2( f )(w, 2 Embedding Stochastic Systems 36 _ ~ ~ ~ (4) w',λ)∈∆(4) C , so that τ 2( f )(w, w',λ)∉∆C for all (w,w',λ)∈(( T CM⊕ T CM)\ ∆)×C). This implies that if w and w' are both eigenvectors of T x fy …y with the same eigenvalue λ, then w = αw' for some α∈C. Thus, we have shown that B (q+1) ∩ B5(q+1) is dense in B (q), hence dense in D2r(M×N, M). i-1 0 (q+1) (q+1) Now, suppose that f∈(B(q+1) ∩ B4 ∩ B5 ). Since the periodic orbits of f of period q or less are all hyperbolic, the Implicit Function Theorem implies that there is an open neighbourhood of f in D2r(M×N, M) which has the same number of periodic orbits of each period, and furthermore the position of these orbits depends continuously on f. Thus, the Jacobian T x fy …y of any such periodic orbits varies continuously with f, and hence so do its eigenvalues. There is thus an open neighbourhood of f in which the eigenvalues of independent eigenvectors are distinct. (q+1) This neighbourhood is thus in B5(q+1), since B(q+1) ∩ B4 is open in D2r(M×N, M) by Corollary (q+1) (q+1) (q+1) 6.17, B ∩ B4 ∩ B5 is open in D2r(M×N, M). We have show above that B (q+1) ∩ B5(q+1) is 2r (q+1) dense in D (M×N, M) and again by Corollary 6.17 B(q+1) ∩ B4 is dense in D2r(M×N, M). Thus (q+1) (q+1) (q+1) B ∩ B4 ∩ B5 ■ is open and dense in D2r(M×N, M), completing the inductive step. i-1 0 Finally, it turns out to be convenient to place one further restriction on the periodic orbits of f, namely that f does not map one periodic orbit to another in too small a number of iterates. This is a generalization of Lemma 6.11 and will be proved by a similar argument. Although not strictly necessary, this property does considerably simplify the proofs of Propositions 6.25, 6.26 and 6.30 below. To motivate the precise formulation we employ note that hitherto, given a sequence (y0 , y1 , … , yq-1 )∈Nq we have considered orbits periodic with respect to the whole sequence, so that xq = x. However, when we come to the proof of Theorem 3.4, we are concerned specifically with orbits where xi = xj for some 0 ≤ i ≠ j ≤ q-1, without the preceding segment (y0, y1, … , yi-1 ) and the following segment (yj, yj+1, … , yq-1 ) necessarily bearing any relation to the periodic part (yi , yi+1, … , yj-1). We thus define Definition 6.20 Given a sequence (y0, y1, … , yq-1 )∈Nq and some i,j,k∈Z with 0 ≤ i ≠ j ≤ q-1 define k mod(i,j) to be the unique k'∈{ i, i+1, … , j-1 } such that k' = k mod (j-i). Then let b –(i,j) = min { k∈{ 0, 1, … , i } : yk' = y(k' mod(i,j)) for all k ≤ k' ≤ i } b +(i,j) = max { k∈{ j, j+1, … , q } : yk' = y(k' mod(i,j)) for all j-1 ≤ k' ≤ k-1 } We shall call the segment (yb (i,j), yb (i,j)+1, … , yb – – + ■ ) the maximal periodic segment of (i,j). (i,j)-1 The asymmetry in the definition of b–(i,j) and b +(i,j) arises because yj plays no role in determining whether or not xi = xj. Note that we have yk = y(k mod(i,j)) for all k∈{ b –(i,j), … , b +(i,j)-1 }, but yb (i,j)-1 ≠ y((b (i,j)-1) mod(i,j)) (if b–(i,j) > 0) and yb (i,j) ≠ y(b (i,j) mod(i,j)) (if b +(i,j) < q). Using Lemma 6.3, it is straightforward to see that if xi is periodic for (yi , yi+1, … , yj-1) under some f∈D2r(M× N d-1 , M), then the first of these relations also holds for xk. – – + + Lemma 6.21 Given f∈D2r(M×N d-1, M), x∈M and (y 0 , y1 , … , yq-1 )∈Nq such that xi ∈Py …y for some 0 ≤ i ≠ j ≤ q-1 then xk = x(k mod(i,j)) for all k∈{ b –(i,j), … , b +(i,j) }. i j-1 Proof Suppose k'∈{ i, i+1, … , j-1 }. Then by Lemma 6.3, xk' is periodic for (yk' , yk'+1, … , yj-1, yi , yi+1, … , yk'-1), so that fy …y y …y (xk') = xk'. First consider a k∈{ b –(i,j), … , b +(i,j) } such that k' = k mod(i,j) and k > k'. Then yk'' = y(k'' mod(i,j)) for all k' ≤ k'' < k and hence (yk' , yk'+1, … yk-1 ) = (yk' , yk'+1, … , yj-1, yi , yi+1, … , yk'-1)r for some r. Thus xk = fy …y (xk' ) = (fy …y y …y )r(xk' ) = xk' . k'-1 j-1 i k' k-1 k' k'-1 j-1 i k' Embedding Stochastic Systems 37 Similarly, if k < k', then (yk , yk+1, … yk'-1) = (yk' , yk'+1, … , yj-1, yi , yi+1, … , yk'-1)r for some r. Then, since fy …y is invertible, xk = (fy …y )-1 (xk') = ((fy …y y …y )r)-1 (xk') = ((fy …y y …y )-1 )r(xk') = xk' = x(k ■ mod(i,j)). Hence in both cases, xk = xk = x(k mod(i,j)) as required. k k'-1 k'-1 k' k'-1 k k'-1 j-1 i j-1 i k' If we now restrict to f∈B (q), we also have xb (i,j)-1 ≠ x((b (i,j)-1) mod(i,j)) (if b –(i,j) > 0) and xb (i,j)+1 ≠ x((b (i,j)+1) mod(i,j)) (if b +(i,j) < q), in analogy to the relations for yb (i,j)-1 and yb (i,j). However, we shall in fact need the stronger: – – + + – + Lemma 6.22 Suppose f∈B (q) and xi ∈Py …y for some (y0, y1, … , yq-1 )∈Nq. Then {x0, … , xb (i,j)-1} {xb (i,j), … , xb (i,j)} = ∅, and similarly {xb (i,j)+1, … , xq } ∩ {xb (i,j), … , xb (i,j)} = ∅. – i + j-1 + – – ∩ + Proof We show that xk∉{xb (i,j), … , xb (i,j)} for all 0 ≤ k < b –(i,j) by contradiction; the proof for b +(i,j) < k ≤ q is similar. So suppose that xk = xk' with 0 ≤ k < b–(i,j) ≤ k' ≤ b+(i,j). By Lemma 6.21, xk' = xk'' for some i ≤ k'' < j and thus by Lemma 6.3, xk = xk' = xk'' is periodic for (yk'', yk''+1, … , yj-1, yi , … , yk''-1 ). On the other hand xk = xk'' implies that xk is periodic for (yk, yk+1, … , yk''-1 ). Hence by the definition of B (q), (yk'', yk''+1, … , yj-1, yi , … , yk''-1 ) and (yk, yk+1, … , yk''-1 ) are multiples of the same prime factor, of length say p, so that j-i = ps and k''-k = ps'. This prime factor must agree with the last p entries of (yk, yk+1, … , yk''-1 ), and we may thus write it as (yk''-p, yk''-p+1, … , yk''-1 ). Thus (yk'', yk''+1, … , yj-1, yi , … , yk''-1 ) = ( yk''-p, yk''-p+1, … , yk''-1 )s and (yk, yk+1, … , yk''-1 ) = (yk''-p, yk''-p+1, … , yk''-1 )s'. Working upwards from k'' in the first of these relations, we see that yk''+r = yk''-p+(r mod p) for all r∈{ k'', … , j-1 }, whilst working backwards yields yr = y(r mod( k''-p, k''-1)) for all r∈ { i, … , k''-1 }. But yk''-p+(r mod p) = y((k''+r) mod(k''-p, k''-1)) and so yr = y(r mod(k''-p, k''-1)) for all r∈{ i, … , j-1 }. Similarly, working downwards from k''-1 in the second relation gives yr = y(r mod( k''-p, k''-1)) for all r∈ { k, … , k''-1 }. But given any r∈{ k, … , k''-1 }, by definition (r mod(i,j) - r) is divisible by (j-i) and hence by p, whilst (r mod(k''-p, k''-1) - r) is also divisible by p. Hence (r mod(i,j) - r mod(k''-p, k''-1)) is divisible by p, so that r mod(k''-p, k''-1) = (r mod(i,j)) mod(k''-p, k''-1). Since (r mod(i,j))∈{ i, … , j-1 }, this implies by above that y(r mod(i,j)) = y(r mod(k''-p, k''-1)) and since yr = y(r mod(k''-p, k''-1)) we have yr = y(r mod(i,j)) for all r∈{ k, … , k''-1 }. Thus b–(i,j) ≤ k which contradicts our initial assumption that k < b–(i,j). ■ – + Finally, we can show that for a dense open set of f∈B (q) we cannot have more that one periodic segment on an orbit of length q. Thus define B6(q) { f∈D2r(M×N d-1, M) : if x∈Py …y with 0 < j < q for some (y0, y1, … , yq-1 )∈Nq and for some j < i' < j' < q we have b–(0,j) < b–(i',j' ) then fy …y (x)∉Py …y } = 0 j-1 i'-1 0 i' j'-1 Note that if f∈B6(q) and xi ∈Py …y then we cannot have fy …y (xi )∈Py …y for any i', j' such that j < i' < j' < q and b–(i,j) < b –(i', j' ) and hence it is sufficient to have x∈Py …y in the definition of B6(q), rather that the apparently more general xi ∈Py …y . We want to show that B (q) ∩ B6(q) is open and _ dense in D2r(M×N, M). Define ρ4 : D2r(M×N d-1, M) → Dr(M,M×M×M×M) by i j-1 i'-1 ρ4( f ) = i' (Id, fy j-1 …y0 , fy i'–1…y0 j-1 j'-1 0 i _ i i-1 , fy j'-1 …y0 ) _ _ similarly to ρy,y' in Lemma 6.11. Observe that if ρ4( f )(x)∈∆4, where ∆4 = { (x,x',x', x''' )∈M×M× M×M : x = x', x'' = x''' }, then x = xj and xi' = xj ' , so that x∈Py …y and xi' ∈Py …y . As usual, we therefore aim to prove that evρ_ is transversal to ∆4, and then use a dimension counting argu_ _ ment to show that ρ4( f ) can only be transversal to ∆4 if the image of ρ4( f ) does not intersect ∆4: 0 j-1 4 i' j'-1 Embedding Stochastic Systems 38 Lemma 6.23 B (q) ∩ B6(q) is open and dense in B (q) and hence in D2r(M×N, M). Proof First observe that xi' ∈Py …y if and only if x b (i',j')∈Py …y . Hence we may assume – without loss of generality that b (i',j' ) = i'. As in Corollary 6.10, Corollary 4.2 implies that evρ_ _ _ _ _ defined as above is C r and T (f,x)evρ_ (η,0x) = (0x,ηj(x,y),ηi'(x,y), ηj'(x,y)) where ηi(x,y) satisfies (6.1). We aim to show that evρ_ : B(q)×M → M×M×M×M is transversal to ∆4. Suppose that evρ_ ( f, x)∈∆4, so that x∈Py …y and xi' ∈Py …y . Let the minimal period of x for (y0, y1, … , yj-1) be p and of xi' for (yi', yi'+1, … , yj'-1 ) be p'. Using (6.2) we have i' – j'-1 b –( i',j') b –( i',j')+j'-i'-1 4 4 4 0 i-1 i' j'-1 4 p _ ηj(x,y) ∑T = (s) (T x fy i (η(xi-1 ,yi-1 ))) p-1 …yi i =1 with T (s) invertible by Lemma 6.8. As usual, by Corollary 6.7 the points {x0, x1, … , xp-1 } are all distinct and so given a u∈T x M, we can construct a η ∈T f B (q) using Corollary C.12 of [Stark, _ 1999] such that η(xi , yi ) = 0x for i = 0, … , p-2, and η(xp-1 , yp-1 ) = (T (s))-1 (u), so that ηi(x,y) = u. The points { (xp,yp),… , (xb (0, j)-1,yb (0, j)-1) } are all determined by xi = x (i mod p) and yi = y (i mod p). _ Hence this choice of η∈T f B (q) fixes a value of ηb (0, j)(x,y)∈T x M (which we do not need to evaluate). By Corollary 6.7, xb (0, j) ≠ xi for any i ≠ b+(0,j) mod p, so that (xb (0, j), yb (0, j)) ≠ (xi ,yi ) for all i ≠ b +(0,j) mod p. On the other hand, by the definition of b +(0,j) we have yb (0, j) ≠ y(b (0, j) mod p), and hence (xb (0, j), yb (0, j)) ≠ (x(b (0, j) mod p), y(b (0, j) mod p)). Thus (xb (0, j), yb (0, j))∉{ (x0,y 0), (x1,y 1), …, (xb (0, j)-1, yb (0, j)-1) } and hence we can choose the value of η(xb (0, j), yb (0, j)) independently of η(xi , yi ) _ _ for i = 0, … , b +(0, j)-1. Taking η(xb (0, j), yb (0, j)) = -T (x ,y ) f (ηb (0, j)(x,y),0) gives ηb (0, j)+1(x,y) = 0x by (4.5). Furthermore by Lemma 6.22, {xb (0, j)+1, … , xq } ∩ {x0, x1, … , xb (0, j)} = ∅ , and _ hence we can choose η(xi , yi ) = 0 x for i = b +(0, j)+1, … , q-1, which gives ηi(x,y) = 0 x for i = b +(0, j)+1, … , q. Hence, this choice of η∈T f B (q) gives T (f,x)evρ_ (η,0x) = (0x,u,0x ,0x ). j i+1 + + + + b + (0,j) + + + + + + + + + + + + + b + (0,j) + b + (0,j)+1 b + (0,j) i+1 + + + + + + i 4 i' j' Turning to the last two components of T (f,x)evρ_ , by Lemma 6.22 and our assumption that b–(i', j' ) = i' we have {x0, x1, … , xi'-1 } ∩ {xi' , … , xj'-1} = ∅. Hence we may choose a η'∈T f B (q) such that η'(xi , yi ) = 0 x for i = 0, … , i'-1, and η'(xi , yi ) takes on whatever values we require for i = j', … , _ j'+p'-1. For such a choice, using (6.1) we have ηi'(x,y) = 0 x and 4 i+1 i' i ′+ p′ _ ηj+i'(x,y) ∑T = (s' ) (T x fy i (η(xi-1 , yi-1 ))) i'+p'-1 …yi i =i ′+1 with T (s') invertible by Lemma 6.8. Thus, given u'∈T x M, we take η'∈T f B (q) so that η(xi , yi ) = 0 x _ for i = i', … , i'+p'-2, and η(xi'+p'-1, yi'+p'-1) = (T (s'))-1 (u' ), so that ηj'(x,y) = u'. This gives T(f,x)evρ_ (η', 0 x ) = (0x,0x ,0x ,u' ). Hence we have (0x,u,0x ,u' )∈T (f,x)evρ_ (T (f,x)(B (q)× M )) for all u∈T x M and u'∈ T x M. But (u,u,u',u' )∈T (x,x,x',x' )∆ 4 and so T (f,x)evρ_ (T (f,x)(B (q)×’’M )) + T (x,x,x',x' )∆ 4 spans T (x,x,x',x' )(M× M×M×M). Thus evρ_ is transversal to ∆4. The dimension of M is m and the codimension of ∆4 is 2m and hence dim M - codim ∆ 4 < 0 < r. Since evρ_ is C r, the Parametric Transversality Theo_ rem implies that ρ4( f ) is transversal to ∆4 for and open dense set of f∈B (q). Choose any such f _ _ and suppose ρ4( f )(x)∈∆4. Then the dimension of T xρ4( f )(T xM) is at most m and the dimension _ of T(x,x,x',x' )∆4 is 2m. Hence Image Txρ4( f ) + T (x,x,x',x' )∆4 cannot span T (x,x,x',x' )(M×M×M×M) which _ _ has dimension 4m. Thus for ρ4( f ) to be transversal to ∆4 we must have ρ4( f )(x)∉∆4 for all x∈M, so that f∈B6(q). Hence B (q) ∩ B6(q) is open and dense in B (q) as required. ■ j' j i' i' j' 4 ' 4 j 4 4 i+1 4 Embedding Stochastic Systems 6.2. 39 EMBEDDING THE SHORT PERIODIC ORBITS Given the definitions in the previous sections, define ~ (q) B = B (q) ∩ B4(q) ∩ B5(q) ∩ B6(q) ~ From now on, we shall restrict ourselves to an f∈B (2d) and aim to construct an open dense set of ϕ in C 2r(M, R) for which Φ f,ϕ,y is an embedding for all y∈Nd-1 , by analogy to the “Unstated Takens’ Theorem” of [Huke, 1993] (Theorems 2.2 and 4.7 of [Stark, 1999]). In fact, as indicated at the beginning of §6, it is sufficient to prove that such ϕ are dense, and since there is only a finite number of y in N d-1, we can consider at single y at a time. As already indicated in §3.2, the main difficulty occurs at those points x∈M such that xi = xj for ~ some 0 ≤ i < j ≤ d-1. In the previous section we showed that for f∈B (2d) the number of such points was finite, and our next aim is to construct an open dense set Af of ϕ in C 2r(M, R) for which Φ f,ϕ,y embeds this finite set for all y∈Nd-1 . Thus recall from §3.2 that we define ~ Py … y 0 = q-1 { x∈M : fy i-1 …y0 (x) = fy j-1 …y0 (x) for some i ≠ j with 0 ≤ i < j ≤ q } Thus in terms of the previous section ~ Py … y 0 U = q-1 (fy i-1 …y0 )-1 (Py …y ) i j-1 0≤i < j <q ~ Since fy …y is invertible and Py …y is a finite set if f∈B (q') for any 0 < q ≤ q', this means that Py … y is finite for each (y0, y1, … , yq-1 )∈Nq and hence i-1 0 i j-1 0 q-1 ~ (q) P U = ( y0 ,…, yq −1 )∈N ~ Py … y 0 q-1 q ~ is finite if f∈B (q') for any 0 < q ≤ q'. Note that if x∈ P (q) then it is possible to have fy …y (x)∈Py …y and fy' …y' (x)∈Py' …y' for some (y0, y1, … , yj-1) ≠ (y'0, y'1, … , y'j'-1 ). One could exclude this using a proof similar to that of Lemma 6.23, but in fact we do not need to do this. We are now in a position to give the analogues of Proposition 4.2 and 4.3 of [Stark, 1999]. i'-1 0 i-1 i' j'-1 0 i j-1 ~ Proposition 6.24 The set of ϕ such that ϕ is 1-1 on P (2d-1) is open and dense in C 2r(M, R). ~ Proof This is intuitively obvious since P (2d-1) consists of a finite number of points, and so we can perturb ϕ independently on each point to ensure that it takes a distinct value there. Furthermore if ϕ is injective, then clearly so is any ϕ ' in an open neighbourhood of ϕ . A formal proof is identical to the proof of Proposition 4.2 in §4.2.2 of [Stark, 1999]. ■ ~ Note that since ϕ is the first component of Φf,ϕ,y, this means that Φ f,ϕ,y is also injective on P (2d-1) ~ (2d-1) for an open dense set of ϕ, ie Φ f,ϕ,y(x) ≠ Φ f,ϕ,y(x' ) for all x,x'∈P such that x ≠ x'. The reason ~ ~ that we want injectivity on P (2d-1) rather than just P (d-1) is that this simplifies the treatment of ~ injectivity in the general case below. By contrast, we only need Φ f,ϕ,y to be immersive on P (d-1). ~ In fact, it turns out to be simplest to phrase this in terms of each individual Py … y : 0 d-2 ~ Proposition 6.25 Suppose that x∈Py … y for some y = (y0, y1, … , yd-2 )∈Nd-1 , then the set of ϕ such that T xΦ f,ϕ,y has rank m is open and dense in C 2r(M, R). 0 d-2 Embedding Stochastic Systems 40 Proof The kth component of TxΦ f,ϕ,y is given by a k = T x ϕ ° T x fy …y and we thus need to show ~ that m such linear maps are independent. Since x∈Py … y , we have xi = xj for some 0 ≤ i < j ≤ d1. We shall separately treat the two cases where b +(i,j) - b-(i,j) ≥ m, and b +(i,j) - b-(i,j) < m. Thus first consider the former, which is simply the proof of Proposition 4.3 in §4.2.3 of [Stark, 1999] applied to the maximal periodic segment {xb (i,j), … , xb (i,j)}. Without loss of generality, we may assume that b-(i,j) = i (this has no effect on the details of the proof, but serves to considerably simplify the notation). Let p be the minimal period of xi for (yi , yi+1, … , yj-1). Given any k such that i ≤ k ≤ i + m -1, since i + m -1 < b+(i,j), we have xk = x(k mod(i,i+p)) by Lemma 6.21. Hence ak = a(k mod(i,i+p)) ° A r where r is such that k = k mod(i,i+p) + rp, and A = T x fy …y . Thus (ai, ai+1, … , ai+m-1 ) = (ai, ai+1, … , ai+p-1 , ai ° A, ai+1 ° A, … , ai+p-1 ° A, … , ai+s ° Ar'), where s = (m-1-i) mod p and m-1 = i + s + rp. By [Huke, 1993] (see Lemma 4.9 of [Stark, 1999] this implies that (ai, ai+1, … , ai+m-1 ) are linearly independent, and hence TxΦ f,ϕ,y has rank m, for a dense open set of ϕ. ~ We now turn to the case b +(i,j) - b-(i,j) < m. Then since f∈B (2d) we cannot have xi' = xj ' for any i',j' such that 0 ≤ i' < j' < b -(i,j), or such that b +(i,j) < i' < j' ≤ d-1. Nor can we have xi' = xj ' with 0 ≤ i' < b-(i,j) and b +(i,j) < j' ≤ d-1 since in that case we would have b–(i',j' ) < b -(i,j) < b+(i,j) < b +(i',j' ). Since f∈B2(q), this means that (yb (i,j), yb (i,j)+1, … , yb (i,j)-1) and (yb (i',j'), yb (i',j')+1 , … , yb (i',j')-1) have the same prime factor, violating the maximality of (yb (i,j), yb (i,j)+1, … , yb (i,j)-1). Thus {x0, … , xb (i,j)-1} ∪ {xb (i,j)+1, … , xd-1 } are distinct. This set consists of b -(i,j) + d-1 - b+(i,j) > m - 1 since d > 2m and b +(i,j) - b-(i,j) < m. Hence {x0, … , xb (i,j)-1} ∪ {xb (i,j)+1, … , xd-1 } consists of at least m distinct points. It is now intuitively obvious that for a dense open set of ϕ, we can choose Tx ϕ on these points so that (a0, … , ab (i,j)-1; ab (i,j)+1, … , ad-1) has rank m. Formally, this follows from the fact, proved in Corollary C.18 of [Stark, 1999], that the map ϕ a (T x ϕ , T x ϕ , … , T x ϕ, Tx ϕ, … ,T x ϕ) is a submersion. But (a0, …, ab (i,j)-1; ab (i,j)+1, … , ad-1 ) is a subset of the components of T xΦ f,ϕ,y , and hence if it has rank m, so has T xΦ f,ϕ,y . ■ ~ We now define Af to be the set of ϕ such that ϕ is 1-1 on P (2d-1) and T xΦ f,ϕ,y has rank m for all x∈ ~ Py … y . By the previous two propositions, Af is open and dense in C 2r(M, R). Now, as in Lemma 4.4 of [Stark, 1999], let k – k-1 0 d-2 0 + i i+p-1 i – – + – – – + – – + + – + k – + 0 b + ( i,j)+1 d-1 – + b -(i,j)-1 d- 2 0 1 E 2r ~ { ( f,ϕ)∈D 2r(M×N, M)× C 2r(M, R) : f∈B (2d), ϕ∈Af } = The same proof as in [Stark, 1999] shows that E 2r is open and dense in D 2r(M×N, M)× C 2r(M, R). It is thus a Banach manifold and T ( f,ϕ)E 2r = T( f,ϕ)D 2r(M×N, M)× C 2r(M,R) for any ( f,ϕ)∈E 2r. From now on, we shall restrict ourselves to E 2r. 6.3. IMMERSIVITY OF Φ This is identical to the proof for the standard Takens’ Theorem in §4 of [Stark, 1999]. For a ~ given y∈Nd-1 define τy : E 2r → C r-1 (TM,T Rd) by τy(f,ϕ) = TΦ f,ϕ,y (6.7) where Φ f,ϕ,y is defined in (4.1) by Φ f,ϕ,y(x) = Φf,ϕ(x,y). Thus evτ ( f,ϕ,v) = evτ( f,ϕ,v,y), where τ(f,ϕ) = T 1Φf,ϕ as in (4.3). Thus by Lemma 4.4, evτ is C r-1 and T ( f,ϕ,v)evτ (0f ,ξ,0x ) = ( ϖ(T x ξ(v0)), ϖ(T x ξ(v1)), … , ϖ(T x ξ(vd-1 ))) †, where as usual vi = Tx fy …y (v), x = τM(v) and xi = fy …y (x,y) and y d-1 y 1 i-1 0 y 0 i-1 0 Embedding Stochastic Systems 41 vi = Tx fy …y (v). Let L be the 0 section in T Rd, and then, exactly as Proposition 4.5 of [Stark, 1999], we have: i-1 0 Proposition 6.26 Given any y∈Nd-1 , evτ is transversal to L. y ~ Proof Suppose evτ ( f,ϕ,v)∈L, so that T xΦ f,ϕ,y(v) = 0. Since v∈TM, we have ||v || = 1 and hence v ~ ≠ 0. Thus rank TxΦ f,ϕ,y < m, and since ϕ∈Af , we cannot have x∈Py … y . Thus the points {x0, … , xd-1 } are distinct. Furthermore, since fy …y is a diffeomorphism, vi ≠ 0 for all i. Hence, by Corollary C.16 of [Stark, 1999] T ( f,ϕ,v)evτ is surjective, and hence evτ is transversal to L. ■ ~ ~ Now, as usual, dim TM = 2m-1, codim L = d and so if d ≥ 2m-1, we have dim TM - codim L < 0 < r-1. Thus by the Parametric Transversality Theorem, τy(f,ϕ) is transversal to L for an open dense set of ( f,ϕ)∈E 2r and hence an open dense set in D 2r(M×N, M)× C 2r(M, R). Fix any such ( f, ~ ~ ϕ), and suppose that τy(f,ϕ)(v) = TxΦ f,ϕ,y(v)∈L for some v∈TM. Then dim T v(TM) = 2m-1, and ~ so the dimension of T v[TΦ f,ϕ,y](T v(TM)) cannot be greater than 2m-1. On the other hand dim T uL = d, and dim T u(T Rd) = 2d, for any u∈T Rd. Thus if d ≥ 2m-1, then dim Image Tv(T xΦ f,ϕ,y) + dim TuL < dim Tu(T Rd), so that Image Tv(T xΦ f,ϕ,y) + TuL cannot span Tu(T Rd). Thus the only way for τy(f,ϕ) = TΦ f,ϕ,y to be transversal to L is for the image of TΦ f,ϕ,y not to intersect L. Thus ~ T xΦ f,ϕ,y(v) ≠ 0 for all v∈TM, ie Φ f,ϕ,y is immersive for an open dense set of ( f,ϕ)∈E 2r. Taking the intersection over the finite number of y∈Nd-1 , we obtain an open dense set for which Φ f,ϕ,y is immersive for all y. y i-1 0 d-2 0 y y INJECTIVITY OF Φ 6.4. Similarly to §4 of [Stark, 1999] define ρy : E 2r → C r((M×M)\∆, Rd×Rd ) for a given y∈Nd-1 by ρy( f,ϕ)(x,x') = (Φ f,ϕ,y(x),Φ f,ϕ,y(x' )) where ∆ = { (x,x' )∈M×M : x = x' } is the diagonal in M×M. Then evρ ( f,ϕ,x,x' ) = (evρ( f,ϕ,x,y), evρ( f,ϕ,x',y)) where ρ( f,ϕ) = Φ f,ϕ is as in (4.2). Thus evρ is C r by Lemma 4.3 and T ( f,ϕ,x,x' )evρ (0η ,ξ, 0 x ,0x' ) = (( ξ(x), ξ(x1 ), … , ξ(xd-1 )),(ξ(x'), ξ(x'1 ), … , ξ(x'd-1 ))) † . Let ∆ d be the diagonal in Rd×Rd, and suppose that evρ ( f,ϕ,x,x' )∈∆ d, so that Φ f,ϕ,y(x) = Φ f,ϕ,y(x' ). Then in particular ϕ(x) = ϕ(x' ) ~ but x ≠ x' and thus by the definition of Af , at least one of x or x' cannot be in P (2d-1). We shall suppose without loss of generality that this is x. Then y y y y ~ ~ Lemma 6.27 Suppose f∈B (2d), x ≠ x', x∉P (2d-1) and for some i,i'∈{ 0, 1, … , d-1 } we have xi = x'i '. Then xi ≠ x'i'' for all i''∈{ 0, 1, … , d-1 } such that i' ≠ i'' and x'i' ≠ xi'' for all i''∈{ 0, 1, … , d-1 } such that i ≠ i''. Proof If xi = x'i'' for some i'' such that i' ≠ i'' then x'i' = x'i'' and hence x'i' = xi is a periodic point of ~ period less than 2 d, contradicting x∉P (2d-1). Similarly if x'i' = xi'' for some i'' such that i ≠ i'' then ~ xi = x'i' = xi'' , so that xi is periodic, again contradicting x∉P (2d-1). ■ Given any i∈{ 0, 1, … , d-1 }, it thus makes sense to define a chain starting at i = i0 to be a sequence (i0, i1, … , iκ-1 ) with ij∈{ 0, 1, … , d-1 } and xi = x'i for all j∈{ 0, 1, … , κ-1 }. j j+1 ~ ~ Lemma 6.28 Suppose f∈B (2d), x ≠ x', x∉P (2d-1) and (i0 , i1 , … , iκ-1 ) is a chain starting at i = i 0. Then {i0, i1, … , iκ-1 } are distinct, ie ij ≠ ij' for any 0 ≤ i ≠ i' ≤ κ-1. Embedding Stochastic Systems 42 Proof Since fy …y is a diffeomorphism for all i, we have xi = x'i for all i∈{ 0, 1, … , d-1 } and hence ij ≠ ij+1 for all j∈{ 0, 1, … , κ-2 }. Hence either ij < ij+1 or ij > ij+1, suppose the former. Then we claim that ij+1 < ij+2. Suppose not, so that ij+1 > ij+2, and denote ij = i, ij+1 = i' and ij+2 = i'', so that i' > i and i' > i''. Then xi' = fy …y (xi ), x'i' = fy …y (x'i'' ) and xi = x'i' , xi' = x'i'' . This implies that fy …y y …y (xi ) = fy …y ( fy …y (xi )) = fy …y (xi') = fy …y (x'i'' ) = x'i' = xi . Hence xi is periodic for (yi , ~ yi+1, … , yi'-1 , yi'' , yi''+1, … , yi'-1 ), contradicting x∉P (2d-1). Similarly if ij > ij+1 then ij+1 > ij+2. Repeating this argument for j = 0, 1, … , κ-2, we either have i0 < i1 < … < iκ-1 or i0 > i1 > … > iκ-1 . ■ i-1 0 i'-1 i'-1 i'' i'-1 i i'-1 i'' i'-1 i i i'-1 i'-1 i'' i'' i'-1 i'' Since {i 0, i1, … , iκ-1 } ⊂ { 0, 1, … , d-1 } this lemma implies that κ ≤ d, and in particular, every chain has to be finite. Thus we may define a maximal chain to be one that cannot be extended (in either direction), ie a chain such that x'i ∉{x0, x1, … , xd-1 } and xi ∉{x'0 , x'1 , … , x'd-1 }. Observe that if xi ∉{x'0 , x'1 , … , x'd-1 } and x'i ∉{x0, x1, … , xd-1 } then the maximal chain starting at i is (i). Not every i has a maximal chain starting from it, but: 0 k-1 ~ ~ Lemma 6.29 Suppose f∈B (2d), x ≠ x' and x∉P (2d-1). Then the maximal chains give a disjoint partition of { 0, 1, … , d-1 }, ie every i∈{ 0, 1, … , d-1 } lies in precisely one maximal chain. Proof Given a chain (i0, i1, … , iκ-1 ), we either have xi ∉{x'0 , x'1 , … , x'd-1 } or xi = x'i' for some i'∈ { 0, 1, … , d-1 } in which case (i0, i1, … , iκ-1 ,i' ) is also a chain and by Lemma 6.27 the choice of i' is unique. Similarly, either x'i ∉{x0, x1, … , xd-1 } or x'i = xi'' for some unique i''∈{ 0, 1, … , d-1 } and (i'', i0, i1, … , iκ-1 ) is a chain. Since by Lemma 6.28 every chain has length less than or d, we can only extend (i 0, i1, … , iκ-1 ) a finite number of times in either direction. When the chain can no longer be extended, it is maximal, and hence (i0 , i1 , … , iκ-1 ) is a sub-chain of a unique maximal chain. Every i lies in at least the one point chain ( i), and hence it lies in at least one maximal chain. But by Lemma 6.27, two distinct maximal chains cannot intersect, and hence each i lies in a unique maximal chain. ■ k-1 k-1 0 0 We now have the tools required to prove Proposition 6.30 Given any y∈Nd-1 , evρ is transversal to ∆d. y ~ Proof Suppose evρ ( f,ϕ,x,x' )∈∆ d, so that as above without loss of generality we have x∉P (2d-1). Denote z = Φf,ϕ,y(x) = Φ f,ϕ,y(x'), so that evρ ( f,ϕ,x,x' ) = (z,z) and for convenience write Ξ = ( f,ϕ, x,x'). Let ei be a basis vector for T z R and e(i) be the vector (0, … , ei , … ,0)∈T zRd, so that e(0), e(1), … , e(d-1) forms a basis for T zR d. Denote by S the span of Image T Ξevρ + T (z,z)∆d. To show that evρ is transversal to ∆d, we thus need to prove that S = T (z,z)(Rd×Rd) = TzRd×TzRd. Since (u,u)∈ T (z,z)∆d for any u∈T zRd, it is sufficient to show that (e(i),0)∈S for all i∈{ 0, 1, … , d-1 }. By Lemma 6.29 the maximal chains form a partition of { 0, 1, … , d-1 } and hence it is sufficient to show that (e(i ),0)∈S for all j∈{ 0, 1, … , κ -1 } where (i0, i1, … , iκ-1 ) is a maximal chain. We do this in~ ductively, starting from the end of the chain. Since x∉P (2d-1), the points {x0 , … , xd-1 } are distinct, and xi ∉{x'0 , x'1 , … , x'd-1 } because the chain is maximal. Thus as in Proposition 5.5, by Corollary C.12 of [Stark, 1999] there exists a ξ∈T ϕ C r(M, R) such that ξ(xi ) = ei , ξ(xi ) = 0 for all i ≠ iκ-1 and ξ(x'i ) = 0 for all i. Thus T Ξevρ (0η ,ξ, 0 x ,0x' ) = (e(iκ ),0), and hence (e(iκ ),0)∈S. Since (e(iκ ),e(iκ ))∈T (z,z)∆ d ⊂ S, we also have (e(iκ ),e(iκ )) - (e(iκ ),0) = (0,e(iκ ))∈S. Now xi = x'i and by Lemma 6.27, xi ≠ x'i for all i ≠ iκ-1 and xi ≠ xi for all i ≠ iκ-2 . Thus, as above, we can find a ξ∈ T ϕ C r(M, R) such that TΞevρ (0η ,ξ, 0 x ,0x' ) = (e(iκ ),e(iκ )), and hence (e(iκ ),e(iκ ))∈S. We have already y y i y y j k-1 -1 -1 y -1 -1 k-1 k-1 -1 -1 -1 -1 k-2 k-2 k-2 y -2 -1 -2 -1 k-1 Embedding Stochastic Systems 43 shown that (0,e(iκ ))∈S, and so (e(iκ ),e(iκ )) - (0,e(iκ )) = (e(iκ ),0)∈S. Repeating this argument κ-1 times, we show that (e(i ),0)∈S for all j∈{ 0, 1, … , κ -1 }, as required. ■ -1 -2 j -1 -1 -2 Now, as usual, we apply the Parametric Transversality Theorem and count dimensions. We have dim (M×M)\∆ = 2m, codim ∆d = d and so if d ≥ 2m-1, we have dim (M×M)\∆ - codim ∆d < 0 < r. Thus by the Parametric Transversality Theorem, ρy(f,ϕ) is transversal to ∆d for a residual set of ( f,ϕ)∈E 2r and hence a residual set in D 2r(M×N, M)× C 2r(M, R) (we do not get an open set since (M×M)\∆ is not closed, but as indicated at the beginning of §6, density is sufficient). Fix any such ( f,ϕ), and suppose that ρy( f,ϕ)(x,x',y) = (Φ f,ϕ,y(x),Φ f,ϕ,y(x'))∈∆d for some (x,x' )∈(M× M)\∆. Then dim T (x,x' )((M×M)\∆) = 2m, and so the dimension of T (x,x' )[Φ f,ϕ,y ×Φf,ϕ,y](T (x,x' )((M× M)\∆)) cannot be greater than 2m. On the other hand dim T (z,z)∆d = d, and dim T (z,z)(Rd×Rd) = 2d. Thus if d ≥ 2m-1, then dim Image T(x,x' )[Φ f,ϕ,y ×Φf,ϕ,y] + dim T (z,z)∆ d < dim T (z,z)(Rd×Rd), so that Image T (x,x' )[Φ f,ϕ,y ×Φf,ϕ,y] + dim T (z,z)∆ d cannot span T (z,z)(R d× R d). Thus the only way for ρy( f,ϕ ) = Φ f,ϕ,y ×Φf,ϕ,y to be transversal to ∆ d is for the image of Φ f,ϕ,y ×Φf,ϕ,y not to intersect ∆d. Thus Φf,ϕ,y(x) ≠ Φ f,ϕ,y(x') for all x ≠ x', ie Φf,ϕ,y is injective for a residual set of ( f,ϕ)∈E 2r. Taking the intersection over the finite number of y∈Nd-1 , we obtain a residual set for which Φ f,ϕ,y is injective for all y, as required. 7. THEOREM 2.5: VARIATIONS AND PROOFS Since Φ f,ϕ,ω,η depends only on ω0, …, ωd-2 and η0, …, ηd-1, then as for Theorem 2.3, it is sufficient to consider Φ f,ϕ,y,z(x) (ϕ(f (0)(x,y),z0), ϕ(f (1)(x,y),z1), …, ϕ(f (d-1)(x,y),zd-1 ))† = for (y ,z)∈N d-1 ×(N')d. The reduction to this case is carried out exactly as in §3.1, and the details are therefore left to the reader. 7.1. THE D IMENSION OF N AND N' Depending on the dimensionality of N and N' a number of versions of Theorem 2.5 are possible. By far the easiest case is when dim N' > 0. Then η0, …, ηd-1 are distinct for µ'Σ' -almost all η, and hence the maps ϕη , ϕη , … , ϕη are independent. Theorem 2.5 then amounts to little more than the standard Whitney Embedding Theorem (eg [Hirsch, 1976]), and in fact can be strengthened to give: 0 1 d-1 Theorem 7.1 Let M, N and N' be compact manifolds, with m = dim M > 0 and dim N' > 0. Suppose that d ≥ 2m+1 and r ≥ 1. Given any f∈Dr(M×N, M) there exists a residual set of ϕ ∈ C r(M×N', R) such that for any ϕ in this set there is an open dense set Σf,ϕ ⊂ Σ×Σ' such that Φf,ϕ,ω,η is an embedding for all (ω ,η)∈Σf,ϕ . If µΣ and µ'Σ' are invariant measures on Σ and Σ' such that µd-1 and µ'd are absolutely continuous with respect to Lebesgue measure on N d-1 and (N')d respectively, then we can choose Σf,ϕ such that µΣ ×µ'Σ'(Σf,ϕ) = 1. Using the same procedure as in §3.1 this follows from: Theorem 7.2 Let M, N and N' be compact manifolds, with m = dim M > 0 and dim N' > 0. Suppose that d ≥ 2m+1 and r ≥ 1 and let ν d-1 and ν 'd denote Lebesgue measure on N d-1 and (N')d respectively. Given any f∈Dr(M×N, M) there exists a residual set of ϕ∈C r(M×N', R) such Embedding Stochastic Systems 44 that for any ϕ in this set there is an open dense set Uf,ϕ ⊂ N d-1 ×(N')d such that ν d-1 ×ν'd(Uf,ϕ) = 1 and the delay map Φf,ϕ,y,z is an embedding for all (y,z)∈Uf,ϕ. We give an easy direct proof of this, independent of the other proofs in this paper. Essentially, this proof amounts to a transversality proof of the Whitney Embedding Theorem (eg [Hirsch, 1976]). It is sketched in § 7.2 below. Observe that a minor modification of this proof can be used to show that given any f∈Dr(M×N, M) and any y∈N d-1 there is a residual set of ϕ ∈ C r(M×N', R) for which Φf,ϕ,y,z is an embedding for an open dense set of z of full measure. However, this residual set of ϕ depends on y, and hence this version does not seem to be useful from the point of applications. An alternative approach to Theorem 2.5 is to repeat the transversality arguments of §5.3 and §5.4 (when dim N > 0) and §6.3, §6.4 (when dim N = 0), replacing ϕ by ϕ j , ϕj , …, ϕjα, where ϕj(x) = ϕ(x,zj). This works for any dimension of N', but we obtain slightly stronger results when dim N' = 0, and hence we restrict ourselves to this case. 1 2 Theorem 7.3 Let M, N and N' be compact manifolds, with m = dim M > 0, dim N > 0 and dim N' = 0. Suppose that d ≥ 2m+1 and r ≥ 1. Then there exists a residual set of ( f,ϕ)∈ Dr(M×N, M)× C r(M×N', R) such that for any ( f,ϕ) in this set there is an open dense set Σf,ϕ ⊂ Σ such that Φf,ϕ,ω,η is an embedding for all ( ω ,η)∈Σf,ϕ ×Σ'. If µΣ is an invariant measure on Σ, such that µd-1 is absolutely continuous with respect to Lebesgue measure on N d-1, then we can choose Σf,ϕ such that µΣ (Σf,ϕ) = 1. As usual this is deduced from the final dimensional version, using the procedure given in §3.1: Theorem 7.4 Let M, N and N' be compact manifolds, with m = dim M > 0, dim N > 0 and dim N' = 0. Suppose that d ≥ 2m+1 and r ≥ 1 and let ν d-1 be Lebesgue measure on N d-1. Then there exists a residual set of ( f,ϕ)∈Dr(M×N, M)× C r(M×N', R) such that for any ( f,ϕ) in this set there is an open dense set Uf,ϕ ⊂ N d-1 such that νd-1 (Uf,ϕ ) = 1 and the delay map Φf,ϕ,y,z is an embedding for all (y,z)∈Uf,ϕ ×(N')d. The proof is sketched in §7.3 below. Finally we consider the case dim N = 0 and dim N' = 0: Theorem 7.5 Let M, N and N' be compact manifolds, with m = dim M > 0, dim N = dim N' = 0. Suppose that d ≥ 2m+1 and r ≥ 1. Then there exists an open dense set of ( f,ϕ)∈Dr(M×N, M)×C r(M×N', R) such that for any ( f,ϕ) in this set Φf,ϕ,ω,η is an embedding for all ( ω ,η)∈Σ×Σ'. This is immediately follows from the final dimensional version, which is proved in §7.4 below. Theorem 7.6 Let M, N and N' be compact manifolds, with m = dim M > 0, dim N = dim N' = 0. Suppose that d ≥ 2m+1 and r ≥ 1. Then there exists an open dense set of ( f,ϕ)∈Dr(M×N, M)× C r(M×N', R) such that for any ( f,ϕ ) in this set Φf,ϕ,y,z is an embedding for all (y,z)∈N d-1× (N')d. The proofs in this section all follow a familiar pattern: construction of an appropriate evaluation function, proof of its transversality, application of the Parametric Transversality Theorem and finally dimension counting. As usual we work with a sufficiently smooth f and ϕ, and reduce to C 1 at the end, and since Φf,ϕ,y,z is continuous in all its parameters, and embeddings are open, it is enough to prove that appropriate sets are dense and/or of full measure. We leave Embedding Stochastic Systems 45 most such repetitious technical details to the reader, and focus on constructing the necessary evaluation functions and showing their transversality. 7.2. SKETCH PROOF FOR DIM N' > 0 Although essentially independent of previous proofs in this paper, the proof of Theorem 7.2 still uses familiar ingredients. We start by defining ~ N'd { z∈(N')d : zi ≠ zj for all i ≠ j } = ~ ~ by analogy to (5.1). Since µ'd(N'd) = 1 we can restrict ourselves to z∈N'd . As usual, denote ϕz (x) ~ ~ = ϕ(x,zi ). Given f∈D2r(M×N, M), define τf : C 2r(M×N', R) → C r-1 (TM×N d-1 ×N'd ,T Rd) by τf (ϕ)(v,y,z) i T x Φf,ϕ,y,z(v) = Then evτ is C r-1 and T (ϕ,v,y,z)evτ (ξ,0v ,0y ,0z ) = (ϖ(T x ξ(v0,z0)), ϖ(T x ξ(v1,z1)), … , ϖ(T x ξ(vd-1,zd-1 ))) †. Since the points {z0 , … , zd-1 } are distinct, we can choose each Tx ξ(vi,zi ) independently, and hence T (ϕ,v,y,z)evτ is surjective. Thus evτ is transversal to the zero section L in T Rd. By the Parametric Transversality Theorem, therefore, there is a residual set of ϕ∈C 2r(M×N', R) such that ~ ~ τf (ϕ) is transversal to L. Fix any such ϕ and define τf,ϕ : N d-1 ×N'd → C r-1 (TM,T Rd) by τf,ϕ(y,z) = TΦf,ϕ,y,z . Then evτ is C r-1 and evτ (y,z,v) = τf,ϕ(y,z)(v) = τf (ϕ)(v,y,z) and hence evτ is transversal to L. Applying the Measure Theoretic Finite Dimensional Parametric Transversality Theorem (Appendix A), we obtain an open dense set of (y,z) of full µd-1 ×µ'd measure for which TΦf,ϕ,y,z is ~ transversal to L. But the dimension of Tv(TM) is 2 m-1, that of TwL is d and hence the dimension of the span of the image of T v(TΦf,ϕ,y,z) and T wL is at most 2m + d -1 < 2d if d ≥ 2m+1. Hence as usual, transversality implies that TΦf,ϕ,y,z does not intersect L and hence Φf,ϕ,y,z is immersive. ~ To prove injectivity, we define ρf : C 2r(M×N', R)→ C r((M×M)\∆×N d-1 ×N'd , Rd×Rd ) by f f 0 1 d-1 i f f f,ϕ f,ϕ f,ϕ ρf (ϕ)(x,x',y,z) = (Φ f,ϕ,y,z(x),Φ f,ϕ,y,z(x' )) Thus evρ is C r and T (ϕ,x,x',y,z)evρ (ξ,0x ,0x' ,0y ,0z ) = ((ξ(x,z0 ), ξ(x1,z1), … , ξ(xd-1 ,zd-1 )),(ξ(x',z0 ), ξ(x'1, ~ z1), … , ξ(x'd-1 ,zd-1 ))) †. Since for z∈N'd, the points {z0, … , zd-1 } are distinct, and xi ≠ x'i for all i, the points { (x0, z0), (x1, z1), … , (xd-1 , zd-1 ), (x'0, z0), (x'1, z1), … , (x'd-1 , zd-1 ) } are all distinct. Therefore T (ϕ,x,x',y,z)evρ is surjective, so that evρ is transversal to the diagonal ∆d in Rd×Rd. By the Parametric Transversality Theorem, there is a residual set of ϕ∈C 2r(M×N', R) such that ρf (ϕ) is transversal ~ to ∆d. Fix any such ϕ and define ρf,ϕ : N d-1 ×N'd → C r((M×M)\∆,Rd) by ρf,ϕ(y,z) = Φf,ϕ,y,z ×Φf,ϕ,y,z , so that ρf,ϕ(y,z)(x,x' ) = ρf (ϕ)(x,x',y,z). Then evρ ϕ is C r and evρ ϕ(y,z,x,x' ) = ρf (ϕ)(x,x',y,z) and hence evρ ϕ is transversal to ∆ d. Applying the Measure Theoretic Finite Dimensional Parametric Transversality Theorem (Appendix A), we obtain an open dense set of (y,z) of full µd-1 ×µ'd measure for which Φ f,ϕ,y,z ×Φf,ϕ,y,z is transversal to ∆d. The dimension of T (x,x' )(M×M)\∆ = is 2m and that of T(w,w)∆d is d and hence the dimension of the span of the image of T (x,x' )(Φ f,ϕ,y,z ×Φf,ϕ,y,z) and T (w,w)∆ d is at most 2m + d < 2d if d ≥ 2m+1. Hence as usual, transversality implies that Φ f,ϕ,y,z ×Φf,ϕ,y,z does not intersect ∆d and hence Φf,ϕ,y,z(x) ≠ Φ f,ϕ,y,z(x') for all x ≠ x', as required. f f f f f, f, f, Embedding Stochastic Systems 7.3. 46 SKETCH PROOF FOR DIM N > 0 The main difference between Theorems 2.3 and 2.5 is that in the latter, ϕ has an additional argument. If anything, this enhances our ability to make perturbations and hence should not affect the transversality of τ'I and ρ'I,R which are the key to the proof of Theorem 2.3 in §5.3 and §5.4. This is indeed the case and the transversality arguments in Propositions 5.4 and 5.5 can be repeated almost verbatim, replacing ϕ by ϕj , ϕj , …, ϕjα and ξ by ξj , ξj , …, ξjα, to give a yield a proof of Theorem 2.5 for the case dim N > 0. However, since in the case dim N' > 0 we have already given a proof in the previous section of a slightly stronger result, we shall restrict ourselves here to the case dim N' = 0, in which case we can prove that Φf,ϕ,y,z is an embedding for all z. This is because when dim N' = 0, there are only a finite number of possible z∈(N')d. We can thus work with a single fixed z∈(N')d, and at the end of the proof simply take the finite intersection of the appropriate residual sets of ( f,ϕ)∈D 2r(M×N,M)× C 2r(M×N',R). 1 2 1 2 α Thus fix z∈(N')d and by analogy to (5.2) define Φ f,ϕ,I,z : M×N d-1 → R by Φ f,ϕ,I,z(x,y) = (ϕ( f (j )(x,y),zj ), ϕ( f (j )(x,y),zj ), …, ϕ( f (jα)(x,y),zjα))† = (ϕ1( f (j )(x,y)), ϕ2(f (j )(x,y)), …, ϕα( f (jα)(x,y)) † 1 2 1 1 2 2 α where ϕj(x) = ϕ(x,zj) and Φf,ϕ,I,y,z : M → R by Φ f,ϕ,I,y,z(x) = Φ f,ϕ,I,z(x,y). Then, similarly to (5.4) let ~ α τI,z : D 2r(M×N,M)× C 2r(M×N',R) → C r-1 (TM× N d-1 ,T R ) by τI,z( f,ϕ) = TΦ f,ϕ,I,z which has the evaluation function evτ ( f,ϕ,v,y) = τI,z( f,ϕ)(v,y) = TxΦ f,ϕ,I,y,z(v), x = τM(v). As in §5.3, if TxΦ f,ϕ,I,y,z(v) ≠ 0 for some I, then T xΦ f,ϕ,y,z(v) ≠ 0. Hence to show that Φf,ϕ,y,z is immersive ~ it suffices to prove that for every v∈T x M there is some I such that T xΦ f,ϕ,I,y,z(v) does not lie in LI. ~ ~ α To show this, define τ'I,z : D2r(M×N, M)× C 2r(M×N', R) → C r-1 (TM×Nd-1 , T R ×Md) by I,z τ'I,z( f,ϕ) (τI,z( f,ϕ), ιτ ° ρ̂( f )) = so that, as in §5.3, evτ' is C r-1 and I,z evτ' ( f,ϕ,v,y) I,z (T xΦ f,ϕ,I,y,z(v),( f (0)(x,y), f (1)(x,y), … , f (d-1)(x,y))) = The proof that evτ' is transversal to LI×∆I is identical to that for evτ' given in Propositions 5.4. Let Ξ = ( f,ϕ,v,y) and suppose that evτ' (Ξ)∈LI×∆I . The second component of evτ' is the same as the second component of evτ' and hence is surjective. Thus given any u = (u0, u1, … , ud-1)∈T _w∆I , there exists η∈T f D2r(M×N,M) and w ∈T v(TM) such that T (f,v,y)evρ̂τ(η,w,0y ) = u, where evρ̂τ is the evaluation map of ιτ ° ρ̂. The first component is given by T Ξevτ (0f ,ξ,0x ,0y ) = (ϖ (T x ξj (vj )), ϖ(T x ξj (v j )), … , ϖ(T xα ξjα(vjα ))) † where now ξ∈T ϕ C 2r(M×N', R) and ξj(x) = ξ(x,zj). Since ρ̂( f )(x, _ α y)∈∆I , the points xj , xj , … , xjα are all distinct. Thus for any u∈T 0(T R ) there exists a ξ∈T ϕ C 2r(M, _ _ R) such that TΞevτ (0f ,ξ,0x ,0y ) = u - T Ξevτ (η,0ϕ,w,0y ) and hence T Ξevτ' (η,ξ,w,0y ) = ( u,u). Thus T Ξevτ is surjective so that T Ξevτ is transversal to LI×∆I . I,z I I,z I,z I 2 2 I,z 2 1 I,z 1 2 I,z 1 1 I,z I,z I,z By the Parametric Transversality Theorem there is a residual set of ( f,ϕ)∈D2r(M×N, M)× C 2r(M× ~ N', R) for which τ'I,z( f,ϕ) is transversal to LI ×∆I . Fix any ( f,ϕ) in this set, and define τ'f,ϕ,I,z : Nd-1 ~ α → C r-1 (TM, T R ×Md) by Embedding Stochastic Systems τ'f,ϕ,I,z(y)(v) 47 τ'I,z(f,ϕ)(v,y) = so that evτ' ϕ is transversal to LI ×∆I . Therefore, by the Measure Theoretic Finite Dimensional Parametric Transversality Theorem (Appendix A), τ'f,ϕ,I,z(y) is transversal to LI ×∆I for a residual ~ set of full Lebesgue measure of y in Nd-1 . The same dimension counting argument as in §5.3 shows that this implies that the image of τ'f,ϕ,I,z(y) cannot intersect LI ×∆I , so that either ~ T xΦ f,ϕ,I,y,z(v) ≠ 0 or ρ̂( f )(x,y)∉∆I . But for every (x,y)∈M×Nd-1 we have ρ̂( f )(x,y)∈∆ I for some I, and thus T xΦ f,ϕ,I,y,z(v) ≠ 0 for some I. For this choice of I we must thus have TxΦ f,ϕ,I,y(v)∉LI . ~ Thus T xΦ f,ϕ,y,z(v) ≠ 0 for all v∈TM, so that Φ f,ϕ,y,z is immersive. f, ,I,z γ To prove injectivity, we define Φ f,ϕ,I,R,z : M ×N d-1 → R as in §5.4 by Φ f,ϕ,I,R,z(x,y) = (ϕ( f (j' )(x,y),zj' ), ϕ( f (j' )(x,y),zj' ), … , ϕ( f (j'γ)(x,y),zj'α ))† = (ϕj'(xj' ), ϕj'(xj' ), …, ϕj'γ (xj'γ ))† 1 1 1 2 2 1 2 2 ~ γ γ and let ρI,R,z : D2r(M×N,M)× C 2r(M×N', R) → C r((M×M)\∆×Nd-1 , R ×R ) by ρI,R,z( f,ϕ)(x,x',y) = (Φ f,ϕ,I,R,z(x,y),Φ f,ϕ,I,R,z(x',y)) ~ γ γ and ρ'I,R,z : D2r(M×N,M)× C 2r(M×N', R) → C r((M×M)\∆×Nd-1 , R ×R ×Md×M d ) with ρ'I,R,z( f,ϕ) (ρI,R,z( f,ϕ),ρ̂'( f )) = We thus need to show that for every (x,x' )∈(M×M)\∆, there is some I,R such that ρI,R,z( f,ϕ)(x,x', y)∉∆γ . To do this, we first prove that evρ' is transversal to ∆γ ×∆ I,R , as in Proposition 5.5. Let Ξ = ( f,ϕ,x,x',y) and suppose that evρ' (Ξ)∈∆γ ×∆ I,R The second component evρ̂' of evρ' is identical to that of evρ' and hence T ( f,x,x',y)evρˆ' is surjective, and in particular, given any (u 0, … , ud-1)∈ T x M× … ×Tx M and (u'0, … , u'd-1)∈T x' M× … × Tx' M we can find a η∈T f C 2r(M×N, M) such that T ( f,x,x',y)evρ̂'(η,u0 ,u'0 ,0y ) = ((u0, … , ud-1),(u'0, … , u'd-1)). If γ = 0, this is sufficient to ensure the surjectivity of evρ' . Otherwise, let TΞevρ (η,0ξ ,u0 ,u'0 ,0y ) = ( ~ u ,~ u ' ). Since ρ̂'( f )(x,x',y)∈∆ I,R , the points xj , xj , … , xjα are all distinct and { xj , xj , … , xjα } ∩ { x'j' , x'j' , … , x'j'γ } = ∅. The first component of T Ξevρ' is given by T Ξevρ (0η ,ξ,0x ,0x' ,0y ) = (ξj' (xj' ), ξj' (xj' ), …, ξj'γ (xj'γ )),(ξj' (x'j' ), ξj' (x'j' ), __ γ γ … , ξj'γ (x'jγ' ))) and thus, given any (u,u' )∈T _wR ×T_wR , there exists a ξ∈T ϕ C 2r(M×N', R) such that _ _ _ _ _ u -~ u ',0w_ ). Then T Ξevρ (η,ξ,u0 ,u'0 ,0y ) = ( u-u'+~ u ',~ u ' ). Since T Ξvρ (0η ,ξ,0x ,0x' ,0y ) = ( u-u',0w ) - ( ~ _ _ _ _ ~ _ _ ∆ and ( ~ _ _ ∆ we have ( u,u' )∈ Image T ev _ _ ∆ . Hence ev both (u',u' )∈T (w, u ', u ' )∈ T + T ρ ρ' w) γ (w,w) γ Ξ (w,w) γ is transversal to ∆γ ×∆ I,R . I,R,zz I,R,zz 0 d-1 I,R,z I,R 0 1 I,R,z 2 I ,R,z 1 1 1 2 2 2 1 I ,R,z I ,R,z 2 d-1 I,R,z 1 I ,R,z 1 2 2 I ,R,z I,R,zz By the Parametric Transversality Theorem there is a residual set of ( f,ϕ)∈D2r(M×N, M)× C 2r(M× N', R) for which ρI,R,z( f,ϕ) is transversal to ∆γ ×∆I,R . Fix any ( f,ϕ) in this set, and define ρ'f,ϕ,I,R,z : ~ γ γ Nd-1 → C r((M×M)\∆, R ×R ×Md×M d) by ρ'f,ϕ,I,R,z(y)(x,x' ) = ρ'I,R,z( f,ϕ)(x,x',y) Thus evρ' ϕ (x,x',y) = ρ 'I,R,z( f,ϕ)(x,x',y) and so evρ' ϕ is transversal to ∆γ ×∆I,R . Thus by the Measure Theoretic Finite Dimensional Parametric Transversality Theorem (Appendix A), ~ ρ'f,ϕ,I,R,z(y) is transversal to ∆γ ×∆I,R for a residual set of full Lebesgue measure of y in Nd-1 . The same dimension counting argument as in §5.4 shows that therefore the image of ρ'f,ϕ,I,R,z(y) canf, ,I,R,z f, ,I,R,z Embedding Stochastic Systems 48 not intersect ∆γ ×∆I,R . Thus for every (x,x' )∈(M×M)\∆ we have Φf,ϕ,I,R,z(x,y) ≠ Φ f,ϕ,I,R,z(x',y) for some I,R and thus Φf,ϕ,y,z(x) ≠ Φ f,ϕ,y,z(x' ). Hence Φ f,ϕ,y,z is injective. 7.4. PROOF FOR DIM N = 0, DIM N' = 0 This is similar in spirit to the previous sections, except that we follow §6.3 and §6.4 instead of ~ ~ §5.3 and §5.4. We thus start by recalling the definition of P (2d-1) for a given f∈B (2d). Then, as in Proposition 6.24, we construct a dense open set of ϕ∈C 2r(M×N', R) for which ϕ(x,z0) ≠ ϕ(x',z0) ~ ~ for all x,x'∈P (2d-1) such that x ≠ x' and all z0∈N'. This is possible since P (2d-1) and N' consist of a finite number of points. Next, as in Proposition 6.25, we construct an open and dense set of ϕ∈ ~ C 2r(M×N', R) for which if x∈Py … y for some y∈Nd-1 , then T xΦ f,ϕ,y,z has rank m for all z∈(N')d. We let Af be the intersection of the resulting sets of ϕ and define E 2r as before to be the set of ( f,ϕ) ~ such that f∈B (2d) and ϕ∈Af . This is open dense in D 2r(M×N, M)× C 2r(M×N', R). Now, fix both y∈ ~ Nd-1 and z∈(N')d and by analogy to (6.7), define τy,z : E 2r → C r-1 (TM,T Rd) by 0 d-2 τy,z(f,ϕ) = TΦ f,ϕ,y,z Then evτ is C r-1 and T ( f,ϕ,v)evτ (0f ,ξ,0x ) = ( ϖ(T x ξz (v 0)), ϖ(T x ξz (v 1 )), … , ϖ (T x ξz (vd-1))) † . If ~ evτ ( f,ϕ,v)∈L, then rank T xΦ f,ϕ,y,z < m, and since ϕ∈Af , we cannot have x∈Py … y . Thus the points {x0, … , xd-1 } are distinct and so T ( f,ϕ,v)evτ is surjective, and hence evτ is transversal to L. Thus by the Parametric Transversality Theorem, τy,z(f,ϕ) is transversal to L for an open dense set of ( f,ϕ)∈D 2r(M×N, M)× C 2r(M×N', R). For any such ( f,ϕ) the same dimension calculation as in §6.3 shows that transversality implies non-intersection. Hence τy,z(f,ϕ)(v) = T xΦ f,ϕ,y,z(v)∉L for ~ all v∈TM, so that Φ f,ϕ,y,z is immersive. y,z y,z y,z 0 0 1 1 d-1 d-1 0 y,z d-2 y,z To prove injectivity, we define ρy,z : E 2r → C r((M×M)\∆, Rd×Rd ) by ρy,z( f,ϕ)(x,x') = (Φ f,ϕ,y,z(x),Φ f,ϕ,y,z(x' )) If evρ ( f,ϕ,x,x' )∈∆d, we have Φ f,ϕ,y,z(x) = Φ f,ϕ,y,z(x' ) and hence in particular ϕz (x) = ϕz (x' ). Since ~ ϕ∈Af , at least one of x or x' cannot be in P (2d-1) and as before we assume without loss of generality that this is x. Then evρ is C r and we show that evρ is transversal to ∆d exactly as in the proof of Proposition 6.30. Thus by the Parametric Transversality Theorem, ρy,z(f,ϕ) is transversal to ∆d for a residual set of ( f,ϕ ) in D 2r(M×N, M)× C 2r(M×N', R). For any such ( f,ϕ) the same dimension calculation as in §6.4 shows that transversality implies non-intersection. Hence for a residual set of ( f,ϕ) we have Φ f,ϕ,y,z(x) ≠ Φ f,ϕ,y,z(x' ) for all x ≠ x', ie Φ f,ϕ,y,z is injective. y,z 0 0 y,z y,z Taking the intersection over all y∈Nd-1 and z∈(N')d we obtain a residual set of ( f,ϕ) for which Φ f,ϕ,y,z is an injective immersion, and hence an embedding since M is compact. Finally, embeddings are open in C r(M, Rd) and Φf,ϕ,y,z depends continuously on f and ϕ, and hence the set of ( f,ϕ) for which Φf,ϕ,y,z is an embedding is open and dense. 8. PROOF OF THEOREM 2.6 We now turn to the case where both f and ϕ depend on the same process, so that after reduction to finite dimensions we want to show that Φ'f,ϕ,y(x) = (ϕ(f (0)(x,y),y0), ϕ(f (1)(x,y),y1), …, ϕ(f (d-1)(x,y),yd-1 ))† Embedding Stochastic Systems 49 is an embedding. As with both Theorem 2.3 and Theorem 2.5 we have a different approach depending on the dimension of N. The case of dim N > 0 is very similar to §7.2 above, whilst that for dim N = 0 parallels §7.4. We briefly sketch details in the next two sections respectively. 8.1. DIM N>0 When dim N > 0 we can use a very similar approach to §7.2 above to obtain a slightly stronger version of Theorem 2.6: Theorem 8.1 Let M and N be compact manifolds, with m = dim M > 0 and dim N > 0. Suppose that d ≥ 2m+1 and r ≥ 1. Given any f∈Dr(M×N, M) there exists a residual set of ϕ∈C r(M× N, R) such that for any ϕ in this set there is an open dense set Σf,ϕ ⊂ Σ such that ~ Φ f,ϕ,ω is an emZ bedding for all ω ∈Σf,ϕ . If µΣ is an invariant measure on Σ = N with µd-1 absolutely continuous with respect to Lebesgue measure on N d-1, then we can choose Σf,ϕ such that µΣ (Σf,ϕ ) = 1. Using the usual reduction to N d-1, as in §3.1, this follows from Theorem 8.2 Let M and N be compact manifolds, with m = dim M > 0 and dim N > 0. Suppose that d ≥ 2m+1 and r ≥ 1 and let ν d-1 be Lebesgue measure on N d-1. Then given any f∈ Dr(M×N, M), there exists a residual set of ϕ∈C r(M×N, R) such that for any ϕ in this set there is ~ an open dense set Uf,ϕ ⊂ N d-1 such that ν d-1 (Uf,ϕ ) = 1 and the delay map Φ f, ϕ,y is an embedding for all y∈Uf,ϕ . Sketch Proof This is identical to the proof of Theorem 7.2, apart from appropriate changes to relevant definitions. The dependence of ϕ on y causes us no difficulties since in establishing transversality we never need make perturbations in y. Thus fix f∈D2r(M×N, M) and define τ'f : ~ ~ C 2r(M×N, R) → C r-1 (TM×Nd ,T Rd) by τ'f (ϕ)(v,y) T x Φ'f,ϕ,y(v) = Then ev τ' is C r-1 and T (ϕ,v,y)evτ' (ξ,0v ,0y ) = (ϖ(T x ξ(v0,y0)), ϖ(T x ξ(v1,y1)), … , ϖ(T x ξ(vd-1,yd-1 ))) †. Since the points {y0 , … , yd-1 } are distinct, we can choose each Tx ξ(vi,yi ) independently, and hence T (ϕ,x,y)evτ' is surjective. Thus evτ' is transversal to the zero section L in TRd and by the Parametric Transversality Theorem, therefore, there is a residual set of ϕ∈C 2r(M×N, R) such that ~ ~ τf' (ϕ) is transversal to L. Fix any such ϕ and define τf,' ϕ : Nd → C r-1 (TM,T Rd) by τ'f ,ϕ(y) = TΦ'f,ϕ,y . Then ev τ' is C r-1 and evτ' (y,v) = τ'f (ϕ)(v,y) and hence ev τ' is transversal to L. Applying the Measure Theoretic Finite Dimensional Parametric Transversality Theorem (Appendix A), we obtain an open dense set of y of full µd measure for which TΦ'f,ϕ,y is transversal to L. The same dimension calculation as in §7.2 then shows that, as usual, transversality implies that TΦ'f,ϕ,y does not intersect L and hence Φ'f,ϕ,y is immersive. ~ To prove injectivity, we define ρ'f : C 2r(M×N, R)→ C r((M×M)\∆×Nd , Rd×Rd ) by f f 0 1 d-1 i f f f,ϕ f,ϕ = f,ϕ ρ'f (ϕ)(x,x',y) (Φ'f,ϕ,y(x),Φ'f,ϕ,y(x' )) Thus evρ' is C r and T (ϕ,x,x',y)evρ' (ξ,0x ,0x' ,0y ) = ((ξ(x,y0), ξ(x1,y1), … , ξ(xd-1 ,yd-1 )),(ξ(x',y0 ), ξ(x'1,y1), ~ … , ξ(x'd-1 ,yd-1 ))) †. Since for y∈Nd , the points {y0 , … , yd-1 } are distinct, and xi ≠ x'i for all i, the points { (x0, y0), ( x1, y1), … , (xd-1 , yd-1 ), (x'0, y0), ( x'1, y1), … , (x'd-1, yd-1 ) } are all distinct. Therefore f f Embedding Stochastic Systems 50 T (ϕ,x,x',y)evρ' is surjective, so that evρ' is transversal to the diagonal ∆d in Rd×Rd. By the Parametric Transversality Theorem, there is a residual set of ϕ∈C 2r(M×N, R) such that ρ'f (ϕ) is transversal ~ to ∆d. Fix any such ϕ and define ρ'f,ϕ :Nd → C r((M×M)\∆,Rd) by ρ'f,ϕ(y)(x,x' ) = ρ'f (ϕ)(x,x',y). Then evρ' ϕ is C r and ev ρ ' ϕ(y,x,x' ) = ρf' (ϕ)(x,x',y) and hence ev ρ' ϕ is transversal to ∆ d . Applying the Measure Theoretic Finite Dimensional Parametric Transversality Theorem (Appendix A), we obtain an open dense set of y of full µd measure for which Φ'f,ϕ,y ×Φ'f,ϕ,y is transversal to ∆d. The same dimension calculation as in §7.2 shows that transversality implies that does not intersect ∆d and hence Φf,ϕ,y,z(x) ≠ Φ f,ϕ,y,z(x') for all x ≠ x', as required. f f f, f, f, 8.2. N=0 DIM In this case, as with Theorem 2.4 and Theorem 7.5 we can strengthen Theorem 2.6 to Theorem 8.3 Let M and N be compact manifolds, with m = dim M > 0 and dim N = 0. Suppose that d ≥ 2m+1 and r ≥ 1. Then there exists an open dense set of ( f,ϕ)∈Dr(M×N,M)× C r(M×N, R) such that for any ( f,ϕ) in this set Φ'f,ϕ,ω is an embedding for all ω ∈Σ. This is follows from the final dimensional version: Theorem 8.4 Let M and N be compact manifolds, with m = dim M > 0 and dim N = 0. Suppose that d ≥ 2m+1 and r ≥ 1. Then there exists an open dense set of ( f,ϕ)∈Dr(M×N,M)× C r(M×N, R) such that for any ( f,ϕ) in this set Φ'f,ϕ,y is an embedding for all y∈N d. Sketch Proof This closely follows the proof of Theorem 7.6, which in turn was based on §6.3 and §6.4. As in Proposition 6.24, we construct a dense open set of ϕ ∈C 2r(M×N, R) for which ~ ϕ(x,y0) ≠ ϕ(x',y0) for all x,x'∈P (2d-1) such that x ≠ x' and all y0∈N. Then, as in Proposition 6.25, ~ we construct an open and dense set of ϕ∈C 2r(M×N, R) ) for which if x∈Py … y for some y∈Nd, then T xΦ'f,ϕ,y has rank m for all y∈Nd. We let Af be the intersection of the resulting sets of ϕ and ~ define E 2r as before to be the set of ( f,ϕ ) such that f∈B (2d) and ϕ ∈Af . This is open dense in ~ D 2r(M×N, M)× C 2r(M×N, R). Now fix y∈Nd and define τy : E 2r → C r-1 (TM,T Rd) by d-2 τ'y(f,ϕ) 0 T xΦ'f,ϕ,y = Then evτ' is C r-1 and T ( f,ϕ,v)evτ' (0f ,ξ,0x ) = (ϖ(T x ξy (v 0)), ϖ (T x ξy (v 1 )), … , ϖ(T x ξy (vd-1))) † . If ~ evτ' ( f,ϕ,v)∈L, then rank T xΦ'f,ϕ,y < m, and since ϕ∈Af , we cannot have x∈Py … y . Thus the points {x0, … , xd-1 } are distinct and so T ( f,ϕ,v)evτ' is surjective, and hence evτ' is transversal to L. Thus by the Parametric Transversality Theorem, τ'y(f,ϕ) is transversal to L for an open dense set of ( f, ϕ)∈D 2r(M×N, M)× C 2r(M×N, R). For any such ( f,ϕ) the same dimension calculation as in §6.3 ~ shows that transversality implies non-intersection. Therefore T xΦ'f,ϕ,y(v)∉L for all v∈TM, so that Φ'f,ϕ,y is immersive. y y y 0 0 1 1 d-1 0 y d-1 d-2 y To prove injectivity, we define ρ'y : E 2r → C r((M×M)\∆, Rd×Rd ) by ρ'y( f,ϕ)(x,x') = (Φ'f,ϕ,y(x),Φ'f,ϕ,y(x' )) If evρ' ( f,ϕ,x,x' )∈∆d, we have Φ'f,ϕ,y(x) = Φ'f,ϕ,y(x' ) and in particular ϕy (x) = ϕy (x' ). Since ϕ∈Af , at ~ least one of x or x' cannot be in P (2d-1) and as usual we assume without loss of generality that this is x. We show that evρ' is transversal to ∆d exactly as in the proof of Proposition 6.30. Thus y y 0 0 Embedding Stochastic Systems 51 by the Parametric Transversality Theorem, ρ'y( f,ϕ) is transversal to ∆d for a residual set of ( f,ϕ) in D 2r(M×N, M)× C 2r(M×N, R). For any such ( f,ϕ) the same dimension calculation as in §6.4 shows that transversality implies non-intersection and hence that Φ'f,ϕ,y(x) ≠ Φ'f,ϕ,y(x' ) for all x ≠ x'. Hence Φ'f,ϕ,y is injective for a residual set of ( f,ϕ). Taking the intersection over all y∈Nd we obtain a residual set of ( f,ϕ) for which Φ'f,ϕ,y is an injective immersion, and hence an embedding since M is compact. Finally, embeddings are open in C r(M, Rd) and Φ'f,ϕ,y depends continuously on f and ϕ, and hence the set of ( f,ϕ) for which Φ'f,ϕ,y is an embedding is open and dense. APPENDIX A Measure Theoretic Finite Dimensional Parametric Transversality Theorem Let A, M and N be C r manifolds and ρ : A → C r(M,N) a map such that evρ is C r. Let L ⊂ N be a submanifold of finite codimension p in N. Suppose that A and M are finite dimensional, that r > max {0, m-p} where m is the dimension of M and that evρ is transversal to L. Then the set of a such that ρ(a) is not transversal to L has zero Lebesgue measure in A. Furthermore if L is closed and M is compact then the set of such a is open. Sketch Proof This is identical to the proof of the standard Parametric Transversality Theorem, except that instead of Smale’s Density Theorem, we use Sard’s Theorem. Recall that this states that the set of regular values of a C r map f : M → N has full Lebesgue measure in N if r ≥ max {0, m-n} where m = dim M and n = dim N. A regular value is a point y∈N such that T xf is surjective at any x∈f -1 (y) (so that any y such that f -1 (y) = ∅ is automatically regular). So, suppose that evρ is transversal to L, then L' = (evρ)-1 (L ) is a C r submanifold of A×M of codimension p. Since evρ is transversal to L, given any (a,x)∈L' and any u∈T yN, where y = evρ(a, x) = ρ(a)(x), there exists a α ∈T a A, v∈T xM and w∈T yL such that u = T (a,x)evρ(α ,v ) + w = T a ρ(α )(x) + Txρ(a)(v) + w. Let πL' : L' → A be the restriction to L' of the projection πA : A×M → A onto the first factor and suppose that a is a regular value of πL' . If (πL')-1 (a) = ∅, then ρ(a)(x)∉L for all x∈M and so ρ(a) is trivially transversal to L. On the other hand, if for some x∈M we have ρ(a)(x)∈L then given any u∈T yN, by above there exists a α , v and w such that Ta ρ(α )(x) + Txρ(a)(v) + w = u. Since T(a,x)πL' is surjective, there exists a v'∈T xM such that (α ,v' )∈T (a,x)L', which implies that (Ta ρ(α )(x) + T xρ(a)(v' ))∈T yL. Thus by linearity Txρ(a)(v-v' ) + (Ta ρ(α )(x) + T xρ(a)(v' ) + w) = u, which implies that ρ(a) is transversal to L. Hence if a is a regular value of πL' then ρ(a) is transversal to L (the converse is also true) and so the set of a such that ρ(a) is transversal to L has full Lebesgue measure. ■ ACKNOWLEDGEMENTS JS would like to thank helpful conversations with Ludwig Arnold over many years. Without his encouragement and generous advice, this paper would never have been written. Some of the preliminary work was carried out at IMPA, Brazil, during the International Conference on Dynamical Systems held there in August 1993, and JS is extremely grateful to IMPA for their hospitality and financial support, and to the Royal Society for proving travel funds to attend Embedding Stochastic Systems 52 this conference. The authors would also like to thank the Basic Research Institute in Mathematical Sciences at Hewlett-Packard in Bristol for its hospitality in the summer of 1994 and the Isaac Newton Institute Part for its hospitality and financial support during the programme “From Finite to Infinite Dimensional Dynamical Systems” in the summer of 1995. Finally, we would like to gratefully acknowledge financial support from the UK Engineering and Physical Sciences Research Council (GR/J97243 and GR/L42513), from the Nuffield Foundation (SCI/180/92/365/G), the Royal Commission for the Exhibition of 1851 and the Royal Society (Leverhume Trust Senior Research Fellowship awarded to JS). REFERENCES Abraham R, 1963, Transversality in Manifolds of Mappings, Bull. Am. Math. Soc., 69, 470474. Abraham R and Robbin J, 1967, Transversal Mappings and Flows, W.D. Benjamin, New York. Aeyels D, 1981, Generic Observability of Differentiable Systems, SIAM J. Control and Optimization, 19, 595-603. Arnold L, 1998, Random Dynamical Systems, Springer-Verlag, Berlin. Barnsley M, 1988, Fractals Everywhere, Academic Press, Boston. Broomhead DS, Huke JP and Muldoon MR, 99, Oversampling Nonlinear Digital Channels, in preparation. Casdagli M, 1992, A Dynamical Systems Approach to Modelling Input-Output Systems. In: Casdagli M and Eubank S, eds, Nonlinear Modelling and Forecasting, Addison-Wesley, Reading, MA. Chen S and Billings SA, 1989, Modelling and Analysis of Non-linear Time Series, Int. J. Control, 50, 2151-2171. Eells J, 1966, A Setting for Global Analysis, Bull. Amer. Math. Soc., 72, 751-807. Foster MJ, 1975, Calculus on Vector Bundles, J. London Math. Soc., 11, 65-73. Franks J, 1979, Manifolds of Cr Mappings and Applications to Dynamical Systems, Adv. in Math. Supplementary Studies, 4, 271-290. Huke, J.P.,1993, Embedding Nonlinear Dynamical Systems, A Guide to Takens Theorem, Internal Report, DRA, Malvern, UK. Kifer Y, 1988, Random Perturbations of Dynamical Systems, Birkhäuser, Basle. Martin R, 1998, Irregularly Sampled Signals: Theories and Techniques for Analysis, PhD Thesis, UCL. Norman MF, 1968, Some Convergence Theorems for Stochastic Learning Operators, J. Math. Psychology, 5, 61-101. Ott E, Sauer T and Yorke JA, 1994, Coping with Chaos, Wiley, New York. Sauer T, Yorke JA and Casdagli M, 1991, Embedology, J. Stat. Phys., 65, 579-616. Stark J, 1999, Delay Embeddings for Forced Systems. I. Deterministic Forcing, Journal of Nonlinear Science, 9, 255-332. Stark J, 2001, Delay Reconstruction: Dynamics v Statistics. In: Mees AI, ed, Nonlinear Dynamics and Statistics, Birkhäuser, Basle. Embedding Stochastic Systems 53 Stark J, Broomhead DS, Davies ME and Huke J, 1997, Takens embedding theorems for forced and stochastic systems, Nonlinear Analysis-Theory Methods & Applications, 30, 5303-5314. Takens F, 1980, Detecting Strange Attractors in Fluid Turbulence. In: Rand DA and Young L-S, eds., Dynamical Systems and Turbulence, Warwick 1980., Springer-Verlag, Berlin.