Fully Reflexive Intensional Type Analysis in Type Erasure Semantics Bratin Saha Valery Trifonov Zhong Shao Department of Computer Science Yale University !"$#%#&' August 27, 2000 Abstract Compilers for polymorphic languages must support runtime type analysis over arbitrary source language types for coding applications like garbage collection, dynamic linking, pickling, etc. On the other hand, compilers are increasingly being geared to generate type-safe object code. Therefore, it is important to support runtime type analysis in a framework that generates type correct object code. In this paper, we show how to integrate runtime type analysis, over arbitrary types in a source language, into a system that can propagate types through all phases of compilation. Keywords: runtime type analysis, type-safe object code 1 Introduction Modern programming paradigms increasingly rely on applications requiring runtime type analysis, like dynamic linking, garbage collection, and pickling. For example, Java adopts dynamic linking and garbage collection as central features. Distributed programming requires that code and data on one machine be pickled for transmission to a different machine. In a polymorphic language, the compiler must rely on runtime type information to implement these applications. Furthermore, these applications may operate on arbitrary runtime values; therefore, the compiler must support the analysis of arbitrary source language types. We term this ability of analyzing arbitrary source language types as fully reflexive type analysis. On the other hand, one of the primary goals of a modern compiler is to generate certified code. A certifying compiler [11] is appealing for a number of reasons. We no longer need to trust the correctness of the compiler; instead, we can verify the correctness of the generated code. Checking the correctness of a compiler-generated proof (of a program property) is much easier than proving the correctness of the compiler. Moreover, since we can verify code before executing it, we are no longer restricted ( This research was sponsored in part by the Defense Advanced Research Projects Agency ISO under the title “Scaling Proof-Carrying Code to Production Compilers and Security Policies,” ARPA Order No. H559, issued under Contract No. F30602-99-1-0519, and in part by NSF Grants CCR-9633390 and CCR9901011. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Defense Advanced Research Projects Agency or the U.S. Government. to executing code generated only by trusted compilers. A necessary step in building a certifying compiler is to have the compiler generate code that can be type-checked before execution. The type system ensures that the code accesses only the provided resources, makes legal function calls, etc. Therefore, it is important to support runtime type analysis (over arbitrary source language types) in a framework that can generate type correct object code. Crary et al. [3] proposed a framework that can propagate types through all phases of compilation. The main idea is to construct and pass terms representing types, instead of the types themselves, at runtime. This allows the use of pre-existing term operations to deal with runtime type information. Semantically, singleton types are used to connect a type to its representation. From an implementor’s point of view, this framework (hereafter referred to as the CWM framework) seems to simplify some phases in a type preserving compiler; most notably, typed closure conversion [9]. However, the framework proposed in [3] supports the analysis of inductively defined types only; specifically, it does not support the analysis of polymorphic or recursive types. This limits the applicability of their system since most type analyzing applications must deal with recursive objects or polymorphic code blocks. In this paper, we extend the CWM framework and encode a language supporting fully reflexive type analysis into this framework. The language is based on our previous work [13]; accordingly, it introduces polymorphism at the kind level to handle the analysis of quantified types. This requires a significant extension of the CWM framework. Moreover, even with kind polymorphism, recursive types pose a problem. We deal with this by suitably constraining the analysis of recursive types in the source language, and by using an unconventional rule for )+*, -/./0$1$)+*, expressions in the target language. The rest of the paper is organized as follows. We give an overview of intensional type analysis in Section 2. We present the source language 247365 in Section 3. Section 4 shows the target language 2 38 that extends the CWM framework. We offer a translation from 2 7395 to 2 38 in Section 5. 2 Intensional type analysis Harper and Morrisett [7] proposed intensional type analysis and presented a type-theoretic framework for expressing computations that analyze types at runtime. They introduced two oper- ators for explicit type analysis: : ;<>=/?@A%= for the term level and B ;<=C!=? for the type level. For example, a polymorphic subscript function for arrays might be written as the following pseudocode: AD0$EGFIHKJMLN: ;<P =?@AD=JO P *) 1$:RQ 1$:%A%0$E U C!=@,/QSCT=/@, AD0$E U9Y QSE>*DV=-$AD0$EXW w P 1$:upMZ Zpg p g /up 0 , pd, @Y ?/= pJ p K H K r % L { ¡ p 2 ¢ J f x v T L { p {W£v p {c{ B p ;<>=C!=?{*)4`a{$¤ ¥n¦D§${¨©§{ª§{ ª b {©Ze{  ``Z Z]bY { b{ g>JufvxLT{  `ag \W£v bx`2JufvxLT{b g rcLT{  g `¹HKrKL{b Figure 2: Syntactic sugar for 2 76 3 5 types and then instantiating it to a specific kind before forming polymorphic types. More importantly, this technique also removes the negative occurrence of from the kind of the argument of the constructor g . Hence in our new 2 7365 calculus (see Figures 1 and 2), we extend | } with variable and polymorphic kinds (r and g>rcLNv ) and add a type constant g of kind grKL/`ar[Zcb^Z to the type language. The polymorphic type gJµf9vxL%{ is now Y represented as g \W£v `2JufvxLT{b . Y While analyzing a polymorphic type, g W£v { , the kind v must be held abstract to ensure termination of the analysis [13]. ThereB fore, the ;<=C!=? operator needs a kind abstraction HKrKL%{ in the branch corresponding to the g constructor. We provide kind abY straction and kind application {W£v at the type level. The formation rules for these constructs, excerpted from Figure 3, are We define the syntax of the 247365 calculus in Figures 1 and 2. The static semantics of 2 7365 uses the following three environments: fnfnF fnfnF fnfnF {fnfnF Figure 1: Syntax of the 2 7395 language 3 The source language 2 79 3 5 s `b Y Y `¼j½¢b¾¶fnfnF³°¿p yÀp ¶cW£v p¶KW { p¶ ¶ pt)¹*, -º {9»¶pd0$1$)¹*, -$º {9»¶ pt: ;<>=?/@AD=º {»{ *)4`a¶¤ ¥n¦%§¶¨©§¶ª§¶ ª §¶Áb Typing this subscript function is more because it P P interesting, P must have all of the types 1$:%@C!C!@%;[P Z 1$:\Z 1$: , C!=@, @C!C!@%;]P Z P 1$:^Z_C!=@, , and E*DV=-$@C!CT@%;M`aJcbdZ 1$:ZeJ for J other than 1$: and C!=@, . To assign a type to the subscript function, we need a construct at the type level that parallels the : ;<>=/?@A%= analysis at the term level. The subscript operation would then be typed as P AD0$E fhg>JMLji B C!C!@%;k`aJcb^Z 1$:^ZeJ where iC!CT@%;lF 2JcL ;P <>=?/@AD=P JR*) 1$:RQ 1$:%@C!C!@%; CU !=@,/QSC!=@, @C!C!@%; U QSE*DV=-$@C!CT@%; B The ;<=?@AD= in the above example is a special case of the B ;<=C!=? construct in [7], which supports primitive recursion over types. m vfnfnF pMvtZvp rpg>rcLv `«¬­¯®b±°fnfnF³²´pH rKL°µp HKJ¢fvxL°µpM2yzfa{4LT¶pd·Vy¸fa{4LT° pt)¹*, -º {9»° Here AD0$E analyzes the type J of the array elements and returns the of type P appropriate subscript function. We assume that arrays P 1$: and CT=/@, have specialized representations, say 1$:@C!CT@%; and C!=@, @C!C!@%; , and therefore have specialized subscript functions; all other arrays use the default (boxed) representation. sort environment kind environment type environment `a>b oIp4mcqr oIpstq%Jufv oIpxwMq%yzfa{ The detailed typing rules are given in Figures 3 and 4 and the important term reduction rules in Figure 5. The language 247395 extends the language 2 73 proposed in [13] with recursive types, and some additional constructs for analyzing recursive types. This section only gives an overview of the language, the reader may refer to [13] for more details. m[ôs mKqTrX§sÄÃ{ fMv mK§NsÄÃHKrKL{¿fg>rcLv mK§NsÅÃ{¿fgrcL/v m ôv [ Y mX§/sÄÃG{W£v f¡v v Æ r Similarly, at the term level, the : ;<>=?/@AD= operator must analyze polymorphic types where the quantified type variable may be of an arbitrary kind. To avoid the necessity of analyzing kinds, the : ;<=?@AD= must bind a variable to the kind of the quantified type variable. Therefore, we introduce kind abstraction HxrcL° and Y kind application ¶KW£v at the term level. To assign types to these new constructs at the term level, we need a type level construct g rKL{ that binds the kind variable r in the type { . The formation rules are shown below. In the impredicative |X} calculus, the polymorphic types g>J[fvL%{ can be viewed as generated by an infinite set of type constructors g~ of kind `vZcbZ , one for each kind v . The type gJf9vxLT{ is then represented as g ~ `2Jf9vxL{ b . The kinds of constructors that can generate types of kind would then be P 1$: f ZZ fZZ g hf`ZcbdZ LLL g ~ f`vtZcbdZ LLL mcqTrX§su§w[ðµf{ mK§Nsu§wÃ\H4rKL°Çfg/rcLT{ We can avoid the infinite number of g~ constructors by defining a single constructor g of polymorphic kind grKL/`ar[ Z cbdZ mX§/su§%w[ö[fg/rcL{Èm]ôv Y mK§Nsu§wöKW£v Éf{ v Æ r Furthermore, since our goal is fully reflexive type analysis, we need to analyze kind-polymorphic types as well. As with poly2 mIÃÊv Kind formation m]ÃÊ Term formation r[ËÌm m[ÃÊvÍmIév mcq!r[ÃÊv mK§/su§wö[f{ÈmK§NsÄÃ{tÑ_{ fM m]Ãr m]ÃÊvtZÎv m]ÃgrKLv mK§Nsu§wö]f{4 Kind environment formation m[ôs m]ÃÊs m[ÃÊv m[Ão mK§/sÄÃw P mX§/su§wòf $ 1 : m]ôsuq%Jufv mK§/sÄÃ{ f¡v Type formation mIôs P mX§/sÄà 1$: fc mX§/sÄÃÌ`Z Zb fcÉZ_ÉZ_ mX§/sÄÃg fgrKL/`ar[ZcbdZ mX§/sÄÃg fc`agrKL/cb^Z mX§/sÄà ,0 fc`ÉZ_cbXZ_ mX§/sÄÃt, @?=GfcÉZ_ m[ÃÌs mcq!rK§sÅÃ{¿f¡v mX§sÄÃHXrcL%{¿fgrcLNv mK§sÄÃ\wÒy¸fa{ in w mcqrX§su§w[ðµf{ mX§/su§wÃyf{ mK§Nsu§wÃ\H4rKL°Çfg/rcLT{ mK§NstqJufv>§wðÇf{ mK§Nsu§wcq%y¸fa{Êö[f{ mK§Nsu§wÃHKJufvL°ÇfgJ¢fvxLT{ mX§/su§wô2y¸fa{xLT¶[f{ÓZ { mK§Nsu§wÃG¶[fg È { m]ôv Y Y mK§su§TwöcW£v f{W+v Y mK§/s¢§wö[fg W£v È { mK§NsÄÃG{ fMv Y mK§/s¢§wöXW {x f{M{4 Jufv in s mX§/sÄÃGJ³fkv m]ÃÌs mK§Nsu§wö[f{ mK§/s¢§wÃG¶[f{ ZÔ{ÈmK§/su§wö f{ mK§/su§wö¶ f{ mK§sÄÃ{fg>rcLvÏm]ÃÊv Y mK§/sÄÃ{W+v f¡v v Æ r mK§Nsu§wcqy¸fa{©Ã°Çf{ {ÕF]g r¡Ö6LLLrd×xL%g>JkÖfv4Ö6LL/LJKØ©fvØL{>ÖXZe{$Ù ÚIÛÝÜ q%Þ ÛÝÜ mX§/stq%J¢fvtÃ{ fMv mX§/sÄé2J¢fvxL{¿f¡vtZv mK§/s¢§w÷Vyzfa{4L!°µf{ mX§/sÅÃ{¿f¡v ZvÍmK§NsÄÃG{ fMv mK§sÄÃ{fcZÒmX§/su§%wö[f{` ,0{b mX§/su§TwÃ)¹*, -º {»¶[f ,0{ mX§/sÄÃ{M{ fkv mK§sÄÃ{ fc mK§sÄÃ{$¤ ¥¯¦fc mK§sÄÃ{¨ÎfcZZZZ mK§sÄÃ{ ª fgrKL/`ar[Z_cbXZ_`ar[ZcbdZ mK§sÄÃ{ ª fc`agrKL/cbdZ`agrKLNcbdZ B mK§NsÄà ;<>=/CT=/?{t*)4`a{¤ ¥¯¦D§{¨©§{ª§{ ª bufM mK§NsÄÃG{¿fMZ_ßmK§/s¢§wÃG¶]f ,0{ mK§Nsu§wÃ0$1$)+*, -$º {9»¶If{` ,0{ b mX§sÄÃ{ fcZ mX§sÄÃ{ fc P mX§su§w[ö¤ ¥¯¦f{ 1$: mX§su§w[ö ¨ fgJ¢fMLgJ fML%{`aJZ J b Y Type environment formation mK§NsÅÃw mX§su§w[öªÎfgrcLg>Jufar]Z_ML%{`ag W r Jcb mX§su§w[ö ª fgJ¢f`agrcLjcbL%{`ag NJcb m]ôs mX§sÄÃwÐmK§/sÄÃ{ fM mX§su§w[ö Á fgJ¢fZML{` ,0 Jcb mK§/sÄÃo mX§/sÄÃwcq%y¸fa{ mK§/s¢§%wÃ: ;<=?@AD=º {9»{ *)`a¶¤ ¥¯¦§¶¨©§¶ª§¶ ª §¶Ábuf{c{ Figure 3: Formation rules of 247395 3 Type reduction mX§/sÄÃG{>ÖÑ{Ù´f¡v mK§stqTJufv ÃG{¿fkvÏmK§/sÄÃ{ f¡v mK§NsSÃÌ`2Jufv L{ b{ Ñ{ { Æ J f v M mcqTrX§sÄÃG{¿f¡vÍm]ôv Y mX§/sÄÃÊ`¹HXrcL{ b>W£v Ñ { v Æ r fMv v Æ r mX§/sÄÃG{¿f¡vtZv ÆË à«`a{b J B mK§sÄà B ;<>=C!=?{ Ö *)x`a{ ¤ ¥n¦ §{ ¨ §{ ª §{ ª b9Ñ_{ Ö fc mK§sÄà ; <>=C!=?{Ù*)x`a{$¤ ¥n¦D§{¨©§{ª>§{ ª b9Ñ_{4Ù fc B mX§sÄà ;<=C!=?M``Z Z]b{>Ö{$Ù/b*)`a{$¤ ¥n¦D§{¨©§{ª§{ ª b Ñ_{¨{Ö{Ù${ Ö { Ù fM B mK§NstqJufv à ;<=C!=?M`a{cJcb*)x`a{¤ ¥¯¦§{¨©§{ª§{ ª b6Ñ{ fM Y B mK§NsÅà ;<>=/CT=/?cY `ag W£v {b9*)4`a{ ¤ ¥n¦ §{ ¨ §{ ª §{ ª b Ñ{ª¡W£v {`2JufvTL{xTbufc Y B mKqTrK§/sÄà ;<>=C!=?K`a{W r b*)x`a{$¤ ¥¯¦§{¨Ê§{ª§{ ª b9Ñ{ fc B mK§NsSà ;<>=C!=?K`ag {b6*)`a{$¤ ¥¯¦§{¨©§{ª§{ ª b6Ñ{ ª {`¹HKrKL%{ bufM ÆË à«`a{b mX§/sÅÃ{¿fgr L/vÍr Y mX§/sÅÃHXrcL{W r Ñ{ fgr L/v B mX§/stqJ¢fÉà ;<>=C!=?c`a{`%, @?=>Jcbb*)`a{$¤ ¥n¦D§{¨©§{ª§{ ª b6Ñ_{ fM P B B mX§sÄà ;<=C!=? 1$:9*)4`a{$¤ ¥n¦D§{¨Ê§{ª§{ ª b¢fM mK§sÄà ;<=C!=?M` ,0{b6*)`a{$¤ ¥¯¦§{¨©§{ª§{ ª b6Ñ ,0X`2JufML{xTbufc P B mK§NsÅà ;<>=/CT=/? 1$:6*)x`a{ ¤ ¥¯¦ §{ ¨ §{ ª §{ ª b6Ñ{ ¤ ¥¯¦ fc B mK§NsÄà ;<>=C!=?K`%, @?=>{b6*)4`a{¤ ¥¯¦§{¨©§${ª§{ ª bufc B mK§NsÄà ;<>=/CT=/?c`%, @?={b4*)x`a{$¤ ¥¯¦§{¨©§{ª§${ ª b6Ñ{¿fM Figure 4: Selected 2 7365 type reduction rules mK§sÄô2JufvxLT{cJRÑ_{fMvZv morphic types, we can represent the type grcL{ as the application of a type constructor g of kind `agrKL/cbtZh to the type HXrcL{ . The reduction rule for analyzing a kind-polymorphic type is B ;<=C!=?M`ag rKL{b6B *)`a{$¤ ¥¯¦§{¨©§{ª§{ ª b6Ñ { ª `¹HXrcL{ b`¹HXrcL ;<=C!=?{t*)x`a{ ¤ ¥¯¦ §{ ¨ §${ª§{ ª bb B The g -branch of ;<>=C!=? gets as arguments the body of the quantified type and a kind function encapsulating the result of the analysis on the body of the quantified type. Recursive types are formed using the ,0 constructor of kind `Z_cbXZ_ . For the analysis of recursive types we introduce a unary constructor , @?= of kind Z , which is not intended for use by the programmer. There are no term level constructs to create an object whose type contains this constructor. The introduction of the , @?= constructor is based on the work of Fegaras and Sheard [6]. Recursive types are handled in the manner suggested by the language by a 247â in [13]. When a recursive type is analyzed B B ;<=C!=? , the result is always a recursive type. The ;<=C!=? analyzes the body of the type with the ã -bound variable protected under the , @?= constructor. Since , @?= is the right inverse of B ;<=C!=? (Figure 4), the analysis terminates when it reaches a type variable. B ;<=C!=?M` ,0 B {b4*)x`a{$¤ ¥¯¦§{¨©§{ª§{ ª b6Ñ ,0X`2JufML ;<=C!=?M`a{`% , @?=JKbb*)`a{$¤ ¥¯¦§{¨©§{ª§{ ª bb Since the argument of the 0, constructor has a negative occurrence of the B kind , this case must be handled specially. Therefore, the ;<=C!=? does not act as an iterator for the ,0 constructor. Instead, it directly analyzes the body of the recursive type. In essence, we have made the 0, constructor transparent to the analysis. Operationally, the number of nested ,0 constructors strictly B decreases at every reduction of the ;<=C!=? , ensuring termination after a finite number of steps. The formation rules in Figure 3 require that the environments are well formed. Moreover, all types and kinds are also well formed. Thus, in the type formation rule mX§/sÅÃG{ f¡v , we have that més and mÃÕv . In the term formation rule mK§Nsu§wÉâ¶f { , we have that m[ôs and mK§sÄÃ\w and mK§/sÄÃ{ fc . B The ;<=C!=? operator is used for type analysis at the type level. In fact, it allows primitive recursion at the type level. It operates on types of kind and returns a type of kind (Figure 4). Depending on the head constructor of the typeP being anB alyzed, ;<=C!=? chooses one of the branches. At the 1$: type, it returns the {¤ ¥¯¦ branch. At the function type {ÉZá{ , it applies the {¨ branch to the components { and { , and to the result of recursively processing { and { . B ;<>=C!=?K`a{ÓZÔB { b6*)x`a{$¤ ¥¯¦§{¨Ê§{ª§{ ª b9Ñ {¨[{c{ ` B ;<=C!=?{t*)`a{$¤ ¥¯¦§{¨©§{ ª§{ ª bb ` ;<=C!=?{ *)4`a{$¤ ¥n¦D§{¨Ê§{ª§{ ª bb When analyzing a polymorphic type, the reduction rule is B ;<=C!Y =?c`ag>Jufv L{b6*)x`a{$¤ ¥¯¦B §{¨Ê§{ª§{ ª b6Ñ {ªMW£v `2Jufv L{b`2Jufv L ;<=C!=?{t*) `a{$¤ ¥n¦D§{¨©§{ª§{ ª bb Since {ª must be parametric in the kind v (to ensure termination, there are no facilities for kind analysis in the language [13]), it can only apply its second and third arguments to locally introduced type variables of kind v . We believe this restriction, which is crucial for preserving strong normalization of the type language, is quite reasonable in practice. For instance { ª can yield a quantified type based on the result of the analysis. 4 P P ·VK:%*í>:%C 1$î fagJ¢fMLTJZïA%:%C 1$îL FIHKJ¢fML P : ;<>P =/?@A%=º2ðzP fMB LðOZï P AD:C 1$î»JR*) 1$: P Q 1$: *í>P:%C 1$î AD:%C 1$î\Q2yz U f%AD:%C 1$î$ULy U U ëë Q H ÖfMP LTH U ÙfY ML2yzf Öcë P ÙL U Y :U *í:C 1$îdUW Ö `£VñnòbóMU :*í:C U 1$îdW Ù `£Vñnôb ZZ Q H ÖfMLTH ÙfML2yzf ÖXZ ÙL$õjö÷ Ú6ø/ù ²!ú Ú U YU g Q H rcL!H farZML2yzfag ¸W r L$õûúü¹ýÞ´úþû6ÿ² ø U U Ú g Q H U fagrcLML/2y¸fag U L$õ ² ûúü¹ýÞ´úþû6ÿ² ø Q H fP ZM ,0 U L2VU f ,0 Y L U :*í:C 1$îdW ` ,0 b `%0$1$)¹*, -º »yb P : ;<=?@AD=º {» $ 1 :9*)`a¶ ¤ ¥¯¦ §¶ ¨ §¶ª>§¶ ª §¶ Á b ÑS¶ ¤ ¥n¦ Y Y : ;<=?@AD=º {»`a{>ÖKZÔ{$Ù/b*)`a¶¤ ¥¯¦§¶¨©§¶ª§¶ ª §¶ÁbÑS¶¨¿W {>Ö W {Ù Y Y Y : ;<=?@AD=º {»`ag W£v { b*)x`a¶¤ ¥n¦D§¶¨©§¶ª>§¶ ª §¶ÁbÕÑS¶ªMW£v W { Y : ;<=?@AD=º {»`ag N{ b*)x`a¶ ¤ ¥n¦ §¶ ¨ §¶ª>§¶ ª §¶ Á b ÑS¶ ª W { Y : ;<=?@AD=º {»` ,0 { b6*)x`a¶ ¤ ¥n¦ §¶ ¨ §¶ ª §¶ ª §¶ Á b ÑS¶ Á W { : ;<=?@AD=º {»`%, @?=>{4!b*)x`a¶¤ ¥¯¦§¶¨©§¶ª§¶ ª §¶>ÁbxÑ : ;<>=/?@A%=º {9»`%, @?=>{ b*)`a¶ ¤ ¥¯¦§¶¨©§¶ª§$¶ ª §¶Áb Figure 5: Selected term reduction rules of 2 7365 Figure 6: The function toString The term expressions are mostly standard. We use explicit fold-unfold operators to implement the isomorphism between a recursive type and its unfolding. Type analysis at the term level is performed using the : ;<=?@AD= operator. Since the term level includes a fixed-point operator, : ;<=?@AD= is not iterative; it inspects a given type { and passes its components to the corresponding branch. The reduction rules for : ;<=?@AD= are in Figure 5. In this syntax the ç6è type operator is defined as: P P ç6èK` 1$:b F 1$: ç6èK`aJkÖkëGJMÙbïFç6èKP `aJkÖNbcëÌç6èc`aJcÙb ç6èK`aJ Ö ZÔ Y J Ù bÕF êc* P ç6èK`ag W r Jcb F êc* P ç6èK`ag Jcb F êc* ç6èK` ,0Jcb F ,0K`2J Ö fMLjç6èK`aJÊ`%, @?/=J Ö bbb P P P P P 1$:bzëÝ` 1$:Z 1$:bb reduces to `£êM* -Óë Note P that ç6èK`` 1$:Z complicated definition is necessary to map this êc* -b ; a more P type to êM* - . As an example of the term level analysis in 2 7395 , consider the P function :%*í:C 1$î shown in Figure 6. This function uses the type of a value P to produce its string representation. (We add the base type D A : C 1$P î to the language). We assume a primitive function P B 1$: *í>:%C 1$î that converts an integer to its string representation, and use ˆ to denote string concatenation. The language 247365 has the following properties. The proofs are similar to the proofs for the language 2 73 in [13]. Existential types can be handled similar to polymorphic types. We define a type constructor ä ä of kind grcLN`arµZåY cbkZ . The existential type äJtfjvxL%{ is then equivalent to ä äKW£v `2Juf B vL{b . The ;<=C!=? and the :¯;<=?@AD= are augmented with a {æ and a ¶æ branch respectively. The reduction rulesB are exactly analogous to the polymorphic case, except that the ;<>=C!=? now chooses the { æ branch, and the :¯;<=?@AD= chooses the ¶ æ branch. B To illustrate the type level analysis we will use the ;<>=C!=? operator to define the class of types admitting equality comparisons. We will extend the example in [7] to handle quantified types. We define a type operator ç6èÝfX±ZéP which maps P function and polymorphic types to the type êM* - . (Here êc* -  gJ´fMLJ is a type with no values). To make the example more realistic, we extend the language with a product type constructor ( ë ë ) of the same kind as ( Z Z ). The type analysis constructs operate on the ë ë constructor in a manner similar to the Z Z constructor. Proposition 3.1 (Strong Normalization) Reduction formed types is strongly normalizing. of well Proposition 3.2 (Confluence) Reduction of well formed types is confluent. For ease of presentation we use B ML-style pattern matching syntax to define a type involving ;<>=/CT=/? : Instead of B :F¿2JufML ;<=C!=?JO*)`a{ ¤ ¥¯¦ §{ ¨ §{ ª §{ ª b Ù fvxLT{4¨ where {¨ìF2JkÖ$fML2JMÙfML2J Ö fvxL/2J {ª uF[HXrcL2J¢far[ZML2J far]ZvxL%{ ª { ª F2J¢f`agrcLjcbL2J f`agrKL/vbL%{ ª we write P :>` 1$:Db F{ ¤ ¥n¦ :>`aJkÖKZ Y JMÙb©F{ ¨ :`aJ¡ÖNbqj:`aJMÙb Æ J Ö qJ Ù :>`ag \W r Jcb F{ ª 2JkÖfarcLj:`aJJkÖNb Æ J Y :>`ag NJKb F{ ª HKrKLN:`aJ´W r b Æ J Proposition 3.3 (Type Safety) If öfa{ , then either ¶ is a value, or there exists a term ¶ such that ¶kѶ and ö fa{ . 3.1 Type analysis in 2 7365 In our previous work [13], we proposed the language 247â which supports the analysis of recursive types without any restrictions. However, the resulting language gets complex and the translation into a CWM framework is not clear. Therefore, type analysis in B 247365 is restricted in two ways. First, the ;<>=/CT=/? operator must return a type of kind . Second, the result of analyzing a recursive type is always a recursive type. We believe that these restrictions do not reduce significantly the usefulness of the language in practice. 5 B The main purpose of ;<=B C!=? is to provide types to :¯;<=?@AD= terms; every branch of the ;<=C!=? types the corresponding branch of the : ;<>=/?@A%= . B Since the type of a term is always of kind , the result of the ;<=C!=? must also be of kind . Thus, B in practice, a ;<=C!=? will be employed to form types of kind . B In some cases a ;<=C!=? is used to enforce typing constraints—for example, in the case of polymorphic equality B above, a ;<>=/CT=/? was used to express the constraint that the set of equality types does B not include function or polymorphic types. In these cases the ;<>=/CT=/? merely verifies that an input type is well formed, while preserving its structure. This means that the B ;<=C!=? will map a recursive type into a recursive type. `b {fnfnF `¼j½¢b Limitations of type analysis in 2 7365 p B pcvÌZv p rp g>rcLv P 1$:upcZ Zp g pg p ,0]pK ,M p p¤ ¥¯¦up9¨åp9ªÉp Y ª pÁp ¸p pJì B p HXrcL{p {W£v p¡2J¢fvxL{ p{c{4 p @îC!=?{t*)`a{ ¤ ¥¯¦ §{ ¨ §{ ª §{ ª §{ b Y Y ¶fnfnF°µpyp¶XW£v p ¶cW { p ¶ ¶ p)¹*, -º {»¶[pX0$1$)¹*, -$º {»¶ pC!=<?@A%=º {9»¶¸*)`a¶>¤ ¥n¦D§¶¨©§¶ª§¶ ª §¶ 8 §$¶>Á§¶ ¹b Figure 7: Syntax of the 2 38 language The approach outlined in this section allows the analysis of recursive types within the term language and the type language, but imposes severe limitations on combining these analyses. One can write a polymorphic printer that analyses types at runtime, or one can write a type operator, like ç6è , to enforce invariants at the type level. However, it is not possible to write a polymorphic equality function that analyzes types at runtime and has the type g>JufMLjç6è JZïç6èJZ_E*$*, . The reason is that when the recursive type ç6èK` ,0{b is unfolded, the result is ç6èX`a{`%, @?=` ,0{ bbb . The equality function must now analyze the type {`%, @?= ` ,0{ bb , which requires it to analyze a , @?= type. However, the corresponding branch of the :¯;<=?@AD= can only contain a divergent term. The root of the problem lies in defining , @?= as a constructor for kind . The solution might require an automatic unfolding of a recursive type when it is being analyzed at the term level; the resulting language exceeds the scope of the current paper. Removing recursive types from the language again allows fully general type analysis. responds to ¤ ¥¯¦ and Z Z corresponds to 9¨ . The type analysis B construct at the type level is @îC!=? and operates only on tags. At the term level, we add representations for tags. The term level operator (now called C!=<?@AD= ) analyzes these representations. All the primitive tags have corresponding term level representations; for example, ¤ ¥¯¦ is represented by ¤ ¥¯¦ . Given any tag, the corresponding term representation can be constructed inductively. 4.2 Typing term representations The type calculus in 2 38 includes a unary type constructor of B kind Z B to type the term level representations. Given a tag { (of kind ), the term representation of { has the type Ó{ . For example, ¤ ¥¯¦ has the type ¤ ¥¯¦ . Semantically, R{ is interpreted as a singleton type that is inhabited only by the term representation of { [3]. 4 The target language 2 38 If the tag { is of a function kind vtZv , then the term representation of { is a polymorphic function from representations to representations: U U U ~ ¨ ~ `a{ b  g fvxL ~ ` b^ Z ~ `a{ b Figure 7 shows the syntax of the 2 38 language. To make the presentation simpler, we will describe many of the features in the context of the translation from 2 7365 . 4.1 vfnfnF `«¬­¯®bΰfnfnF²Ìp H4rcL°µpHXJufvxL%°Çpc2y¸fa{4L!¶pX·Vy¸fa{xLT° p)¹*, -º {»° Y Y p ^¤ ¥¯¦u p Y 6¨ p Y ¨ W { p Y ¨¿Y W { ° p ¨ W { °¸W { p ¨ W { °zW { ° Y Y Y Y Y Y p ª p ª W£v p ª W£v ¡W { p ª W£v MW { W { Y Y Y p ªMW+v W { W {x Y ° Y p ª p ª WY { p ª Y W{ ° p ^Á[ p ^ÁKW { Y p ÁcW { ° Y p ¸ p W { Y p W { Y ° p p GW { p W { ° Other applications of type analysis also follow this pattern. Consider a copying collector [14]. The ?*<j; function would use B a ;<=C!=? to express that data from a particular region has been copied into a different region. Since the structure of the data remains the same after being copied, a recursive type would still be mapped into a recursive type. The same holds true while flattening tuples. Flattening involves traversing the input type tree, and converting every tuple into the corresponding flattened type; therefore, the structure of the input type is preserved. 3.2 `¹b The analyzable components in 2 38 However a problem arises if { is of a variable kind r . The only way of knowing the type of its representation is to construct it when r is instantiated. Hence programs translated into 2 38 must be such that for every kind variable r in the program, a J , representing the type of the term corresponding type variable representation for a tag of kind r , is also available. In 2 38 , the type calculus is split into types and tags: While types classify terms, tags are used for analysis. We extend the kind calculus to distinguish between the two. The kind includes the set B of types, while the kind includes the set of tags. For every constructor that generates a type of kind , we P B have a corresponding constructor that generates a tag of kind ; for example, 1$: cor- This is why we need to extend the CWM framework. Their source language does not include kind polymorphism; therefore, 6 B p¯Mp>F p rKpF[r m]ÃÊs p£vZv pF pnvxp>Zpnv p p grKLvxpF[g>rcLj`arZcb^Zpnvxp mX§/sÅà Õ Îf B mK§/sÅÃJfr]Z Z_ mK§NsÄÃ% ¢Â J fr[Z_ mK§sÄÃ& ~  { f¡pnv > p Z U U U mX§/sÅà ~ ¨ ~  2J¢fp£vZv pnL%g fp£vp£L%{ Z { `aJ b Ô fMp£vZv%p>Z mX§/sÄÃ&t~  { fMpnvxpZ Figure 8: Translation of 2 76 3 5 kinds to 2 38 kinds mcq!rK§stq%J^far]Zà ~  {¿fkpnvxp>Z mK§NsÅÃ\ª ' ~  2J¢fp g>rcLNvxpnL%grcLg>Jfar[ZML%{`aJ´W r fMp g>rcLvxp>Z they can compute the type of all the representations statically. This is also the reason that we need to introduce a new set of primitive type constructors and split the type calculus into types and tags. Consider the g and the g type constructors in 247395 . The g constructor binds a kind v . When it is translated into 2 38 , the translated constructor must also, in addition, bind a type of kind v[Z . Therefore, we need a new constructor 9ª . Similarly, the g constructor binds a type function of kind grKLN . When it is translated into 2 38 , the translated constructor must bind a type function of kind p grcLNMp . (See Figure 8.) Therefore, we introduce a new constructor ª . Furthermore, if we only have as the primitive kind, it will no longer be inductive. (The inductive definition B would break for ª ). Therefore, we introduce a new kind (for tags), and allow analysis only over tags. This leads us to the kind translation from 247365 to 2 38 B (Figure 8). Since the analyzableB component of 2 38 is of kind , the 2 7365 kind is mapped to . The polymorphic kind grcLNv is translated to grKL/`arZÎcbcZ±pnvxp . Note that every kind variable r must now have a corresponding type variable J . Given a tag of variable kind r , the type of its term representation is given by J . Since the type of a term is always of kind , the variable J has the kind r]Z . Lemma 4.1 pnv v %Æ r 4.3 Tag analysis in 2 38 We now consider the tag analysis constructs in more detail. The term level analysis is done by the C!=<>?/@AD= construct. Figure 10 and Figure 11 show its static and dynamic semantics respectively. The expression being analyzed must be of type O{ ; therefore, C!=<?@AD= always analyzes term representation of tags. Operationally, it examines the representation at the head, selects the corresponding branch, and passes the components of the representation to the selected branch. Thus the rule for analyzing the representation of a polymorphic type is Y Y Y C!=<?@A%=º {9»(6ª¡W£v W { ~ W { `a¶ b*) ÑȶªMW£v Y W { ~ Y W { Y `a¶ b `a¶ ¤ ¥¯¦ §¶ ¨ §¶ ª §¶ ª § ¶ 8 §¶ Á §¶ b B @îCT=/? construct. The Type level analysis is performed by the B language must be fully reflexive, so @îCT=/? includes an additional branch for the new type B constructor . Since only the kind of Á contains the kind in a doubly-negative position B (Figure 10), we can define @îC!=? as an iterator over the kind B , and treat Á specially (like the ,0 constructor in 247365 ). B Figure 12 shows the reduction rules for the @îC!=? , which are B similar to the reduction rules for the source language ;<>=/CT=/? : given a tag, it calls itself recursively on the components of the tag and then passes the result of the recursive calls, along with the original components, to the corresponding branch. Thus the reduction rule for the function tag is B @îC!=?c` 9¨[{MB { b*)x`a{¤ ¥¯¦§{¨©§{ª§{ ª §) { b6Ñ { ¨ {M{ ` B @îCT=/?{t*)`a{ ¤ ¥¯¦ §{ ¨ §{ ª§{ ª §{(bb ` @îCT=/?{ *)4`a{ ¤ ¥n¦ §{ ¨ §{ ª §{ ª §{ bb Similarly, the reduction for the polymorphic tag is Y B @îC!Y=?M` 9ªcW£v { ~ { B b9*)x`a{$¤ ¥n¦D§{¨©§{ª>§{ ª §{ b6Ñ { ª W£v {~4{G`2J¢fvxL @îC!=?M`a{cJKb*)`a{ ¤ ¥¯¦ § { ¨ §{ ª §{ ª §{(bb Recursive tags are handled in a manner similar to recursive types in 2 7365 . The result is constrained to be a recursive tag. The analysis proceeds directly to the body of the tag function, with the bound B variable protected under a * tag, which is the right inverse of @îC!=? . The reduction rules also include a rule for the , constructor. The , constructor is used to handle recursive tags in the $ function (Section 4.4). This constructor is again an implementation p>F¿pnvxp np v p Æ r By induction over the structure of v . Figure 9 shows the function ~ . Suppose { is a 2 7365 type of kind v and p {p is its translation into 2 38 . The function ~ gives the type of the term representation of p {p . Since this function is used by the translation from 247395 to 2 38 , it is defined by induction on 247395 kinds. F ~ ~! #" Lemma 4.2 ` ~ b np v pnq ~ Æ r qJ By induction over the structure of v . J>b Figure 9: Types of representations at higher kinds Proof Proof Y The formation rules for tags are displayed in Figure 10. Since the translation maps 247395 type constructors to these tags, a type constructor of kind v is mapped to a corresponding tag of kind pnvxp . Thus, while the g type constructor has the kind g>rcLN`aB rÀZ BcbXZÎ , the 9ª tag has the kind g>rcLj`arÉZÎcbXZÎ`arZ bXZ . Figure 10 also shows the type of the term representation of the primitive type constructors. These types agree with the definition of the function u~ ; for example, the type of ¨ is ¢¨t¨t` 9¨©b . The term formation rules in Figure 10 use a tag interpretation function $ that is explained in Section 4.4. 7 C!=<?@AD=º {»)¤ ¥¯¦9*)4`a¶¤ ¥n¦D§¶¨©§¶ª§¶ ª §¶ 8 §¶Á§¶ ab6Ѷ¤ ¥n¦ Y Y C!=<?@AD=º {) » ¨ W { Ö `a¶ Ö bW { Ù `a¶ Ù b6*) Y Y `a¶¤ ¥n¦D§¶¨Ê§¶ª§¶ ª §¶ 8 §¶Á§ ¶ ab6Ѷ¨¿W {Ö `a¶xÖjbW {Ù `a¶Ùb Y Y Y C!=<?@AD=º {) » 9ªMW+v kW {~ W { `a¶ b*) Y Y Y `a¶¤ ¥n¦D§¶¨Ê§¶ª§¶ ª §¶ 8 §¶Á§ ¶ ab6ѶªMW£v ¡W { ~ W { `a¶ b Y C!=<?@AD=º {) » ª Y W { `a¶ b9*)4`a¶ ¤ ¥¯¦ §¶ ¨ §¶ ª §¶ ª §¶ 8 §¶ Á §¶ b6Ñ ¶ ª W {4 `a¶Tb Y C!=<?@AD=º {) » Y W { `a¶ b*)`a¶¤ ¥¯¦§¶¨©§¶ª§¶ ª §¶ 8 §¶>Á§2 ¶ *ab6Ñ ¶ W { `a¶ b mX§/sÅÃ{¿f¡v Type formation m]ôs B mX§/sÅà f ZB mX§/sÅà , fcBZ mX§/sÅ Ã ¤ ¥n¦ f B B B mX§/sÅ Ã ¨ f Z Z B mX§/sÅ Ã 9ªÎfg>rcL/`ar]ZcbdZ`ar[ B Z B bdZ mX§/sÅ Ã ª fc`ag>B rcL/`ar] Z c ^ b Z d b Z B B mX§/sÅ Ã Á fcB` Z B bXZ mX§/sÅ Ã ¢f B Z B mX§/sÅ Ã f Z B B mX§/sÅÃ{ f B mX§/sÅÃ{¤ ¥¯¦f B B B B B mX§/sÅÃ{¨Îf Z Z Z Z B B B mX§/sÅÃ{ ª fgrcLN`ar[ZcbdZ`ar[ bdZ`ar]Z bXZ B B Z mX§/sÅÃ{ ª fcB`agrcLN`aB r[ZB cb^Z b^Z`agrKL/`ar[Z_cbXZ b Z d mX§/sÅÃ) { f Z Z B B mX§sÄà @îC!=?{t*)x`a{$¤ ¥n¦D§{¨©§{ª>§{ ª §( { buf Y C!=<?@AD=º {») ÁY W { ¶ * )`a¶ ¤ ¥n¦ §¶ ¨ §$¶ ª §¶ ª § ¶ 8 §¶ Á §¶ * b6Ñ ¶>ÁXW {x `a¶ !b Y C!=<?@AD=º {) » Y W { `a¶ b*)`a¶¤ ¥¯¦§¶¨©§¶ª§¶ ª §¶ 8 §> ¶ Á§¶2 *ab6Ñ ¶ W { `a¶ b B Figure 11: Selected term reduction rules of 2 38 mX§/su§%w[ö[f{ Term formation artifact in 2 38 and has no counterpart in the source language. Its reduction rule will never be used in a program translated from 247365 . mK§sÄÃ\w mK§Nsu§wà ¤ ¥n¦ f ¤ ¥¯¦ mK§Nsu§w à ¨Îf¢¨t¨t`9¨©b mK§Nsu§w à ªÎ f ª )',+-/¨t/ . ¨t `9ªb mK§Nsu§w à ª f + ª )'0 . ¨t ` ª b mK§Nsu§w à f ¢¨t` b mK§Nsu§w à Á f +n¨t0 . ¨t ` Á b mK§Nsu§w à ¢ f ¢¨t` *¹b mX§/sÅÃ{ f B B Z 4.4 The tag interpretation function Programs in 2 38 pass tags at runtime since only tags can be analyzed. However, abstractions and the fixpoint operator must still carry type information for type checking. Therefore, these annotations must use a function mapping tags to types. Since these annotations are always of kind , this function must map tags B of kind to types of kind . This implies that we can use an iterator over tags to define the function as follows: P $¸` ¤ ¥n¦ b F 1$: $¸` ¨ J Ö J Ù bé3 F $¸U `aJ Ö bd4 Z U $¸`aJ Ù b Y U $¸` 9ªMW r J JKbÕF g farKL% J Z $z`aJ b Y 5 $¸` ª Jcb F grKLU g J far[Z_ML U $¸`aJÌW r J >b $¸` ÁJcb F ,0`2 fML $z`aJ´`%, bbb $¸`% ,JKb F JP $¸` MJcb F P 1$: $¸` Jcb F 1$: B The function $ takes a type tree in the kind space and converts it into the corresponding tree in the P kind space. Therefore, it converts the tag ¤ ¥¯¦ to the type 1$: . For the other tags, it recursively converts the components into the corresponding types. The branches for the and the * tags are bogus but of the correct kind. The language 2 38 is only intended as a target for translation from 247365 —the only interesting programs in 2 38 are the ones translated from 247365 ; therefore, the branch of $ will remain unused. Similarly, since the source language hides the , @?= constructor completely from the programmer, it does not appear in 247365 programs; hence the * branch of $ will also remain unused. mK§Nsu§wö[f$¸`a{`Á { bb mX§/su§%wÃt)¹*, -º {9»¶[f$¸` Á { b mK§/sÄÃ{f B Z B mK§su§Twö[f$k` Á {b mK§/s¢§wÃ0$1$)+*, -$º {»¶[f$z`a{`Á {bb B mX§sÄÃ{ f Z_ mX§su§w[ö fR{ mX§su§w[ö ¤ ¥¯¦ f{1 ¤ ¥¯¦ B B mX§su§w[ö¨ÎfgJkÖ$f LÓJkÖKZeg>JMÙ>f LÓJMÙ¡ZÔ{`9¨]JkÖ$JcÙb mX§su§w[öªÎfg rcLg>JfarB ZML Y g>JufarIZ #L ¨t `aJcbdZÔ{` ª W r J Jcb B mX§su§w[ö ª fgJ¢fagB rcLj`arZcb^Z L \ª ' `aJcbdZÔ{` ª Jcb mX§su§w[Ã0 ¶ fgJ¢f B L OJ B Ze{` cJcb mX§su§w[öÁ fgJ¢f B Z L u¨t`aJKbdZ {` Á Jcb mX§su§w[Ã2 ¶ *GfgJ¢f L OJZe{` Jcb mK§/s¢§%wÃC!=<?@AD=º {9»¶z*)x`a¶ ¤ ¥n¦ §¶ ¨ §¶ª§¶ ª §¶ 8 §¶ Á §¶2 *ab¢f{c{ Figure 10: Formation rules for the new constructs in 2 38 8 B B mX§/sÅà @îC!=?¤ ¥n¦6*)x`a{$¤ ¥¯¦§{¨Ê§{ª§{ ª §{(buf B B mK§NsSà @îCT=/?¤ ¥¯¦9*)4`a{¤ ¥¯¦D§{¨©§{ª§{ ª §{ b9Ñ_{$¤ ¥¯¦tf B B mK§/sÄà B @îC!=? {>Ö*)4`a{$¤ ¥n¦D§{¨Ê§{ª§{ ª §) { b9Ñ{ Ö f B mK§/sÄà @îC!=? { Ù *)4`a{ ¤ ¥n¦ §{ ¨ §{ª§{ ª §{ b9Ñ{ Ù f B mK§NsSà @îCT=/?c` ¨ { Ö B { Ù b*)`a{ ¤ ¥¯¦ §{ ¨ §{ª§{ ª §( { b6Ñ { ¨ { Ö { Ù { Ö { Ù f B mK§NstqJufv B à @îC!=?M`a{Ù$Jcb6*)4`a{¤ ¥¯¦§{¨©§{ª§${ ª §{ b6Ñ { f Y B mK§NsSà @îY CT=/?c` 9ªMW+v {>Ö{Ùb6*)4B `a{¤ ¥¯¦§{¨Ê§{ª§{ ª §( { bÑ {ªMW£v {Ö{Ù9`2JufvTL{x!b¢f p JMp>F]J P p 1$:p/7 F ¤ ¥¯¦ pnZ Zp/F89¨ p g p^F99ª p HXrcL{p/FHXrcL2JfarZ_MLp {p Y Y p {W£v pFÝp {p/W£p£vp ~ p£2J¢fvxL{p/FÝ2J¢fpnvxpnLp {p p g /p4F: ª p {M{ p/FÝp {pp { p p ãdpX; F Á p , @?/=p/F< B B p ;<>=C!=?{*)x`a{¤ ¥¯¦§{¨©§{ª§{ ª bp>F @îC!=?¡p {p*)4`p { ¤ ¥n¦ pn§p { ¨ p¯§p {ªxp¯§p { ª ¯p § 2 f B L2 f B L/p { ¤ ¥¯¦ p¯b Figure 13: Translation of 2 7395 types to 2 38 tags mcq!BrK§stqTJ farY Zà B @îCT=/?c`a{W r Jb*)x`a{$¤ ¥n¦D§{¨©§{ª>§{ ª §{(b6Ñ{ f B mK§/sÄà @îC!=?M` ª {b4*)4`a{¤ ¥¯¦§{¨©§B {ª§{ ª §{)b6Ñ { ª {`¹HKrKL2J far[ZML%{xTb¢f B B mK§NsSà @îCT=/?{t*)x`a{ ¤ ¥¯¦ §{ ¨ §${ ª §{ ª §( { bÑ_{ f B mK§/sÅà @îC!=?cB ` >{b9*)4`a{ ¤ ¥n¦ §{ ¨ §{ ª §{ ª §) { b9Ñ { M{M{4¸f ( B mK§NB stqJuf à B @îCT=/?c`a{` jJcbb9*)`a{$¤ ¥¯¦§{¨©§{ª§{ ª §( { b9Ñ_{ f B mX§/sÅà @îC!=B ?M` Á{b4B *)4`a{¤ ¥¯¦§{¨©§{ª§${ ª §) { b6Ñ ÁX`2Juf L{4buf B B mK§/sÄà @îC!=?K` * {b6*)`a{ ¤ ¥¯¦ §{ ¨ §{ ª §{ ª §{ buf B B mK§NsSà @îCT=/?c` { b9*)`a{ ¤ ¥n¦ §{ ¨ §{ ª §{ ª §( { b6Ñ{ f B B mK§/sÅà @îC!=?K`%,T{b6*)4`a{ ¤ ¥¯¦ §{ ¨ §{ª§${ ª §{ buf B B mK§NsSà @îCT=/?c`%,T{b6*)x`a{ ¤ ¥¯¦ §{ ¨ §${ª§{ ª §{ bÑá,T{f p ²Np>F]² p ypF]y p H rcLT°9pFIH rcLTHXJ^far[ZMLp °9p Y Y Y p ¶KW+v 4pF¿p ¶>pW+pnvxp ¡W u~ p HKJufvL%°9pFIHKJ¢fpnvxpnL2y=^fu~9JMLp °9p Y Y > p ¶KW { pF¿p ¶>pW+p {p `a{b pn2y¸fa{4L!¶>pF¿2y¸f $zp {pnLp ¶>p p ¶¶ pF¿p ¶>pp ¶ p p ·V^yzfa{4L!°9pF·V^y¸f $zp {pnLp °9p p )¹*, -º {9»¶>pF)¹*, -º!p {p »p ¶>p p 0$1$)¹*, -$º {9»¶>pF0$1$)¹*, -ºap {p »xp ¶>p p : ;<>=?/@AD=º {»{ *)B `a¶ ¤ ¥n¦ §¶ ¨ §¶ª> §¶ ª §¶ Á bp FC!=<>?/@AD=/º!2Juf ?L $z`p {pTJKb%» `a{ b9*) ¤ ¥¯¦ Qìp ¶ ¤ ¥n¦ p ¨ÅQìp ¶¨Õp ªÅQìp ¶ªxp ª Qìp ¶ ª p U B U U µQ H f L/2y¸f L·Vy¸f $k`p {p/` b bLTy Á Qìp ¶ Á p U B U U Q H f L/2y¸f L·Vy¸f $k`p {p/` b bLTy Figure 12: Reduction rules for 2 38 Typerec Figure 14: Translation of 2476 3 5 terms to 2 38 terms The type interpretation function has the following properties. F6$z`a{ { Æ J b Lemma 4.3 `$¸`a{ bb { Æ J 5 Translation from 247395 to 2 38 Proof Follows from the fact that none of the branches of $ has free type variables. Lemma 4.4 `$¸`a{ bb v Æ r F6$z`a{ v Æ r In this section, we show a translation from 2 7395 to 2 38 . The languages differ mainly in two ways. First, the type calculus in 2 38 is split into tags and types, with types used solely for type checking and tags used for analysis. Therefore, type passing in 247395 will get converted into tag passing in 2 38 . Second, the : ;<>=?/@AD= operator in 247365 must be converted into a C!=<?@AD= operating on term representation of tags. Figure 13 shows the translation of 2 7395 types into 2 38 tags. The primitive type constructors B get translated into the corresponding tag constructors. The ;<=C!=? gets converted into a B @îC!=? . The translation inserts an arbitrarily chosen well-kinded result into the branch for the tag since the source contains no such branch. b Proof Follows from the fact that none of the branches of $ has free kind variables. The language 2 38 has the following properties. Proposition 4.5 (Type Reduction) Reduction of well formed types is strongly normalizing and confluent. Proposition 4.6 (Type Safety) If öfa{ , then either ¶ is a value, or there exists a term ¶ such that ¶kѶ and ÃG¶ fa{ . The term translation is shown in Figure 14. The translation 9 P ` $ 1 :b^F6 ¤ ¥n¦ U B U B `Z Z[b^FIHXJuf L2Yy = fRJMU9L!Y H f L2y@f L 9¨¿W J `ay = b>W `ay@b > B `ag b^FIHxrcL!HKJ far[Z_MLHKJufar]Z L/2yA=^f ¨t `aJcbL Y Y Y 9ªMW r MW J W J `ay=b > B `ag Nb^FIHXJuf`agrKLNY `ar[ZcbdZ bL2y = f¸ª )' `aJcbL ª W J `ay = b > Y B B ` ,0b^FIHXJuf Z L2 J B f u¨t\`aJcb?L ^ÁKW J `ay = b > Y B `%, @?/=b^FIHXJuf L2y = f RJM,L *W J `ay = b > `aJcbdF]A y = > > `¹HXrcL{ b^FIH rcL!HKJ far[Z_ML `a{b > Y > Y Y `a{W£v bdF `a{ b>W£pnvxp W ~ > > `2J¢fvxL{ b^FIHXJufpnvxpnL2y = f ~ JcL `a{b > > Y > `a{M{ b^F `a{ b>W£p { p ` `a{ bb > B ¤ ¥n¦ §{ ¨ §{ª§{ ª bbRF ` ;<=C!=? t { * ) a ` { B `%·VK)fag>B Juf LÓJÉZì`a{DC6JcbjL HXJuf L2A y =f BRJcL C!=<?@AD=º2> Juf L ì`a{ C Jcb%»y = *) ^¤ ¥¯¦Q `a{$¤ ¥nB ¦Db U B U ¨ Q HXJuf L2A y =^Y f ÓJML!H U9Y f L2y @ f L > `a{ ¨ b>W JY `aA y =Y b>W `a y @>U9b Y U6Y W { C J `%)W J y = b>W { C `%)W y @b B ªSQ H rcL!HK J far[ZML!HKJufar]Z L/2y = f E¨t\`aJcbL > Y Y Y U U Y `a{ ª bU W r MW J W J `aUA y =bW£2 U9Y farKL{ C U6`aY J b `¹H farKL2 y @faJ B Lj)W J `aA y =cW y @bb ª Q HXJuf`agrKLN`arZ_cbdZ bL2y = f ¸ª )' `aJcbL Y > Y Y `a{ ª bW J `ay = bW HKrKL2 J farZML{ C `aJ´W r J b Y Y Y Y `¹H B rcLTHX J dfar]Z MLj)W JÊW r J `ay = W r W J bb > ¿Q HXJuf L2y = f ÓJML `a{$¤ ¥¯¦b B B Á Q HXJuf Z U B L2 y =df ¨t U `aJcY bL C `aJÊ` bb ÁcW£2 f L { U B U `¹H f L2 y U @f Y L U6Y U6Y )B W JÊ` b `ay = W ` W `a y @bbbDb *Q HXJuf L2A y =^f ÓJMLTA y =b Y W£p {p > `a{ b where B { C\F pn2J¢fML ;<=C!=? JR*)x`a{ ¤ ¥¯¦ §{ ¨ §{ª§${ ª bp / > the term representation of the tag. Thus the translations for type abstractions and type applications are Y Y*> p HXJufvxL%°9pFIHKJufpnvxpnL2A y =^f t~6JMLp °9p p ¶cW { pF p ¶>pW£p {p `a{b > As pointed out before, the translations of abstraction and the fixpoint operator use the tag interpretation function $ to map tags to types. We show the term representation of types in Figure 15. The primitive type constructors get translated to the corresponding term representation. The representations of type and kind functions are similar to the term translation of type and kind abstractions. The only B involved case is the term representation of a B ;<=C!=? . Since ;<>=C!=? is recursive, we use a combination of a CT=/<>?@AD= and a ·V . We will illustrate only one case here; the other cases can be reasoned about similarly. B B Consider the reduction of ;k`a{4dZ_{4 Tb , where ;{ stands B for ;<=C!=?Ê{*)G`a{ ¤ ¥¯¦ §{ ¨ §{ ª §{ ª b . This type reduces to B B { ¨ { { ` ;M`a{ bb` ;¡`a{ bb . Therefore, in the translation, the term representation of {¨ must be applied to the term representations of {x , {x , and the result of the recursive calls to the B ;<=C!=? . The representations of { and { are bound to the variables A y = and y @ ; by assumption the representations for the results of the recursive calls are obtained from the recursive calls to the function ) . In the following propositions the original 2 7365 kind environment s is extended with a kind environment su`amKb which binds J of kind rÅZ for each rSËm . Similarly a type variable the term-level translations extend the type environment w with wK`sub , binding a variable y = of type t~9J for each type variable J bound in s with kind v . Proposition 5.1 If mK§/s ÃÏ{ f v holds in 247395 , then p mKpn§pnsup£q6su`amXbép {p\fp£vp holds in 2 38 . Proof Follows directly by induction over the structure of { . Proposition 5.2 If mX§/sïÃO{Ôfv > and mX§/sïÃÕw hold in 2 79 3 5 , `a{bÕ f ~ p {p holds in 2 38 . then p mcp¯§p¯stpnq6su`amKb§/p wcpnqwc`s¢bdà Proof By induction over the structure of { . The only interesting case is that of a kind application which uses Lemma 4.2. Figure 15: Representation of 2 79 3 5 types as 2 38 terms Proposition 5.3 If mK§Nsu§wÈà ¶¾fì{ holds in 247365 , then p mKpn§pnsup£q6su`amXb§p wMpnqwK`sub^ÃÊp ¶>p\f&$¸p {p holds in 2 38 . must maintain two invariants. First, for every accessible kind variable r , it adds a corresponding type variable J ; this variable gives the type of the term representation for a tag of kind r . At every kind application, the translation uses the function ~ (Figure 9) to compute this type. Thus, the translations of kind abstractions and kind applications are p H rcLT°9p>FIH rcLTHXJfar[ZMLp °9p Proof This is proved by induction over the structure of ¶ , using Lemmas 4.3 and 4.4. Y Y Y p ¶cW£v pF p ¶>p/W£pnvxp W ~ 6 The untyped language Second, for every accessible type variable J , a term variable y = is introduced, providing the corresponding term representation of J . At every type application, the translation uses the function > `a{ b (Figure 15) to construct this representation. Furthermore, type application gets replaced by an application to a tag, and to This section shows that in 2 38 types are not necessary for comF putation. Figure 16 shows F an untyped language 2 38 . We show a translation from 2 38 to 2 38 in Figure 17. The expression ò is the integer constant one. 10 7 Related work and conclusions `«¬­¯®bΰµfnfnF³²ÌpM2yL ¶[pX·VyL °µpX)+*, -K° p ^¤ ¥n¦u p 9¨ p ¨µòI p 9¨µò9° p ¨ ò°¡ò] p ¨ ò9°kò9° p ª p 6ªcò] p ªcòò[ p ªKòòò p ªKòò^ò9° p ª p ª òI p ª ò9° p ^Á[ p ^Ád òI p ^Ádò9 ° p ¸ p ò] p *ò9° p p ò] p ò9° `¼N½Gb Our work closely follows the framework proposed in Crary et al. [3]. However, as we pointed before, they consider a language that analyzes inductively defined types only. Extending the analysis to arbitrary types makes the translation much more complicated. The splitting of the type calculus into types and tags, and defining an interpretation function to map between the two, is related to the ideas proposed by Crary and Weirich for the language LX [2], which provides a rich kind calculus and a construct for primitive recursion over types. This allows the user to define new kinds and recursive operations over types of these kinds. ¶µfnfnF³°µpyp¶ ¶pK)+*, -c¶pX0$1$)¹*, -c¶ pC!=<?@AD=¶¸*)x`a¶ ¤ ¥¯¦ §¶ ¨ §¶ ª §¶ ª §¶ 8 §¶ Á §¶ b Figure 16: Syntax of the untyped language 2 38 F This framework also resembles the dictionary passing style in Haskell [12]. The term representation of a type may be viewed as the dictionary corresponding to the type. However, the authors consider dictionary passing in an untyped calculus; moreover, they do not consider the intensional analysis of types. Dubois et al. [4] also pass explicit type representations in their extensional polymorphism scheme. However, they do not provide a mechanism for connecting a type to its representation. Minamide’s [8] type-lifting procedure is also related to our work. His procedure maintains interrelated constraints between type parameters; however, his language does not support intensional type analysis. F F[² 9ª F F ª G Y F ` 9 M ª £ W v b FGªKò F 2 L ° F Y Y F `9ªMW+v kW { b F FGªKòò F 2 L ° F Y Y Y F 2yL ¶ ` 9 M ª W+v kW { W { b F FGªKòòò F F Y Y Y FÇ·VyL ° F `ª¡W£v W { W { ¶b FGªKòòò¶ F FÇ)¹*, -c¶ F ª FG ª Y F FÇ0$1$)¹*, -K¶ ` ª W { b FG ª ò F F Y F F[¶ ò ` ª W { ¶b FG ª ò9¶ F F F[¶ ò ^Á FG^Á F F Y F F ¶ ¶xÖ [ ` ÁcW { b FG^Á^ò F Y F F ¤ ¥¯¦ G ` Á W { ¶b FG Á ò¶ F FG ¨ FG Y F FG9¨µò ` * W { b F FG ò F F Y FG9¨µò9¶ ` H *W { ¶b FG ò¶ F F FG9¨µò9¶ ò FG Y F F ` W { b FG ò F F F Y F ¨µò9¶ ò9¶xÖ ` IGW { ¶b FG\ò¶ F `%C!=<?@AD=º {9»¶¸*F )`a¶¤ ¥¯¦§¶F ¨©§$¶ªF §¶ ª F §¶ 8 §F¶Á§2¶ *F ¹bb F F F C!=<?@AD=¶ *)4`a¶¤ ¥¯¦ §¶¨ §¶ª §¶ ª §¶ 8 §¶Á §¶2 * b ²F `¹H rcLT°6b F `¹HXJufvxLT°6b F `2yzfa{4L!¶>b F `%·Vyzfa{4L!°6b F `%)¹*, -º {9»¶>b F `%0$1$)¹*, -$º {9»¶>b F Y `a¶cW£v b F Y `a¶XW { b F `a¶¶4Öb F ^¤ ¥n¦ F ¨ Y F ` 9¨¿W { b Y F ` ¨ W { ¶>b Y Y F `¨ W { ¶cW {4 b F Y Y ` ¨ W { ¶KW { ¶ Ö b F Figure 17: Translation of 2 38 to 2 38 Duggan [5] proposes another framework for intensional type analysis. His system allows for the analysis of types at the term level only. It adds a facility for defining type classes and allows type analysis to be restricted to members of such classes. Yang [15] presents some approaches to enable type-safe programming of type-indexed values in ML which is similar to term level analysis of types. Aspinall [1] studied a typed 2 -calculus with subtypes and singleton types. Necula [11] proposed the idea of a certifying compiler and showed the construction of a certifying compiler for a type-safe subset of C. Morrisett et al. [10] showed that a fully type preserving compiler generating type safe assembly code is a practical basis for a certifying compiler. We have presented a framework that supports the analysis of arbitrary source language types, but does not require explicit type passing. Instead, term level representations of types are passed at runtime. This allows the use of term level constructs to handle type information at runtime. F References The translation replaces type and kind applications (abstractions) by a dummy application (abstraction), instead of erasing them. In the typed language, a type or a kind can be applied to a fixpoint. This results in an unfolding of the fixpoint. Therefore, the translation inserts dummy applications to preserve this unfolding. [1] D. Aspinall. Subtyping with singleton types. In Proc. 1994 CSL. Springer Lecture Notes in Computer Science, 1995. [2] K. Crary and S. Weirich. Flexible type analysis. In Proc. 1999 ACM SIGPLAN International Conference on Functional Programming, pages 233–248. ACM Press, Sept. 1999. [3] K. Crary, S. Weirich, and G. Morrisett. Intensional polymorphism in type-erasure semantics. In Proc. 1998 ACM SIGPLAN International Conference on Functional Programming, pages 301–312. ACM Press, Sept. 1998. The untyped language hasF the following property which shows that term reduction in 2 38 parallels term reduction in 2 38 . Proposition 6.1 If ¶kÑ C ¶ Ö , then ¶ F F Ñ C ¶ Ö . [4] C. Dubois, F. Rouaix, and P. Weis. Extensional polymorphism. In Proc. 22nd Annual ACM Symp. on Principles of Programming Languages, pages 118–129. ACM Press, 1995. [5] D. Duggan. A type-based semantics for user-defined marshalling in polymorphic languages. In X. Leroy and A. Ohori, editors, Proc. 11 On the other hand, the language 2 73 [13] shown below has a translation into a type erasure semantics. This language supports fully general analysis of polymorphic types. 1998 International Workshop on Types in Compilation, volume 1473 of LNCS, pages 273–298, Kyoto, Japan, Mar. 1998. SpringerVerlag. [6] L. Fegaras and T. Sheard. Revisiting catamorphism over datatypes with embedded functions. In 23rd Annual ACM Symp. on Principles of Programming Languages, pages 284–294. ACM Press, Jan. 1996. [7] R. Harper and G. Morrisett. Compiling polymorphism using intensional type analysis. In Proc. 22nd Annual ACM Symp. on Principles of Programming Languages, pages 130–141. ACM Press, Jan. 1995. `b {fnfnF ìpMvtZÎvprp g>rcLv P 1$:upcZ Zp g pg N Y pGJì p {M{4 B p HKrKL{ pM2JufvxL{ p{W£v p ;<>=C!=? {t*)4`a{¤ ¥¯¦§{¨©§${ªx§${ ª b Y Y `¼j½¢b¾¶fnfnF°µp yÀp ¶KW+v p ¶cW { p ¶¶ pÌ: ;<>=?/@AD=/º {»{ *x ) `a¶>¤ ¥n¦D§¶¨Ê§¶ª§¶ ª b Therefore, to support fully reflexive type analysis in a type erasure semantics, we must base our language on 2 73 . Accordingly, to arrive at 247365 , we augment the type calculus of 2 73 with recursive types and recursive type analysis. Correspondingly, we extend the term calculus of 2 73 with )¹*, - and 0$1$)+*, - operators, and add a branch for recursive types to the : ;<>=?/@AD= operator. This, in turn, requires that we restrict type analysis suitably to ensure a sound and decidable type system. [9] Y. Minamide, G. Morrisett, and R. Harper. Typed closure conversion. In Proc. 23rd Annual ACM Symp. on Principles of Programming Languages, pages 271–283. ACM Press, 1996. [10] G. Morrisett, D. Walker, K. Crary, and N. Glew. From System F to typed assembly language. In Proc. 25th Annual ACM Symp. on Principles of Programming Languages, pages 85–97. ACM Press, Jan. 1998. [11] G. C. Necula. Compiling with Proofs. PhD thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, Sept. 1998. [12] J. Peterson and M. Jones. Implementing type classes. In Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation, pages 227–236. ACM Press, June 1993. [13] V. Trifonov, B. Saha, and Z. Shao. Fully reflexive intensional type analysis. In Proc. 2000 ACM SIGPLAN International Conference on Functional Programming. ACM Press, 2000. [14] D. Wang and A. Appel. Safe garbage collection = regions + intensional type analysis. Technical report, Dept. of Computer Science, Princeton University, July 1999. [15] Z. Yang. Encoding types in ML-like languages. In Proc. 1998 ACM SIGPLAN International Conference on Functional Programming, pages 289–300. ACM Press, 1998. The language 2476 3 5 In this section, we compare 2 7365 with the languages proposed in [13], specially for those unfamiliar with the previous work. The language 247365 is a simplification of the language 2 7â proposed in [13]B which allows unrestricted type analysis. In 2 7â , the result of the ;<>=/CT=/? is not restricted to B be of kind ; moreover, while analyzing a recursive type, the ;<>=C!=? is not required to return another recursive type. The kind and type calculus of 2 7â is shown below, while the term calculus is identical. `,² ÚKJ b vfnfnF `«¬­¯®b±°fnfnF²Ìp H rcL%¶[p HKJ¢fvxL¶pc2yzfa{4LT¶pX·Vyzfa{4L!° [8] Y. Minamide. Full lifting of type parameters. Technical report, RIMS, Kyoto University, 1997. A `¹b vÉfnf£FSrp1LvÉpcvZv p g>rcLv P M p g NM tp 0[ ` ù ýû¶ J bh{Àfnf£FSJìp 1$:up;Z Z M p g , M pXY , @?/= p´B 2JufvxL{ p{c{ pHXrcL%{ p{W+v p ;<=C!=?{t*)x`a{$¤ ¥n¦D§{¨©§{ª>§{ ª §{$Áb To allow general type analysis, 247â uses parameterized kinds Lv . B The kind is then modeled as grKL L¹r . The ;<=C!=? analyzes types of kind Lv , and returns a type of kind v . (Refer to [13] for details). The presence of parameterized kinds complicates the system and makes the translation into a type erasure semantics unclear. 12