Extending Type Systems in a Library Yuriy Solodkyy, Jaakko Järvi, Esam Mlaih Department of Computer Science and Engineering Texas A&M University March 23, 2010 http:://parasol.tamu.edu/~yuriys/ Motivation: Trends Programs grow in size and complexity o o o o Use multiple libraries Depend on third party components (APIs, protocols) Have to account for OS, hardware & compiler differences Are written by many people with different skill sets Languages grow in abstraction level Crave for more features o Have to deal with backward compatibility o Provide larger standard libraries o Domain specialization o 2 Motivation: Bugs Cost U.S. economy $60 billion each year users incurred 64% of the cost o developers and vendors – 36% o Improvements in testing can reduce it by about a third ($23 billion) o won't eliminate all software errors o Collection of Software Bugs by Thomas Huckle 3 Motivation: Bugs Ariane 5 Explosion NASA Mars Climate Orbiter LA Air-Traffic Control narrowing conversion 6/4/1996, $500 million mixing metric and imperial units 8/23/1999, $125 million counter underflow 9/14/2004 Patriot Missile Failure Zune's New Year Freeze Pentium FDIV bug float imprecision rounding 02/25/1991, 28 dead 100 injured infinite loop 12/31/2008 incomplete entries in a look-up-table 1994, $400 million Collection of Software Bugs by Thomas Huckle 4 Motivation: Solutions Testing is not a panacea o o o o Many errors can be detected without running the program through the use of: o o Does not prove absence of errors Specific to a project and cannot be reused Has to be maintained in sync with evolution of project Improvements in it will reduce the cost of bugs by about a third, but won't eliminate all software errors Type system Static analysis Sometimes absence of specific run-time errors can be proven 5 The problem with Type System Interesting ones are domain-specific o Impossible to account for all interesting ones in general-purpose programming language o o e.g units, qualifiers covariance typing, full static typing and subtype substitutability: pick two can be designed and approached differently Composability of multiple type systems in a single program is hard to achieve o e.g. units for measurements and regular expression types for patterns 6 XTL: eXtensible Typing Library Type systems – prevent certain kinds of bugs from happening – not easily extensible – domain specific Observation – many interesting domain specific type-systems can be implemented as libraries Objective – explore how far a pure library solution suffices in extending a type system Solodkyy et al. LCSD’06 7 Use Cases Tracking physical quantities/units – converting between compatible units – rejecting operations that do not agree on units Tracking semantic properties – – – – – kg = lbs – ok, convert – kg = km/h – error nullability, sign, oddity of a number security vulnerability to format strings usage of user pointers in kernel space deadlocks and data races Regular Expression Types semantic properties – even+odd=odd – sprintf(buf,…); – T* krnl = usr;: error – typing XML documents – ordering patterns units regular expression types – T*U* <: (T|U)* Variant Parametric Types – defining relations between instances of a parameterized type based on relations of its argument types Solodkyy et al. LCSD’06 variant parametric types – vector<B*> <: vector<D*> 8 XTL Contributions We report on implementing simple type qualifiers as a C++ library We implement regular expression types (in a limited form) to check XML data We provide a framework to help others in extending the C++ type system for their abstractions Solodkyy et al. LCSD’06 9 Example // a is a positive value double a = current_height(); returns only positive numbers upper bound for values from previous call // b is a positive value double b = max_allowed_height(); // b-a double // b+a double difference is positive, but not for type system may be negative though! Result is always positive c = std::sqrt(b-a); argument and result are always positive is assumed to be positive. Unless there is bug! d = std::sqrt(b+a); 10 Refining Type Systems What do we need? How do we achieve that in C++? Useful building blocks Typing rules Type construction via class templates tuple, variant, optional Evaluation rules Function templates and overloading enable_if Subtyping rules A dedicated metafunction MPL Composability Library-only solution Interoperability of the above libraries Objective: emulate domain-specific type system as a library Solodkyy et al. LCSD’06 11 Type System 1: Type qualifiers Objective: providing support for type qualifiers in programs Allow tracking of semantic properties like: o o o o o o Easily defined in terms of: o o immutability of a certain value (const) sign of a number (pos, neg) assumptions about pointers (optional, nonnull) trustworthiness of a certain value (tainted, untainted) oddity of a number (odd, even) origin of a pointer (user, kernel) direction (positive/negative qualifier) abstract operations on qualifiers But o o cannot handle flow-sensitive qualifiers cannot handle arbitrary reference qualifiers Solodkyy et al. LCSD’06 12 Example: Qualifiers’ Hello World Declare few qualifiers: Define how different Q isoperations positive if Ttransfer <: Q T Q is negative if Q T <: T properties #include <xtl/qualdecl.hpp> DECLARE_NEGATIVE_QUALIFIER(pos); DECLARE_POSITIVE_QUALIFIER(tainted); // ... Other qualifiers ... namespace xtl { template <> struct minus<pos, neg> { typedef qual<pos> type; }; template <> struct minus<neg, pos> { typedef qual<neg> type; }; template <> struct mul<pos, pos> { typedef qual<pos> type; }; template <> struct mul<pos, neg> { typedef qual<neg> type; }; template <> struct mul<nonnull, nonnull>{ typedef qual<nonnull> type;}; template <> struct div<nonnull, nonnull>{ typedefMultiplication qual<nonnull> type;}; carries Positive qualifiers Subtraction does not } // of namespace xtl nonnull & negativeness Declare your variables but notto can be...added int main() carry nonnull on type... pos & neg with appropriate dropped once { result untainted<nonnull<neg<int> > > a(-42); properties theyarguments are there! pos<untainted<nonnull<int> > > b(7); neg<nonnull<long> > c = a * b; // OK: drop negative qualifier untainted //nonnull<pos<double> > d = b - a; // Error: nonnull isn’t carried by pos<tainted<double> > e = b + a*c; // OK to add positive qualifier //pos<double> f = e; // Error: ... but not OK to drop it! } Solodkyy et al. LCSD’06 13 Example we achieve this by function we convert descend value of a restricting its argument accepts subtype onlyinto positive a value of type to be a subtype of numbers a supertype pos<double> // Example definition that accepts only positive doubles template<class U> typename enable_if< typename is_subtype<U, pos<double> >::type, returns a value that is void both: positive and >::type descend(const U& altitude) untainted { pos<double> a = subtype_cast<pos<double> >(altitude); a can hold positive, //... untainted values. order }; of qualifiers is not important! // Data coming from measurements is marked untainted no negative altitudes extern pos<untainted<int> > get_corridor_height(); can appear here! untainted<pos<int> > a = get_corridor_height(); descend(a); // no negative altitudes here! 14 Type System 2: XML Typing Objective: providing support for typing XML snippets in programs Types can describe XML elements with certain structure o Subtyping describes structurally more powerful types o types that can hold all the values of their subtype Compile-time assurance that only valid XML documents are produced o sequences, alternatives, elements etc. when document schema changes, are we backward compatible with the old one? Value-preserving type conversions o logically the same entities can be represented by different XML elements Solodkyy et al. LCSD’06 15 we use a dedicated type for each tag we createwe map XML data types element toa represent back dedicated referencestag-type are into C++ types XML elements mapped to previous XML Schema’s choice is mapped to typedefs Boost C++ variant Example XML Schema <xsd:element name="name" type="xsd:string"/> <xsd:element name="email" type="xsd:string"/> <xsd:element name="icq" type="xsd:decimal"/> typedef element<name, string> XMLname; typedef element<email,string> XMLemail; typedef element<icq, int> XMLicq; <xsd:element name="contact"> XML Schema’stypedef element<contact, <xsd:complexType> sequence is mapped toboost::variant< <xsd:choice> <xsd:element ref="email"/> XMLemail, Fusion’s tuple <xsd:element ref="icq"/> XMLicq </xsd:choice> > </xsd:complexType> > XMLcontact; </xsd:element> <xsd:element name="person"> <xsd:complexType> <xsd:sequence> <xsd:element ref="name"/> <xsd:element ref="contact"/> </xsd:sequence> </xsd:complexType> </xsd:element> typedef element<person, fusion::tuple< XMLname, XMLcontact > > XMLperson; 16 Example Name, Tel, ICQ <: Person Name, AnyContact, <: PersonEx ICQ Tel <: <:AnyContact AnyContact Instantiate an XML AnyContact snippet that corresponds to Person type // ... typedef variant<Email,Tel,ICQ> AnyContact; typedef element<person, tuple<Name, Tel, ICQ> > Person; typedef element<person, tuple<Name, AnyContact, AnyContact> > PersonEx; int { Person <: PersonEx Assignment involves main() PersonEx is not a subtype conversion subtype of Person Person p(make_tuple(Name("Yuriy"), Tel("555-4321"), ICQ(1234))); Parses only XML files PersonEx x = p; // OK: Subtyping conversion that correspond to // p = x; // ERROR: Not in subtypingProduces relation XML source PersonEx schema on the screen ifstream xml("old-person.xml"); xml >> x; // read data from XML file. assumes file exist cout << x << endl; // show XML source on the screen } 17 XDuce Type o Regular Expression Types o o o o o o set of sequences over a certain domain concatenation : A,B alternation : A|B repetition : A* optional : A? type construction : l[A] recursion : X = A,X | ø Subtyping o inclusion between the sets defined by types 18 C++ Type o Regular Expression Types o o o o o o set of sequences over a certain domain concatenation : A,B alternation : A|B repetition : A* optional : A? type construction : l[A] recursion : X = A,X | ø tuple<A,B> variant<A,B> vector<A> optional<A> element<l,A> – Subtyping o is_subtype and subtype_cast Objective: similar representation and the same semantics as in XDuce 19 XTL’s Strengths Simplicity o Genericity o o reasonably powerful type systems can be built without creating a language tool common interface for defining custom subtyping relation common interface for defining conversion from a subtype to a supertype Reusability o ready definitions to be used in other type systems subtyping of array types subtyping of function types subtyping of sequences and union types subtyping of qualified types Solodkyy et al. LCSD’06 20 XTL’s Limitations Unable to take information about control-flow into account o Slows down compilation on complex type systems o XML typing is exponential No implicit transitivity of subtyping relation o e.g. XML types Scaling problems on complex type systems o if (x>0) … does not make the type of x - pos<int> library has no access to all available types Meta-function join may return an arbitrary upper bound o same reason – no global information on all types Solodkyy et al. LCSD’06 21 THANK YOU! Abstract Questions deserve Abstract Answers 22 Compilation Times XML Typing n/k 1 2 3 4 5 6 7 8 9 1 1.48 1.66 1.73 1.81 2.03 2.14 2.35 2.54 2.79 2 1.52 1.77 2.29 3.67 8.28 27.38 3 2.43 2.00 2.66 6.00 31.43 4 2.62 2.03 3.63 25.57 5 2.32 2.28 7.15 6 2.73 4.13 19.30 7 2.48 2.86 56.78 8 1.74 3.65 9 1.76 4.93 Type Qualifiers N 0 1 2 3 4 5 6 7 8 9 10 Time 1.12 1.72 2.61 2.81 3.18 3.94 4.97 5.55 6.28 15.40 19.22 23