Object Text Markup Language Specification Object Text Markup Language Specification Version 0.1 (Draft) 8 Feb 2016 By: David R. Tribble david@tribble.com Version 0.1 (Draft) 1 8 Feb 2016 Object Text Markup Language Specification Copyright ©2013 by David R. Tribble, all rights reserved. Permission is granted to anyone to reproduce, distribute, translate, or store this document in whole or in part. Version 0.1 (Draft) 2 8 Feb 2016 Object Text Markup Language Specification Contents Introduction .................................................................................................................................................. 5 Lexical Elements ............................................................................................................................................ 6 Spaces and Newlines............................................................................................................................. 6 Characters ............................................................................................................................................. 6 Comments ............................................................................................................................................. 6 Punctuation and Operators .................................................................................................................. 7 Names ................................................................................................................................................... 7 Reserved Keywords............................................................................................................................... 8 Boolean Values ..................................................................................................................................... 8 String Values ......................................................................................................................................... 8 Character Escape Sequences ................................................................................................................ 8 Numeric Values ..................................................................................................................................... 9 Syntactical Elements ................................................................................................................................... 12 Objects ................................................................................................................................................ 12 Types ................................................................................................................................................... 13 Names ................................................................................................................................................. 13 Attributes ............................................................................................................................................ 13 Document Objects ...................................................................................................................................... 14 Document object ................................................................................................................................ 14 Head Object ........................................................................................................................................ 14 Include Object ..................................................................................................................................... 15 Style Object ......................................................................................................................................... 15 Body Object......................................................................................................................................... 15 Markup Object .................................................................................................................................... 16 Data Object ......................................................................................................................................... 17 Functions ..................................................................................................................................................... 19 Function Object................................................................................................................................... 19 Statements .......................................................................................................................................... 20 Variable Definition .............................................................................................................................. 20 If-Else Statement................................................................................................................................. 20 For Statement ..................................................................................................................................... 20 Do-While Statement ........................................................................................................................... 21 While Statement ................................................................................................................................. 21 Switch Statement ................................................................................................................................ 21 Break Statement ................................................................................................................................. 22 Return Statement ............................................................................................................................... 22 Try-Catch-Finally Statement ............................................................................................................... 22 Throw Statement ................................................................................................................................ 23 Lock Statement ................................................................................................................................... 23 Block Statement .................................................................................................................................. 23 Expression Statement ......................................................................................................................... 24 Expressions.................................................................................................................................................. 25 Attribute Values .................................................................................................................................. 31 Type Conversions ................................................................................................................................ 31 Version 0.1 (Draft) 3 8 Feb 2016 Object Text Markup Language Specification Regular Expressions ............................................................................................................................ 32 Object Management ........................................................................................................................... 34 Name Scope ........................................................................................................................................ 34 Function Invocation ............................................................................................................................ 34 Multithreading .................................................................................................................................... 35 Standard Library Types ............................................................................................................................... 36 Object Type ......................................................................................................................................... 36 String Type .......................................................................................................................................... 36 Bool Type ............................................................................................................................................ 37 Int Type ............................................................................................................................................... 37 Float Type............................................................................................................................................ 38 Date Type ............................................................................................................................................ 39 URL Type ............................................................................................................................................. 40 RegEx Type .......................................................................................................................................... 40 Array Type ........................................................................................................................................... 41 Event Type .......................................................................................................................................... 41 Thread Type ........................................................................................................................................ 42 Body Type ........................................................................................................................................... 43 Markup Type ....................................................................................................................................... 43 Request Type ...................................................................................................................................... 43 Response Type .................................................................................................................................... 44 Function Type ..................................................................................................................................... 44 Markup Types ............................................................................................................................................. 45 Prior Art....................................................................................................................................................... 47 HTML ................................................................................................................................................... 47 XML ..................................................................................................................................................... 47 CSS....................................................................................................................................................... 49 JavaScript ............................................................................................................................................ 49 JSON .................................................................................................................................................... 50 AJAX .................................................................................................................................................... 51 HTML Browsers ................................................................................................................................... 51 Examples ..................................................................................................................................................... 52 Unresolved Issues ....................................................................................................................................... 53 Lexical Issues ....................................................................................................................................... 53 Syntax Issues ....................................................................................................................................... 53 References .................................................................................................................................................. 54 Version 0.1 (Draft) 4 8 Feb 2016 Object Text Markup Language Specification Introduction Object Text Markup Language (OTML) is a text-based format for encoding documents composed of structured text and image elements. It is intended to be used as a standardized format for data transmitted between client web browsers and web content servers. OTML combines the capabilities of Hyper-Text Markup Language (HTML), Extensible Markup Language (XML), Cascading Style Sheets (CSS), JavaScript, and JavaScript Object Notation (JSON) into a single unified syntax. This syntax is based primarily on JavaScript and JSON. Whereas HTML, CSS, JavaScript, and JSON all have disparate and conflicting lexical structure and syntax, OTML has a single unified syntax that applies to all of its definable entities. OTML is designed to embody all of the capabilities and functionality of these languages, and to provide its own additional capabilities. The primary components of OTML are the document object, request object, and response object. These are the data elements that are transmitted between client web browsers and web servers. Everything in OTML is an object. Primitive objects are simple text string values; non-primitive objects contain sub-objects. Objects may be named. Objects may also have named attribute values attached to them. Version 0.1 (Draft) 5 8 Feb 2016 Object Text Markup Language Specification Lexical Elements OTML documents are text documents composed of Unicode characters. The default encoding is UTF-8, but other encodings are possible (e.g., UTF-16, ISO 8859-1 Latin-1, etc.). Documents are composed of lexical items, which fall into these categories: Comments and whitespace Punctuation and operators Names String values Numeric values Reserved keywords Spaces and Newlines Spaces (and other whitespace characters such as HT) are only significant if they appear within quoted string values. In all other contexts they are ignored, and serve only to separate source tokens and to improve readability for humans. Newlines (CR and LF characters) are only significant if they appear within quoted string values. They also indicate the end of ‘//’ comments. In all other contexts they are ignored, and serve only to separate source tokens and to improve readability for humans. The standard encoding of newlines (end of line delimiters) follows the XML standard, i.e., a newline is composed of a CR LF pair. OTML parsers, however, must be able to read documents having newlines consisting of CR, LF, or CR LF pairs. Note that newlines are only significant for recognizing the end of ‘//’ style comments, and are not required in any other context. Characters Source documents are composed of printable Unicode characters. Unprintable characters (e.g., control characters or undefined Unicode character codes) are treated as spaces. Implementations may choose to issue warnings when they are encountered. It is also possible for unprintable characters to appear within quoted string values. Such characters should be coded using character escape sequences instead. Comments Comments are sequences of arbitrary text characters which are ignored, being syntactically equivalent to spaces. There are two forms of comments: // comment-text newline /* comment-text-and-newlines */ In the first form, all characters following the ‘//’ marker are ignored up to the next newline (or up to the end of the document or current include file). In the second form, all characters following the ‘/*’ marker are ignored, including newlines, up to the next ‘*/’ marker (or up to the end of the document or current include file). Comments do not nest. Version 0.1 (Draft) 6 8 Feb 2016 Object Text Markup Language Specification Punctuation and Operators These tokens are composed of punctuation characters, and are used as delimiters and operators. { } ( ) [ ] + − * / & | ^ >> >>> << ? : ; ~ == === != > >= < <= += −= *= /= &= |= ^= >>= >>>= <<= . , @ = The character pairs ‘//’ and ‘/*’ are special, used to introduce comments. Quote marks (single ' and double ") are special, used to introduce string values. Backslash (\) is special within string values, used to specify character escape sequences. Names Names (a.k.a. identifiers) are used to uniquely label and identify objects within a given scope within a document. Names are composed of alphabetic letters, digits, underscores (_), and US dollar signs ($). Names cannot begin with a digit. Names can be any length, but are considered unique only up to the first (leftmost) 63 characters, and the rest are ignored. Implementations may choose to issue warnings if names longer than this are encountered. Letter characters include the standard ASCII alphabetic characters (upper and lower case), as well as Latin-1 alphabetic characters, and any other Unicode character point defined to be an “alphabetic letter”. Digit characters include the standard ASCII decimal digits, as well as any Unicode character point defined to be a “digit”. Any alphanumeric character outside the Latin-1 group can be specified using a \uHHHH character escape sequence, where HHHH are hexadecimal digits, to represent the Unicode character U+HHHH. For example, fiché, grün, θ, Δt, π, Давид, and \u0391\u03B4\u03C1\u03B1 are all valid names. In the interests of readability, however, it is recommended that only meaningful names be used, and to avoid names that resemble operators or reserved keywords too closely. So for example, while för is a syntactically valid name, a different name should probably be used instead to reduce confusion. Names are not case-sensitive. This means that the names foo, FOO, and Foo all refer to the same object within a given scope. Names beginning with two leading underscores (e.g., __foo) or beginning with an underscore and a dollar sign (e.g., _$foo) are reserved for private use by implementations, and should not be used in normal client code. Every object has an implicit attribute named ‘id’ that contains the name of the object (which is null for unnamed objects). Categories of objects can be given names of a certain format in order to aid in readability. For example, it is traditional in C-like languages to name constants (i.e., initialized variables whose values are never modified) using all-uppercase letters and digits (e.g., PI, MAX_LENGTH, RED). Variables are traditionally named using all lowercase letters (e.g., i, count, len, user_name) or using camel-case names (e.g., userID, custName, cityState). Functions are traditionally named using all lowercase letters (e.g., find, get_value, compute_sum) or using camel-case names (e.g., either getValue or GetValue). Version 0.1 (Draft) 7 8 Feb 2016 Object Text Markup Language Specification Reserved Keywords While these tokens are lexically equivalent to names, they are reserved as keywords, and cannot be used as names. and bool break case catch continue data date default defined delete do else event false finally float for function if int lock new not null object or return string switch this throw true try void while It is recommended that the use of names resembling keywords (e.g., för, ìf) be avoided in the interests of readability. Boolean Values Boolean (or logical) values are either true or false, and have type bool. String Values String values are sequences of zero or more Unicode characters. String literals are specified as Unicode characters enclosed within quotes. Either matching single quote marks (') or double quote marks (") may be used. String literals should be composed of only printable Unicode characters. Unprintable characters, including control codes, should be encoded using character escape sequences. Some example string values: "" 'Hello, world.' '123' "He said, \"I cannot tell a lie.\"" "Pi (written as \&pi;) is approx. 3.1416." 'The set {0,\_1,\_2} contains three elements.\r\n' "an\-ti\-dis\-es\-tab\-lish\-men\-tar\-i\-an\-ism" Character Escape Sequences Unprintable Unicode characters can be specified within string values using character escape sequences. Such a sequence is indicated by a backslash (\) followed by one or more characters that taken together specify a single Unicode character code. A ‘\u’ followed by exactly four hexadecimal characters (upper or lower case) specifies a single Unicode character. Characters from ‘\u0020’ to ‘\u007E’ are printable ASCII characters. Implementations may choose to issue warnings if malformed or illegal Unicode character escape sequences are encountered. Version 0.1 (Draft) 8 8 Feb 2016 Object Text Markup Language Specification A ‘\&’ followed by a name followed by a semicolon (;) specifies a named character entity, corresponding to its equivalent XML/HTML character entity. For example, ‘\&amp;’ is the same character as XML entity ‘&amp;’, which is the ampersand character (&). The HTML 4 standard defines 252 character entities. A ‘\’ followed by one of the specific characters listed below specifies a certain character code. A ‘\’ followed by a character that is not in the list is ignored. Implementations may choose to issue warnings if malformed or undefined sequences are encountered. Sequence Character HTML Description Unicode \\ \cX \N \n \r \s \t \uHHHH \z \' \" \\_ \^ \$ \. \? \* \+ \[ \] \( \) \{ \} \| \&xxx; \ Ctrl-X CR LF LF CR SP HT U+HHHH ZWSP ' " SHY NBSP ^ $ . ? * + [ ] ( ) { } | \ Backslash Control character Newline Line feed Carriage return Space Horizontal tab Unicode character Zero-width space Apostrophe Quote Soft hyphen Non-breaking space Caret US dollar sign Period (full stop) Question mark Asterisk Plus sign Left square bracket Right square bracket Left parenthesis Right parenthesis Left curly brace Right curly brace Vertical bar Character entity xxx U+005C U+0000 – U+001F U+000A U+000D U+000A U+000D U+0020 U+0009 U+HHHH U+200B U+0027 U+0022 U+00AD U+00A0 U+005E U+0024 U+002E U+003F U+002A U+002B U+005B U+005D U+0028 U+0029 U+007B U+007D U+007C &#10;&#13; &#10; &#13; &#32; &#9; &#xHHHH; &#x200B; &apos; &quot; &shy; &nbsp; ^ $ . ? * + [ ] ( ) { } | &xxx; Most characters can be specified in more than one way. For example, Latin-1 capital A can be coded as 'A' or '\u0041'. A non-breaking space can be coded as '\_', '\u00A0', or '\&nbsp;'. Numeric Values Numeric values can be specified as integers and as floating-point numbers. Decimal integer numbers are composed of decimal digits, and do not begin with the digit ‘0’. Octal integers are prefixed with a ‘0’, hexadecimal integers are prefixed with a ‘0x’, and binary integers are prefixed with a ‘0b’. Integers are 64-bit signed binary values, spanning the range from −9_223_372_036_854_775_808 to +9_223_372_036_854_775_807 (or −0x8000_0000_0000_0000 to +0x7FFF_FFFF_FFFF_FFFF). Floating-point numbers are composed of decimal digits and must contain either a decimal point (.) or an exponent suffix. They are 64-bit IEEE double-precision binary floating-point values, spanning the range Version 0.1 (Draft) 9 8 Feb 2016 Object Text Markup Language Specification from ±1.112536929253600e-308 to ±1.797693134862315e+308. Infinity (±INF) and Not-A-Number (NaN) values are also supported (see the float type in the standard library). Numbers may contain underscores (_) between digits to improve readability. number: unsigned-number unsigned-number: integer real integer: decimal-integer 0 octal-integer 0 {x | X} hexadecimal-integer 0 {b | B} binary-integer decimal-integer 0 decimal-number decimal-number _ decimal-number decimal-number: digit decimal-number digit octal-integer: octal-number octal-number _ octal-number octal-number: octal-digit octal-number octal-digit hexadecimal-integer: hexadecimal-number hexadecimal-number _ hexadecimal-number hexadecimal-number: hexadecimal-digit hexadecimal-number hexadecimal-digit binary-integer: binary-number binary-number _ binary-number binary-number: 0 | 1 binary-number {0 | 1} real: fractional-real fractional-real exponent Version 0.1 (Draft) 10 8 Feb 2016 Object Text Markup Language Specification fractional-real: . decimal-integer decimal-integer . decimal-integer . decimal-integer exponent: {e | E} [+ | −] decimal-integer Some example numeric values: 0 65535 5_000_000 0177_777 0x100B 0xFFFF_FFFF 0b01101000 0b0100_0010_0100_0010 1.066e-23 .750 166.6667 3.14159_26535_89793_23846_26433 Version 0.1 (Draft) // // // // // // // // // // // // int, decimal int, decimal int, decimal int, octal int, hexadecimal int, hexadecimal int, binary int, binary float float float float 11 8 Feb 2016 Object Text Markup Language Specification Syntactical Elements Objects All objects within a document are defined using either of two syntactical forms, a primitive object definition or a (non- primitive) object definition. object-definition: primitive-object non-primitive-object array-object primitive-object: type [name] = value ; non-primitive-object: type [name] = [attribute-def…] { [object-member…] } [: name] array-object: type [name] = [attribute-def…] [ [array-member…] ] [: name] attribute-def: attribute-name = value attribute-name: name attribute-name . name object-member: object-definition ; The general syntax for a primitive object definition is: type [name] = value ; The general syntax for a non-primitive object definition is: type [name] [ attrib = value ]… { […object-members…] } [: name] Function objects have a slightly more complicated syntax. Objects may be named, which allows them to be uniquely identified within a given scope. All objects have a type. String values have an implicit type of string. A simple object has the type it is defined as, or without an explicit type it has the type of the value assigned to it. Non-simple objects have the type specified by their type tag. Members of an object definition are themselves object definitions. Version 0.1 (Draft) 12 8 Feb 2016 Object Text Markup Language Specification Types The type tag of an object definition specifies the type of the object, whether it is a markup element, data object, or variable. Markup elements have types (such as ‘p’ for ‘paragraph’ and ‘br’ for ‘line break’) corresponding to HTML markup element tags. Names Objects may be named. The scope of an object name is the parent object in which it is defined. Names defined within the same scope must be unique. A name defined within a sub-object may hide other objects with the same name defined in parent objects of the sub-object. Attributes Objects may have named attribute values (also known as properties) attached to them. Attribute values have a restricted syntax rather than the full expression syntax, although a parenthesized expression is a syntactically valid value. In practice, most attributes are assigned simple string values or names. Attribute names are lexically valid names. More complex names can be defined by combining two or more names separated by dots (.), for example style.color. […] Version 0.1 (Draft) 13 8 Feb 2016 Object Text Markup Language Specification Document Objects An OTML document is a single object of type document. This object may contain other display objects, functions, variables, and data objects. Document object An OTML document is itself a single object definition, of type document. document-object: document [document-attribute…] { [document-member…] } [;] document-attribute: format = string encoding = string document-member: head-object include-object style-object body-object data-object function-object ; A document has a (required) format attribute assigned the value "1.0", indicating that it conforms to version 1.0 of the OTML syntax specification. By default, OTML documents are encoding with UTF-8 character encoding. An optional encoding attribute may be supplied which specifies a different character encoding. Implementations should be capable of recognizing the first few octets of a document stream that signal UTF-8 and UTF-16 encodings, as well as recognizing byte order mark (BOM) sequences. In general, an OTML document may begin with the octets for encoding a BOM, a space character (SP), a tab character (HT), a newline character (CR or LF), a ‘/’ character (the initial character of a comment), or a ‘d’ character (the initial character of ‘document’). Head Object head-object: base href = value ; link href = value rel = value ; meta name = value ; title = value ; […]…http-equiv? Version 0.1 (Draft) 14 8 Feb 2016 Object Text Markup Language Specification Include Object include-object: include [include-attrib = value]… ; include-attrib: href […] Style Object style-object: style [name] [style-attrib…] { [style-def…] } style-attrib: type = value scope = value inherit = value attribute-def style-def: name = expr ; ; Style objects define the styling attributes for named style classes and for element types. […]…Each style-def is composed of… This example defines a style object for a style class named bold_red_para: style bold_red_para { color = 'red'; font_weight = 'bold'; } This style class can be used to specify that a text element is to be displayed with a red foreground color and in bold font: div class=bold_red_para { "Now is the winter of our discontent." } Body Object body-object: body [type] name [attribute-def…] { [body-member…] } body-member: style-object markup-object data-object function-object ; Version 0.1 (Draft) 15 8 Feb 2016 Object Text Markup Language Specification A document body contains zero or more text or markup elements. Browsers are obliged to render and display the visible text and markup elements of a body object. A body object may also contain style objects, function objects, and data objects. […] Markup Object markup-object: primitive-markup-object non-primitive-markup-object primitive-markup-object: value non-primitive-markup-object: type [name] [markup-attribute = value]… { [markup-member…] } { [markup-member…] } markup-attribute: style . name enabled remain_active html-attribute-name markup-member: style-object markup-object data-object function-object ; A markup object contains displayable text and graphic elements. Browsers are obliged to display the visible elements. The displayable content of a document is composed entirely of the markup objects contained within it. Markup objects may contain other markup objects, style objects, function objects, and data objects. Markup objects may be named, to uniquely identify them within their parent object scope. The supported markup types are listed in the appendix. By default, a markup object becomes disabled once its parent body executes a submit action. That is, a markup object can no longer accept or respond to GUI user actions such as mouse clicks or keyboard keystrokes after the user has initiated a document action that sends a response to the server. This is done to prevent users from clicking buttons and other elements while the browser is waiting for a response from the server. However, this disabling can be overridden by defining the markup element with a ‘remain_active=true’ attribute. […] Rendering Display Elements […] […]…spaces, newlines, preserving whitespace, etc. … Version 0.1 (Draft) 16 8 Feb 2016 Object Text Markup Language Specification For example, the following span markup object named ‘preamble’ contains several member markup objects, most of them primitive text strings: span preamble { "When in the course of human events " "it becomes"; i{"necessary"}; "for one people"; "to dissolve the " b{"bonds"} } This can also be coded using style objects in place of markup objects: span preamble { "When in the course of human events " "it becomes"; span style.font_style='italic' {"necessary"}; "for one people"; "to dissolve the " span style.font_style='bold' {"bonds"} } This is intended to be displayed by browsers as something like this: When in the course of human events it becomes necessary for one people to dissolve the bonds Data Object A data object contains one or more data members. These members can be simple named values or other data objects. Data objects are not displayed by browsers, but provide values to other functions and objects within the document. Data objects can occur within any document, body, markup, or function object. data-object: primitive-data-obj non-primitive-data-obj array-data-obj primitive-data-obj: data primitive-data-def primitive-data-def: [type] name = expr ; non-primitive-data-obj: data non-primitive-data-def non-primitive-data-def: name [attribute-def…] { [data-member…] } array-data-obj: data array-data-def Version 0.1 (Draft) 17 8 Feb 2016 Object Text Markup Language Specification array-data-def: [type] name [attribute-def…] [ [array-member…] ] array-member: [[ expr ] =] data-def ; data-def: primitive-data-def non-primitive-data-def array-data-def Primitive data objects are essentially named variables with initializers: data [type] name = expr ; Non-primitive data objects have a brace-enclosed body containing member data elements: data name [attrib = value]… { [data-members…] } Data arrays are defined in a similar fashion, but their member elements are enclosed within square brackets ([]). data [type] name [attrib = value]… [ [[ expr ] = ] array-member ; … ] The optional ‘[expr] =’ preceding an array member specifies the subscript of a particular element within the array to be initialized. If this is not specified, then the next element within the array following the previous array member is assumed. […] […]…Examples… Version 0.1 (Draft) 18 8 Feb 2016 Object Text Markup Language Specification Functions Function Object function-object: function [type] {name | new} ( [param-defs] ) [function-attrib = value]… block-statement function-attrib: event param-defs: param-def param-defs , param-def param-def: [type] name A function contains executable code (statements). A function is defined using the same syntax as any other object, with the addition of the leading function keyword and a parenthesized parameter list. Every function has a name, which is either a user-defined name or new. Functions with the name new are constructors and are used to initialize new objects of their parent object type. Every function has a parent object, which is the object in which it is defined. The parent object can be accessed by statements within the function with the this keyword. The outermost document object can be accessed with the global document variable. Function definitions can contain other function definitions, i.e., functions can be nested. All functions have a return type. By default, a function without an explicitly defined return type returns the type object. All functions return a value. By default, a function that terminates without executing an explicit return statement returns the value true. Executing the last statement in a function body, i.e., “falling out of” the block, is equivalent to returning a value of true. A function may be defined to be an event handler by specifying an event attribute in its definition. The attribute is set to one of the possible event types. Any such events initiated by the user on the parent object containing the function cause the function to be invoked, which is passed an event object that contains information about the event. Version 0.1 (Draft) 19 8 Feb 2016 Object Text Markup Language Specification Statements The following are the executable statements that can be coded within a function. statement: variable-definition if-else-statement for-statement do-while-statement while-statement switch-statement case-statement break-statement return-statement try-catch-finally-statement throw-statement lock-statement block-statement expr-statement ; Variable Definition type name [ = expr ] ; A function may contain one or more local variables. These are named values that can be modified by the function statements. A variable always has a name, a type, and a value. An explicit type is required, to syntactically distinguish variable definitions from assignment statements. Variables may be initialized with an explicit expression value, which is evaluated at the point that the variable definition statement is executed and converted into the type of the variable. If no initializer expression is specified, the variable is initialized to a default value, which is 0, 0.0, false, or null, based on the variable type. Variables have block scope, being visible only within the parent block they are defined within, and within any sub-blocks within that same parent block. Variable definitions within a sub-block may hide other variable definitions within outer blocks, although implementations may choose to issue warnings when such cases are encountered. If-Else Statement if-else-statement: if ( expr ) statement1 if ( expr ) statement1 else statement2 This is the basic conditional control-flow statement. The control expression is evaluated and converted into a boolean value. If the value is true, then statement1 is executed, otherwise statement2 (if present) is executed. For Statement for-statement: for ( [expr-list1] ; [expr-list2] ; [expr-list3] ) statement1 Version 0.1 (Draft) 20 8 Feb 2016 Object Text Markup Language Specification This is a looping control-flow statement. Expression-list1 is evaluated (if it is present), which typically contains pre-loop initializations. The loop then executes, first by evaluating the controlling expression-list2 (if it is present) and converting it into a boolean value. If expression-list2 is not present, it is implicitly assumed to be true. If the result is false, the loop terminates and execution continues with the statement following the for-loop statement. Otherwise the result is true, and the next iteration of the loop body (statement1) is executed. This statement is typically a block statement, but is not required to be so. After the body statement completes, expression-list3 is executed (if it is present), which typically contains post-loop increment expressions. Then the execution flows to the top of the next iteration of the loop, evaluating expression-list2 again. Execution of the loop body continues until expression-list2 evaluates to false. Note that if the controlling expression-list2 always evaluates to true, the loop is an infinite loop. A forloop with no controlling expression is equivalent to an infinite loop. Executing a break statement within a loop body causes looping to terminate. Execution of a continue statement within the loop body causes the current iteration of the loop body to end, expression-list3 to be evaluated, and forces the next iteration of the loop to occur. Do-While Statement do-while-statement: do statement1 while ( expr-list ) ; This is a looping control-flow statement. With each loop iteration, statement1 is executed, and then the controlling expression-list is evaluated and converted into a boolean value. If the result is false, the loop statement terminates and execution continues with the statement following the loop. Otherwise the result is true, and the next iteration of the loop is performed. Looping continues until the expression-list evaluates to false. Note that at least one iteration of the loop is always performed, prior to testing the (post-loop) conditional expression. Note also that if the controlling expression-list always evaluates to true, the do-while statement is an infinite loop. Executing a break statement within the body of the loop causes the looping to terminate. Executing a continue statement within the body of the loop causes the current iteration to end, and for the testing of the controlling expression-list, forcing the next iteration of the loop to occur. While Statement while-statement: while ( expr-list ) statement […] Switch Statement switch-statement: switch ( expr ) statement Version 0.1 (Draft) 21 8 Feb 2016 Object Text Markup Language Specification case-statement: case expr : statement… default : statement… This is a selection statement, which evaluates the switch expression, then looks for a matching case expression within the body statement. If a match is found, the statements following the matching case expression are executed. If no matching expression is found, the statements associated with the default label (if one is present) are executed. If no default statement is present, the switch statement ends, and execution continues with the next statement following it. When statements following a given case or default label are executed, they continue being executed until a break statement is encountered or until the end of the switch statement is reached. In other words, execution of each of the switch body statements “fall through” to the next statement unless a break statement is encountered. Break Statement break-statement: break ; continue ; A continue statement can only appear within the body of a looping statement (do-while, while, or for loop). A break statement can appear within a looping statement or within a switch statement body. Within the body of a looping statement, executing a break statement causes the looping statement to end, and for the nest statement following the loop to be executed. Executing a continue statement within the body of a looping statement causes the current iteration of the loop to end and forces the next iteration to begin. Within the body of a switch statement, executing a break statement causes execution control to flow out of the switch statement body, and for the next statement following the switch to be executed. In other words, this defeats the “fall through” behavior of the statements within the switch body. Return Statement return-statement: return [expr] ; Executing a return statement causes the execution of its parent function to end, and for a value to be returned to the caller of the function. If an expression is specified, it is evaluated and converted into the return type of the function, and that converted value is returned from the function. If no expression is specified, an implicit value of true, 1, or 1.0 is returned, depending on the function’s return type. Try-Catch-Finally Statement try-catch-finally-statement: try statement1 [catch ( [type] name ) statement2]… [finally statement3] […] Version 0.1 (Draft) 22 8 Feb 2016 Object Text Markup Language Specification Throw Statement throw-statement: throw expr ; Executing a throw statement causes an exception to be raised. […] Lock Statement lock-statement: lock ( expr ) statement The controlling expression designates an object to lock. This must be a named object l-value or an object that is accessible by functions in other threads; it cannot be a temporary object (e.g., the result of an arithmetic expression). After the expression is evaluated, the current thread attempts to acquire an exclusive lock on the resulting object. If a lock can be acquired, the thread locks it and then proceeds to execute the statement. If a lock cannot be acquired, another thread has a lock on the object, so the current thread blocks until the other thread releases its lock on the object, at which time the current thread again attempts to acquire a lock on the object. Locks are acquired in a non-deterministic way, being assigned arbitrarily to one of the threads requesting a lock on a given object. While the thread is blocked, it cannot execute any other statements. A lock statement typically contains a call to the locked object’s wait() or notify() functions. These functions operate in concert to synchronize the execution of separate but cooperating threads. Implementations may impose a limit on the number of simultaneously active locks, and may choose to throw a run-time exception when that limit is exceeded. Block Statement block-statement: { [statement…] } A block statement (or statement block) is a sequence of zero or more statements enclosed within braces ({ }). A new scope exists within the block. Variables and objects defined within the block exist as long as the block is active (being executed), and they are no longer accessible when the execution of the block ends. These variables and objects are not visible outside of the block. However, variables and objects defined outside the block but contained within the same outer block are visible within the block. […]…variable hiding… Empty blocks (i.e., blocks containing no statements) are equivalent to an empty statement, and do not perform any operations. Version 0.1 (Draft) 23 8 Feb 2016 Object Text Markup Language Specification Some examples of block statements: // Outer block { string name = "Howard"; string species = "duck"; // Inner block { string name2 = name; string species = "billionaire"; } // Initialized to "Howard" // Hides outer ‘species’ var } Expression Statement expr-statement: expr ; An expression statement specifies an expression to be executed. The expression is executed generally for its side effects, which may include modifying variables and objects or calling functions. Implementations may choose to issue a warning if an expression statement has no side effects. […] Version 0.1 (Draft) 24 8 Feb 2016 Object Text Markup Language Specification Expressions Expressions are the basic building block of function code. expr: query-expr expr-list: expr expr-list , expr Expression lists are sequences of one or more separate expressions, and are evaluated from left to right. The result and type of the expression list is the value and type of the last (rightmost) expression in the list. query-expr: assignment-expr or-expr ? query-expr : query-expr Query expressions (also called ternary expressions or if-then-else expressions) are evaluated by first evaluating the first expression (preceding the ‘?’ operator), also known as the controlling expression, and the result is converted into a bool value. If the value is equal to true, the second expression (following the ‘?’ operator) is evaluated this becomes the result of the entire expression; otherwise the value is false, and the third expression (following the ‘:’ operator) is evaluated and becomes the result of the entire expression. Note that only one of the second and third expressions is evaluated following the evaluation of the first controlling expression. The result of either expression is converted to a common type (which will be either the type of the second expression or the type of the third expression), and this is the type of the entire expression. assignment-expr: lvalue assignment-op query-expr // FIXME lvalue: primary-expr // FIXME assignment-op: = *= /= %= += −= >>= >>>= <<= &= ^= |= Assignment expressions have the side effect of assigning a new value to the l-value expression (the expression preceding the assignment operator). The value is the result of evaluating the expression following the assignment operator, and this result is converted into the type of the l-value, and then the value of the l-value is changed to the resulting converted value. The l-value expression must designate a modifiable variable or object which is accessible within the scope of the block containing the expression. Version 0.1 (Draft) 25 8 Feb 2016 Object Text Markup Language Specification The compound assignment operators apply another operator to the right operand before assigning the resulting value to the l-value. For instance, ‘x *= y’ is semantically equivalent to ‘x = x * y’, except that ‘x’ is evaluated only once. or-expr: and-expr or-expr or and-expr and-expr: not-expr and-expr and not-expr not-expr: rel-expr not not-expr Logical expressions result in a bool value (true or false). An or-expression is evaluated by first evaluating the expression to the left of the ‘or’ operator and converting the result into a bool value. If the result is true, the entire expression evaluates to true; otherwise, the expression to the right of the ‘or’ operator is evaluated and the result is converted into a bool value, which becomes the result of the entire expression. Note that if the left expression is true, the right expression is not evaluated at all (this is known as short-circuit evaluation). An and-expression is evaluated by first evaluating the expression to the left of the ‘and’ operator and converting the result into a bool value. If the result is false, the entire expression evaluates to false; otherwise, the expression to the right of the ‘and’ operator is evaluated and the result is converted into a bool value, which becomes the result of the entire expression. Note that if the left expression is false, the right expression is not evaluated at all (this is known as short-circuit evaluation). A not-expression is evaluated by evaluating the expression following the ‘not’ operator and converting the result into a bool value. The result of the entire expression is then the logical complement of that value (i.e., if the value is true, the expression evaluates to false, and vice versa). Note that the expression ‘not x’ is similar, but not exactly identical, to the expressions ‘x == false’ and ‘x != true’. rel-expr: bit-or-expr rel-expr == rel-expr === rel-expr != rel-expr > rel-expr >= rel-expr < rel-expr <= bit-or-expr bit-or-expr bit-or-expr bit-or-expr bit-or-expr bit-or-expr bit-or-expr A relational expression involves a relational operator (also called a comparison operator) which specifies how the left and right expressions are to be compared to one another. The left expression is evaluated first, then the right expression is evaluated. The two results are then converted to a common type (which will be either the type of the left expression or the type of the right expression), and the converted values are compared according to the relational operator specified. The results of the comparison is a bool value (true or false), which becomes the result of the entire expression. If either expression has type object, the only valid relational operators are equals (==, ===) and not equals (!=), and the other expression is taken as an object value. Otherwise, if either expression has Version 0.1 (Draft) 26 8 Feb 2016 Object Text Markup Language Specification type string, the other expression is converted into a string value prior to the comparison. Otherwise, if either expression has type float, the other expression is converted into a float value prior to the comparison. Otherwise, if either expression has type int, the other expression is converted into an int value prior to the comparison. Otherwise, if either expression has type bool, the other expression is converted into a bool value prior to the comparison. If either expression is undefined, the relational expression evaluates to false. If both expressions are type float and either expression is NaN (not a number), the relational expression evaluates to false. bit-or-expr: bit-xor-expr bit-or-expr | bit-xor-expr bit-xor-expr: bit-and-expr bit-xor-expr ^ bit-and-expr bit-and-expr: shift-expr bit-and-expr & shift-expr shift-expr: add-expr shift-expr >> bit-or-expr shift-expr >>> bit-or-expr shift-expr << bit-or-expr The bit-wise operators can only be applied to operands of type int, and yield values of type int. The bit-shift operators (<<, >>, >>>) shift the binary bits of their left operand by the number of bits specified by their right operand. The ‘<<’ operator shifts the bits left, filling the low-order bits with zeros; the ‘>>’ operator shifts the bits right, filling the high-order bits with copies of the most significant (sign) bit; the ‘>>>’ operator also shifts the bits right, but fills the high-order bits with zeros. add-expr: mul-expr add-expr + mul-expr add-expr − mul-expr mul-expr: shift-expr mul-expr * unary-expr mul-expr / unary-expr mul-expr % unary-expr Additive operators and multiplicative operators can only be applied to operands of type int or float. If either operand is type float, the other operand is converted to float before the operator is applied. The ‘+’ operator can also be applied to operands of type string, resulting in a value that is the concatenation of the two operands. Both operands must have type string for the ‘+’ operator to designate string concatenation, otherwise it designates numeric addition. Version 0.1 (Draft) 27 8 Feb 2016 Object Text Markup Language Specification unary-expr: primary-expr ~ unary-expr + unary-expr − unary-expr ++ primary-expr −− primary-expr primary-expr ++ primary-expr –– The unary operators perform arithmetic operations on a single operand. The ‘~’ operator results in the bit-wise one’s-complement of its operand. It can only be applied to operands of type int. The unary ‘+’ and ‘−’ sign operators can only by applied to operands of numeric type. The ‘+’ operator does not change the value of its operand, while the ‘−’ operator results in the arithmetic negative of its operand. The resulting value is the same type as the operand. The ‘++’ and ‘−−’ increment operators cause their operands to be incremented and decremented by 1, respectively. These operators can only be applies to operands of numeric type. The operand expression must designate a modifiable l-value. There are two forms of these operators; those that precede their operand are called the pre-increment and pre-decrement operators, and operate by first incrementing or decrementing their operand and the returning the resulting modified value. Those that follow their operand are called the post-increment and post-decrement operators, and operate by returning the value of their operand prior the increment/decrement operation, and increment or decrementing their operand after the original value has been used. primary-expr: ( expr ) subscript-expr new-expr delete-expr convert-expr typeof-expr function-call-expr name-expr string number true false null Parentheses (the ‘(’ and ‘)’ round bracket operators) are used for grouping expressions so that they are evaluated in a specific order. The true and false values are pre-defined (built-in) primitive values of type bool. They can be converted into types int (0 and 1) and string ("true" and "false"). The null value is a pre-defined (built-in) primitive value of type object. It specifically designates no object, so that values assigned null do not refer to any object. The null value can be converted into a bool value of false. Version 0.1 (Draft) 28 8 Feb 2016 Object Text Markup Language Specification subscript-expr: name-expr [ expr ] The array subscript operator accesses an element of an array object. The left operand (preceding the ‘[’ operator) is evaluated first, and must designate an object of array type. The right operand (inside the ‘[’ and ‘]’ brackets) is evaluated; if it has numeric type (bool, int, or float), it is converted into an int value, and the result is the index of the element within the array to be accessed. Note that array indices start at zero (0). If the right operand has type string, it is taken as a hash key designating an element of the array to be accessed. If the right operand is null or is undefined, a run-time exception is thrown. new-expr: new type ( [args] ) delete-expr: delete expr The new operator is used to construct new objects. A brand new object of the specified type is created (its space being allocated on the heap), and the appropriate constructor function is then invoked to initialize the object. The arguments, if any, are evaluated and passed to the constructor function, each one being assigned to its corresponding function parameter. The appropriate constructor function is the constructor function with parameters that best match the list of arguments specified in the new expression. If no arguments are specified (known as a no-args constructor call), and no constructor function with no parameters is defined for the type, the type is instantiated and its members (if any) are default-initialized. Otherwise, if no matching constructor function can be found, a run-time exception is thrown. The delete operator is used to destroy existing objects. It returns no value, and has no type (i.e., is undefined). When the expression designates a member of some object, that member is removed from its parent object. When the expression designates an element of some array object, that element is removed from its parent array. Note that the deleted object continues to exist in memory as long as some other live object or variable contains a reference to it; otherwise, the deleted object will eventually be deallocated from memory by the garbage collector. convert-expr: defined ( expr ) bool ( expr ) int ( expr ) float ( expr ) string ( expr ) void ( expr ) //Needed??? //Needed??? //Needed??? The conversion operators resemble built-in function calls, and convert their arguments into a specific type. The argument expression can be a value of any type. The bool() conversion operator converts its argument into a value of type bool. If the argument is false, null, 0, 0.0, or is undefined, the operator returns a value of false; otherwise, it returns a value of true. [???]…should int(x) et al allow x to be any type, or only numeric/bool? Or should users be required to use int.parse(x) instead? Version 0.1 (Draft) 29 8 Feb 2016 Object Text Markup Language Specification The string() operator converts its operand into a string value. If the argument is null or is undefined, the resulting value is null. Otherwise, the value returned is the same as if the to_string() function for the argument’s type was called. […] The void() operator removes the types of its operand, making it undefined. The defined() operator is not actually a conversion operator, but determines whether its argument is defined or not, and returns a bool value. typeof-expr: typeof ( expr ) The typeof() operator returns a string value indicating the type of its argument. It returns one of the values "bool", "int", "float", "string", "object", "date", "url", "event", "function", "null", or "undefined", or the name of a markup type (such as "body", "p", "br", etc.). name-expr: this name primary-expr @ name primary-expr . name primitive-type . name primitive-type: object bool int float string date url function A name expression uniquely designates an object (i.e., an object, attribute, variable, or function). The ‘this’ value refers to the parent object that contains the function. An unadorned name refers to the variable or function having that name within the current execution scope. A name expression followed by an at-sign (@) followed by a name (e.g., foo@bar) refers to an attribute (also called a property) of an object. A name expression followed by a dotted name (e.g., foo.bar) refers to a named member of an object. A primitive type name followed by a dotted name (e.g., date.today) designates a built-in function or variable in the run-time type library. […] function-call-expr: name-expr ( [args] ) Version 0.1 (Draft) 30 8 Feb 2016 Object Text Markup Language Specification args: arg args , arg arg: [name :] expr A function call expression invokes a function, passing it zero or more arguments. Each argument expression is evaluated and assigned to its corresponding function parameter. If the number of arguments in the call expression does not match the parameters in the function definition, any unassigned parameters will be undefined. The result of the function invocation has the type specified by the definition of the function. The resulting value returned from the execution of the function becomes the value of the function call expression. More details can be found in the Function Invocation section below. value: ( expr ) name-expr string [+ | −] number true false null type: markup-type primitive-type A value expression is essentially a primary expression, except that it only occurs in contexts where an attribute (property) value is allowed. Attribute Values Attribute values (also called property values or just simply properties) are named values attached to objects. They are distinct from object members. […] Type Conversions Every object, value, and expression implicitly has type object, and thus can be assigned to any variable of type object with no explicit conversion operation required. A value or expression assigned to a variable of type object retains its original type. Expressions having a value of null or that are undefined do not refer to any object. Conditional expressions are used as the controlling expressions for control flow statements, and are implicitly converted into type bool. For expressions of type int and float, a value of zero (0) is converted into false, and all other non-zero values are converted into true. Expressions of type string are converted into false if they are empty strings ("") or are null, otherwise they are converted into true. (Note that string values such as "0" and '0.00' are converted into true.) All other expressions are converted into false if they are undefined or are null, otherwise they are converted into true. Version 0.1 (Draft) 31 8 Feb 2016 Object Text Markup Language Specification Expressions may be converted into strings using the string() operator. Most of the primitive types provide a to_string() function that can be used to return a string value formatted in a specific way. The int, float, and date types provide a parse() function that can be used to convert a string value into the corresponding primitive type. In a function call expression, the argument expressions in the call are assigned to their corresponding function parameter variables. Each argument is implicitly converted into the type of its parameter, which may result in a runtime exception being thrown. Regular Expressions Regular expressions are string values used in certain expression contexts to perform pattern matching on other string values. These string values are in the format "/p/s", where p is a pattern composed of one or more of the regular expression operators listed below, and s is zero or more regular expression suffixes. The suffixes affect the behavior of pattern matching: Suffix g i m Version 0.1 (Draft) Description Perform pattern matching globally Ignore letter case ??? 32 //FIXME 8 Feb 2016 Object Text Markup Language Specification The regular expression pattern matching operators are: Pattern x pq ^ $ . p? p* p+ p{n,m} [abc] [a−z] [^a−z] p|q (p) \n \b \B \d \D \N \s \S \w \W \uHHHH \\ \^ \$ \. \? \* \+ \[ \] \( \) \{ \} \| Version 0.1 (Draft) Description Matches Unicode character 'x' Matches p followed by q Matches the beginning of a string Matches the end of a string Matches any single Unicode character Matches zero or one occurrence of p Matches zero or more occurrences of p Matches one or more occurrence of p Matches n to m occurrences of p Matches single character 'a', 'b', or 'c' Matches single character from 'a' to 'z' Matches single character other than 'a' to 'z' Matches either p or q Grouping, matches p Matches group n (0 to 9) Matches the beginning of a word Matches the end of a word Matches a digit character Matches any character except digit characters Matches a CR LF pair Matches a whitespace character Matches any character except whitespace characters Matches a word character Matches any character except word characters Matches Unicode character U+HHHH Matches '\' Matches '^' Matches '$' Matches '.' Matches '?' Matches '*' Matches '+' Matches '[' Matches ']' Matches '(' Matches ')' Matches '{' Matches '}' Matches '|' 33 8 Feb 2016 Object Text Markup Language Specification Object Management New objects are created using the new operator. This is syntactically similar to a function call, except that an object type is specified instead of a function name. Markup objects can be created (e.g., p, table, tr, div, span), as well as objects of the standard library types (e.g., date, url). […] Objects that go out of scope (i.e., local variables within a statement block that is no longer active) or that no longer have any other active objects referring to them are called dead objects. Dead objects are eventually subject to garbage collection, which deallocates them from memory and reclaims the space they occupy. Objects that are still within an active scope or are still referenced by other active objects are called live objects, and are not subject to garbage collection. […] type foo { // Constructor function for 'type' function new() { … } } […] Name Scope […] Function Invocation A function may be invoked from a function, or by an event or timer. Recursive calls are supported, meaning that a function may call itself directly, or may call another function that in turn calls it indirectly. Upon entry to a function from a function call expression, the argument values of the call expression (if any) are assigned to their corresponding function parameters. The function parameters are essentially local variables of the function. If the number of function call arguments does not agree with the function parameters, any unassigned parameters will be undefined. After the parameters are assigned, the first statement of the function body (a statement block enclosed within { } braces) is executed. The statements within the body of the function are executed in sequence until a return statement is executed (which returns a value back to the caller) or until the execution flows out of the function body. If an explicit return value is not specified, or if the function body is empty (contains no statements), the function returns an implicit value of true back to the caller. […]…the implicit this object parameter […]…closures, first-class function objects, and nested functions… […]…threads… […]…anonymous functions… Version 0.1 (Draft) 34 8 Feb 2016 Object Text Markup Language Specification […] Multithreading Within an OTML browser, the document object is executed within a main thread. Each body object within the document executes within its own separate thread. Implementations may choose to limit the number of active body threads within a given document. Upon the initial loading of a body object, its thread issues a ‘load’ event, which may be handled by member functions of the body defined with an ‘event="load"’ attribute. Once execution of all load event functions ends, the body thread suspends execution and waits for events to be received from the main loop. GUI events (such as mouse clicks and keyboard keystrokes) for a given display element are captured by the main thread and delivered to the body thread containing that element. Once the main thread has delivered an event to the appropriate thread, it resumes waiting for other events. When an action is triggered that causes a response object to be sent to the server from the document, such as when the user clicks on an action button or enters a keyboard keystroke that results in a submit action, the main thread sends an ‘unload’ event to each of the body threads. Each body thread in turn may then handle this event with member functions defined with an ‘event="unload"’ attribute. Once the execution of all of the body threads have ended, the main thread terminates all of the body threads, and then sends the response object to the server. The browser then waits for the next reply object to be received from the server. […] Version 0.1 (Draft) 35 8 Feb 2016 Object Text Markup Language Specification Standard Library Types The types below are defined using a pseudo-language which is not actually valid OTML. Object Type Every object contains the following variables and functions. type object { string string object object object object object[] object[] … id; type; parent_element; prev_element; next_element; first_child; elements; attributes; // // // // // // // // Name Element type, e.g., "string" Parent object Previous sibling object Next sibling object First child object Child objects Attribute values function bool equals(object o); function bool send_event(event evt); … operator object new(); operator object new(string type); } object document; // Document object String Type A string object is a (possibly empty) sequence of Unicode characters. Every string object contains the following variables and functions. type string { int function function function function function function function function function function … length; // Length string char_at(int pos); string match(string pat); string replace(string pat, string new); string search(string pat); string[] split(string pat); string substring(int start, int len); string substring(int start); string to_capital(); string to_lower(); string to_upper(); operator string [](int pos); } operator string string(object obj); Version 0.1 (Draft) 36 8 Feb 2016 Object Text Markup Language Specification Bool Type A bool object is a 1-bit boolean value, either true or false. type bool { string name; // "true" or "false" function string to_string(); function int parse(string s); … } Int Type An int object is a 64-bit signed binary integer value. type int { const int const int function function function function … MIN; MAX; // Minimum int value, -(2^63) // Maximum int value, (2^63)-1 string to_string(string fmt); int parse(string s); int to_fixed(float n); int to_exponential(int n); } Version 0.1 (Draft) 37 8 Feb 2016 Object Text Markup Language Specification Float Type A float object is a 64-bit IEEE double-precision floating-point real value. type float { const float const float const float const float const float const float const float … MIN; MAX; INF; NAN; EPSILON; PI; E; // // // // // // // function function function function function function … bool is_inf(float x); bool is_nan(float x); float parse(string s); string to_string(string fmt); float to_fixed(int n); float to_precision(int n); function function function function function function function function function function function function function function function … float float float float float float float float float float float float float float float Minimum float value Maximum float value Infinity Not-a-Number Smallest float difference pi, 3.1415926+ e, 2.7182818+ sqrt(float x); log(float x); log10(float x); exp(float x); pow(float x, float y); sin(float x); cos(float x); tan(float x); asin(float x); acos(float x); atan(float x); atan2(float x, float y); sinh(float x); cosh(float x); tanh(float x); } Version 0.1 (Draft) 38 8 Feb 2016 Object Text Markup Language Specification Date Type A date object embodies the components of a date and time. Implementations are required to support dates spanning at least the range from AD 1601-01-01 to 2400-12-31. type date { int int int int int int int int int int int … const const const const const date date date date date year; mon; mday; yday; wkday; week; hour; min; sec; msec; ticks; // // // // // // // // // // // Year (1600-2400) Month (1-12) Day of the month (1-31) Day of the year (1-366) Day of the week (0-6) Week of the year (1-53) Hour (0-23) Minute (0-59) Second (0-60) Millisecond (0-999) msec ticks since the Epoch MIN; MAX; UNKNOWN; NEVER; ERROR; // // // // // Minimum supported date value Maximum supported date value Unknown date value, < MIN Never/unset date value, > MAX Erroneous date value function function function function function function … date now(); date utc_now(); date today(); date parse(string s, string fmt); string to_string(string fmt); date normalize(); function function function function function function function … date date date date date date date add_years(int delta); add_months(int delta); add_days(int delta); add_hours(int delta); add_mins(int delta); add_secs(int delta); add_msecs(int delta); operator date new(int year, int mon, int mday); operator date new(int year, int mon, int mday, int hr, int min, int sec, int msec); operator date new(int ticks); } Version 0.1 (Draft) 39 8 Feb 2016 Object Text Markup Language Specification URL Type A url object embodies the components of an HTTP URL, which specifies the location of a network resource. type url { int string string int string string string string … length; protocol; host; port; user; password; file; query; // // // // // // // // Length Protocol prefix ("http") Host name Port number User name Password File Query parameters function string query_parm(string parm); function url parse(string s); … operator url new(string proto, string host, string file); } operator url url(object obj); RegEx Type A regular expression object is a sequence of Unicode characters. Every regular expression object contains the following variables and functions. type regex { int last_index; // Position of last match function string exec(); function string test(string s); } operator regex regex(string s); Version 0.1 (Draft) 40 8 Feb 2016 Object Text Markup Language Specification Array Type All arrays contain the following variables and functions. type array { int length; // Number of elements function int add(object obj); function bool remove(int index); function bool remove(string index); … function function function function … object[] copy(int start, int count); float avg(); float variance(); float stdev(); operator object [](int index); operator object [](string index); } Event Type An event object contains details about an event, which are usually triggered by user interaction with display elements. This includes mouse clicks and movements, and keyboard keystrokes. type event { string string object int int string int int int int … Version 0.1 (Draft) id; type; element; key_code; key_mask; mouse_key; x_abs; y_abs; x_offset; y_offset; // // // // // // // // // // 41 Name Event type Affected object Unicode key code Shift modifiers bitmask Mouse button type Absolute X window location Absolute Y window location X location within parent object Y location within parent object 8 Feb 2016 Object Text Markup Language Specification +FIXME type = string; x = num; y = num; client_x = num; client_y = num; screen_x = num; screen_y = num; page_x = num; page_y = num; from_element = ref; to_element = ref; key_code = "\uHHHH"; target = ref; modifiers = num; alt_key = t/f; ctrl_key = t/f; shift_key = t/f; reason = num; height = num; width = num; return_value = t/f; … } // // // // // // // // // // // // // // // // // // // // // Event type, sans "on" Element X coordinate Element Y coordinate Browser X coordinate Browser Y coordinate Screen X coordinate Screen Y coordinate Page X coordinate Page Y coordinate Object moved from Object moved to Unicode char or mouse key Parent 'this' object Modifier keys bitmask 'Alt' key was pressed 'Ctrl' key was pressed 'Shift' key was pressed Event completion code Resize height Resize width Event return value Thread Type Each body element executes within its own separate execution thread, and the document object controls the main execution thread for the browser display. Timer events also execute in their own separate threads. type thread { string thread object thread bool bool bool object[] … function function function function … id; parent_thread; parent_element; document_thread; is_active; is_waiting; is_locked; locked_objects; // // // // // // // // Name Parent thread Parent object Top document thread Is executing Is waiting for an event Is waiting for a notify Objects locked by thread thread get_thread(string id); bool wait(); bool notify(); bool exit(); function thread set_interval(function func, object[] args, int msec); function thread set_timer(function func, object[] args, int msec); function bool cancel_timer(thread th); } Version 0.1 (Draft) 42 8 Feb 2016 Object Text Markup Language Specification Body Type All body elements contain the following variables and functions. type body { string thread object object bool bool … id; this_thread; document; response; is_alive; is_waiting; // // // // // // Name Body execution thread Document object Response object Is executing Is waiting for events function body get_body(string id); function respond(); function respond_to(url site); … } Markup Type All markup elements contain the following variables and functions. type markup { string string int int int int int int … } id; type; width; height; left_abs; top_abs; left_offset; top_offset; // // // // // // // // Name Markup type Width, in pixels Height, in pixels Absolute X window Absolute Y window X location within Y location within location location parent object parent object Request Type A browser request object is sent from the web content server to the client browser, and contains these variables and functions. type request { url object[] … } Version 0.1 (Draft) site; elements; // Origination site // Data objects sent 43 8 Feb 2016 Object Text Markup Language Specification Response Type A browser response object is returned from the client browser to the web content server, and contains these variables and functions. type response { url object[] … } site; elements; // Destination site // Data objects returned Function Type Every function object contains these variables and functions. type function { string string object … id; type; parent_element; // Name // Return type // Parent object function object call(object this_obj, object[] args); … operator object new(); } Version 0.1 (Draft) 44 8 Feb 2016 Object Text Markup Language Specification Markup Types The following is the list of the supported markup object types. Items marked as ‘deprecated’ have been deprecated since HTML 4.01; those marked ‘deprecated (5)’ have been deprecated since HTML 5. Markup type a address b bl body br button code del div dir dd dl dt embed form font h1 … h6 head hr http_equiv i img ins input kbd li link meta nobr object ol p param pre quote s span strike table tbody td textarea Version 0.1 (Draft) Description Anchor Address Boldface Bulleted list Document body Line break Input button Source code font Deleted Division box Directory item Directed list data item Delimited list Delimited list title item Embedded object Input entry form Font Heading Document head Horizontal rule Document HTTP control Italics Image graphic Inserted Form input item Keyboard font List item Document link Document meta-info Non-breaking span Embedded object Ordered list Paragraph Object parameter Preserve whitespace Quotation Strike-out Span box Strike-out Table Table body Table cell Text input box Notes deprecated (5) deprecated (5) deprecated deprecated (5) Levels 1 (highest) to 6 (lowest) deprecated (5) deprecated deprecated, not supported deprecated, not supported deprecated (5) deprecated 45 8 Feb 2016 Object Text Markup Language Specification tfoot th thead title tr tt u ul var Table footer Table heading cell Table header Document title Table row Teletype font Underlined Unordered list Variable item deprecated deprecated (5) deprecated […] Version 0.1 (Draft) 46 8 Feb 2016 Object Text Markup Language Specification Prior Art HTML The fundamental lexical difference between OTML and HTML is that display text elements in HTML are free-form while markup tags an attributes are enclosed with brackets; the opposite is true of OTML, where display text elements are enclosed within quotes and brackets while markup tags and attributes are free-form. Almost all HTML element tags and style properties translate directly into OTML markup element types and attributes. Thus this HTML code: <span style="border:solid 1px red">Error!</span> becomes this OTML code: span style.border="solid 1px red" { "Error!" } HTML (actually XHTML) has slightly better implicit handling of element content bracketing, requiring a closing ‘</tag>’ for every opening ‘<tag>’. OTML requires matching ‘{‘ and ‘}’ braces, but these can be accidentally omitted too easily. (Non-XML conforming HTML also suffers from this problem.) OTML solves this by allowing closing ‘}’ braces to be followed by a ‘: tag’ to indicate which block brace is being matched/closed. This syntax is not required, but is recommended for large object blocks. All of the HTML character entities (252 of them) are retained as character escape sequences. In addition, more character escape sequences are available to provide for the most common display formatting characters. For example, this HTML fragment: &nbsp;&nbsp;&bull;&nbsp; Name:&nbsp;&nbsp; John Doe could be coded in OTML as: "\&nbsp;\&nbsp;\&bull;\&nbsp; Name:\&nbsp;\&nbsp; John Doe" or even more succinctly as: "\_\_\&bull;\_ Name:\_\_ John Doe" […]…what about <object> and <var> tags? […]…threading per body element […] XML Like XML, OTML syntax requires elements (blocks) to be enclosed within matching tags (braces). Version 0.1 (Draft) 47 8 Feb 2016 Object Text Markup Language Specification OTML documents can be embedded within XML documents. For example: <!-- OTML document embedded within an XML document --> <?xml version="1.0" stand-alone="yes"> <document type="text/OTML"> document format="1.0" { head { … } body { … } data { … } … } </document> Be aware that, per XML encoding rules, certain special characters must be encoded appropriately; specifically, the characters <, >, and & must be replaced by their equivalent XML character entities &lt;, &gt;, and &amp;, or by &#60;, &#62;, and &#38;, respectively. OTML documents may contain XML documents. For example, this illustrates an XML document encoded as an OTML data object: // XML data, as an OTML object data xmldoc1 { "<?xml version=\"1.0\" stand-alone=\"yes\">"; "<employee>"; " <name><!-- As a structure -->"; " <first>John</first>"; " <middle>Q</middle>"; " <last>Doe</last>"; " </name>"; … "</employee>"; } An alternative way to encode XML text as a data object within an OTML document, which does not require quoting every text line of the XML document, looks like this: // XML data, as an OTML object data xmldoc1 { "<?xml version=\"1.0\" stand-alone=\"yes\"> <employee> <name><!-- As a structure --> <first>John</first> <middle>Q</middle> <last>Doe</last> </name> … </employee>"; } One thing to be aware of when encoding XML within OTML is that embedded quote marks (") must be encoded appropriately, replacing them with either the OTML escaped quote sequences (\") or their Version 0.1 (Draft) 48 8 Feb 2016 Object Text Markup Language Specification equivalent XML character codes (\u0034) or entities (\&quot;). For example, an XML element containing embedded quote characters could be encoded as: "<quotation>She said, \"Well, <i>hello</i> there!\"</quotation>" or as: "<quotation>She said, \u0034Well, <i>hello</i> there!\&quot;</quotation>" [...] CSS OTML syntax for style objects closely resembles the syntax for CSS <style> elements. […] JavaScript OTML syntax resembles JavaScript syntax more than any other language. Some of the ways in which OTML differs from JavaScript, though, are: Replacement of the ‘var’ variable definition keyword with several data type keywords (‘object’, ‘int’, ‘string’, ‘float’, and ‘bool’). Variables have block scope, whereas in JavaScript they have function scope. Functions and variables have a scope limited to their parent object, whereas in JavaScript they essentially have document scope (even if defined within separate <script> elements). This means that name prefixes and suffixes are not as necessary to make names unique as they are in JavaScript. Variables are default-initialized, whereas in JavaScript they are not. All functions return a value, whereas in JavaScript they only return values explicitly. Event-handler functions must be explicitly defined as such, whereas in JavaScript they do not. An event for an element is specified as the ‘event=xxx’ attribute of a member function of the element, instead of as arbitrary JS code in an ‘onxxx’ attribute of the element. Character escape sequences are different. Most of the control code sequences (e.g., \a, \b, \v) have been removed, since they do not have much practical application within web documents. New ones have been added (e.g., \z, \_) which do have practical uses within web documents. The logical ‘&&’, ‘||’, and ‘!’ operators have been replaced by the more readable ‘and’, ‘or’, and ‘not’ keywords. The equality (==, !=) and relational operators (>, >=, <, <=) all have the same precedence. More useful character escape sequences have been provided. Version 0.1 (Draft) 49 8 Feb 2016 Object Text Markup Language Specification Regular expressions are just string values, thus requiring no special lexical analysis. x.getAttribute() and x.setAttribute() are handled as direct access to an object’s attribute by name, as x@foo. […]…eval()… […] JSON Consider this JSON structured data value: "employee": { "first-name": "last-name": "address": { "street": "city": "state": }, "emp-id": "hire-date": } "Jason", "Alexander", "1001 Shady Oak Lane", "Hollywood", "CA" "8652-17-221255" "2011-05-01" This can be translated directly into OTML as: data employee = { first_name = "Jason"; last_name = "Alexander"; address = { street = "1001 Shady Oak Lane"; city = "Hollywood"; state = "CA"; }; emp_id = "8652-17-221255"; hire_date = "2011-05-01"; } The primary difference between the two is that OTML data elements look syntactically more like variable definitions with initializer values (because that is essentially what they are). A more typecomplete example could be coded as: Version 0.1 (Draft) 50 8 Feb 2016 Object Text Markup Language Specification data employee = { string first_name = string last_name = address = { string street = string city = string state = }; string emp_id = string hire_date = } "Jason"; "Alexander"; "1001 Shady Oak Lane"; "Hollywood"; "CA"; "8652-17-221255"; "2011-05-01"; Another difference is that an OTML data element must have a lexically valid name, whereas JSON allows an element to have any kind of name (because it is a quoted string). This means that OTML data objects and data members cannot have names that are the same as OTML keywords. It is recommended that such names be suffixed with an underscore character (for example, the JSON name "date" would become the OTML name date_). […] AJAX […]…XMLHttpRequest object… HTML Browsers The fundamental difference between OTML and HTML browsers is the nature of the data transmitted between them and web servers. HTML browsers are concerned with sending and receiving primarily textual content containing display markup elements; OTML browsers are concerned with sending and receiving data objects containing display markup elements. Another important difference is that in HTML, executable code (<script> elements) are separate from display elements. In OTML, executable code (function objects) are integrated into, and directly part of, the display elements. The requirements that OTML places on browsers are meant to improve the GUI experience of the user. Primary to this is the way a browser behaves when an action is instigated by the user, such as clicking a button or URL link. HTML browsers initiate a submit action, but continue to allow the user to interact with other action elements in the display, which can cause multiple (and possibly conflicting) submit actions. OTML browsers, in contrast, immediately disable all further user interactions with display elements once a submit action is initiated (unless the elements are explicitly defined to remain active). This eliminates the possibility of the user issuing multiple conflicting actions. Likewise, requiring the OTML browser to visually indicate that a submit action has been instigated (such as by dimming the display or brandishing a pop-up panel) makes it clear to the user that he has, in fact, triggered an action. […]…threading, multiple independent bodies… […]…respond to multiple sites simultaneously from a submit, using multiple bodies… […] Version 0.1 (Draft) 51 8 Feb 2016 Object Text Markup Language Specification Examples HTML + CSS <!-- A paragraph --> <p class="para" style="margin-right:0.5in"> Hello, world. This is an <i>example</i> of an <u>HTML</u> document. </p> <!-- Some more text --> <p> Lorem ipsum <i>dolor sit <b>amet</b></i>, consectetur adipisicing elit. <br/> <span style="color:blue"> E = mc<sup>2</sup>. </span> </p> <hr/> OTML // A paragraph p class=para style.margin_right="0.5in" { "Hello, world."; "This is an " i { "example" }; "of a " u { "OTML" } " document."; } // Some more text p { "Lorem ipsum"; i{"dolor sit " b{"amet"}} ","; "consectetur adipisicing elit."; br; span style.color="blue" { "E = mc" sup{"2"} "." } } hr; Version 0.1 (Draft) 52 8 Feb 2016 Object Text Markup Language Specification Unresolved Issues Lexical Issues Newlines (CR LF, or CR/LF/CR LF?) Require '\r\n' or only '\n', i.e., is '\n' CR LF or just LF? o '\L' could be LF, or '\N' could be CR LF Should names be case-insensitive in all contexts (color vs. Color vs. COLOR)? Use XML or JavaScript names (margin_left / marginLeft)? Do system names differ from user-defined names? o Math.PI, date.$now, foo vs. $foo, this vs. $this Should attribute names differ from variable names? o x.color, x@color, x.@color, x@.color, x['color'], x['@color'] Regular expressions are strings, requiring no special lexer Attributes and variables share the same syntax, do not need quoted values Char escape sequences must not conflict with regex operators Hidden variables must return in the browser response object Syntax Issues Separating content text from markup Separating pure data from content Do we still need var in addition to object? Keywords ‘var’ and ‘object’ conflict with <var> and <object> o Give <object> markup a different name o Do not support <var> markup, or give it a different name Should variable scoping be like JavaScript or like C? Restrict include objects to the top document level only? Anonymous (unnamed) dynamic functions? Can plain data objects contain functions? String concatenation operator (+ or .)? Auto-spacing between text elements and punctuation text o ‘;’ delimiter implies word spacing: {"hi" i{"mom"}} vs {"hi"; i{“mom"}} o ‘\s’ could force word spacing: {"hi" \s i{"mom"}} o Must be able to handle adjacent markup elements: {"e" sup{"2"}} Should on_event=expr attributes be permitted on display objects? o If so, what would expr contain? <pre> (preserve whitespace) markup elements using quoted text elements Can break and continue statements specify a loop label? How is eval() handled? Need an ‘expr is type’ operator? Version 0.1 (Draft) 53 8 Feb 2016 Object Text Markup Language Specification References 1. www.ECMAscript.org 2. es5.GitHub.io/#toc Version 0.1 (Draft) 54 8 Feb 2016