Object Text Markup Language Specification
Object Text Markup Language
Specification
Version 0.1 (Draft)
8 Feb 2016
By: David R. Tribble
david@tribble.com
Version 0.1 (Draft)
1
8 Feb 2016
Object Text Markup Language Specification
Copyright ©2013 by David R. Tribble, all rights reserved.
Permission is granted to anyone to reproduce, distribute, translate, or store this document in whole or in part.
Version 0.1 (Draft)
2
8 Feb 2016
Object Text Markup Language Specification
Contents
Introduction .................................................................................................................................................. 5
Lexical Elements ............................................................................................................................................ 6
Spaces and Newlines............................................................................................................................. 6
Characters ............................................................................................................................................. 6
Comments ............................................................................................................................................. 6
Punctuation and Operators .................................................................................................................. 7
Names ................................................................................................................................................... 7
Reserved Keywords............................................................................................................................... 8
Boolean Values ..................................................................................................................................... 8
String Values ......................................................................................................................................... 8
Character Escape Sequences ................................................................................................................ 8
Numeric Values ..................................................................................................................................... 9
Syntactical Elements ................................................................................................................................... 12
Objects ................................................................................................................................................ 12
Types ................................................................................................................................................... 13
Names ................................................................................................................................................. 13
Attributes ............................................................................................................................................ 13
Document Objects ...................................................................................................................................... 14
Document object ................................................................................................................................ 14
Head Object ........................................................................................................................................ 14
Include Object ..................................................................................................................................... 15
Style Object ......................................................................................................................................... 15
Body Object......................................................................................................................................... 15
Markup Object .................................................................................................................................... 16
Data Object ......................................................................................................................................... 17
Functions ..................................................................................................................................................... 19
Function Object................................................................................................................................... 19
Statements .......................................................................................................................................... 20
Variable Definition .............................................................................................................................. 20
If-Else Statement................................................................................................................................. 20
For Statement ..................................................................................................................................... 20
Do-While Statement ........................................................................................................................... 21
While Statement ................................................................................................................................. 21
Switch Statement ................................................................................................................................ 21
Break Statement ................................................................................................................................. 22
Return Statement ............................................................................................................................... 22
Try-Catch-Finally Statement ............................................................................................................... 22
Throw Statement ................................................................................................................................ 23
Lock Statement ................................................................................................................................... 23
Block Statement .................................................................................................................................. 23
Expression Statement ......................................................................................................................... 24
Expressions.................................................................................................................................................. 25
Attribute Values .................................................................................................................................. 31
Type Conversions ................................................................................................................................ 31
Version 0.1 (Draft)
3
8 Feb 2016
Object Text Markup Language Specification
Regular Expressions ............................................................................................................................ 32
Object Management ........................................................................................................................... 34
Name Scope ........................................................................................................................................ 34
Function Invocation ............................................................................................................................ 34
Multithreading .................................................................................................................................... 35
Standard Library Types ............................................................................................................................... 36
Object Type ......................................................................................................................................... 36
String Type .......................................................................................................................................... 36
Bool Type ............................................................................................................................................ 37
Int Type ............................................................................................................................................... 37
Float Type............................................................................................................................................ 38
Date Type ............................................................................................................................................ 39
URL Type ............................................................................................................................................. 40
RegEx Type .......................................................................................................................................... 40
Array Type ........................................................................................................................................... 41
Event Type .......................................................................................................................................... 41
Thread Type ........................................................................................................................................ 42
Body Type ........................................................................................................................................... 43
Markup Type ....................................................................................................................................... 43
Request Type ...................................................................................................................................... 43
Response Type .................................................................................................................................... 44
Function Type ..................................................................................................................................... 44
Markup Types ............................................................................................................................................. 45
Prior Art....................................................................................................................................................... 47
HTML ................................................................................................................................................... 47
XML ..................................................................................................................................................... 47
CSS....................................................................................................................................................... 49
JavaScript ............................................................................................................................................ 49
JSON .................................................................................................................................................... 50
AJAX .................................................................................................................................................... 51
HTML Browsers ................................................................................................................................... 51
Examples ..................................................................................................................................................... 52
Unresolved Issues ....................................................................................................................................... 53
Lexical Issues ....................................................................................................................................... 53
Syntax Issues ....................................................................................................................................... 53
References .................................................................................................................................................. 54
Version 0.1 (Draft)
4
8 Feb 2016
Object Text Markup Language Specification
Introduction
Object Text Markup Language (OTML) is a text-based format for encoding documents composed of
structured text and image elements. It is intended to be used as a standardized format for data
transmitted between client web browsers and web content servers.
OTML combines the capabilities of Hyper-Text Markup Language (HTML), Extensible Markup Language
(XML), Cascading Style Sheets (CSS), JavaScript, and JavaScript Object Notation (JSON) into a single
unified syntax. This syntax is based primarily on JavaScript and JSON.
Whereas HTML, CSS, JavaScript, and JSON all have disparate and conflicting lexical structure and syntax,
OTML has a single unified syntax that applies to all of its definable entities. OTML is designed to embody
all of the capabilities and functionality of these languages, and to provide its own additional capabilities.
The primary components of OTML are the document object, request object, and response object. These
are the data elements that are transmitted between client web browsers and web servers.
Everything in OTML is an object. Primitive objects are simple text string values; non-primitive objects
contain sub-objects. Objects may be named. Objects may also have named attribute values attached to
them.
Version 0.1 (Draft)
5
8 Feb 2016
Object Text Markup Language Specification
Lexical Elements
OTML documents are text documents composed of Unicode characters. The default encoding is UTF-8,
but other encodings are possible (e.g., UTF-16, ISO 8859-1 Latin-1, etc.).
Documents are composed of lexical items, which fall into these categories:






Comments and whitespace
Punctuation and operators
Names
String values
Numeric values
Reserved keywords
Spaces and Newlines
Spaces (and other whitespace characters such as HT) are only significant if they appear within quoted
string values. In all other contexts they are ignored, and serve only to separate source tokens and to
improve readability for humans.
Newlines (CR and LF characters) are only significant if they appear within quoted string values. They also
indicate the end of ‘//’ comments. In all other contexts they are ignored, and serve only to separate
source tokens and to improve readability for humans.
The standard encoding of newlines (end of line delimiters) follows the XML standard, i.e., a newline is
composed of a CR LF pair. OTML parsers, however, must be able to read documents having newlines
consisting of CR, LF, or CR LF pairs. Note that newlines are only significant for recognizing the end of ‘//’
style comments, and are not required in any other context.
Characters
Source documents are composed of printable Unicode characters. Unprintable characters (e.g., control
characters or undefined Unicode character codes) are treated as spaces. Implementations may choose
to issue warnings when they are encountered.
It is also possible for unprintable characters to appear within quoted string values. Such characters
should be coded using character escape sequences instead.
Comments
Comments are sequences of arbitrary text characters which are ignored, being syntactically equivalent
to spaces. There are two forms of comments:
// comment-text newline
/* comment-text-and-newlines */
In the first form, all characters following the ‘//’ marker are ignored up to the next newline (or up to the
end of the document or current include file).
In the second form, all characters following the ‘/*’ marker are ignored, including newlines, up to the
next ‘*/’ marker (or up to the end of the document or current include file). Comments do not nest.
Version 0.1 (Draft)
6
8 Feb 2016
Object Text Markup Language Specification
Punctuation and Operators
These tokens are composed of punctuation characters, and are used as delimiters and operators.
{
}
(
)
[
]
+
−
*
/
&
|
^
>>
>>>
<<
?
:
;
~
==
===
!=
>
>=
<
<=
+=
−=
*=
/=
&=
|=
^=
>>=
>>>=
<<=
.
,
@
=
The character pairs ‘//’ and ‘/*’ are special, used to introduce comments.
Quote marks (single ' and double ") are special, used to introduce string values.
Backslash (\) is special within string values, used to specify character escape sequences.
Names
Names (a.k.a. identifiers) are used to uniquely label and identify objects within a given scope within a
document. Names are composed of alphabetic letters, digits, underscores (_), and US dollar signs ($).
Names cannot begin with a digit. Names can be any length, but are considered unique only up to the
first (leftmost) 63 characters, and the rest are ignored. Implementations may choose to issue warnings if
names longer than this are encountered.
Letter characters include the standard ASCII alphabetic characters (upper and lower case), as well as
Latin-1 alphabetic characters, and any other Unicode character point defined to be an “alphabetic
letter”. Digit characters include the standard ASCII decimal digits, as well as any Unicode character point
defined to be a “digit”. Any alphanumeric character outside the Latin-1 group can be specified using a
\uHHHH character escape sequence, where HHHH are hexadecimal digits, to represent the Unicode
character U+HHHH.
For example, fiché, grün, θ, Δt, π, Давид, and \u0391\u03B4\u03C1\u03B1 are all valid names. In the
interests of readability, however, it is recommended that only meaningful names be used, and to avoid
names that resemble operators or reserved keywords too closely. So for example, while för is a
syntactically valid name, a different name should probably be used instead to reduce confusion.
Names are not case-sensitive. This means that the names foo, FOO, and Foo all refer to the same object
within a given scope.
Names beginning with two leading underscores (e.g., __foo) or beginning with an underscore and a
dollar sign (e.g., _$foo) are reserved for private use by implementations, and should not be used in
normal client code.
Every object has an implicit attribute named ‘id’ that contains the name of the object (which is null for
unnamed objects).
Categories of objects can be given names of a certain format in order to aid in readability. For example,
it is traditional in C-like languages to name constants (i.e., initialized variables whose values are never
modified) using all-uppercase letters and digits (e.g., PI, MAX_LENGTH, RED). Variables are traditionally
named using all lowercase letters (e.g., i, count, len, user_name) or using camel-case names (e.g.,
userID, custName, cityState). Functions are traditionally named using all lowercase letters (e.g., find,
get_value, compute_sum) or using camel-case names (e.g., either getValue or GetValue).
Version 0.1 (Draft)
7
8 Feb 2016
Object Text Markup Language Specification
Reserved Keywords
While these tokens are lexically equivalent to names, they are reserved as keywords, and cannot be
used as names.
and
bool
break
case
catch
continue
data
date
default
defined
delete
do
else
event
false
finally
float
for
function
if
int
lock
new
not
null
object
or
return
string
switch
this
throw
true
try
void
while
It is recommended that the use of names resembling keywords (e.g., för, ìf) be avoided in the interests
of readability.
Boolean Values
Boolean (or logical) values are either true or false, and have type bool.
String Values
String values are sequences of zero or more Unicode characters. String literals are specified as Unicode
characters enclosed within quotes. Either matching single quote marks (') or double quote marks (")
may be used.
String literals should be composed of only printable Unicode characters. Unprintable characters,
including control codes, should be encoded using character escape sequences.
Some example string values:
""
'Hello, world.'
'123'
"He said, \"I cannot tell a lie.\""
"Pi (written as \π) is approx. 3.1416."
'The set {0,\_1,\_2} contains three elements.\r\n'
"an\-ti\-dis\-es\-tab\-lish\-men\-tar\-i\-an\-ism"
Character Escape Sequences
Unprintable Unicode characters can be specified within string values using character escape sequences.
Such a sequence is indicated by a backslash (\) followed by one or more characters that taken together
specify a single Unicode character code.
A ‘\u’ followed by exactly four hexadecimal characters (upper or lower case) specifies a single Unicode
character. Characters from ‘\u0020’ to ‘\u007E’ are printable ASCII characters. Implementations may
choose to issue warnings if malformed or illegal Unicode character escape sequences are encountered.
Version 0.1 (Draft)
8
8 Feb 2016
Object Text Markup Language Specification
A ‘\&’ followed by a name followed by a semicolon (;) specifies a named character entity, corresponding
to its equivalent XML/HTML character entity. For example, ‘\&’ is the same character as XML entity
‘&’, which is the ampersand character (&). The HTML 4 standard defines 252 character entities.
A ‘\’ followed by one of the specific characters listed below specifies a certain character code. A ‘\’
followed by a character that is not in the list is ignored. Implementations may choose to issue warnings
if malformed or undefined sequences are encountered.
Sequence
Character
HTML
Description
Unicode
\\
\cX
\N
\n
\r
\s
\t
\uHHHH
\z
\'
\"
\\_
\^
\$
\.
\?
\*
\+
\[
\]
\(
\)
\{
\}
\|
\&xxx;
\
Ctrl-X
CR LF
LF
CR
SP
HT
U+HHHH
ZWSP
'
"
SHY
NBSP
^
$
.
?
*
+
[
]
(
)
{
}
|
\
Backslash
Control character
Newline
Line feed
Carriage return
Space
Horizontal tab
Unicode character
Zero-width space
Apostrophe
Quote
Soft hyphen
Non-breaking space
Caret
US dollar sign
Period (full stop)
Question mark
Asterisk
Plus sign
Left square bracket
Right square bracket
Left parenthesis
Right parenthesis
Left curly brace
Right curly brace
Vertical bar
Character entity xxx
U+005C
U+0000 – U+001F
U+000A U+000D
U+000A
U+000D
U+0020
U+0009
U+HHHH
U+200B
U+0027
U+0022
U+00AD
U+00A0
U+005E
U+0024
U+002E
U+003F
U+002A
U+002B
U+005B
U+005D
U+0028
U+0029
U+007B
U+007D
U+007C





 
	
&#xHHHH;
​
'
"
­
 
^
$
.
?
*
+
[
]
(
)
{
}
|
&xxx;
Most characters can be specified in more than one way. For example, Latin-1 capital A can be coded as
'A' or '\u0041'. A non-breaking space can be coded as '\_', '\u00A0', or '\ '.
Numeric Values
Numeric values can be specified as integers and as floating-point numbers.
Decimal integer numbers are composed of decimal digits, and do not begin with the digit ‘0’. Octal
integers are prefixed with a ‘0’, hexadecimal integers are prefixed with a ‘0x’, and binary integers are
prefixed with a ‘0b’. Integers are 64-bit signed binary values, spanning the range from
−9_223_372_036_854_775_808 to +9_223_372_036_854_775_807 (or −0x8000_0000_0000_0000 to
+0x7FFF_FFFF_FFFF_FFFF).
Floating-point numbers are composed of decimal digits and must contain either a decimal point (.) or an
exponent suffix. They are 64-bit IEEE double-precision binary floating-point values, spanning the range
Version 0.1 (Draft)
9
8 Feb 2016
Object Text Markup Language Specification
from ±1.112536929253600e-308 to ±1.797693134862315e+308. Infinity (±INF) and Not-A-Number
(NaN) values are also supported (see the float type in the standard library).
Numbers may contain underscores (_) between digits to improve readability.
number:
unsigned-number
unsigned-number:
integer
real
integer:
decimal-integer
0 octal-integer
0 {x | X} hexadecimal-integer
0 {b | B} binary-integer
decimal-integer
0
decimal-number
decimal-number _ decimal-number
decimal-number:
digit
decimal-number digit
octal-integer:
octal-number
octal-number _ octal-number
octal-number:
octal-digit
octal-number octal-digit
hexadecimal-integer:
hexadecimal-number
hexadecimal-number _ hexadecimal-number
hexadecimal-number:
hexadecimal-digit
hexadecimal-number hexadecimal-digit
binary-integer:
binary-number
binary-number _ binary-number
binary-number:
0 | 1
binary-number {0 | 1}
real:
fractional-real
fractional-real exponent
Version 0.1 (Draft)
10
8 Feb 2016
Object Text Markup Language Specification
fractional-real:
. decimal-integer
decimal-integer .
decimal-integer . decimal-integer
exponent:
{e | E} [+ | −] decimal-integer
Some example numeric values:
0
65535
5_000_000
0177_777
0x100B
0xFFFF_FFFF
0b01101000
0b0100_0010_0100_0010
1.066e-23
.750
166.6667
3.14159_26535_89793_23846_26433
Version 0.1 (Draft)
//
//
//
//
//
//
//
//
//
//
//
//
int, decimal
int, decimal
int, decimal
int, octal
int, hexadecimal
int, hexadecimal
int, binary
int, binary
float
float
float
float
11
8 Feb 2016
Object Text Markup Language Specification
Syntactical Elements
Objects
All objects within a document are defined using either of two syntactical forms, a primitive object
definition or a (non- primitive) object definition.
object-definition:
primitive-object
non-primitive-object
array-object
primitive-object:
type [name] = value ;
non-primitive-object:
type [name] = [attribute-def…] { [object-member…] } [: name]
array-object:
type [name] = [attribute-def…] [ [array-member…] ] [: name]
attribute-def:
attribute-name = value
attribute-name:
name
attribute-name . name
object-member:
object-definition
;
The general syntax for a primitive object definition is:
type [name] = value ;
The general syntax for a non-primitive object definition is:
type [name]
[ attrib = value ]…
{
[…object-members…]
} [: name]
Function objects have a slightly more complicated syntax.
Objects may be named, which allows them to be uniquely identified within a given scope.
All objects have a type. String values have an implicit type of string. A simple object has the type it is
defined as, or without an explicit type it has the type of the value assigned to it.
Non-simple objects have the type specified by their type tag. Members of an object definition are
themselves object definitions.
Version 0.1 (Draft)
12
8 Feb 2016
Object Text Markup Language Specification
Types
The type tag of an object definition specifies the type of the object, whether it is a markup element,
data object, or variable.
Markup elements have types (such as ‘p’ for ‘paragraph’ and ‘br’ for ‘line break’) corresponding to HTML
markup element tags.
Names
Objects may be named. The scope of an object name is the parent object in which it is defined. Names
defined within the same scope must be unique. A name defined within a sub-object may hide other
objects with the same name defined in parent objects of the sub-object.
Attributes
Objects may have named attribute values (also known as properties) attached to them. Attribute values
have a restricted syntax rather than the full expression syntax, although a parenthesized expression is a
syntactically valid value. In practice, most attributes are assigned simple string values or names.
Attribute names are lexically valid names. More complex names can be defined by combining two or
more names separated by dots (.), for example style.color.
[…]
Version 0.1 (Draft)
13
8 Feb 2016
Object Text Markup Language Specification
Document Objects
An OTML document is a single object of type document. This object may contain other display objects,
functions, variables, and data objects.
Document object
An OTML document is itself a single object definition, of type document.
document-object:
document [document-attribute…] { [document-member…] } [;]
document-attribute:
format = string
encoding = string
document-member:
head-object
include-object
style-object
body-object
data-object
function-object
;
A document has a (required) format attribute assigned the value "1.0", indicating that it conforms to
version 1.0 of the OTML syntax specification.
By default, OTML documents are encoding with UTF-8 character encoding. An optional encoding
attribute may be supplied which specifies a different character encoding. Implementations should be
capable of recognizing the first few octets of a document stream that signal UTF-8 and UTF-16
encodings, as well as recognizing byte order mark (BOM) sequences.
In general, an OTML document may begin with the octets for encoding a BOM, a space character (SP), a
tab character (HT), a newline character (CR or LF), a ‘/’ character (the initial character of a comment), or
a ‘d’ character (the initial character of ‘document’).
Head Object
head-object:
base href = value ;
link href = value rel = value ;
meta name = value ;
title = value ;
[…]…http-equiv?
Version 0.1 (Draft)
14
8 Feb 2016
Object Text Markup Language Specification
Include Object
include-object:
include [include-attrib = value]… ;
include-attrib:
href
[…]
Style Object
style-object:
style [name] [style-attrib…] { [style-def…] }
style-attrib:
type = value
scope = value
inherit = value
attribute-def
style-def:
name = expr ;
;
Style objects define the styling attributes for named style classes and for element types.
[…]…Each style-def is composed of…
This example defines a style object for a style class named bold_red_para:
style bold_red_para
{
color = 'red';
font_weight = 'bold';
}
This style class can be used to specify that a text element is to be displayed with a red foreground color
and in bold font:
div class=bold_red_para { "Now is the winter of our discontent." }
Body Object
body-object:
body [type] name [attribute-def…] { [body-member…] }
body-member:
style-object
markup-object
data-object
function-object
;
Version 0.1 (Draft)
15
8 Feb 2016
Object Text Markup Language Specification
A document body contains zero or more text or markup elements. Browsers are obliged to render and
display the visible text and markup elements of a body object. A body object may also contain style
objects, function objects, and data objects.
[…]
Markup Object
markup-object:
primitive-markup-object
non-primitive-markup-object
primitive-markup-object:
value
non-primitive-markup-object:
type [name] [markup-attribute = value]… { [markup-member…] }
{ [markup-member…] }
markup-attribute:
style . name
enabled
remain_active
html-attribute-name
markup-member:
style-object
markup-object
data-object
function-object
;
A markup object contains displayable text and graphic elements. Browsers are obliged to display the
visible elements. The displayable content of a document is composed entirely of the markup objects
contained within it.
Markup objects may contain other markup objects, style objects, function objects, and data objects.
Markup objects may be named, to uniquely identify them within their parent object scope.
The supported markup types are listed in the appendix.
By default, a markup object becomes disabled once its parent body executes a submit action. That is, a
markup object can no longer accept or respond to GUI user actions such as mouse clicks or keyboard
keystrokes after the user has initiated a document action that sends a response to the server. This is
done to prevent users from clicking buttons and other elements while the browser is waiting for a
response from the server. However, this disabling can be overridden by defining the markup element
with a ‘remain_active=true’ attribute.
[…]
Rendering Display Elements
[…]
[…]…spaces, newlines, preserving whitespace, etc. …
Version 0.1 (Draft)
16
8 Feb 2016
Object Text Markup Language Specification
For example, the following span markup object named ‘preamble’ contains several member markup
objects, most of them primitive text strings:
span preamble
{
"When in the course of human events "
"it becomes"; i{"necessary"}; "for one people";
"to dissolve the " b{"bonds"}
}
This can also be coded using style objects in place of markup objects:
span preamble
{
"When in the course of human events "
"it becomes";
span style.font_style='italic' {"necessary"};
"for one people";
"to dissolve the "
span style.font_style='bold' {"bonds"}
}
This is intended to be displayed by browsers as something like this:
When in the course of human events it becomes necessary for
one people to dissolve the bonds
Data Object
A data object contains one or more data members. These members can be simple named values or
other data objects. Data objects are not displayed by browsers, but provide values to other functions
and objects within the document.
Data objects can occur within any document, body, markup, or function object.
data-object:
primitive-data-obj
non-primitive-data-obj
array-data-obj
primitive-data-obj:
data primitive-data-def
primitive-data-def:
[type] name = expr ;
non-primitive-data-obj:
data non-primitive-data-def
non-primitive-data-def:
name [attribute-def…] { [data-member…] }
array-data-obj:
data array-data-def
Version 0.1 (Draft)
17
8 Feb 2016
Object Text Markup Language Specification
array-data-def:
[type] name [attribute-def…] [ [array-member…] ]
array-member:
[[ expr ] =] data-def ;
data-def:
primitive-data-def
non-primitive-data-def
array-data-def
Primitive data objects are essentially named variables with initializers:
data [type] name = expr ;
Non-primitive data objects have a brace-enclosed body containing member data elements:
data name
[attrib = value]…
{
[data-members…]
}
Data arrays are defined in a similar fashion, but their member elements are enclosed within square
brackets ([]).
data [type] name
[attrib = value]…
[
[[ expr ] = ] array-member ;
…
]
The optional ‘[expr] =’ preceding an array member specifies the subscript of a particular element
within the array to be initialized. If this is not specified, then the next element within the array following
the previous array member is assumed.
[…]
[…]…Examples…
Version 0.1 (Draft)
18
8 Feb 2016
Object Text Markup Language Specification
Functions
Function Object
function-object:
function [type] {name | new} ( [param-defs] )
[function-attrib = value]…
block-statement
function-attrib:
event
param-defs:
param-def
param-defs , param-def
param-def:
[type] name
A function contains executable code (statements). A function is defined using the same syntax as any
other object, with the addition of the leading function keyword and a parenthesized parameter list.
Every function has a name, which is either a user-defined name or new. Functions with the name new are
constructors and are used to initialize new objects of their parent object type.
Every function has a parent object, which is the object in which it is defined. The parent object can be
accessed by statements within the function with the this keyword. The outermost document object can
be accessed with the global document variable.
Function definitions can contain other function definitions, i.e., functions can be nested.
All functions have a return type. By default, a function without an explicitly defined return type returns
the type object.
All functions return a value. By default, a function that terminates without executing an explicit return
statement returns the value true.
Executing the last statement in a function body, i.e., “falling out of” the block, is equivalent to returning
a value of true.
A function may be defined to be an event handler by specifying an event attribute in its definition. The
attribute is set to one of the possible event types. Any such events initiated by the user on the parent
object containing the function cause the function to be invoked, which is passed an event object that
contains information about the event.
Version 0.1 (Draft)
19
8 Feb 2016
Object Text Markup Language Specification
Statements
The following are the executable statements that can be coded within a function.
statement:
variable-definition
if-else-statement
for-statement
do-while-statement
while-statement
switch-statement
case-statement
break-statement
return-statement
try-catch-finally-statement
throw-statement
lock-statement
block-statement
expr-statement
;
Variable Definition
type name [ = expr ] ;
A function may contain one or more local variables. These are named values that can be modified by the
function statements. A variable always has a name, a type, and a value. An explicit type is required, to
syntactically distinguish variable definitions from assignment statements.
Variables may be initialized with an explicit expression value, which is evaluated at the point that the
variable definition statement is executed and converted into the type of the variable. If no initializer
expression is specified, the variable is initialized to a default value, which is 0, 0.0, false, or null,
based on the variable type.
Variables have block scope, being visible only within the parent block they are defined within, and
within any sub-blocks within that same parent block. Variable definitions within a sub-block may hide
other variable definitions within outer blocks, although implementations may choose to issue warnings
when such cases are encountered.
If-Else Statement
if-else-statement:
if ( expr ) statement1
if ( expr ) statement1 else statement2
This is the basic conditional control-flow statement. The control expression is evaluated and converted
into a boolean value. If the value is true, then statement1 is executed, otherwise statement2 (if present)
is executed.
For Statement
for-statement:
for ( [expr-list1] ; [expr-list2] ; [expr-list3] ) statement1
Version 0.1 (Draft)
20
8 Feb 2016
Object Text Markup Language Specification
This is a looping control-flow statement. Expression-list1 is evaluated (if it is present), which typically
contains pre-loop initializations. The loop then executes, first by evaluating the controlling
expression-list2 (if it is present) and converting it into a boolean value. If expression-list2 is not present, it
is implicitly assumed to be true. If the result is false, the loop terminates and execution continues with
the statement following the for-loop statement. Otherwise the result is true, and the next iteration of
the loop body (statement1) is executed. This statement is typically a block statement, but is not required
to be so. After the body statement completes, expression-list3 is executed (if it is present), which
typically contains post-loop increment expressions. Then the execution flows to the top of the next
iteration of the loop, evaluating expression-list2 again. Execution of the loop body continues until
expression-list2 evaluates to false.
Note that if the controlling expression-list2 always evaluates to true, the loop is an infinite loop. A forloop with no controlling expression is equivalent to an infinite loop.
Executing a break statement within a loop body causes looping to terminate.
Execution of a continue statement within the loop body causes the current iteration of the loop body
to end, expression-list3 to be evaluated, and forces the next iteration of the loop to occur.
Do-While Statement
do-while-statement:
do statement1 while ( expr-list ) ;
This is a looping control-flow statement. With each loop iteration, statement1 is executed, and then the
controlling expression-list is evaluated and converted into a boolean value. If the result is false, the
loop statement terminates and execution continues with the statement following the loop. Otherwise
the result is true, and the next iteration of the loop is performed. Looping continues until the
expression-list evaluates to false.
Note that at least one iteration of the loop is always performed, prior to testing the (post-loop)
conditional expression.
Note also that if the controlling expression-list always evaluates to true, the do-while statement is an
infinite loop.
Executing a break statement within the body of the loop causes the looping to terminate.
Executing a continue statement within the body of the loop causes the current iteration to end, and for
the testing of the controlling expression-list, forcing the next iteration of the loop to occur.
While Statement
while-statement:
while ( expr-list ) statement
[…]
Switch Statement
switch-statement:
switch ( expr ) statement
Version 0.1 (Draft)
21
8 Feb 2016
Object Text Markup Language Specification
case-statement:
case expr : statement…
default : statement…
This is a selection statement, which evaluates the switch expression, then looks for a matching case
expression within the body statement. If a match is found, the statements following the matching case
expression are executed. If no matching expression is found, the statements associated with the
default label (if one is present) are executed. If no default statement is present, the switch
statement ends, and execution continues with the next statement following it.
When statements following a given case or default label are executed, they continue being executed
until a break statement is encountered or until the end of the switch statement is reached. In other
words, execution of each of the switch body statements “fall through” to the next statement unless a
break statement is encountered.
Break Statement
break-statement:
break ;
continue ;
A continue statement can only appear within the body of a looping statement (do-while, while, or
for loop). A break statement can appear within a looping statement or within a switch statement
body.
Within the body of a looping statement, executing a break statement causes the looping statement to
end, and for the nest statement following the loop to be executed.
Executing a continue statement within the body of a looping statement causes the current iteration of
the loop to end and forces the next iteration to begin.
Within the body of a switch statement, executing a break statement causes execution control to flow
out of the switch statement body, and for the next statement following the switch to be executed. In
other words, this defeats the “fall through” behavior of the statements within the switch body.
Return Statement
return-statement:
return [expr] ;
Executing a return statement causes the execution of its parent function to end, and for a value to be
returned to the caller of the function. If an expression is specified, it is evaluated and converted into the
return type of the function, and that converted value is returned from the function. If no expression is
specified, an implicit value of true, 1, or 1.0 is returned, depending on the function’s return type.
Try-Catch-Finally Statement
try-catch-finally-statement:
try statement1
[catch ( [type] name ) statement2]…
[finally statement3]
[…]
Version 0.1 (Draft)
22
8 Feb 2016
Object Text Markup Language Specification
Throw Statement
throw-statement:
throw expr ;
Executing a throw statement causes an exception to be raised.
[…]
Lock Statement
lock-statement:
lock ( expr ) statement
The controlling expression designates an object to lock. This must be a named object l-value or an object
that is accessible by functions in other threads; it cannot be a temporary object (e.g., the result of an
arithmetic expression).
After the expression is evaluated, the current thread attempts to acquire an exclusive lock on the
resulting object. If a lock can be acquired, the thread locks it and then proceeds to execute the
statement. If a lock cannot be acquired, another thread has a lock on the object, so the current thread
blocks until the other thread releases its lock on the object, at which time the current thread again
attempts to acquire a lock on the object.
Locks are acquired in a non-deterministic way, being assigned arbitrarily to one of the threads
requesting a lock on a given object. While the thread is blocked, it cannot execute any other statements.
A lock statement typically contains a call to the locked object’s wait() or notify() functions. These
functions operate in concert to synchronize the execution of separate but cooperating threads.
Implementations may impose a limit on the number of simultaneously active locks, and may choose to
throw a run-time exception when that limit is exceeded.
Block Statement
block-statement:
{ [statement…] }
A block statement (or statement block) is a sequence of zero or more statements enclosed within braces
({ }). A new scope exists within the block. Variables and objects defined within the block exist as long as
the block is active (being executed), and they are no longer accessible when the execution of the block
ends. These variables and objects are not visible outside of the block. However, variables and objects
defined outside the block but contained within the same outer block are visible within the block.
[…]…variable hiding…
Empty blocks (i.e., blocks containing no statements) are equivalent to an empty statement, and do not
perform any operations.
Version 0.1 (Draft)
23
8 Feb 2016
Object Text Markup Language Specification
Some examples of block statements:
// Outer block
{
string name = "Howard";
string species = "duck";
// Inner block
{
string name2 = name;
string species = "billionaire";
}
// Initialized to "Howard"
// Hides outer ‘species’ var
}
Expression Statement
expr-statement:
expr ;
An expression statement specifies an expression to be executed. The expression is executed generally
for its side effects, which may include modifying variables and objects or calling functions.
Implementations may choose to issue a warning if an expression statement has no side effects.
[…]
Version 0.1 (Draft)
24
8 Feb 2016
Object Text Markup Language Specification
Expressions
Expressions are the basic building block of function code.
expr:
query-expr
expr-list:
expr
expr-list , expr
Expression lists are sequences of one or more separate expressions, and are evaluated from left to right.
The result and type of the expression list is the value and type of the last (rightmost) expression in the
list.
query-expr:
assignment-expr
or-expr ? query-expr : query-expr
Query expressions (also called ternary expressions or if-then-else expressions) are evaluated by first
evaluating the first expression (preceding the ‘?’ operator), also known as the controlling expression,
and the result is converted into a bool value. If the value is equal to true, the second expression
(following the ‘?’ operator) is evaluated this becomes the result of the entire expression; otherwise the
value is false, and the third expression (following the ‘:’ operator) is evaluated and becomes the result
of the entire expression. Note that only one of the second and third expressions is evaluated following
the evaluation of the first controlling expression. The result of either expression is converted to a
common type (which will be either the type of the second expression or the type of the third
expression), and this is the type of the entire expression.
assignment-expr:
lvalue assignment-op query-expr
// FIXME
lvalue:
primary-expr
// FIXME
assignment-op:
=
*=
/=
%=
+=
−=
>>=
>>>=
<<=
&=
^=
|=
Assignment expressions have the side effect of assigning a new value to the l-value expression (the
expression preceding the assignment operator). The value is the result of evaluating the expression
following the assignment operator, and this result is converted into the type of the l-value, and then the
value of the l-value is changed to the resulting converted value. The l-value expression must designate a
modifiable variable or object which is accessible within the scope of the block containing the expression.
Version 0.1 (Draft)
25
8 Feb 2016
Object Text Markup Language Specification
The compound assignment operators apply another operator to the right operand before assigning the
resulting value to the l-value. For instance, ‘x *= y’ is semantically equivalent to ‘x = x * y’, except that
‘x’ is evaluated only once.
or-expr:
and-expr
or-expr or and-expr
and-expr:
not-expr
and-expr and not-expr
not-expr:
rel-expr
not not-expr
Logical expressions result in a bool value (true or false).
An or-expression is evaluated by first evaluating the expression to the left of the ‘or’ operator and
converting the result into a bool value. If the result is true, the entire expression evaluates to true;
otherwise, the expression to the right of the ‘or’ operator is evaluated and the result is converted into a
bool value, which becomes the result of the entire expression. Note that if the left expression is true,
the right expression is not evaluated at all (this is known as short-circuit evaluation).
An and-expression is evaluated by first evaluating the expression to the left of the ‘and’ operator and
converting the result into a bool value. If the result is false, the entire expression evaluates to false;
otherwise, the expression to the right of the ‘and’ operator is evaluated and the result is converted into
a bool value, which becomes the result of the entire expression. Note that if the left expression is
false, the right expression is not evaluated at all (this is known as short-circuit evaluation).
A not-expression is evaluated by evaluating the expression following the ‘not’ operator and converting
the result into a bool value. The result of the entire expression is then the logical complement of that
value (i.e., if the value is true, the expression evaluates to false, and vice versa). Note that the
expression ‘not x’ is similar, but not exactly identical, to the expressions ‘x == false’ and ‘x != true’.
rel-expr:
bit-or-expr
rel-expr ==
rel-expr ===
rel-expr !=
rel-expr >
rel-expr >=
rel-expr <
rel-expr <=
bit-or-expr
bit-or-expr
bit-or-expr
bit-or-expr
bit-or-expr
bit-or-expr
bit-or-expr
A relational expression involves a relational operator (also called a comparison operator) which specifies
how the left and right expressions are to be compared to one another. The left expression is evaluated
first, then the right expression is evaluated. The two results are then converted to a common type
(which will be either the type of the left expression or the type of the right expression), and the
converted values are compared according to the relational operator specified. The results of the
comparison is a bool value (true or false), which becomes the result of the entire expression.
If either expression has type object, the only valid relational operators are equals (==, ===) and not
equals (!=), and the other expression is taken as an object value. Otherwise, if either expression has
Version 0.1 (Draft)
26
8 Feb 2016
Object Text Markup Language Specification
type string, the other expression is converted into a string value prior to the comparison. Otherwise,
if either expression has type float, the other expression is converted into a float value prior to the
comparison. Otherwise, if either expression has type int, the other expression is converted into an int
value prior to the comparison. Otherwise, if either expression has type bool, the other expression is
converted into a bool value prior to the comparison.
If either expression is undefined, the relational expression evaluates to false. If both expressions are
type float and either expression is NaN (not a number), the relational expression evaluates to false.
bit-or-expr:
bit-xor-expr
bit-or-expr | bit-xor-expr
bit-xor-expr:
bit-and-expr
bit-xor-expr ^ bit-and-expr
bit-and-expr:
shift-expr
bit-and-expr & shift-expr
shift-expr:
add-expr
shift-expr >> bit-or-expr
shift-expr >>> bit-or-expr
shift-expr << bit-or-expr
The bit-wise operators can only be applied to operands of type int, and yield values of type int.
The bit-shift operators (<<, >>, >>>) shift the binary bits of their left operand by the number of bits
specified by their right operand. The ‘<<’ operator shifts the bits left, filling the low-order bits with zeros;
the ‘>>’ operator shifts the bits right, filling the high-order bits with copies of the most significant (sign)
bit; the ‘>>>’ operator also shifts the bits right, but fills the high-order bits with zeros.
add-expr:
mul-expr
add-expr + mul-expr
add-expr − mul-expr
mul-expr:
shift-expr
mul-expr * unary-expr
mul-expr / unary-expr
mul-expr % unary-expr
Additive operators and multiplicative operators can only be applied to operands of type int or float. If
either operand is type float, the other operand is converted to float before the operator is applied.
The ‘+’ operator can also be applied to operands of type string, resulting in a value that is the
concatenation of the two operands. Both operands must have type string for the ‘+’ operator to
designate string concatenation, otherwise it designates numeric addition.
Version 0.1 (Draft)
27
8 Feb 2016
Object Text Markup Language Specification
unary-expr:
primary-expr
~ unary-expr
+ unary-expr
− unary-expr
++ primary-expr
−− primary-expr
primary-expr ++
primary-expr ––
The unary operators perform arithmetic operations on a single operand.
The ‘~’ operator results in the bit-wise one’s-complement of its operand. It can only be applied to
operands of type int.
The unary ‘+’ and ‘−’ sign operators can only by applied to operands of numeric type. The ‘+’ operator
does not change the value of its operand, while the ‘−’ operator results in the arithmetic negative of its
operand. The resulting value is the same type as the operand.
The ‘++’ and ‘−−’ increment operators cause their operands to be incremented and decremented by 1,
respectively. These operators can only be applies to operands of numeric type. The operand expression
must designate a modifiable l-value. There are two forms of these operators; those that precede their
operand are called the pre-increment and pre-decrement operators, and operate by first incrementing or
decrementing their operand and the returning the resulting modified value. Those that follow their
operand are called the post-increment and post-decrement operators, and operate by returning the
value of their operand prior the increment/decrement operation, and increment or decrementing their
operand after the original value has been used.
primary-expr:
( expr )
subscript-expr
new-expr
delete-expr
convert-expr
typeof-expr
function-call-expr
name-expr
string
number
true
false
null
Parentheses (the ‘(’ and ‘)’ round bracket operators) are used for grouping expressions so that they are
evaluated in a specific order.
The true and false values are pre-defined (built-in) primitive values of type bool. They can be
converted into types int (0 and 1) and string ("true" and "false").
The null value is a pre-defined (built-in) primitive value of type object. It specifically designates no
object, so that values assigned null do not refer to any object. The null value can be converted into a
bool value of false.
Version 0.1 (Draft)
28
8 Feb 2016
Object Text Markup Language Specification
subscript-expr:
name-expr [ expr ]
The array subscript operator accesses an element of an array object. The left operand (preceding the ‘[’
operator) is evaluated first, and must designate an object of array type. The right operand (inside the ‘[’
and ‘]’ brackets) is evaluated; if it has numeric type (bool, int, or float), it is converted into an int
value, and the result is the index of the element within the array to be accessed. Note that array indices
start at zero (0). If the right operand has type string, it is taken as a hash key designating an element of
the array to be accessed. If the right operand is null or is undefined, a run-time exception is thrown.
new-expr:
new type ( [args] )
delete-expr:
delete expr
The new operator is used to construct new objects. A brand new object of the specified type is created
(its space being allocated on the heap), and the appropriate constructor function is then invoked to
initialize the object. The arguments, if any, are evaluated and passed to the constructor function, each
one being assigned to its corresponding function parameter. The appropriate constructor function is the
constructor function with parameters that best match the list of arguments specified in the new
expression. If no arguments are specified (known as a no-args constructor call), and no constructor
function with no parameters is defined for the type, the type is instantiated and its members (if any) are
default-initialized. Otherwise, if no matching constructor function can be found, a run-time exception is
thrown.
The delete operator is used to destroy existing objects. It returns no value, and has no type (i.e., is
undefined). When the expression designates a member of some object, that member is removed from
its parent object. When the expression designates an element of some array object, that element is
removed from its parent array. Note that the deleted object continues to exist in memory as long as
some other live object or variable contains a reference to it; otherwise, the deleted object will
eventually be deallocated from memory by the garbage collector.
convert-expr:
defined ( expr )
bool ( expr )
int ( expr )
float ( expr )
string ( expr )
void ( expr )
//Needed???
//Needed???
//Needed???
The conversion operators resemble built-in function calls, and convert their arguments into a specific
type. The argument expression can be a value of any type.
The bool() conversion operator converts its argument into a value of type bool. If the argument is
false, null, 0, 0.0, or is undefined, the operator returns a value of false; otherwise, it returns a value
of true.
[???]…should int(x) et al allow x to be any type, or only numeric/bool? Or should users be required to
use int.parse(x) instead?
Version 0.1 (Draft)
29
8 Feb 2016
Object Text Markup Language Specification
The string() operator converts its operand into a string value. If the argument is null or is undefined,
the resulting value is null. Otherwise, the value returned is the same as if the to_string() function for
the argument’s type was called.
[…]
The void() operator removes the types of its operand, making it undefined.
The defined() operator is not actually a conversion operator, but determines whether its argument is
defined or not, and returns a bool value.
typeof-expr:
typeof ( expr )
The typeof() operator returns a string value indicating the type of its argument. It returns one of the
values "bool", "int", "float", "string", "object", "date", "url", "event", "function", "null",
or "undefined", or the name of a markup type (such as "body", "p", "br", etc.).
name-expr:
this
name
primary-expr @ name
primary-expr . name
primitive-type . name
primitive-type:
object
bool
int
float
string
date
url
function
A name expression uniquely designates an object (i.e., an object, attribute, variable, or function).
The ‘this’ value refers to the parent object that contains the function.
An unadorned name refers to the variable or function having that name within the current execution
scope.
A name expression followed by an at-sign (@) followed by a name (e.g., foo@bar) refers to an attribute
(also called a property) of an object.
A name expression followed by a dotted name (e.g., foo.bar) refers to a named member of an object.
A primitive type name followed by a dotted name (e.g., date.today) designates a built-in function or
variable in the run-time type library.
[…]
function-call-expr:
name-expr ( [args] )
Version 0.1 (Draft)
30
8 Feb 2016
Object Text Markup Language Specification
args:
arg
args , arg
arg:
[name :] expr
A function call expression invokes a function, passing it zero or more arguments. Each argument
expression is evaluated and assigned to its corresponding function parameter. If the number of
arguments in the call expression does not match the parameters in the function definition, any
unassigned parameters will be undefined.
The result of the function invocation has the type specified by the definition of the function. The
resulting value returned from the execution of the function becomes the value of the function call
expression.
More details can be found in the Function Invocation section below.
value:
( expr )
name-expr
string
[+ | −] number
true
false
null
type:
markup-type
primitive-type
A value expression is essentially a primary expression, except that it only occurs in contexts where an
attribute (property) value is allowed.
Attribute Values
Attribute values (also called property values or just simply properties) are named values attached to
objects. They are distinct from object members.
[…]
Type Conversions
Every object, value, and expression implicitly has type object, and thus can be assigned to any variable
of type object with no explicit conversion operation required. A value or expression assigned to a
variable of type object retains its original type. Expressions having a value of null or that are
undefined do not refer to any object.
Conditional expressions are used as the controlling expressions for control flow statements, and are
implicitly converted into type bool. For expressions of type int and float, a value of zero (0) is
converted into false, and all other non-zero values are converted into true. Expressions of type string
are converted into false if they are empty strings ("") or are null, otherwise they are converted into
true. (Note that string values such as "0" and '0.00' are converted into true.) All other expressions
are converted into false if they are undefined or are null, otherwise they are converted into true.
Version 0.1 (Draft)
31
8 Feb 2016
Object Text Markup Language Specification
Expressions may be converted into strings using the string() operator. Most of the primitive types
provide a to_string() function that can be used to return a string value formatted in a specific way.
The int, float, and date types provide a parse() function that can be used to convert a string value
into the corresponding primitive type.
In a function call expression, the argument expressions in the call are assigned to their corresponding
function parameter variables. Each argument is implicitly converted into the type of its parameter,
which may result in a runtime exception being thrown.
Regular Expressions
Regular expressions are string values used in certain expression contexts to perform pattern matching
on other string values. These string values are in the format "/p/s", where p is a pattern composed of
one or more of the regular expression operators listed below, and s is zero or more regular expression
suffixes.
The suffixes affect the behavior of pattern matching:
Suffix
g
i
m
Version 0.1 (Draft)
Description
Perform pattern matching globally
Ignore letter case
???
32
//FIXME
8 Feb 2016
Object Text Markup Language Specification
The regular expression pattern matching operators are:
Pattern
x
pq
^
$
.
p?
p*
p+
p{n,m}
[abc]
[a−z]
[^a−z]
p|q
(p)
\n
\b
\B
\d
\D
\N
\s
\S
\w
\W
\uHHHH
\\
\^
\$
\.
\?
\*
\+
\[
\]
\(
\)
\{
\}
\|
Version 0.1 (Draft)
Description
Matches Unicode character 'x'
Matches p followed by q
Matches the beginning of a string
Matches the end of a string
Matches any single Unicode character
Matches zero or one occurrence of p
Matches zero or more occurrences of p
Matches one or more occurrence of p
Matches n to m occurrences of p
Matches single character 'a', 'b', or 'c'
Matches single character from 'a' to 'z'
Matches single character other than 'a' to 'z'
Matches either p or q
Grouping, matches p
Matches group n (0 to 9)
Matches the beginning of a word
Matches the end of a word
Matches a digit character
Matches any character except digit characters
Matches a CR LF pair
Matches a whitespace character
Matches any character except whitespace characters
Matches a word character
Matches any character except word characters
Matches Unicode character U+HHHH
Matches '\'
Matches '^'
Matches '$'
Matches '.'
Matches '?'
Matches '*'
Matches '+'
Matches '['
Matches ']'
Matches '('
Matches ')'
Matches '{'
Matches '}'
Matches '|'
33
8 Feb 2016
Object Text Markup Language Specification
Object Management
New objects are created using the new operator. This is syntactically similar to a function call, except
that an object type is specified instead of a function name. Markup objects can be created (e.g., p,
table, tr, div, span), as well as objects of the standard library types (e.g., date, url).
[…]
Objects that go out of scope (i.e., local variables within a statement block that is no longer active) or that
no longer have any other active objects referring to them are called dead objects. Dead objects are
eventually subject to garbage collection, which deallocates them from memory and reclaims the space
they occupy. Objects that are still within an active scope or are still referenced by other active objects
are called live objects, and are not subject to garbage collection.
[…]
type foo
{
// Constructor function for 'type'
function new()
{
…
}
}
[…]
Name Scope
[…]
Function Invocation
A function may be invoked from a function, or by an event or timer. Recursive calls are supported,
meaning that a function may call itself directly, or may call another function that in turn calls it
indirectly.
Upon entry to a function from a function call expression, the argument values of the call expression (if
any) are assigned to their corresponding function parameters. The function parameters are essentially
local variables of the function. If the number of function call arguments does not agree with the function
parameters, any unassigned parameters will be undefined.
After the parameters are assigned, the first statement of the function body (a statement block enclosed
within { } braces) is executed. The statements within the body of the function are executed in sequence
until a return statement is executed (which returns a value back to the caller) or until the execution
flows out of the function body. If an explicit return value is not specified, or if the function body is empty
(contains no statements), the function returns an implicit value of true back to the caller.
[…]…the implicit this object parameter
[…]…closures, first-class function objects, and nested functions…
[…]…threads…
[…]…anonymous functions…
Version 0.1 (Draft)
34
8 Feb 2016
Object Text Markup Language Specification
[…]
Multithreading
Within an OTML browser, the document object is executed within a main thread. Each body object
within the document executes within its own separate thread. Implementations may choose to limit the
number of active body threads within a given document.
Upon the initial loading of a body object, its thread issues a ‘load’ event, which may be handled by
member functions of the body defined with an ‘event="load"’ attribute. Once execution of all load
event functions ends, the body thread suspends execution and waits for events to be received from the
main loop.
GUI events (such as mouse clicks and keyboard keystrokes) for a given display element are captured by
the main thread and delivered to the body thread containing that element. Once the main thread has
delivered an event to the appropriate thread, it resumes waiting for other events.
When an action is triggered that causes a response object to be sent to the server from the document,
such as when the user clicks on an action button or enters a keyboard keystroke that results in a submit
action, the main thread sends an ‘unload’ event to each of the body threads. Each body thread in turn
may then handle this event with member functions defined with an ‘event="unload"’ attribute.
Once the execution of all of the body threads have ended, the main thread terminates all of the body
threads, and then sends the response object to the server. The browser then waits for the next reply
object to be received from the server.
[…]
Version 0.1 (Draft)
35
8 Feb 2016
Object Text Markup Language Specification
Standard Library Types
The types below are defined using a pseudo-language which is not actually valid OTML.
Object Type
Every object contains the following variables and functions.
type object
{
string
string
object
object
object
object
object[]
object[]
…
id;
type;
parent_element;
prev_element;
next_element;
first_child;
elements;
attributes;
//
//
//
//
//
//
//
//
Name
Element type, e.g., "string"
Parent object
Previous sibling object
Next sibling object
First child object
Child objects
Attribute values
function bool equals(object o);
function bool send_event(event evt);
…
operator object new();
operator object new(string type);
}
object document;
// Document object
String Type
A string object is a (possibly empty) sequence of Unicode characters. Every string object contains the
following variables and functions.
type string
{
int
function
function
function
function
function
function
function
function
function
function
…
length;
// Length
string char_at(int pos);
string match(string pat);
string replace(string pat, string new);
string search(string pat);
string[] split(string pat);
string substring(int start, int len);
string substring(int start);
string to_capital();
string to_lower();
string to_upper();
operator string [](int pos);
}
operator string string(object obj);
Version 0.1 (Draft)
36
8 Feb 2016
Object Text Markup Language Specification
Bool Type
A bool object is a 1-bit boolean value, either true or false.
type bool
{
string
name;
// "true" or "false"
function string to_string();
function int parse(string s);
…
}
Int Type
An int object is a 64-bit signed binary integer value.
type int
{
const int
const int
function
function
function
function
…
MIN;
MAX;
// Minimum int value, -(2^63)
// Maximum int value, (2^63)-1
string to_string(string fmt);
int parse(string s);
int to_fixed(float n);
int to_exponential(int n);
}
Version 0.1 (Draft)
37
8 Feb 2016
Object Text Markup Language Specification
Float Type
A float object is a 64-bit IEEE double-precision floating-point real value.
type float
{
const float
const float
const float
const float
const float
const float
const float
…
MIN;
MAX;
INF;
NAN;
EPSILON;
PI;
E;
//
//
//
//
//
//
//
function
function
function
function
function
function
…
bool is_inf(float x);
bool is_nan(float x);
float parse(string s);
string to_string(string fmt);
float to_fixed(int n);
float to_precision(int n);
function
function
function
function
function
function
function
function
function
function
function
function
function
function
function
…
float
float
float
float
float
float
float
float
float
float
float
float
float
float
float
Minimum float value
Maximum float value
Infinity
Not-a-Number
Smallest float difference
pi, 3.1415926+
e, 2.7182818+
sqrt(float x);
log(float x);
log10(float x);
exp(float x);
pow(float x, float y);
sin(float x);
cos(float x);
tan(float x);
asin(float x);
acos(float x);
atan(float x);
atan2(float x, float y);
sinh(float x);
cosh(float x);
tanh(float x);
}
Version 0.1 (Draft)
38
8 Feb 2016
Object Text Markup Language Specification
Date Type
A date object embodies the components of a date and time. Implementations are required to support
dates spanning at least the range from AD 1601-01-01 to 2400-12-31.
type date
{
int
int
int
int
int
int
int
int
int
int
int
…
const
const
const
const
const
date
date
date
date
date
year;
mon;
mday;
yday;
wkday;
week;
hour;
min;
sec;
msec;
ticks;
//
//
//
//
//
//
//
//
//
//
//
Year (1600-2400)
Month (1-12)
Day of the month (1-31)
Day of the year (1-366)
Day of the week (0-6)
Week of the year (1-53)
Hour (0-23)
Minute (0-59)
Second (0-60)
Millisecond (0-999)
msec ticks since the Epoch
MIN;
MAX;
UNKNOWN;
NEVER;
ERROR;
//
//
//
//
//
Minimum supported date value
Maximum supported date value
Unknown date value, < MIN
Never/unset date value, > MAX
Erroneous date value
function
function
function
function
function
function
…
date now();
date utc_now();
date today();
date parse(string s, string fmt);
string to_string(string fmt);
date normalize();
function
function
function
function
function
function
function
…
date
date
date
date
date
date
date
add_years(int delta);
add_months(int delta);
add_days(int delta);
add_hours(int delta);
add_mins(int delta);
add_secs(int delta);
add_msecs(int delta);
operator date new(int year, int mon, int mday);
operator date new(int year, int mon, int mday,
int hr, int min, int sec, int msec);
operator date new(int ticks);
}
Version 0.1 (Draft)
39
8 Feb 2016
Object Text Markup Language Specification
URL Type
A url object embodies the components of an HTTP URL, which specifies the location of a network
resource.
type url
{
int
string
string
int
string
string
string
string
…
length;
protocol;
host;
port;
user;
password;
file;
query;
//
//
//
//
//
//
//
//
Length
Protocol prefix ("http")
Host name
Port number
User name
Password
File
Query parameters
function string query_parm(string parm);
function url parse(string s);
…
operator url new(string proto, string host, string file);
}
operator url url(object obj);
RegEx Type
A regular expression object is a sequence of Unicode characters. Every regular expression object
contains the following variables and functions.
type regex
{
int
last_index;
// Position of last match
function string exec();
function string test(string s);
}
operator regex regex(string s);
Version 0.1 (Draft)
40
8 Feb 2016
Object Text Markup Language Specification
Array Type
All arrays contain the following variables and functions.
type array
{
int
length;
// Number of elements
function int add(object obj);
function bool remove(int index);
function bool remove(string index);
…
function
function
function
function
…
object[] copy(int start, int count);
float avg();
float variance();
float stdev();
operator object [](int index);
operator object [](string index);
}
Event Type
An event object contains details about an event, which are usually triggered by user interaction with
display elements. This includes mouse clicks and movements, and keyboard keystrokes.
type event
{
string
string
object
int
int
string
int
int
int
int
…
Version 0.1 (Draft)
id;
type;
element;
key_code;
key_mask;
mouse_key;
x_abs;
y_abs;
x_offset;
y_offset;
//
//
//
//
//
//
//
//
//
//
41
Name
Event type
Affected object
Unicode key code
Shift modifiers bitmask
Mouse button type
Absolute X window location
Absolute Y window location
X location within parent object
Y location within parent object
8 Feb 2016
Object Text Markup Language Specification
+FIXME
type = string;
x = num;
y = num;
client_x = num;
client_y = num;
screen_x = num;
screen_y = num;
page_x = num;
page_y = num;
from_element = ref;
to_element = ref;
key_code = "\uHHHH";
target = ref;
modifiers = num;
alt_key = t/f;
ctrl_key = t/f;
shift_key = t/f;
reason = num;
height = num;
width = num;
return_value = t/f;
…
}
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
Event type, sans "on"
Element X coordinate
Element Y coordinate
Browser X coordinate
Browser Y coordinate
Screen X coordinate
Screen Y coordinate
Page X coordinate
Page Y coordinate
Object moved from
Object moved to
Unicode char or mouse key
Parent 'this' object
Modifier keys bitmask
'Alt' key was pressed
'Ctrl' key was pressed
'Shift' key was pressed
Event completion code
Resize height
Resize width
Event return value
Thread Type
Each body element executes within its own separate execution thread, and the document object
controls the main execution thread for the browser display. Timer events also execute in their own
separate threads.
type thread
{
string
thread
object
thread
bool
bool
bool
object[]
…
function
function
function
function
…
id;
parent_thread;
parent_element;
document_thread;
is_active;
is_waiting;
is_locked;
locked_objects;
//
//
//
//
//
//
//
//
Name
Parent thread
Parent object
Top document thread
Is executing
Is waiting for an event
Is waiting for a notify
Objects locked by thread
thread get_thread(string id);
bool wait();
bool notify();
bool exit();
function thread set_interval(function func, object[] args, int msec);
function thread set_timer(function func, object[] args, int msec);
function bool cancel_timer(thread th);
}
Version 0.1 (Draft)
42
8 Feb 2016
Object Text Markup Language Specification
Body Type
All body elements contain the following variables and functions.
type body
{
string
thread
object
object
bool
bool
…
id;
this_thread;
document;
response;
is_alive;
is_waiting;
//
//
//
//
//
//
Name
Body execution thread
Document object
Response object
Is executing
Is waiting for events
function body get_body(string id);
function respond();
function respond_to(url site);
…
}
Markup Type
All markup elements contain the following variables and functions.
type markup
{
string
string
int
int
int
int
int
int
…
}
id;
type;
width;
height;
left_abs;
top_abs;
left_offset;
top_offset;
//
//
//
//
//
//
//
//
Name
Markup type
Width, in pixels
Height, in pixels
Absolute X window
Absolute Y window
X location within
Y location within
location
location
parent object
parent object
Request Type
A browser request object is sent from the web content server to the client browser, and contains these
variables and functions.
type request
{
url
object[]
…
}
Version 0.1 (Draft)
site;
elements;
// Origination site
// Data objects sent
43
8 Feb 2016
Object Text Markup Language Specification
Response Type
A browser response object is returned from the client browser to the web content server, and contains
these variables and functions.
type response
{
url
object[]
…
}
site;
elements;
// Destination site
// Data objects returned
Function Type
Every function object contains these variables and functions.
type function
{
string
string
object
…
id;
type;
parent_element;
// Name
// Return type
// Parent object
function object call(object this_obj, object[] args);
…
operator object new();
}
Version 0.1 (Draft)
44
8 Feb 2016
Object Text Markup Language Specification
Markup Types
The following is the list of the supported markup object types. Items marked as ‘deprecated’ have been
deprecated since HTML 4.01; those marked ‘deprecated (5)’ have been deprecated since HTML 5.
Markup type
a
address
b
bl
body
br
button
code
del
div
dir
dd
dl
dt
embed
form
font
h1 … h6
head
hr
http_equiv
i
img
ins
input
kbd
li
link
meta
nobr
object
ol
p
param
pre
quote
s
span
strike
table
tbody
td
textarea
Version 0.1 (Draft)
Description
Anchor
Address
Boldface
Bulleted list
Document body
Line break
Input button
Source code font
Deleted
Division box
Directory item
Directed list data item
Delimited list
Delimited list title item
Embedded object
Input entry form
Font
Heading
Document head
Horizontal rule
Document HTTP control
Italics
Image graphic
Inserted
Form input item
Keyboard font
List item
Document link
Document meta-info
Non-breaking span
Embedded object
Ordered list
Paragraph
Object parameter
Preserve whitespace
Quotation
Strike-out
Span box
Strike-out
Table
Table body
Table cell
Text input box
Notes
deprecated (5)
deprecated (5)
deprecated
deprecated (5)
Levels 1 (highest) to 6 (lowest)
deprecated (5)
deprecated
deprecated, not supported
deprecated, not supported
deprecated (5)
deprecated
45
8 Feb 2016
Object Text Markup Language Specification
tfoot
th
thead
title
tr
tt
u
ul
var
Table footer
Table heading cell
Table header
Document title
Table row
Teletype font
Underlined
Unordered list
Variable item
deprecated
deprecated (5)
deprecated
[…]
Version 0.1 (Draft)
46
8 Feb 2016
Object Text Markup Language Specification
Prior Art
HTML
The fundamental lexical difference between OTML and HTML is that display text elements in HTML are
free-form while markup tags an attributes are enclosed with brackets; the opposite is true of OTML,
where display text elements are enclosed within quotes and brackets while markup tags and attributes
are free-form.
Almost all HTML element tags and style properties translate directly into OTML markup element types
and attributes. Thus this HTML code:
<span style="border:solid 1px red">Error!</span>
becomes this OTML code:
span style.border="solid 1px red" { "Error!" }
HTML (actually XHTML) has slightly better implicit handling of element content bracketing, requiring a
closing ‘</tag>’ for every opening ‘<tag>’. OTML requires matching ‘{‘ and ‘}’ braces, but these can be
accidentally omitted too easily. (Non-XML conforming HTML also suffers from this problem.) OTML
solves this by allowing closing ‘}’ braces to be followed by a ‘: tag’ to indicate which block brace is
being matched/closed. This syntax is not required, but is recommended for large object blocks.
All of the HTML character entities (252 of them) are retained as character escape sequences. In addition,
more character escape sequences are available to provide for the most common display formatting
characters. For example, this HTML fragment:
  •  Name:   John Doe
could be coded in OTML as:
"\ \ \•\  Name:\ \  John Doe"
or even more succinctly as:
"\_\_\•\_ Name:\_\_ John Doe"
[…]…what about <object> and <var> tags?
[…]…threading per body element
[…]
XML
Like XML, OTML syntax requires elements (blocks) to be enclosed within matching tags (braces).
Version 0.1 (Draft)
47
8 Feb 2016
Object Text Markup Language Specification
OTML documents can be embedded within XML documents. For example:
<!-- OTML document embedded within an XML document -->
<?xml version="1.0" stand-alone="yes">
<document type="text/OTML">
document format="1.0"
{
head { … }
body { … }
data { … }
…
}
</document>
Be aware that, per XML encoding rules, certain special characters must be encoded appropriately;
specifically, the characters <, >, and & must be replaced by their equivalent XML character entities <,
>, and &, or by <, >, and &, respectively.
OTML documents may contain XML documents. For example, this illustrates an XML document encoded
as an OTML data object:
// XML data, as an OTML object
data xmldoc1
{
"<?xml version=\"1.0\" stand-alone=\"yes\">";
"<employee>";
" <name><!-- As a structure -->";
"
<first>John</first>";
"
<middle>Q</middle>";
"
<last>Doe</last>";
" </name>";
…
"</employee>";
}
An alternative way to encode XML text as a data object within an OTML document, which does not
require quoting every text line of the XML document, looks like this:
// XML data, as an OTML object
data xmldoc1
{
"<?xml version=\"1.0\" stand-alone=\"yes\">
<employee>
<name><!-- As a structure -->
<first>John</first>
<middle>Q</middle>
<last>Doe</last>
</name>
…
</employee>";
}
One thing to be aware of when encoding XML within OTML is that embedded quote marks (") must be
encoded appropriately, replacing them with either the OTML escaped quote sequences (\") or their
Version 0.1 (Draft)
48
8 Feb 2016
Object Text Markup Language Specification
equivalent XML character codes (\u0034) or entities (\"). For example, an XML element
containing embedded quote characters could be encoded as:
"<quotation>She said, \"Well, <i>hello</i> there!\"</quotation>"
or as:
"<quotation>She said, \u0034Well, <i>hello</i> there!\"</quotation>"
[...]
CSS
OTML syntax for style objects closely resembles the syntax for CSS <style> elements.
[…]
JavaScript
OTML syntax resembles JavaScript syntax more than any other language. Some of the ways in which
OTML differs from JavaScript, though, are:

Replacement of the ‘var’ variable definition keyword with several data type keywords (‘object’,
‘int’, ‘string’, ‘float’, and ‘bool’).

Variables have block scope, whereas in JavaScript they have function scope.

Functions and variables have a scope limited to their parent object, whereas in JavaScript they
essentially have document scope (even if defined within separate <script> elements). This means
that name prefixes and suffixes are not as necessary to make names unique as they are in
JavaScript.

Variables are default-initialized, whereas in JavaScript they are not.

All functions return a value, whereas in JavaScript they only return values explicitly.

Event-handler functions must be explicitly defined as such, whereas in JavaScript they do not.

An event for an element is specified as the ‘event=xxx’ attribute of a member function of the
element, instead of as arbitrary JS code in an ‘onxxx’ attribute of the element.

Character escape sequences are different. Most of the control code sequences (e.g., \a, \b, \v) have
been removed, since they do not have much practical application within web documents. New ones
have been added (e.g., \z, \_) which do have practical uses within web documents.

The logical ‘&&’, ‘||’, and ‘!’ operators have been replaced by the more readable ‘and’, ‘or’, and
‘not’ keywords.

The equality (==, !=) and relational operators (>, >=, <, <=) all have the same precedence.

More useful character escape sequences have been provided.
Version 0.1 (Draft)
49
8 Feb 2016
Object Text Markup Language Specification

Regular expressions are just string values, thus requiring no special lexical analysis.

x.getAttribute() and x.setAttribute() are handled as direct access to an object’s attribute by
name, as x@foo.
[…]…eval()…
[…]
JSON
Consider this JSON structured data value:
"employee": {
"first-name":
"last-name":
"address": {
"street":
"city":
"state":
},
"emp-id":
"hire-date":
}
"Jason",
"Alexander",
"1001 Shady Oak Lane",
"Hollywood",
"CA"
"8652-17-221255"
"2011-05-01"
This can be translated directly into OTML as:
data employee = {
first_name = "Jason";
last_name = "Alexander";
address = {
street = "1001 Shady Oak Lane";
city =
"Hollywood";
state = "CA";
};
emp_id =
"8652-17-221255";
hire_date = "2011-05-01";
}
The primary difference between the two is that OTML data elements look syntactically more like
variable definitions with initializer values (because that is essentially what they are). A more typecomplete example could be coded as:
Version 0.1 (Draft)
50
8 Feb 2016
Object Text Markup Language Specification
data employee =
{
string first_name =
string last_name =
address =
{
string street =
string city =
string state =
};
string emp_id =
string hire_date =
}
"Jason";
"Alexander";
"1001 Shady Oak Lane";
"Hollywood";
"CA";
"8652-17-221255";
"2011-05-01";
Another difference is that an OTML data element must have a lexically valid name, whereas JSON allows
an element to have any kind of name (because it is a quoted string). This means that OTML data objects
and data members cannot have names that are the same as OTML keywords. It is recommended that
such names be suffixed with an underscore character (for example, the JSON name "date" would
become the OTML name date_).
[…]
AJAX
[…]…XMLHttpRequest object…
HTML Browsers
The fundamental difference between OTML and HTML browsers is the nature of the data transmitted
between them and web servers. HTML browsers are concerned with sending and receiving primarily
textual content containing display markup elements; OTML browsers are concerned with sending and
receiving data objects containing display markup elements.
Another important difference is that in HTML, executable code (<script> elements) are separate from
display elements. In OTML, executable code (function objects) are integrated into, and directly part of,
the display elements.
The requirements that OTML places on browsers are meant to improve the GUI experience of the user.
Primary to this is the way a browser behaves when an action is instigated by the user, such as clicking a
button or URL link. HTML browsers initiate a submit action, but continue to allow the user to interact
with other action elements in the display, which can cause multiple (and possibly conflicting) submit
actions. OTML browsers, in contrast, immediately disable all further user interactions with display
elements once a submit action is initiated (unless the elements are explicitly defined to remain active).
This eliminates the possibility of the user issuing multiple conflicting actions.
Likewise, requiring the OTML browser to visually indicate that a submit action has been instigated (such
as by dimming the display or brandishing a pop-up panel) makes it clear to the user that he has, in fact,
triggered an action.
[…]…threading, multiple independent bodies…
[…]…respond to multiple sites simultaneously from a submit, using multiple bodies…
[…]
Version 0.1 (Draft)
51
8 Feb 2016
Object Text Markup Language Specification
Examples
HTML + CSS
<!-- A paragraph -->
<p class="para"
style="margin-right:0.5in">
Hello, world.
This is an <i>example</i>
of an <u>HTML</u> document.
</p>
<!-- Some more text -->
<p>
Lorem ipsum
<i>dolor sit <b>amet</b></i>,
consectetur adipisicing elit.
<br/>
<span style="color:blue">
E = mc<sup>2</sup>.
</span>
</p>
<hr/>
OTML
// A paragraph
p class=para
style.margin_right="0.5in"
{
"Hello, world.";
"This is an " i { "example" };
"of a " u { "OTML" } " document.";
}
// Some more text
p {
"Lorem ipsum";
i{"dolor sit " b{"amet"}} ",";
"consectetur adipisicing elit.";
br;
span style.color="blue"
{ "E = mc" sup{"2"} "." }
}
hr;
Version 0.1 (Draft)
52
8 Feb 2016
Object Text Markup Language Specification
Unresolved Issues
Lexical Issues










Newlines (CR LF, or CR/LF/CR LF?)
Require '\r\n' or only '\n', i.e., is '\n' CR LF or just LF?
o '\L' could be LF, or '\N' could be CR LF
Should names be case-insensitive in all contexts (color vs. Color vs. COLOR)?
Use XML or JavaScript names (margin_left / marginLeft)?
Do system names differ from user-defined names?
o Math.PI, date.$now, foo vs. $foo, this vs. $this
Should attribute names differ from variable names?
o x.color, x@color, x.@color, x@.color, x['color'], x['@color']
Regular expressions are strings, requiring no special lexer
Attributes and variables share the same syntax, do not need quoted values
Char escape sequences must not conflict with regex operators
Hidden variables must return in the browser response object
Syntax Issues















Separating content text from markup
Separating pure data from content
Do we still need var in addition to object?
Keywords ‘var’ and ‘object’ conflict with <var> and <object>
o Give <object> markup a different name
o Do not support <var> markup, or give it a different name
Should variable scoping be like JavaScript or like C?
Restrict include objects to the top document level only?
Anonymous (unnamed) dynamic functions?
Can plain data objects contain functions?
String concatenation operator (+ or .)?
Auto-spacing between text elements and punctuation text
o ‘;’ delimiter implies word spacing: {"hi" i{"mom"}} vs {"hi"; i{“mom"}}
o ‘\s’ could force word spacing: {"hi" \s i{"mom"}}
o Must be able to handle adjacent markup elements: {"e" sup{"2"}}
Should on_event=expr attributes be permitted on display objects?
o If so, what would expr contain?
<pre> (preserve whitespace) markup elements using quoted text elements
Can break and continue statements specify a loop label?
How is eval() handled?
Need an ‘expr is type’ operator?
Version 0.1 (Draft)
53
8 Feb 2016
Object Text Markup Language Specification
References
1. www.ECMAscript.org
2. es5.GitHub.io/#toc
Version 0.1 (Draft)
54
8 Feb 2016