Text Based Protocols Lesson 3 IHA præsentation 1 Outline for today • Text Based Protocols • Augmented Backus-Naur (ABNF) • Motivation • The Language • Usage • Text-based vs. Binary encoding IHA præsentation 2 Augmented Backus-Naur Form • A metalanguage • Based on Backus-Naur Form • Describes a formal system of a language to be used as a bidirectional communication protocol • Many variants of Backus-Naur • Extended (EBNF) • Augmented (ABNF) IHA præsentation 3 Augmented Backus-Naur Format The Language IHA præsentation 4 ABNF – Text-based Syntax An ABNF Specification is a set of derivation rules. A rule is defined by the following sequence: name = elements CR LF name of the rule one or more rule names separates the name from the definition of the rule Rules resolve into a string of terminal values, sometimes called characters IHA præsentation 5 ABNF – Text-based Syntax Example rulename = “abc” ABNF strings are case insensitive. Hence: rulename = “abc” and rulename = “aBc” will match “Abc”, abC”, ABc”………… IHA præsentation 6 ABNF – Operators Concatenation: Rule1 Rule2 A rule may be defined by listing a sequence of rule names. For example: foo = %x61 ;a bar = %x62 ;b A semicolon starts a comment that continues to the end of line mumble = foo bar foo The rule <mumble> matches the lowercase string “aba” individually specified character – makes the rule case sensitive IHA præsentation 7 ABNF – Operators Alternatives: Rule1 / Rule2 A rule may be defined by a list of alternative rules by a forward slash (“/”): foo / bar will accept <foo> or <bar> Value Range Alternatives: %c##-## A range of alternative numeric values can be specified compactly using a dash (”-”) to indicate the range of alternative values DIGIT = %x30-39 IHA præsentation 8 ABNF – Operators Incremental Alternatives: Rule1 =/ Rule2 Additional alternatives may be added to a rule through the use of =/ s that the ruleset: ruleset = alt1 / alt2 ruleset =/ alt3 ruleset =/ alt4 / alt5 is the same as specifying ruleset = alt1 / alt2 / alt3 / alt4 / alt5 IHA præsentation 9 ABNF – Operators Sequence Group: (Rule1 Rule2) Elements enclosed in parentheses are treated as a single element. Thus, elem (foo / bar) blat matches (elem foo blat) or (elem bar blat), and elem foo / bar blat matches (elem foo) or (bar blat) IHA præsentation 10 ABNF – Operators Variable Repetition: *Rule The operator “*” preceding an element indicates repetition. The full form is: <a>*<b>element where <a> and <b> are optional decimal values, indicating at least <a> and at most <b> occurrences of the element Specific Repetition: nRule A rule of the form: <n>element is equivalent to <n>*<n>element IHA præsentation 11 ABNF – Operators Examples - Variable Repetition: *Rule Default values are 0 and infinity so that *<element> allows any number, including zero 1*<element> requires at least one 3*3<element> allows exactly 3 1*2<element> allows one or two. Example - Specific Repetition: nRule 2DIGIT is a 2-digit 3ALPHA is a string of three alphabetic characters IHA præsentation 12 ABNF – Operators Optional Sequence: [Rule] Square brackets enclose an optional element sequence: [foo bar] is equivalent to: *1(foo bar) IHA præsentation 13 ABNF – Operators Operator Precedence 1. Rule name, prose-val, Terminal value 2. Comment 3. Value Range 4. Repetition 5. Grouping, Optional 6. Concatenation 7. Alternative IHA præsentation 14 ABNF – Example (from wikipedia) postal-address = name-part street zip-part name-part = *(personal-part SP) last-name [SP suffix] CRLF name-part =/ personal-part CRLF personal-part = first-name / (initial ".") first-name = *ALPHA initial = ALPHA last-name = *ALPHA suffix = ("Jr." / "Sr." / 1*("I" / "V" / "X")) street = [apt SP] house-num SP street-name CRLF apt = 1*4DIGIT house-num = 1*8(DIGIT / ALPHA) street-name = 1*VCHAR zip-part = town-name "," SP state 1*2SP zip-code CRLF town-name = 1*(ALPHA / SP) state = 2ALPHA zip-code = 5DIGIT ["-" 4DIGIT] IHA præsentation 15 ABNF – Exercise 1 Write a grammar to accept the following input: “I would like to fly from ___ to ___ (please/thanks)” where the following cities are allowed: “paris”, “new york”, “dublin” and “please” or “thanks” is an optional extra IHA præsentation 16 ABNF - Excise 2 Specify, using ABNF, the syntax for a directory path, like users/smith/file or users/smith/WWW/file with none, one or more directory names, followed by a file name. IHA præsentation 17 ABNF - Exercise 3 Specify the syntax of an e-mail header field with the following properties: Name: .Weather. Values: .Sunny. or .Cloudy. or .Raining. or .Snowing. Optional parameters: ";" followed by parameter, "=" and integer value Parameters: .temperature. and .humidity. Examples: Weather: Sunny; temperature=20; humidity=50 Weather: Cloudy IHA præsentation 18 Augmented Backus-Naur Format Usage IHA præsentation 19 ABNF – Text-based Syntax Usage: - HTTP - Browsing the internet - Session Initiation Protocol - VoIP (IP telephony), IMS (IP Multimedia Subsystem) - Session Description Protocol IHA præsentation 20 ABNF (VoIP Example) SIP SIP INVITE sip:e9-airport.mit.edu SIP/2.0 From: "Dennis Baron"<sip:6172531000@mit.edu>;tag=1c41 To: sip:e9-airport.mit.edu Call-Id: call-1096504121-2@18.10.0.79 Cseq: 1 INVITE Contact: "Dennis Baron"<sip:6172531000@18.10.0.79> Content-Type: application/sdp Via: SIP/2.0/UDP 18.10.0.79 200 OK IHA præsentation 21 VoIP (SIP Session) IHA præsentation 22 Session Initiation Protocol SIP-message = Request / Response Request = Request-Line *( message-header ) CRLF [ message-body ] Response = Status-Line *( message-header ) CRLF [ message-body ] IHA præsentation 23 Session Initiation Protocol Request-Line=Method SP Request-URI SP SIP-Version CRLF Status-Line=SIP-Version SP Status-Code SP Reason-Phrase CRLF Method= INVITEm / ACKm / OPTIONSm / BYEm / CANCELm / REGISTERm / INFOm / PRACKm / SUBSCRIBEm / NOTIFYm / UPDATEm / MESSAGEm / REFERm / PUBLISHm / extension-method INVITEm=%x49.4E.56.49.54.45 ; INVITE in caps IHA præsentation 24 Text-based vs. Binary Encoding IHA præsentation 25 Text-based vs. Binary Encoding ABNF Specification ASN.1 Specification Family = "Family" CRLF *(Person) "End of Family" Family = SEQUENCE OF Person Person = "Person" CRLF " Name: " 1*A CRLF " Birthyear: " 4D CRLF " Gender: " ("Male"/"Female") CRLF " Status: " ("unmarried"/ "married"/ "divorced"/ "widow"/"widower" ) Person := SEQUENCE { name VisibleString, birthyear INTEGER, gender Gender, status Status } Gender := ENUMERATED { male(0), female(1) } Status := ENUMERATED { unmarried(0), married(1), divorced(2), widow(3), widower(4) } IHA præsentation 26 Text-based vs. Binary Encoding Example of textual encoding Example of BER encoding Family Person Name: John Smith Birthyear: 1958 Gender: Male Status: Married Person Name: Eliza Tennyson Birthyear: 1959 Gender: Female Status: Married End of Family 30 34 30 16 1A 0A J o h n S m i t h 02 02 07 A6 0A 01 00 0A 01 01 30 1A 1A 0E E l i z a T e n n y s o n 02 02 07 A7 0A 01 01 0A 01 01 169 octets (excl. new lines) 18% efficiency compared to ASN.1 PER 54 octets 57% efficiency compared to ASN.1 PER IHA præsentation 27 Text-based vs. Binary Encoding The PER (unaligned variant) encoding of the same ASN.1 and the same data would be the following 31 octets: 00000010 (no of persons in family) 000011 10 (14 characters) 00001010 (10 characters) 100010 1 E 1001010 J 1101100 l 1 101111 o 1101001 i 11 01000 h 1 111010 z 110 1110 n 11 00001 a 0100 000 010 0000 10100 11 S 1010 100 T 110110 1 m 11001 01 e 1101001 i 110111 0 n 1110100 t 1101110 n 1 101000 h 1111001 y 00 000010 (2 octest) 1 110011 s 00 0011110 100110 (1958) 11 01111 o 0 (male) 110 1110 n 0 01 (married) 0000 0010 (2 bytes) 0000 01111010 0111 (1959) 1 (female) 001 (married) IHA præsentation 28