Aventail Web Translation
Web Developer Guide
May 2007
© Aventail Corporation 2007. All rights reserved.
Aventail, Aventail.Net, Aventail ExtraNet Center, Aventail ExtraWeb, Aventail ExtraNet, Aventail Connect,
and their respective logos are trademarks, service marks or registered trademarks of Aventail
Corporation.
Other product and company names mentioned in this publication are the trademarks of their respective
owners.
2 • WEB CONTENT TRANSLATION
Table of Contents
Overview .......................................................................................................3
Introduction ............................................................................................... 3
Version Compliance ..................................................................................... 4
Who is this document for? ............................................................................ 4
How does the Aventail Web Translation server works? ..................................... 4
Content-Type of Web Pages ..........................................................................5
Recommendations ....................................................................................... 5
Character Encoding .......................................................................................5
Recommendations ....................................................................................... 5
Cookie translation .........................................................................................5
Recommendations ....................................................................................... 5
URLs ..............................................................................................................6
Recommendations ....................................................................................... 6
HTML translation ...........................................................................................6
Recommendations ....................................................................................... 6
CSS translation ..............................................................................................6
JavaScript translation ...................................................................................7
Translation rules ......................................................................................... 7
Recommendations ....................................................................................... 9
VBScript translation ......................................................................................9
Java applet, ActiveX and Flash translation ....................................................9
Recommendations ....................................................................................... 9
XML translation ...........................................................................................10
Recommendations ..................................................................................... 10
Web aliases .................................................................................................10
Recommendations ..................................................................................... 10
Miscellaneous ..............................................................................................10
Referrer lookup ......................................................................................... 10
WEB CONTENT TRANSLATION • 3
Overview
Introduction
A truly clientless VPN appliance requires a robust web-content translation engine.
The reason is simple: all network references within the web content must be
changed to point to the VPN appliance instead of internal hosts. With full-client VPNs
or pseudo-clientless VPN appliances that use web-deployed ActiveX or Java clients,
this host mapping can be done on the client. For VPN use on the broadest possible
browser base however, web content translation is indispensable.
A simple example is effective in illustrating the translation of web content. Imagine
an HTML page with the following anchor tag that links to an internal resource:
<a href=“http://owa.in.aventail.com”>Outlook Web Access</a>
Within the corporate network, such a link works perfectly. When the user clicks on
the link in the browser, the latter asks the internal DNS server what the IP address
of “owa.in.aventail.com” is and retrieves the desired page.
Outside the corporate network however, say at an employee’s home, this link does
not work. The browser asks the DNS server of the local ISP what IP address
corresponds to “owa.in.aventail.com” and is told that that address doesn’t exist.
Even if the link were to a routable IP address within the corporate network, the
corporate firewall would probably prevent the browser from accessing the desired
resource.
Web content translation is the process of changing (translating) the link above into
something like:
<a href=“https://ex2500.aventail.com/go/owa.in.aventail.com”>Outlook
Web Access</a>
The hostname is changed from the internal hostname to the DNS-resolvable
hostname of the VPN appliance. However, the appliance doesn’t hold the desired
resource; therefore that end resource must be encoded in some way within the URL.
In our example, it is encoded within the path portion of the URL.
If the only kind of translation necessary were a translation of HTML links such as the
above, things would be easy. This unfortunately is not so. There are numerous ways
4 • WEB CONTENT TRANSLATION
to reference network resources in HTML alone. Javascript, the now-ubiquitous web
scripting language, augments the scope of the problem tremendously. Javascript in
fact makes the problem intractable. It provides means of executing code on the
browser and it allows the user to feed in additional input that is unknown at the time
the server-side translation is done. For example, the user can be prompted for a URL
using Javascript and the browser can then be instructed to go to that URL.
Version Compliance
Users of the document must note that this document is updated for each ASAP
version that is released by Aventail. Please check the version you are running on
Aventail box is in compliance with that of this document.
This document is valid for releases ASAP 8.6 through ASAP 8.8.
Who is this document for?
This document is for Web Application Developers who wish to make their software
easy to translate by the Aventail translation engine. It provides a set of guidelines to
achieve this goal and gives a brief overview of certain aspects of the translation
engine.
How does the Aventail Web Translation server works?
The Aventail Web Translation server is part of the Aventail VPN appliance which sits
at the network perimeter. It isolates and protects private Web-based resources from
unauthorized external access.
A user first logs in to the Aventail appliance and is presented with the Workplace
page. The user then follows a link on that page to request a resource from the
internal network, or enters a URL on the Workplace page. All URLs point to the
Aventail appliance.
The Aventail Web Translation server translates an incoming URL using an "alias"
contained in the URL. Aliases are used to obscure the URLs that point to resources on
your internal (or “downstream”) servers. Because all requests are directed to the
Aventail appliance, the user only sees the incoming URL that contains the alias. The
Aventail Web Translation server matches the alias to a list it stores in memory and
translates the URL.
Once it determines that the URL submitted by the user is valid and points to a
resource on the network, the Aventail appliance checks its access control and
authentication rules to make sure the user is authorized to access the requested
resource.
WEB CONTENT TRANSLATION • 5
Content-Type of Web Pages
Although the Aventail translation engine possesses heuristics to guess the type of
content in an HTTP response from the backend web server, it is best to avoid relying
on this and to instead specify the type explicitly.
Recommendations
The single most important thing you can do to ensure proper translation is to make
sure that all pages are served up with the correct “Content-Type” header. In
particular, it is imperative that:
1. HTML content is served up with the “text/html” Content-Type.
2. Javascript content is served up with the “application/x-javascript” ContentType.
3. XML content is served up with the “text/xml” Content-Type.
Character Encoding
As an internationalized network device, the Aventail appliance uses UTF-8 exclusively
for its internal work.
Recommendations
1. Use UTF-8 exclusively for all your Web content. Do not use the Microsoft
code-pages. This particularly important when POSTing form data.
Cookie translation
The path portion of a “Set-Cookie” header is translated. The domain portion of this
header is discarded. For example, if the backend web server sends the header:
Set-cookie: x=y; path=/; domain=.in.aventail.com
and the alias associated with the web resource is “morty”, then this header is
translated to:
Set-Cookie: x=y; path=/morty/
This forces the web browser to send this cookie back only to the alias (and therefore
the web server) that set the cookie.
Recommendations
1. Avoid sophisticated client-side cookie manipulations using Javascript
2. Avoid using URLs in cookies. Although an attempt is made to translate those
URLs, there is some risk of letting them through.
6 • WEB CONTENT TRANSLATION
URLs
The Aventail translation engine can handle URLs in any form:
1. Fully-qualified URLs (e.g. “http://www.acme.com/dir1/dir2/file.html”)
2. Absolute paths (e.g. “/dir1/dir2/file.html”)
3. Relative paths (e.g. “../dir2/file.html”)
Recommendations
1. It is best to use relative paths exclusively in your web application. This of
course also has the advantage of making your web application more portable
(e.g. to another web server and directory).
HTML translation
HTML translation is handled very reliably by the Aventail appliance.
Recommendations
1. Make sure your HTML is formatted according to standard, especially the
quotes around attributes in tags. Ideally, use XHTML formatting. HTML
attributes containing a value (for example, src="path") may not be
translated if they contain any of the following errors:
a. Spaces before or after the equal sign.
src ="path" or src= "path"
b. Leading or trailing spaces within the value.
src=" path" or src="path "
c. Missing lead or end quotation mark.
src="path or src=path"
2. Avoid base tags, such as:
<base href="http://myapp.internal.acme.com/dir/" />
in your HTML code.
3. The “meta” tag is commonly used to redirect users to another page. For
example:
<meta http-equiv="refresh" content="5;url=redirectURL.html" />
The meta tag’s content attribute must be formatted carefully; don’t include
line breaks or spaces.
CSS translation
CSS content should be handled without difficulty.
WEB CONTENT TRANSLATION • 7
JavaScript translation
JavaScript translation is complex and there are certain coding practices that you can
use to make sure your JavaScript code translates correctly.
Translation rules
The current Aventail JavaScript translation engine is a parse-tree based engine that
can handle complex syntax. It is a rule-based translator that makes use of Aventail’s
client-side JavaScript library. The rules are stored in:
/usr/local/extranet/etc/jstrans.cfg
The translation rules are divided into four categories:
1. Assignment statements (type ASSIGNMENT)
2. Function calls (type CALL)
3. Substitution of one language token with another (type SUBSTITUTION)
4. Special kind of substitution in a function call (type SUBARGS)
You should not need to write any new rules. It is however useful to be aware of the
rules as you follow the recommendations below.
Here are the majority of the JavaScript rules as of September 2006:
# Javascript Translation
# Assignment Statement Translation
#
# Type
Left Hand Side (LHS)
#
ASSIGNMENT location
ASSIGNMENT .location
ASSIGNMENT .href
ASSIGNMENT .src
ASSIGNMENT .action
ASSIGNMENT document.domain
ASSIGNMENT document.cookie
ASSIGNMENT .innerHTML
ASSIGNMENT .url
# Function Call Translation
#
# Type
Function Name
#
CALL
.addBehavior
CALL
.showModalDialog
CALL
.showModelessDialog
CALL
.insertAdjacentHTML
CALL
location.replace
CALL
location.assign
Encapsulate RHS with
aventail.translate_url
aventail.translate_url
aventail.translate_url
aventail.translate_url
aventail.translate_url
aventail.setDomain
aventail.setCookie
aventail.postText
aventail.translate_url
Param Encapsulate param with
1
1
1
2
1
1
aventail.translate_url
aventail.translate_url
aventail.translate_url
aventail.postText
aventail.translate_url
aventail.translate_url
8 • WEB CONTENT TRANSLATION
CALL
eval
1
aventail.post
# Subsitution of one token with another
#
# lvalue/rvalue: 0: substitute always
#
1: substitute only if token is an rvalue (read from)
#
2: substitute only if token is an lvalue (written to)
#
# Type
Token
lval/ Replacement
#
rval
SUBSTITUTION
location.pathname
0
aventail.location.pathname
SUBSTITUTION
.location.pathname
0
.aventail.location.pathname
SUBSTITUTION
document.domain
1
document.aventail.getDomain()
SUBSTITUTION
document.domain
2
aventail.junk
SUBSTITUTION
.execCommand
0
.aventail.execCommand
SUBSTITUTION
location.pathname
0
aventail.location.pathname
SUBSTITUTION
.location.pathname
0
.aventail.location.pathname
SUBSTITUTION
location.host
0
aventail.location.host
SUBSTITUTION
.location.host
0
.aventail.location.host
SUBSTITUTION
location.hostname
0
aventail.location.hostname
SUBSTITUTION
.location.hostname
0
.aventail.location.hostname
SUBSTITUTION
location.port
0
aventail.location.port
SUBSTITUTION
.location.port
0
.aventail.location.port
SUBSTITUTION
location.protocol
0
aventail.location.protocol
SUBSTITUTION
.location.protocol
0
.aventail.location.protocol
SUBSTITUTION
location.href
1
aventail.location.href
SUBSTITUTION
.location.href
1
.aventail.location.href
SUBSTITUTION
location.search
1
aventail.location.search
SUBSTITUTION
.location.search
1
.aventail.location.search
SUBSTITUTION
location
1
aventail.location
SUBSTITUTION
.scripts
1
.aventail.getScripts()
# Subsitution of one token with another, with a twist:
# Take the "stem" of the call and make it the first argument in the new
function.
# For example:
# If we have the token "foo.bar" and the replacement "aventail.ourFoo":
# We will replace the construction "anObject.foo.bar(arg1, arg2)" with:
# aventail.ourFoo(anObject, arg1, arg2)
# This allows us to verify the type of the anObject object prior to
operating on it
#
# lvalue/rvalue: 0: substitute always
#
1: substitute only if token is an rvalue (read from)
#
2: substitute only if token is an lvalue (written to)
#
3: special case, turn a flat lvalue into a function call
#
# The "3" case above is used in cases such as "foo.location" to allow
us to ensure
# that "foo" is an object such as a document, window, or frame, and not
some user-defined
# object that just happens to have a "location" member.
#
# Type
Token
lval/
Replacement
WEB CONTENT TRANSLATION • 9
#
SUBARGS
SUBARGS
SUBARGS
SUBARGS
SUBARGS
SUBARGS
document.close
document.write
document.writeln
.open
.Open
.location
rval
0
0
0
0
0
3
aventail.docClose
aventail.docWrite
aventail.docWrite
aventail.objOpen
aventail.objOpen
aventail.objLocation
Recommendations
1. Do not use DOM references as variables names. For example, do not call any
of your variables “location”. See the translation rules above to know what to
avoid.
2. Avoid the “with” construct: with(object) {statements}.
3. Avoid passing DOM objects as parameters to functions. For example, avoid
writing functions of the form:
function test(mywin) { mywin.location = “http://owa.in.aventail.com” }
Instead, make sure that the network-sensitive javascript appears verbatim,
e.g.
window.location = “http://owa.in.aventail.com”;
In other words, do not hide the names of the underlying DOM objects.
4. Do not set a base tag using JavaScript. This invalidates all the translated
URLs on the page.
5. Do not use conditional compilation for Internet Explorer (e.g. “@if …”)
6. Do not use Microsoft Script Encoding (e.g. language “JScript.Encode”)
VBScript translation
VBScript translation is no longer supported.
Java applet, ActiveX and Flash translation
No explicit translation of Java applets, ActiveX or Flash objects is performed. If
possible, avoid using them entirely.
Recommendations
1. If it is not possible to avoid using these objects entirely, consider constructing
the network references they need from the URL of the page they are on.
Perform this construction dynamically at run time.
10 • WEB CONTENT TRANSLATION
XML translation
Since XML needs to be described to make sense of the data, you will need to identify
the portions of the XML content that require translation. This is done in the file:
/usr/local/extranet/etc/custom-xmltrans.cfg
The format of the rules to add to this file is:
ELEMENT ATTR1 ATTR2 ... ATTRn
This instructs the translation engine to look for element “ELEMENT” in the XML and to
translation its attributes “ATTR1”, “ATTR2”,..., “ATTRn”. These attributes are URLs,
of course.
Recommendations
1. Add your XML translation rules to custom-xmltrans.cfg
Web aliases
Web aliases are declared when you configure a resource. They are used to hide the
hostname of the internal server.
Recommendations
1. Avoid using the same name for the alias as for the top level directory of your
application. For example, if your web appliance lives in
“http://myapp.in.aventail.com/coolapp/”, do not use “coolapp” as the alias for
the Aventail resource.
Miscellaneous
Referrer lookup
When a request for an absolute or relative URL for which there is no matching alias
comes in, the Aventail Web Translation server looks at the “Referer” HTTP header or
the referrer cookie that it sets. This header or cookie is used to correctly assemble
the destination URL. This is a best effort attempt and should not be relied upon.