Email-history-APEC

advertisement
History and Evolution of Electronic
Mail
(and a bit of a tutorial)
John C Klensin, Ph.D.
APEC, 2014-10-30
About Internet History – A Disclaimer
• Early period (~1965 to ~1985)
– Many parallel developments
– Extensive collaboration and idea-sharing
• Recent period
– Internet has become important
– Many claims of individual invention
• I will tell the story I know:
– It is not the only story; others may be equally
accurate
2
More Warnings
Everything is connected to everything else
Many places where this talk says (another talk)
Any time you have a spare couple of weeks…
Going to say some controversial things
Welcome questions and arguments
(mostly tomorrow)
3
Before the Beginning:
Messages to the Computer Operator
• Probably goes back to handwritten notes with
job submissions
• Some batch job control options
– For example, device mount instructions
• Similar user → operator messages in early
time-sharing systems
• Typically one-way only!
4
The CTSS Insight
• MIT’s Compatible Time-Sharing System
– Often recognized as the beginning of interactive,
multiple concurrent user, computing
• Two features of many
– Messages to operators
– Interprocess signaling between users
• Why not permit users to send messages to
each other and notify on arrival?
(van Vleck and Morris, 1965-1966)
5
Parallel and Slightly Later
Developments
• DTSS
• Sigma-7
• MTS
• Multics
• TENEX ?
• CompuServe
• All multiple-user, single machines until
– MIT cloned CTSS and ran two separate systems
with tape transfer of data… and messages
– 6 - 12 hour turnaround, plus or minus
6
From the Beginning
• Postal mail model
– Envelope and content
– Origination, transport, and delivery systems
• Terminology changed
– Mail, electronic mail, net mail, email
– MUA, MTA, MSA, MDA
• Even regulatory concerns
7
Then the ARPANET Happened
• Original usage model involved resourcesharing
– First two important application protocols were
remote login (“telnet”) and file transfer (“FTP”)
– FTP very soon acquired a “mail” verb and
conventions
– “netmail” and “user@host”
• FTP was recognized as not a really good model
• ITU OSI work, including X.400, started
8
Internet Mail Redesign 1
• Large community effort
• Mail transport separated from FTP
• Separation of envelope and headers
– Detailed specification of headers
– Detailed specification of envelope and transport
model
• DNS-based and explicit models for dealing with
relays and intermittently-connected hosts.
• ARPANET/Internet still very restricted use
• Deployed 1981-1982, DNS mostly later
9
Alternative Mail Systems
• Mail over UUCP
• Development of BITNET/EARN/NetNorth and mail
also JANET, etc.
• FidoNet
• Many private/proprietary mail system
developments …
Just in the US:
– ccMail
– Notes
– AOL
-- MSMail
-- CompuServe
-- Delphi
-- MCIMail
-- MS Exchange
(later)
• ITU/ISO X.400 / MHS
10
A World of Gateways
• People wanting to communicate no matter
which mail system they were using
• “Gateways” for translation
– Had to be built one pair at a time
– Different information models
– Never perfect
– Information often got lost, messages sometimes.
11
SMTP as Common Denominator
• Since the early 1990s, mail exchange among
other systems
– primarily went through Internet-(and SMTP-) capable
gateways
– Many-one rather than many-many conversions
• SMTP became the model for envelopes in many
other systems
• Headers:
– Internet Mail Header Format (RFC 822) for many
– X.400 for several more
– Completely proprietary for a few
12
It Just Works
(and the robustness principle)
• SMTP Design
–
–
–
–
Very simple command structure
Rules against guessing and transforming midway
Can deliver almost anything – sort out at destination
Notification of non-delivery
• Headers
– ASCII “name: value” fields
– Few requirements; recipients generally ignore what they do not
understand
• Robustness: Senders expected to be careful, receivers
liberal
• All worked well until anti-spam came along (another talk)
13
Why Internationalize?
• People prefer to communicate in own
languages (obvious, and always has been)
• Use of “foreign” languages and scripts can be
hard
• Support for localization
– Very few people really care about “i18n”
– Without it as foundation, chaos or isolation
14
Going Multilingual and Multimedia
• IETF effort started ~1990 to standardize coding and
identification for non-Latin script content
– Not the first use of those scripts in Internet email
– Just mechanisms to identify what was being used so
promoting interchange
• Language issues immediately came into play
• Effort expanded to multimedia mail, etc.
• Result was MIME
– Structured messages
– Content/Media type and “charset” identification
– Plus multimedia stuff
(another talk)
• And an SMTP extension/ negotiation mechanism
15
The Internationalization Tradeoff
and People
• More accessibility to Internet but more
fragmentation:
– Obvious advantages for communication within a
language/script community
– Disadvantages for communication among people and
communities who use different languages and/or
scripts
• Enables local content
– More accessibility
– Translation possible, but with all the usual problems
– Email bodies are content
16
Rare and Endangered Languages and
Scripts
• Really quite important (another talk by someone else)
• May not benefit from some
internationalization approaches
– Applications software rarely adopted
– Inability to render a script and produce
meaningless displays (□□□□ or ????)
– The “wait for Unicode” problem
Further drive toward major languages
17
Requirements for internationalized
message content
• Either
– Coding scheme to transmit ASCII-only or
– Reliable way to indicate extensions are in use
(did both)
• Clear identification of Character Set and encoding
used (“charset”)
• Optional identification of language
• SMTP extension mechanism
– Included provisions for non-ASCII-coded message
bodies
18
ESMTP and MIME
Source
Message…
Envelope:
EHLO
MAIL FROM:
RCTP TO:
DATA
Headers:
From:
To:
Subject:
Date:
Source
Message…
Source
Message…
19
The Internationalization Tradeoff and
Computer Networks
• With one, interconnected, network
– Computers are not very smart
– Mnemonics, acronyms, and codes don’t translate
• Alias models do not scale well
• Some lessons there about domains (another talk)
• In particular, when the audience is computers
– Actual protocol elements do not need translation
(at least in theory)
– Identifier strings used with protocol elements may
not translate (or need to
20
Be Careful What You Try to
Internationalize
21
Internationalizing Domain Names
• Significant pressure for mnemonics in local
scripts
– “All will be well if work at 2nd level and below”
– Some incorrect conceptions about DNS
– In particular, cannot enforce language
– Whoops, need TLDs (!)
• IDNA and coding
(another talk)
22
Are IDNs Necessary?
• Socially and politically, definitely yes
• If search is used more than remembering or
guessing domain names, maybe not.
• Favorites and bookmarks can be anchored in
any language and mapped to domains in any
script
23
Beyond content to addresses
• Internationalization tradeoffs still a problem
– Good within language/ script communities
– Problem when sender and recipient use different
ones.
– If I cannot read or type your address, we have a
problem (noticed in Post a long time ago)
• Updating email transport systems is easy
– Legacy conversion Is harder
– Interface to and in MUAs is really hard.
• Unlike content, multiple character codes are a
problem for addresses
24
Messages with New Addresses to Old
Systems
• No conversion gateways
– Sender System (MSA or MTA): Can you accept
this?
– Receiver MTA: No
– Sender MTA: ok, goodbye… will tell the user
26
Mail Transport
Source
Message
…
MUA
MSA
MTA
Gateway
Relay
Relay
Retrieval &
Presentation
Delivery
Process
MTA
27
Why no “downgrading”?
• Note: local-part@domain
• Constraints imply
– No way to do IDNA-like mapping of
addresses
– Local-part may be an arbitrary string;
domain not much better
• No translation either
• Transliteration not reliable even if agreement
could be reached
28
Email Extended for Non-ASCII
Addresses - Characteristics
• local-part@domain – entirely Unicode UTF-8
• Requires non-ASCII Unicode support in header
field data
• Addresses in envelope
– Supported through SMTP extension
– No fallback or translation/ coding in transit.
– System accepting the extensions must be prepared for
any Unicode-supported script
• New addresses + older systems: No
communication
29
I Did Not Talk About MUAs
• Always the hard part
– Need to understand people and behavior, not just
computers
– Figuring out what to do when something is not
understood is hard too
• Not clear that we know how to build a perfect
one, even for all-ASCII message and systems
(another talk)
31
As the Extensions Deploy…
• More Internet accessibility to people
unfamiliar with Latin characters
• Better ability to use non-basic-Latin email
addresses
– Both local parts and domain names
• Better communication within language
communities
• Probably little change between communities.
– Learning that from inevitable problems
32
Email Probably Has A Future
• It is as universal as human communication
• But humans still communicate better when
– Same language
– Same writing system
– Same culure
• More internationalized email probably won’t
change that
33
Thank you
Bring questions tomorrow.
34
Download