Herein-UNL presentation

UNIVERSAL
NETWORKING
LANGUAGE
UNDL FOUNDATION
UNL
BAMAKO
7/05/05
Content
1.
2.
3.
4.
5.
6.
7.
Seeing is believing
Response to challenge of our time
Convergence of IT, Knowledge, Language
What we can do with it
How it works
What it takes to have it
A story
ARABIC
ARABIC
CHINESE
CHINESE
ENGLISH
ENGLISH
FRENCH
FRENCH
RUSSIAN
RUSSIAN
SPANNISH
SPANNISH
UNITED NATIONS: 5x6=30 Pairs
French
Chinese
Russian
Spanish
English
Arabic
English
Spanish
Chinese
Enconverter
Deconverter
UNL
French
Russian
Japanese
Etc...
Arabic
English
Russian
Spanish
Deconverter
UNL
Chinese
Japanese
Enconverter
French
Etc...
Arabic
UNL System Architecture
French
UNL
System
Chinese
Chinese People
French People
Arabic
Hindi
Internet
Hindu People
The property rights of the UNL belongs to the United Nations
Arabic People
 UNDL Foundation. All rights reserved
UNL Document creation
WEB
Internet
Web server
Web page
using UNL
Spanish Content
Developer
Enconverting
UNL Language
Server
Spanish
Language Server
Web Server with
UNL document
UNL
ENCO
UNL
NL
UNL
Natural
Language
Generation
Rules
UNL-Language
Dictionary
Analysis
Rules
Coocurrence
Dictionary
UNL
Language Server
DECO
Knowledge
Base
Deconverter
UNL
UNL
System
Deco
NL
Coocurrence
Dictionary
UNL-Language
Dictionary
Knowledge
Base
Generation
Rules
Enconverter
NL
Enco
UNL
UNL
System
Analysis
Rules
UNL-Language
Dictionary
Knowledge
Base
UNL LANGUAGE SERVER
Enconverter =  Deconverter
(EnCO)
(EnCO)
Language Server
UNL <- >Chinese
EnCO
DeCO
UNL document
Language Server
UNL Editor
UNL <-> Arabic
UNL Viewer
UNL Proxy
Language Server
Internet
USER
1
UNL <-> Spanish
2
3
Language Server
UNL <-> Hindi
Language Server
UNL <- >Japanese
EnCO
DeCO
Language Server
UNL <- > English
EnCO
DeCO
EnCO
DeCO
LANGUAGE SERVER
UNL = Native Language
Enconversion Program
3EnCO
UNL Dictionary
Deconversion Program
DeCO
UNL Grammar
UNL Knowledge Base
The UNL over the Web
WEB
UNL
French
UNL
Chinese People
UNL
Hindi
French People
UNL
UNL
Spanish
Russian
English
Chinese
Hindu People
= UNL Language server
Spanish People
MERCI
Thank you
History of Great Discoveries and the
Great Inventions
• To see the Forest beyond the Trees.
• The little stories of everyday, hide the great history and
the great changes in History
• Most of them are unexpected
• Technology are response , reaction to great challenges
– Technologies has a power of transforming everything,
catalyzes of other movements, power of multiplying effects,
opportunities;
– Technologies that have higher lower power of transforming
life;
– When they happen , there is a Risorgimento regional.
Enconverter
Deconverter
English
Russian
Spanish
Chinese
UNL
French
Japanese
Etc...
Arabic
What the UNL can do ?
1) Machine Translation
2) Multilingual Information Service (eCommerce, e-learning, e-government,
e-TV)
3) Information Retrieval System (eCommerce, e-learning, e-government)
4) Expert system
5) Encyclopaedia
Inside the UNL System
WHAT IS THE UNL ?
A set of resources
comprising:
– Linguistic Resources
– A technical infrastructure
– Knowledge Assets
Linguistic resources
• Specifications of the UNL
– Universal Words (UWs)
– Master Definitions
– Attributes
– Relations
– Grammar
Technical Resources
• UNL Servers
– Enconverter
– Deconverter
• Proxy Server
• UNL Editor
• UNL Verifier
• UNL Explorer
• Manuals
Knowledge Assets
• Dictionaries: UWs, Master, Natural
Language (NL)
• Grammatical rules for each NL
Competitors?
• Computer systems which can deal
with knowledge and contents have
been already developed.
• Representations of knowledge or
contents are different from each
other.
• Moreover, a representation depends
on a language.
• Knowledge or contents of a
Who?
• In the case of machine translation,
if we combine all the results of
research and development on
machine translation, we can not
realize a multilingual machine
translation system that can break
language barriers.
Advantages of UNL
• The UNL, a common language for
computers:
– enables sharing knowledge and
contents among all systems
– overcomes language barriers
– reduces costs of developing knowledge
or contents
– facilitates knowledge processing
How it works
• The UNL can express concepts like
words do in natural languages. Ex:
horse, cavalo, cheval,
(Chinese
ideogram),
• The UNL can express information
like natural languages do. Ex: full
description of a horse as in an
encyclopaedia entry.
How UNL express information?
• The UNL express information by
classifying objectivity and
subjectivity.
• Objectivity is expressed using UWs
and relations.
• Subjectivity is expressed using
attributes.
Who are we

300 computer scientists and linguists from
universities and research institutions around
the world
Language Coverage
• Languages already engaged:
• 6 UN official Languages:
Arabic, Chinese, English, French,
Spanish, Russian
• Other languages:
Hindi, Indonesian, Italian,
Japanese, Korean, Mongol,
Latvian, Portuguese, Thai
UNL R&D Network(1)
Arabic
The Royal Scientific Society, Jordan
Chinese
Ministry of Electronics Industry, China
English
UNL Centre
French
University Joseph Fourier, France
German
Univ. of Saarbrucken, Germany (inactive)
Hindi
Indian Institute of Technology, India
Indonesian BPPT Technology, Indonesia
Italian
Pisa CNR, Italy
Japanese
UNL Centre
UNL R&D Network(2)
Mongolian
Mongol Pedagogical University, Mongolia
Latvian
University of Latvia, Latvia (inactive)
Portuguese University of Sao Paulo, Brazil
Russian
Russian Academy of Science, Russia
Spanish
University Politecnica of Madrid, Spain
Swahili
Univ. of Dar es Salaam, Tanzania (inactive)
Thai
NECTEC, Thailand (inactive)
UNL Centre UNDL Foundation
Where we are now
1) UNL the language, Relation and Attributes
(specification)
Version Approved by a committee of
Scholars and patent recognized by PCT
countries (WIPO) 2002
Dictionary of Universal Words, Knowledge
Base (Increasing volume of entries on a
continuous development)
Where we are now
2) Language Server: (Operational)
a) Deconverter (Language Generation System)
Deco: Operational
Generation Rules and Dictionary (each
language):
continuous development
b) Enconverter (UNL Generation System)
Enco: Operational
Analysis Rules, Dictionaries (each language):
continuous development
Where we are now
• 3) Tools & Applications:
•
•
•
•
•
•
•
•
•
UNL Proxy Server: Operational
UNL News: 4 publications on 2002
UW Gate: under tests
UNL Verifier: under tests
UNL Viewer: Prototype
UNL Editor: Prototype
UNL Encyclopedia: Prototype
UNL Explorer: Prototype
Org Explorer: Prototype
Vision of the Future
• Applications in all fields of human
activities
• Advantages for international
organizations
• Bridging the Digital Gap
• Benefits for Multilingual Countries
• Content driven Technology:
• hence opportunities for employment and
self employment
• Low cost clean investment
Challenges
• Financial Resources
• Persistence: working towards
cumulative results
• More language coverage
• Expanding the R&D network
Open Policy
UNL (system) should be developed by
all peoples in the world.
• We will open:
UNL specifications
Universal Word Dictionary
Format of UNL-Language dictionary
Format of Deconversion rule
What we expect to be developed by
people in the world
UNL (system) should be developed by
all peoples in the world.
• Universal words necessary for each
language
• Language Servers for new languages
and new domains
What we expect to be developed by
people in the world
• Application systems such as:
Information Retrieval System
Search Engines
Browsers
Editors/Word Processors
Machine translation Systems
The UNL System
1) UNL (Universal Networking Language)
Dictionary of Universal Words , Relation, Attribute,
Knowledge Base
2) Language Server
i) Deconverter (Language Generation System)
Deco, Generation Rules, Dictionary(each
language)
ii) Enconverter (UNL Generation System)
Enco, Analysis Rules, Dictionaries(each
language)
UNL Proxy Server
• Searches for UNL at the web page
accessed by the user.
• The UNL document is sent to the
Language Server defined by the
selected language.
• Updates the web page to be displayed
on the user’s chosen language.
UNL Editor
Enconverter
English
Russian
Spanish
Chinese
UNL
French
Japanese
Etc...
Arabic
Deconverter
UNL Editor – select sentence
UNL Editor
UNL Editor
UNL Editor
UNL Editor
UNL Encyclopaedia
UNL Encyclopaedia
•
•
•
•
•
“Infinite library” (M.Luis Borges)
Human Knowledge
Knowledge system
Encyclopaedias
How to build the UNL Encyclopaedia
UNL Patent
• Purpose: Gift to Humankind
• Submitted in 1999 to the Japanese
Patent Office
• Recognized by PCT countries
(WIPO) 2002
• Application for patent e
commercial protection in major
countries
UNL Global Network of R&D
• 300 computer scientists and
linguists from universities and
research institutions around the
world
Languages covered: Arabic,
Chinese, English, French,
Japanese, German, Hindi,
Indonesian, Italian, Latvian,
Mongolian, Portuguese, Russian,
UNL Society
• Purpose: Collaboration in R&D
• Membership: Individual and
institutions
• At present: over 300 members
from 30 countries
• Future perspective: Collaboration,
support users
WE…
A global network of
computer Scientists
and Linguists +
philosophers,
mathematicians…
UNDL FOUNDATION
8, JULY 2003
UNL: A LANGUAGE
• UNL is a “language” for computers (different
from a “computer language”)
• expresses information and knowledge in
digits, the characters that all computers
understands
• enconverts contents from any natural
language into UNL and then deconverts into
any other natural languages.
• UNL Language enables peoples to
• build the “reservoir” of human knowledge
from and to diverse natural languages
Enconverter
English
Russian
Spanish
Chinese
UNL
French
Japanese
Etc...
Arabic
Deconverter
English
Spanish
Chinese
Enconverter
Deconverter
UNL
French
Russian
Japanese
Etc...
Arabic
English
Spanish
Chinese
Enconverter
Deconverter
UNL
French
Russian
Japanese
Etc...
Arabic
UNL: A SYSTEM
• UNL has been designed to
represent contents in a
language independent way.
• UNL is a system to support
multilingual information
services (mainly for Internet)
• It can also be used as a
machine translation system
European Heritage Network
(HEREIN)
•
•
•
HEREIN is a very large document repository (all documents
written in three different languages)
Great amount of human translation resources needed.
Current contents written in UNL can be converted in more
languages.
•
Web page www.european-heritage.net
•
The Network is currently composed of administrations and/or
mandated bodies from the following (27) countries :
– Andorra, Armenia, Belgium (Brussels-Capital, Flemish Region,
Walloon Region), Bulgaria, Croatia, Cyprus, Denmark, Estonia,
Finland, France, Georgia, Hungary, Ireland, Latvia, Lithuania,
Luxembourg, Norway, Poland, Portugal, Romania, Slovakia,
Slovenia, Spain, Sweden and the United Kingdom.
DEMO
1. See the Spanish report (.xml)
2. See the Spanish report in UNL
3. Load the Spanish report in UNL into the
Spanish language generator.
4. See the generated Spanish (output.txt)
5. See generation available in other
languages (Russian, English, Italian).
THE SIZE OF THE PROBLEM/CHALLENGE
Globalization of the economic activities and the political relations
among states and social lifestyle is generated, supported and
reinforced by global information systems.
The global village emerging from the convergence of
telecommunications carriers, radio and television global networks
and the computers generates the conditions for a market, sharing
affluence, and enjoying cultural goods.
The global village creates the situation of exclusion of millions from
the sharing affluence, health services education, enjoying leisure,
technology comfort exchange and exposure culture, participating
social activities, benefiting from economic activities, access to the
market
Population Forecasts for Major Cities in 2010 (unit: millions)
(1) Tokyo (Japan)
(2) San Paolo (Brazil)
(3) Bombay (India)
(4) Shanghai (China)
(5) Lagos (Nigeria)
(6) Mexico City (Mexico)
(7) Beijing (China)
(8) Dhaka (Bangladesh)
(9) New York (USA)
(10) Jakarta (Indonesia)
(11) Karachi (Pakistan)
(12) Manila (Philippines)
(13) Ten shin (China)
28.93 m
24.97 m
24.37 m
21.67 m
21.09 m
18.02 m
17.97 m
17.55 m
17.23 m
17.20 m
17.02 m
16.06 m
15.70 m
(14) Calcutta (India)
(15) New Delhi (India)
(16) Los Angeles (USA)
(17) Seoul (South Korea)
(18) Buenos Aries (Argentina)
(19) Cairo (Egypt)
(20) Rio de Janeiro(Brazil)
(21) Bangkok (Thailand)
(22) Tehran (Iran)
(23) Istanbul (Turkey)
(24) Osaka (Japan)
(25) Moscow (Russia)
(26) Lima (Peru)
15.70 m
15.58 m
13.91 m
13.91 m
13.68 m
13.42 m
13.32 m
12.74 m
11.88 m
11.80 m
10.60 m
10.37 m
10.07 m
25
23
19
22
7 17
13
1115 8
24 1
3 14 21 4
12
10
9
16
6
5
26
220
18
Source: World Bank Data/Nishi
Top 10 Languages by
Population
RANK
LANGUAGE POPULATION
____________________________________________
1. CHINESE (MANDARIN)
885,000,000
2. SPANISH
332,000,000
3. ENGLISH
322,000,000
4. BENGALI
189,000,000
5. HINDI
182,000,000
6. ARABIC (ALL COUNTRIES)
177,000,000
7. PORTUGUESE
170,000,000
8. RUSSIAN
170,000,000
9. JAPANESE
125,000,000
10. GERMAN, STANDARD
98,000,000
____________________________________________
Source: Ethnologue: Languages of the World
WHAT DOES UNL OFFER?
The UNL provides users with a multilingual platform and a
set software tools enabling them to communicate with
other people their respective languages.
With the multilingual platform in place, users can share
information and knowledge across native languages.
Citizens, governments, international organizations, and
enterprises will all benefit from the UNL, as it provides
opportunities for information sharing, education, and ebusiness. The ultimate goal is to promote sustainable
development, dialogue among civilizations, economic
prosperity for all nations as well as peace among them.
The property rights of the UNL belongs to the United Nations
 UNDL Foundation. All rights reserved
Understanding How
the UNL System
Works
In bloc 1, the USER writes a
document in his/her native language
using a PC equipped with the UNL
Language Server. The UNL “Editor” tool,
in connection with the UNL Language
Server, enables him/her to write it in UNL.
As the USER types word by word,
sentence after sentence, a full paragraph,
or the whole document, the UNL
Enconverter software (EnCO)
instantaneously
“enconverts” all inputs into UNL
representations. This can be done
interactively between the writer and
the computer. The UNL “Viewer”
shows back to the writer the
document as it is “enconverted” from
the UNL into his/her language, which
represents how the system
understands the original document
being produced by the writer. This
allows him/her to check the
correctness of “enconversion”. In
such an interactive process, the
writer can produce UNL documents as
accurate as he/she wishes. USERS do
not need to know UNL, nor how the
EnCO and DeCO programs operates;
they just need to input correct
sentences in their native languages
with the help of the UNL Editor like a
word processor.
Once You Have It in
UNL, You Have in It All
Languages
In Bloc 2 the UNL document is placed on the
Internet through a ”UNL Proxy server”.
Information, text, documents, web pages written
in UNL can be stored in archives, or downloaded
and shared throughout the Internet to multiple
users in all native languages equipped with the
“UNL Language Server” set. UNL documents
can be processed
in standing alone computers, or exchanged
in local networks (LAN), or distributed
through WWW servers. They can also be
forwarded by file transfer program. UNL
documents received in a network terminal
can be deconverted into each native
language and read by any people on a
browser equipped with the “the UNL
Language Server” set. This is one of the
outstanding features of the UNL System: it
allows for synchronous and asynchronous
operation of multiple language servers
simultaneously.
Once you Have it in
UNL, you Have in It all
languages
Bloc 3 shows that each native language
has its own a UNL Language Server. This
allows any user to interact with others in
his/her own language, while the others use
theirs, through the Internet. The number of
languages that can be supported is
unlimited.
The Language Servers are all equipped
with the same set of software as in
Block 1, i.e., the EnCO, DeCO programs,
as well as the Editor and Viewer tools,
which are connected to the Master
Dictionary of UWs and the UNL
Knowledge Base. Users, therefore, may
write, read and exchange UNL
documents from any language that has
developed its UNL Language Server.
They can also improve the existing UWs
Dictionary, or create one where it does
not exist, and expand Knowledge Base
indefinitely. For these tasks, the
necessary tools, specifications,
instructions and manuals are available
on the web.
•
馬