Araujo EN - communication

advertisement
UNION INTERPARLEMENTAIRE
INTER-PARLIAMENTARY UNION
Association of Secretaries General of Parliaments
COMMUNICATION
from
MR José Manuel ARAUJO
Deputy Secretary General of the Assembly of the Republic of Portugal
on
PARLIAMENTARY TERMINOLOGY: CREATING A TEMINOLOGICAL AND TEXTUAL
DATABASE AT THE PORTUGESE PARLIAMENT
Geneva Session
October 2014
PARLIAMENTARY TERMINOLOGY: BUILDING A TERMINOLOGICAL AND
TEXTUAL DATABASE AT THE PORTUGUESE PARLIAMENT
Introduction
The Terminological and Textual Database of the Assembly of the Republic (BDTT-AR) is a
multilingual linguistic resource offering Portuguese, English and French. It consists of a
structured database that compiles the terminology in use within the Portuguese Parliament.
The database shows Portuguese terms and has equivalents in English and French, as well as
definitions for the terms, and all the sources used are rigorously documented.
The terminological database is linked to a textual database that provides access to the texts
from which the terminological information contained in the BDTT-AR was extracted. It is
thus possible to check occurrence of the terms in one or several texts and see them in the
various contexts in which they are used.
The BDTT-AR is the outcome of a multidisciplinary collaboration between the translation
service of the Assembly of the Republic and the Linguistics Centre of the Universidade Nova
de Lisboa (CLUNL).
1. Objectives
The initial purpose of the database was to meet the needs of the Portuguese Parliament’s
translators by creating a multilingual linguistic resource that would help to solve translation
problems, and also to serve anyone who has to write texts in English and French.
Another goal emerged early in the project: the need to organise the body of parliamentary
knowledge.
Linking the Parliament with university terminologists contributed to the scientific accuracy of
the available data and helped to improve the coherence of the texts produced and made
available by the Assembly of the Republic, which thus enhanced the accuracy of its
discourse.
This project was accompanied by other internal measures in the area of translation:
 centralisation of translation requests in the Assembly translation service
 acquisition of a tool to aid translation
 creation of an application for translation requests on the Intranet
2

harmonisation of terminology in English and French of several reference texts that
contribute to the smooth functioning of Parliament, including the Constitution of the
Portuguese Republic, the Rules of Procedure of the Assembly of the Republic, the
Statute of Members, the Electoral Law, etc.
2. Target audience
The database users on the Intranet are the institution’s translators, but anyone who needs
to produce specialised texts in Portuguese, English or French can make use of this
multilingual resource, such as MPs, jurists, writers and advisers.
Users who access the database on the Internet include: freelance translators, including those
who work with the Assembly of the Republic (and, in fact, they are obliged to consult the
database); students; teachers, and anyone else with a strong interest in the parliamentary
domain.
3. Work structure
The collaboration between the two institutions was organised as follows: the University was
responsible for developing work methods suited to the AR, designing the databases,
providing support and scientific knowledge, and training working groups involved in the
project.
The AR was charged with providing the logistical resources, creating the necessary working
groups, applying the working methods and developing the database.
The project involved three major steps:
a) implementation;
b) development;
c) maintenance.
The first two stages included several steps that began by assessing the terminology needs of
the AR.
a) Implementation
Meetings were held with the various services of the AR that produce any kind of
parliamentary documentation to decide upon the relevant documents to work with, and
how to carry out the work, as well as the most important specialised areas.
Having identified the needs of the AR in terms of terminology and knowledge organisation,
the theoretical and methodological criteria were established that led to the design of the
database by the CLUNL and its development by the Assembly of the Republic.
3
b) Development
This stage involved creating the terminological record with relevant terminological fields and
the information they should contain, always keeping the users of the database in mind.
A working database was also developed for data entry. The interface for the database to be
available for consultation on the Intranet and Internet was designed afterwards.
These databases were envisaged for the dissemination of the data as an immediate process,
without any data migration. The data provided can be made visible and hidden under the
criteria established by the AR and this result is immediately visible in the BDTT-AR, both on
the Intranet and on the Internet.
The third step was the compilation of the corpus, i.e., the organisation of the texts deemed
relevant for this project, which the AR had selected based on the initial meetings. The corpus
initially consisted of eight texts and has been gradually supplemented with new texts as the
project has progressed. Texts already included have been updated too (the corpus is
composed of legislation and whenever it is revised new terms can be introduced) and this
makes it possible to keep abreast of parliamentary terminology as it evolves over time. The
BDTT-AR corpus currently consists of twelve texts and their updates, making a total of
twenty texts.
This stage included the processing of the corpus and its computerisation, the creation of
criteria for organising it, as well as criteria for the extraction of terms.
The corpus was then processed semi-automatically with a specific tool to enable the
Terminology Group to identify and choose candidates for terms.
In addition to creating lists of candidate terms extracted from the corpus and preparing
them for validation by the experts, the terminology process included the validation of the
terms in Portuguese and the search for their equivalents in English and French.
The contribution of experts in this case was extremely important because it allowed the data
to be validated, and only the terms validated by all the experts were made available on the
Intranet and Internet. The validation was performed in close collaboration between the
Terminology Group and the Group of Experts, both supported by the CLUNL.
It should be noted that the good cooperation established between the Group of Experts and
the CLUNL was crucial to this stage of the project. The experts, who received specific training
in this field from the CLUNL, diligently assumed the role of ‘validation agents’, helping to
ensure the accuracy of the terms made available.
After the validation process, the Terminology Group then searched for the equivalents in
English and French, based on the official translations of the above-mentioned reference
texts provided by the AR.
4
c) Maintenance
Finally, the maintenance stage involved ensuring that the AR could work independently,
since it would be responsible for managing the database, consulting the university
terminologists whenever necessary.
At this stage, and in order to maintain the momentum of the BDTT-AR, it was decided to
continue making the terms in the database available gradually.
4. Definitions
The availability of definitions in Portuguese was requested by the AR during the second stage
of the project, which enriched the content provided by the BDTT-AR and broadened the
database’s target audience. This was no longer merely the institution’s translator since it
began to cover other professionals interested in the parliament and related issues, whether
or not they worked in a foreign language, and even students.
This stage of the project had two main steps: reviewing the existing definitions and the
wording of new definitions for the terms available. The Terminology Group was responsible
for reviewing and editing the definitions, preparing them for subsequent validation by the
Group of Experts.
In the process of reviewing definitions, the Terminology Group examined a list describing
some of the terms most relevant for the Parliament. The definitions in this list were then
revised/rewritten in accordance with the criteria established for the drafting of definitions
for the BDTT-AR.
New definitions were drafted en bloc, i.e., starting by identifying and defining the generic
term to distinguish the characteristics specific to each term. This approach ensured
consistency in the wording of the definitions, thereby significantly enhancing their quality
and accuracy.
The definitions, like the terms, were also subjected to validation by members of the Group of
Experts. The experts were sent lists of definitions and these were discussed at weekly
meetings attended by all the Group of Experts and some members of the Terminology Group
and the CLUNL, who chaired the meeting.
After each definition had been discussed it was validated or amended in accordance with the
suggestions of the experts. If no agreement was reached, the definition could be rewritten
by the Terminology Group and submitted at the next meeting.
The validation meetings, which took place between December 2010 and July 2012, were
highly productive and resulted in the definition of the terms regarded as most relevant.
5
The innovative character of this type of validation should be noted, because it managed to
secure a meaningful dialogue between academic terminologists and parliamentary experts,
with positive results for both institutions.
5. Working groups
Two working groups were created for the project at the AR, composed of parliamentary staff
from different services and departments of the institution:
a) Terminology Group;
b) Group of Experts.
The Terminology Group was composed of translators, terminologists, documentalists and
writers, and it was responsible for compiling the corpus, extracting the terminology from
texts, drawing up lists of terms for validation, drafting definitions, etc.
The Group of Experts, composed mainly of jurists with extensive parliamentary experience
and thorough knowledge of the legislative texts produced by or used at the Assembly, was
responsible for the validation of the terms and definitions, i.e., of all the information
available in the BDTT-AR on the Intranet and Internet. This group had an odd number of
members in case a vote was needed with respect to the validation of terms and definitions.
As mentioned, both groups were composed solely of parliamentary staff, who received
continuous training throughout the project from the CLUNL in the area of terminology,
whenever necessary.
The groups were also supported by a terminologist from the CLUNL, who assisted the
Terminology Group, and clarified the doubts of the Group of Experts.
6. Project implementation – availability on the Intranet and Internet
The first stage of this project ran from 1 April 2005 to 1 April 2007; the second ran from 7
September 2009 to 7 September 2012.
The terminology database was made available on the Intranet on 12 June 2006, with 400
terms and their equivalents in English and French. The same database was made available on
the Internet in November 2007.
January 2010 saw the development of textual database, which was made available on the
Intranet in February 2011 and on the Internet in September of the same year.
We currently have about 1500 terms available on the Internet from a total of about 3500
terms entered into the working database.
6
As we took terms from the corpus we realised that parliamentary terminology was very
repetitive, hence the relatively small number of available terms. In addition, the difficulty of
establishing the limits of "parliamentary terminology" meant that the experts rejected the
terminology taken from the selected corpus that was not directly related to the functioning
of parliament.
7. Services provided
The BDTT-AR is a multilingual resource consisting of a terminological database and a textual
database.
The terminological database is based on the Portuguese context and provides equivalents in
English and French that relate to that context. Each terminological record contains
information relating to the parliamentary term, such as the grammatical category, definition,
acronym, equivalents and the respective contexts.
With respect to terminology, the database allows a simple search by term or by the
beginning of the term, as well as an advanced search for truncated terms, acronyms,
equivalents, or even terms contained in a definition. The terminological record provides
linguistic information such as grammatical category and the acronym, which may sometimes
be used more often in Parliament than the term itself. There is also a ‘Phraseology’ field that
includes, for example, verbs most often used with the particular term.
In addition to the English and French equivalents, the record also provides contexts for these
equivalents, derived from official translations of texts in Portuguese entered in the textual
database.
The textual database contains all the texts that constitute the corpus from which the terms
in the terminological database were extracted, showing the terms in a wider context, as well
as the number of occurrences of the term, the kind of occurrence in the text, etc.
A search of the textual database will give the full, reduced or truncated terms in one or more
texts, since it is possible to choose the text or texts, according to the desired search.
These two databases are inter-connected, which allows navigation from one to the other,
depending on the specific interests of each user, i.e., users can start from the term and
access the text or start from the text and access the term.
Also provided are the sources, which are extremely important. All data found in the BDTT-AR
are accompanied by their documentary sources, i.e., the source from which the information
was extracted or which was used to check it. Thus, in addition to ensuring the occurrence of
terms in context, users themselves can consult the sources used if further information about
a particular term is required.
7
8. Conclusion
This database was designed keeping in mind the requirements needed to ensure the quality
of the content, which is systematically validated by the various working groups that include
linguists, terminologists, translators, documentalists, jurists and other experts from various
parliamentary areas.
Internally, the BDTT-AR brings together an extensive range of parliamentary and legal
information in Portuguese. From the point of view of translation, the database means
greater certainty in the choice and justification of the terms used and, from the point of view
of text production, it offers greater consistency in spoken and/or written pieces.
Externally, through this database, the AR demonstrates a concern for the proper
dissemination of information on terminology and shows the integrity and control of the
accuracy of its discourse; the BDTT-AR may also play the role of a reference database in the
parliamentary area within Portugal’s Public Administration itself.
The BDTT-AR is also quite important with regard to the international activity of the AR,
because it contributes to its active participation with speeches and written texts in various
conferences and/or meetings.
The BDTT-AR is a dynamic database, whose content is constantly updated.
José Manuel Araújo
Assistant to the Secretary-General
2011. Costa, Rute; Silva, Raquel; Soares de Almeida, Zara. “Cooperation between terminologists and experts in the creation of a
Terminology and Textual Database: the context of the Portuguese Parliament”, Nordterm 17, Volume: Samarbetet ger resultat: fran
begreppskaos till överenskomna tremer, p. 9.
http://www.nordterm.net/filer/publikationer/rapporter/Nordterm17.pdf
2011. Costa, Rute; Silva, Raquel; Soares de Almeida, Zara. “Terminology in the Portuguese Parliament: collaboration between
terminologists and domain experts in the validation process of terminological content”, International Legal Translation Conference,
Tradulínguas, Lisboa.
http://www.tradulex.com/LIS2011/rcosta.pdf
8
Download