1 A UML causal ontology for medical instrumentality (Abstract TKE 2008) 1. Context This paper deals with the elaboration of instrumentality as part of a medical ontology, which transforms the traditional structured terminological data of a specialized domain in a “formal, explicit (conceptual) model of object ranges in a computational representation” (Budin 2007). State-of-the-art applications like iTerm and its associated modeling module i-Model allow for the graphic representation of ontologies supported by terminography. Our paper explicitly tries to implement Budin’s methodological claim and calls for the combination, in terminology, of a conceptual linguistic theory, i.e. cognitive linguistics, and the Unified Modeling Language (UML), the formal standard for conceptual modeling used in IT engineering. In this paper we focus on instrumentality, which is traditionally defined as an associative relation. Previous research has suggested integrating causal instrumentality in i-Term as a fundamental conceptual relation in terminography (Sambre & Wermuth, forthcoming). Instrumentality is part of a causality relation between actions, in which the causing action typically transforms an initial medical state into a second, altered state, called caused action. Contrarily to more traditional vertical or generic conceptual relations, associative relations involve a timeline or process which questions the Wüsterian static conception of terminology (Sambre 2005). As such, this paper is part of a larger research project in which different associative relations, like temporality, causality and instrumentality are extensively described in (emerging, dynamic) fields of research as different as telecoms, medicine and nanotechnology. We develop the model proposed by Sambre & Wermuth (2005) and further explore Wermuth’s (2007) exploratory and multilayered typology of instrumentality in medical classification rubrics. As a general tenet, we claim that instrumentality in this specific medical text type occurs in the form of a series of subtypes each of which plays an important role in the event structure of rubrics. 2. Objectives We have two linked theoretical objectives: (2.1) to provide a more refined and authentic usage-based typology of English subtypes of medical instrumentality, i.e. in medical abstracts and titles of research papers in scientific journals, and (2.2) to develop a correlated UML concept model for this instrumental typology. 2 2.1 Linguistic corpus-driven typological objective We propose a typology of instrumentality based on the description of an authentic usage- based medical of abstracts in scientific papers. Not only has instrumentality been scarcely examined, terminological work often relies on the atomistic description of medical terms more than on the authentic textual or discourse sequences used by domain experts. Sambre & Wermuth (forthcoming) show how semantic relations like instrumentality massively appear in medical writing, both explicitly and implicitly. The prototypical definition restricts instrumentality to “the instrument or means used to achieve a particular end or purpose” (Oxford English dictionary). This prototypical meaning corresponds to the cognitive Figure in the Gestalt or cognitive framework which directly functions as the instrument with respect to the entire causative situation. Linguistically speaking, the instrument function is prototypically realized by means of some instrumental preposition such as by means of or a with-phrase in agentive sentences like I cut the bread with a knife (Talmy 2003: 487). A closer look at our corpus consisting of medical titles and abstracts quickly reveals that this prototypical instrument role realization does indeed occur, but that, in addition, a whole range of instrumental subtypes can be distinguished (such as instrument, device, means, cause, result, time and manner). These subtypes are furthermore realized in different linguistic ways: as determiners in compounds, as adjectives or deverbal nominalizations etcetera. Hence, in text types like abstracts, instrumentality should rather be defined as the convergence of several factors like the degree of involvement of the instrument in the action and the type of control the agent has on the instrument for the action (ranging from full to zero control). The descriptive analysis of our corpus therefore is based on an extended causal conception of instrumentality which leads to a more accurate semantic typology of medical instrumentality, particularly in the associated linguistic surface structures in English, the lingua franca for medical research. 2.2 UML modeling objective UML or Unified Modeling Language is a standard universal language for writing software blueprints. UML is used to visualize, specify, construct and document the artifacts of object-oriented software systems and the modeling of reality these systems require. Modeling is a basic engineering technique to ensure a schematic representation of real-world full-scale complex systems. UML uses this general conceptual stage in the engineering process not only in order to produce a visual representation of the conceptual analysis, but UML mappings can be used in socalled forward engineering: these graphics can be directly mapped forward into specific OO- programming languages. UML is used in a wide array of applications and also in medical electronics or scientific modeling. UML has a 3 formal semantics (Cranefield & Purvis 1999). Its open standard has given rise to a growing user community. UML offers different diagram types which allow visualizing the functionality of a system, in clinical workflow analysis, and allows the display of dynamic relationships between actors, objects and the actions performed. It is therefore well suited to medical domain modeling (Toma et al. 2007). Since instrumental subtyping is an ongoing and recent research trend, the fact that UML is an extensible language, and can be used throughout the different stages of a consistent modeling process, is to be considered an important methodological and theoretical advantage. UML could be considered a step towards the development of a standard upper ontology and facilitates data interoperability, information retrieval and natural language processing. The latter are three elements closely linked to the first objective of our research. 3. Method and corpus 3.1 Method As a starting point, we use the exploratory analysis of instrumentality conducted on so-called classification rubrics (Wermuth 2007). Rubrics are short linguistic descriptions of numerical codes used in medical classifications for the representation of diagnoses and surgical procedures. In morphosyntactic terms, rubrics are reduced phrasal forms of (complex) sentences describing medical actions. In their most simple realization, rubrics consist of a nominalized verb or a neoclassical root which is complemented by a number of pre- and postmodifications. Let’s take an example of such a rubric: Diagnostic procedures on external ear (ICD-9-CM). The nominal head may also be followed by a sequence of prepositional phrases. This leads to complex nominal structures like Revision of stapedectomy with incus replacement. Based on this typology, we distinguish at least 11 instrumental subtypes: means (artifact, body part, abstract), cause, manner, path, time, etc. For the purpose of this contribution we test the validity of this typology with respect to medical abstracts and titles. As these text types display more flexible and usage-based surface patterns than rubrics do, we elaborate a refined typology of instrumentality adapted to the textual features of the corpus. 3.2 Corpus Medical written discourse typically requires abstraction and indexing. As is wellknown, medical literature takes widespread advantage of the abstract in order to aptly communicate complex research. Among the many genres of medical texts (scientific papers, case reports, package inserts, patient brochures etc.) the abstract and titles of research papers are important subgenres. A medical abstract is an essential part of larger text units such as biomedical papers and can 4 be defined as the point-of-entry for the lecture of medical articles. It gives a brief summary of the articles by describing their main findings, thus providing the information necessary to the reader in order to quickly ascertain the article's purpose. Similarly, the title of medical papers forms a subgenre on its own. Titles tell the reader what the article is about by further condensing the information given in the abstract into an elliptical description. The importance of medical abstracts and titles is pivotal for automatic text retrieval as the generation of data for medical databases such as MEDLINE (accessible through PubMed) is based on the titles and abstracts of biomedical papers. In practice, these parts frequently are the only part of a biomedical paper that will be read in order to select papers relevant to the researcher’s own research (cf. Reeves-Ellington 1998: 105-115) or therapeutic decisions. Our methodology takes further a corpus-based approach: as requested by cognitive corpus linguistics (Langacker 1999), we analyze authentic language data. These English linguistic data we take from specialized medical journals in the field of cardiosurgery and microsurgery. The data are fed into a relational database which allows both the analytical subtyping of the data according to the type of instrumentality and a variation analysis according to the discipline. Being usage-based, this method provides insights into the salience of the various instrumental subtypes, which eventually leads to an extended definition of (medical) instrumentality, particularly in multidisciplinary teams where medical, technical and IT teams meet in the (modeling of the) clinical practice (Toma et al. 2007). 4. Results We obtain three types of results, on three different levels of description, with a joint conceptual basis. 4.1 Causality as general conceptual template Causality is the general template which serves as conceptual background to the instrumental subtyping. Causality can be broken down on an abstract conceptual level in two (spatio)temporal settings (type) in which instrumental linguistic expressions (as tokens) occur to perform caused action. The conceptual level of temporal subdivision in two windows of attention for causing and caused action (Talmy 2003) then acts as the interface between a linguistic layer on the one hand and an ontological level on the other. The categories and relations of this ontological structure supply the predicates and elements which facilitate formalization in a logical format as UML (Trautwein 2007: 404-407) or OODBMS. 5 4.2 Subtypes of instrumentality A closer look at medical discourse shows that instrumentality is important as a semantic role. A major observation is that the prototypical meaning of instrumentality provided in traditional definitions is far from sufficient to cover the instrumental diversity to be attested in our corpus. Also, in the medical context, instrumentality can first of all be defined as “an artifact, or a set of artifacts, that are instrumental (i.e. behave as instruments) in accomplishing some end (i.e. reaching some goal)” (cf. WordNet). A great deal of the medical data under investigation supports this genuine definition, but there are also data which can be assumed to be instrumental in a non-artifactual way. We illustrate this by some examples from the above mentioned classification rubrics. In the rubric Microscopic examination of specimen from ear, the adjective microscopic refers to an artifact (being a microscope) which is used in order to carry out the specimen examination. In this case, the instrumental meaning indeed has been narrowed to the function of an instrument. The lexicalization refers, in other words, to a material object, i.e. a microscope which is instrumental in carrying out the action. By contrast, in the rubric Diagnostic procedures on external ear the cause of the procedure realized as adjective diagnostic (or, viewed from another perspective, the purpose of the procedure) fulfills the instrumental role. In other words, the procedure is carried out because the surgeon wants to confirm some assumed pathology or, put the other way round, the purpose of the procedure is to diagnose the patient with a specific assumed pathology. In the same way, a number of examples of other subtypes of instrumentality can be identified in rubrics such as manner, path, time, metonymy, etc. which underscores the necessity of a much more fine grained definition of instrumentality, at least with respect to the medical data under investigation. 4.3 Temporal shifts Temporal shifts between instrumental causes and their respective caused actions are necessarily coded linguistically on the linguistic level. We will illustrate this idea, which is fundamental for the new status of dynamic relations in terminology, still very much dominated by static representations of concept systems, a criticism developed by Sambre (forthcoming). This approach in UML then is to our mind compatible with other recent attempts to model dynamism in so-called eventities (Schalley 2007: 439-452) as the interrelation between change of state, participant structure (even if our abstracts do not explicitly display human participants, but inanimate tissue, medical substances, artifacts, tests and techniques) and the positions participants hold in a conceptual structure. 6 5. Towards a UML description Object modeling languages are used for static and dynamic description of complex systems. We refuse the received idea that in concept modeling only static diagrams apply (ISO WI 24156). It is not because terminological concept structure as defined in ISO 1087-1 are mapped on conceptual modeling in UML that these terminological concepts should not contain aspects of dynamicity. We use the descriptive instrumental subtyping within causality as a test case for demonstrating this more abstract idea about concept modeling. For causality and its instrumental subpart(s) we offer a UML visual template. Dynamic aspects of systems are typically represented in other UML diagrams than those presented in ISO WI 24156. We will consider for instance activity diagrams, statechart diagrams, sequence and collaboration diagrams, to name a few (for an overview Booch et al. 1998: chapters 15-19). Activity diagrams decompose activities in sets of (inside) atomic actions and computations very similar to the causal chains set up by instrumentality. Multidisciplinary medical modeling involves workflow modeling techniques which display subresponsibilities (for different medical departments using different instruments, like clinical tests or surgical techniques) within the overall workflow process. Sequencing implies iteration and/or succession of actions over time, whereas use cases specify not only which (outside) actors (like patients and physicians) use instruments but also how they do so. These dynamic relations in UML have to be taken into account by terminological concept modeling to give formalist impact to conceptual ontologies. UML then is the place where conceptual ontologies and formalist descriptions of authentic language data meet. 7 References Booch, G., J. Rumbaugh, I. Jacobson. The Unified Modeling Language User Guide. Reading (Mass.): Addison Wesley. Budin, G. 2007. From Terminologies to Ontologies. Advances in Knowledge Organization. Unpublished paper, Terminology Summer School Cologne. Cranefield, S. & M. Purvis. 1999. UML as an Ontology Modelling Language. IJCAI-99 Workshop on Intelligent Information Integration. Gries, St. & A. Stefanowitsch, A. (eds.). 2006. Corpora in Cognitive Linguistics. Corpus-based Approaches to Syntax and Lexis. Berlin/New York: Mouton de Gruyter. ICD-9-CM. 2006. International Classification of Diseases, 9th revision, Clinical Modification, Hospital Edition. Los Angeles: PMIC. . ISO. 2006. Guidelines for applying concept modelling in terminology work. ISO WI 24156, Draft 1 for CD ballot. Langacker, R. W. 1999. Grammar and Conceptualization. Berlin / New York: Mouton de Gruyter (Cognitive Linguistics Research, 14). Langacker, R. W. 2001, Discourse in Cognitive Grammar, Cognitive Linguistics, 12, 2, pp. 143-188 Reeves-Ellington, B. 1998, The biomedical paper: translation for publication. In Fischbach, H. Translation and Medicine, A(merican) T(ranslation) A(ssociation), volume X, 105-115, Amsterdam / Philadelphia: Benjamins. Sambre, P. forthcoming. La futurità delle nanotecnologie: per una visione dinamica della definizione terminologica, Mediazioni, Rivista online di studi interdisciplinari su lingue e culture. Sambre, P. & C. Wermuth. (forthcoming). Instrumentality in cognitive concept modeling, In Steurs F, M. Theelen (eds.). Terminology in society. [provisional title]. Amsterdam / Philadelphia: John Benjamins. Schalley, A.C. 2007. Relating ontological knowledge and internal structure of eventity concepts. In Schalley, A.C. & D. Zaefferer, 435-458. Ontolinguistics. How Ontological Status Shapes the Linguistic Coding of Concepts. Berlin / New York: Mouton de Gruyter. Talmy, L. 2000. Toward a Cognitive Semantics. Cambridge: MIT Toma, M. et al. 2007. UML based modeling of medical applications workflow in maxillofacial surgery. GMS CURAC 2007; 2(1): Doc 03. Trautwein, M. 2007. On the ontological, conceptual, and grammatical foundations of verb classes. In Schalley, A.C. & D. Zaefferer, 395-418. Ontolinguistics. How Ontological Status Shapes the Linguistic Coding of Concepts. Berlin / New York: Mouton de Gruyter. Wermuth, C. Instrumentality vs. pseudo-instrumentality in medical classification rubrics. Unpublished paper, New Directions in Cognitive Linguistics NDCL-2 Cardiff, August 2007.