Most of the European countries have at least a linguistic atlas of

HEAP, David, page 10 of 20 Online Dynamic Iberian Linguistic Atlas (Attachment # 1: summary) This collaborative project brings together a team of scholars from three continents to transform a unique legacy of linguistic data into a cutting-edge research tool that is poised to become a standard point of reference for researchers in Iberian dialectology and language cartography for generations to come. The University of Western Ontario is currently home to a resource which is like no other in the world: copies of the dialect survey notebooks from the Linguistic Atlas of the Iberian Peninsula (the Atlas Lingüístico de la Península Ibérica or ALPI). The original ALPI fieldwork transcriptions, made in the 1930s but largely ignored in Spain until quite recently, have been gathered here at UWO’s Theoretical and Applied Linguistic Laboratory, where we have the only complete collection of field data from this project (the holdings in Spain are divided between three different locations, not all of which are accessible to scholars). Since 2001 we have been electronically publishing the raw data from this longawaited and unique resource for Iberian Romance dialectology, which provides researchers with an invaluable snapshot of how Spanish, Portuguese and Catalan were spoken in the Iberian Peninsula at a specific historical moment. This project has already attracted a great deal of scholarly interest from the international scientific community: our site www.alpi.ca receives some 150 unique visitors daily and has hundreds of registered users who access thousands of pages of data from dozens of different countries. In its current form, however, the ALPI database provides only raw data: scanned facsimiles of the original fieldworker's notebook pages, a format which only begins to tap the real potential of the ALPI data. To go beyond the static hand-transcribed data of the original notebooks to online databases where linguistic forms can be searched electronically and used to create maps automatically according to individual researcher's interests and needs, we need to collaborate with international teams of scholars. The raw data can only be read rather painstakingly by specialists trained in the particularly detailed phonetic alphabet used in the ALPI fieldwork. In order to make the data more widely useable, the linguistic forms have to be retranscribed digitally and coded into databases which can then be accessed by a broad range of researchers, students and eventually the general public. This massive data transfer (over 36 000 pages of hand-transcribed field notes) cannot be done automatically but requires coordinated long-term efforts by research teams. At UWO we are developing the tools for web-based retranscription of the ALPI phonetic alphabet as well as discrete coding tools in order to tag grammatical and lexical variants. To test and refine these different web-based tools, we need to work closely with specialists in the different Iberian dialects represented in the ALPI data. The highly qualified personnel needed for this work are not to be found in sufficient numbers in any one centre: instead, the required linguistic expertise is spread through a number of different universities. As scholars in Spain and elsewhere realize the potential goldmine of linguistic data which the ALPI represents, existing research teams in dialectology can make available their specialized knowledge to help transfer the original data for their respective regions into digital format. As part of a network of scholars working on this project their own ongoing research will in turn benefit from improved access and readability of the ALPI data. The Principal Investigator at UWO (Heap) will facilitate and coordinate the contributions of two research teams in Spain (in Barcelona and Madrid) which are already leaders in the electronic publication of language variation data. The team will also build towards future collaborations with other regions in the Iberian Peninsula where there is growing interest in using ALPI data to publish regional linguistic atlases. Our co-applicant at l’Université de Montreal, Enrique Pato, also brings his experience in creating ALPI databases and in online cartography. Since we already have the required expertise in data-coding and relational database infrastructure, these two researchers at Canadian institutions are well-positioned to take a leadership role in developing the network of researchers and the web-based tools needed for data entry and data management in the next stages of the ALPI scholarship. The Canadian team also forms a link between Spanish colleagues working with us on the ALPI databases and the expertise in automatic cartography from the VARILEX project based at Sophia University (Tokyo), which will be instrumental in creating dynamic linguistic maps on the internet from our data. 2. Online Dynamic Iberian Linguistic Atlas HEAP, David, page 11 of 20 (Attachment # 2: detailed description) UWO is already home to an unparalleled data resource for scholarship in Hispanic language variation which provides a unique opportunity for international collaboration to create a new interactive linguistic atlas online. Background: from dialect survey to internet database Linguistic atlases are a standard research tool for studying language variation, and most European languages have at least one which covers their entire national territory. Spain and Portugal are, however, exceptions in that no complete and uniform survey of data from all the Iberian Romance languages and dialects has ever been published. The Linguistic Atlas of the Iberian Peninsula (Atlas Lingüístico de la Península Ibérica or ALPI) was proposed by philologist Ramón Menéndez Pidal in the early twentieth century and the dialect surveys were carried out under the direction of phonetician Tomás Navarro Tomás beginning in the 1930s. Just when most of the surveys were finished, the Spanish Civil War cut the project short and the ALPI materials, along with Navarro Tomás, went into exile in the U.S. In the 1950s the materials returned to Madrid and the surveys were completed, but only a single volume was ever published (ALPI, I. Madrid: CSIC, 1962), out of what would have been at least ten volumes if the publication had been completed. Most of this invaluable data remained unpublished and neglected for decades until uncovered by a Canadian scholar in 2001 (cf. Heap 2002). The ALPI data cover a network of 527 survey points from across the Iberian Peninsula with two field notebooks at each point (Notebook I Phonetics and Grammar, Notebook II Vocabulary). This unique collection of transcribed linguistic forms, with precious data on how Spanish, Catalan and Portuguese dialects were spoken at a specific historical moment, has no parallel in the world of Iberian dialectology; the ALPI remains the only complete language survey of this linguistic area. The most complete collection of these materials in existence today consists of the copied notebooks held at the University of Western Ontario’s Theoretical and Applied Linguistic Laboratory (TALL). While the original field notebooks in Spain are housed in three different archival locations, and not all of them are accessible to researchers, the ALPI data collection at Western began appearing as an Internet publication in 2002, in the form of scanned notebook images. Funded by SSHRC since 2003, the site www.alpi.ca has already made more than 70% of the original ALPI data available to the international research community. The site now averages more than 150 visitors per day, up from just 87 visitors daily during the preceding 12 months; the site’s total bandwidth for last year was over 20Gb, more than four times the figure for the preceding year. More significantly, this internet traffic comes from more than 30 different countries: mostly Canada, U.S. and Spain, but also a variety of other Spanish-speaking countries, many different parts of Europe and east Asia, in particular Japan, evidence of the increasing international scholarly interest in the ALPI data. The online ALPI database has several hundred registered users, who continue to download thousands of pages of data for their research. Thanks to the internet, our relational databases and web interface will soon make all of the original survey data (some 36 000 pages in all) freely available, thus avoiding the practical limitations (i.e. production costs) that prevented the print publication of the ALPI from being completed decades ago. Despite this success, there are still serious limitations to how these data can be accessed and analysed in their current form (i.e. the current database can neither be searched nor mapped automatically), limitations which cannot be overcome by a single research team working at one institution. Concrete objectives This application seeks support for the initial phase of the Online Dynamic Iberian Linguistic Atlas project (henceforth ODILA), which will improve and expand the online delivery of ALPI data by building an international team of collaborators working in electronic dialectology. We will provide the scholarly community with a representative portion of the ALPI data from two language areas (Catalan HEAP, David, page 12 of 20 and Castilian) in digitized format (a substantial achievement in itself), as well as establishing solid foundations for future work on the ALPI data, by setting up tools, protocols and guidelines for the coordination of research teams at different centres. The web tools and common resources which we perfect and implement will then serve as ‘proof of concept’ for demonstrating the possibilities of dialectology research on the internet, and recruiting further research teams to participate in future ODILA collaborations. The specific and feasible outcomes of this project include:  a representative sample (from at least 50 data points or about 10% of the whole ALPI survey) coded for variables selected from different sections of the ALPI questionnaire (phonetics, grammar, vocabulary);  development and testing of web-tools and working protocols corresponding to each of these sections, including discrete coding tools for grammatical and lexical variables, as well as phonetic transcription routines using Unicode character sets (where appropriate, orthographic i.e. spelling transcriptions will also be used for lexical variables).  recruitment of at least two more regional collaborators in other regions of the Iberian Peninsula, with resources and expertise to continue work on the ODILA in their respective areas. In this way, a large-scale scientific undertaking can be reduced to a series of manageable subprojects, each of which can be addressed and completed individually in an efficient and focussed manner. Proposed research: Interactive databases for online cartography While the internet publication of the raw ALPI field materials has met a clear demand on the part of researchers (witness the number of visits to the www.alpi.ca site as well as the number of pages accessed and downloaded), in their current form the scanned original notebook pages still cannot exploit the full potential of these invaluable data. The notebook images can be accessed using a list-driven or mapdriven search, but once located each page must be read individually to find the linguistic forms relevant for a given area of research. Furthermore, these pages of raw data can only be read rather painstakingly by specialists trained in the particularly detailed phonetic alphabet used in the original fieldwork transcriptions; the ALPI’s original director (Navarro Tomás) chose not to use the International Phonetic Alphabet but rather his own adaptation of a traditional dialectologists’ phonetic alphabet favoured in Spain, to which he added a large number of extra symbols (diacritics) to represent fine phonetic nuances. The result is that the phonetic transcriptions are so detailed that the data are difficult for some scholars and many students of linguistics to access (we have already determined experimentally that optical character recognition software is not a viable option for accessing these data, given the type of phonetic transcription used, the different fieldworkers’ individual handwriting styles and the number of additional diacritic symbols employed in the notebook transcriptions). In order to make the ALPI data more widely accessible, the linguistic forms have to be retranscribed digitally and coded into databases in a form which can be accessed and used by a broad range of researchers, students and eventually the general public. This ambitious project involves the transfer of information from handwritten phonetic transcriptions in the original ALPI notebooks to specially-designed relational databases accessible online. Such a massive data transfer (over 36 000 pages of hand-transcribed field notes) cannot be accomplished at any one institution but rather requires coordinated long-term efforts by coordinated teams of researchers. International collaboration is the only conceivable way a project of this scale can be conducted: the size of the task necessarily entails a multicentre approach which brings together the specific regional expertise of different teams in a coordinated web-based network of researchers. At the UWO, we have been developing web-based tools for retranscribing the ALPI phonetic alphabet using Unicode character sets (the modern standard for internet usage) to replace the hand- HEAP, David, page 13 of 20 transcribed phonetic notation system. By simplifying the Navarro alphabet to a semi-phonemic inventory of symbols, we will create a transcription system which is automatically translatable from the traditional dialectologists’ phonetic alphabet to the International Phonetic Alphabet, thus increasing the range of users who can access the ALPI data. In addition, some forms (for example, lexical items) will also be transcribed orthographically (i.e. in everyday spelling) where possible, making this part of the ALPI data accessible not only to specialists but also to students and to the general public. The discrete tagging and coding tools we are developing in order to code grammatical and lexical variants in our database also produce data in a format which can be accessed without knowledge of phonetic transcription systems. Crucially, none of the original transcriptions will be lost: all of the scanned images with the detailed phonetic notations will remain online for researchers to access should they wish to consult the original data directly. Since the searchable database with simplified Unicode transcriptions will be linked to the original scanned images, the online digital atlas will facilitate locating specific pages of the original ALPI notebook data, for those who want to see the full details at a given survey point. The specific online tools we envisage involve web-forms that display a given scanned page of data (i.e. a certain part of the ALPI questionnaire for a particular survey point) and present each dataenterer with various options: providing a phonetic transcription using the new Unicode character-set, providing an orthographic transcription where appropriate, and choosing between different coding option for grammatical or lexical items. Depending on each team’s focus at a given time, the data-enterer may choose one or all of these options, while also having the choice of providing a meta-comment or other flag for their team coordinator: all of these entries populate different but linked tables within our relational database. In addition, each data entry must be time- and date-stamped for a particular loggedon individual, allow us to track trends and maintain data integrity across and between group members. Team coordinators (initially Heap and Pato, although as the other research groups become more involved our co-investigators will take on this role as well) are responsible for setting access permissions for each data-enterer, i.e. which data-points they can ‘see’ and edit, checking individual data-entries, and establishing the discrete choices between grammatical and lexical variants. Since we cannot know in advance all of the variants which may come up, the team coordinators must also be able to create and modify the table entries corresponding to the linguistic variants the data-enterers choose. The whole process is one which requires thorough checking and testing at every stage, as the type of raw data found in the ALPI notebooks will always present us with surprises and new empirical challenges which have to be dealt with in principled and consistent ways in order to create data-tables which can then be searched and used by researchers. This development process, already begun by a team at Western as part of the post-doctoral research conducted by Pato under Heap’s supervision, will continue to its natural fruition under the research collaboration proposed here. Although this Canadian team is in the forefront of developing the internet tools and procedures for this project, we have reached a stage where international collaboration is essential for our work to progress further in a productive manner. In order to test and perfect these different web-based tools, we need to work closely with researchers experienced in the different Iberian dialect areas represented in the ALPI data. The highly qualified personnel needed for this work are not to be found in sufficient numbers in any one centre: instead, the relevant linguistic expertise is spread though a number of different universities. As scholars in Spain and elsewhere start to realize the potential goldmine of linguistic data which the ALPI represents, existing research teams in dialectology can contribute their specialized knowledge to help transfer the original data for their respective regions into digital formats, as part of a network of scholars whose ongoing research can benefit from improved access and readability of the ALPI data. The overall architecture of our relational databases allows us to manage and coordinate the different yet linked data-sets from each of the regional projects and, crucially, to maintain data-integrity across multiple research sites through our server at the UWO. Our initial aims are moderate and achievable: the period covered by the support sought in this application envisages working with a HEAP, David, page 14 of 20 representative subset of data from just two of the major language regions of the Iberian Peninsula, Castilian and Catalan. The role of the PI at UWO (Heap) is to facilitate and coordinate the contributions of two research teams in Spain which are already leaders in the electronic publication of language variation data (Perea in Barcelona and Fernández-Ordóñez in Madrid) while building towards future collaborations with other regions in the Iberian Peninsula where there is now a growing interest in using ALPI data to publish regional linguistic atlas projects. Successfully coordinated work in these two areas will allow us to work towards further collaboration with teams of scholars in other regions (Andalusia, Asturias, Galicia and Portugal, for example). Since there have already been overtures from scholars in each of these regions who are interested in working with the ALPI data, the support of SSHRC’s International Opportunities Fund at this point will allow us to move quickly to establish collaborations in regions covering more of the Iberian Peninsula in the near future, and to obtain the necessary resource commitments from their respective regional governments. Some of the early reviews of and studies based on the one published volume of the ALPI (1962), while positive about the importance and usefulness of the data (Catalan 1964: 307, 1975:97), stress that the same data could just as conveniently be presented as lists of forms without mapping each variable, as was done for example with the Survey of English Dialects (Orton et al. 1962-1971). While those observations predate the use of databases in dialect research, they clearly foreshadow current and future trends in this field: the future of dialectology lies not in print-based paper atlases but rather in open searchable databases where individual forms and classes of forms can be searched according to researchers’ interests and needs (Kretzschmar 1999), without reading them from predetermined maps. Of course, the traditional view of a linguistic atlas necessarily involves language data presented on maps: as useful as a searchable database might be, our project would be incomplete if we did not also provide a means to display the ALPI data in a way that shows spatial relations between different linguistic forms and (physical or human) geographic features. The beauty of a linguistic atlas in database format is that the results of a search can be either displayed as custom maps or analysed statistically and presented in tables and other formats. In other words, rather than being stored as static maps in hard-toconsult (and costly to publish) printed volumes, linguistic data retrieved from such a database can be used to generate dynamic maps “on the fly” (Ruiz Tinoco 2002:6), or exported in tabular format, or both. Within the field of Spanish language variation, there already exists a successful prototype of this approach: the VARILEX project in Tokyo, which has studied lexical variation from more than 40 cities across the Spanish-speaking world since 1993. Since 1999, the VARILEX data have been published online (see http://gamp.c.u-tokyo.ac.jp/~ueda/varilex/index.html), and these results can now also be mapped on demand (via http://lingua.cc.sophia.ac.jp/varilex/php-atlas/lista3.php). The proposed ODILA research team includes Ruiz Tinoco from Sophia University, who brings his expertise in the design and implementation of PHP-driven web interfaces used by the VARILEX project to this project. Coordinated collaboration via internet databases will not only facilitate researchers from different institutions working together, it also has the advantage of allowing cumulative and progressive contributions to the ongoing project. Data-entry in any format necessarily entails some degree of human error, but working with databases allows us to make corrections more easily: instead of having to wait for each printed volume before discovering the (inevitable) errata if we were producing traditional ‘hardcopy’ linguistic atlas, our relational databases can be updated as errors are detected, and datamanagement protocols (authentication user log-ins with specific localized authorizations, all entries time- and date-stamped) allow us to track and correct systematic errors or even divergences in data-entry choices, if and when they should occur. As a result, the data which end-users can access via the internet will potentially be more coherent and reliable than any single printing of an atlas could ever hope to be. Methodology and timelines An important goal of the ODILA is to create methods which allow researchers to work as much as possible in a decentralized manner, with investigators each leading their own team at their home HEAP, David, page 15 of 20 institution, and the regional teams working together via a centralized database on the ALPI server (www.alpi.ca) at the University of Western Ontario. The PI (Heap) will coordinate the different team efforts, assign the raw materials to each team and manage the databases: fortunately, he can draw on the infrastructure resources and technical support of the CFI-funded Theoretical and Applied Linguistics Lab at UWO for this project. The database design and webforms we are developing will continue to be based on open-source software (SQL for the databases and PHP for the interactive webforms): this decision not only helps in keeping costs under control by avoiding expensive updates, it ensures that the tools we develop can be freely adapted by other researchers. While most of the testing of prototype transcription and coding tools will naturally occur online, to ensure an efficient and productive start to this collaborative work, initial stages will require more direct personal contact than it is possible to achieve via internet. The most reliable way to ensure regular and accurate feedback in the development and refinement of the common toolset and procedural protocols (including web interfaces for data entry) is through in-person meetings and intensive worksessions with co-investigators in each of the target regions. Heap will be on sabbatical leave from the UWO, devoting himself full-time to research, from January to June 2007: he will be based at the Universitat de Barcelona where he will work closely with Perea’s research team. He will also consult regularly with Fernández-Ordóñez’ research team in Madrid to ensure that the phonetic transcription and grammatical coding tools developed in Barcelona for Catalan-speaking areas are equally applicable in Castilian-speaking areas. Heap’s expertise in the electronic publishing of dialect data and his intimate knowledge of the ALPI data will be complemented by Perea’s experience in computerised dialectology and dialectometry (Viaplana & Perea 2003; Perea 2004) and by Fernández-Ordóñez’ track-record in internet delivery of language variation data (Fernánez-Ordóñez 2004; Fernánez-Ordóñez & Pato 2005). Heap will also maintain regular contact with Pato, whose reduced course-load (see accompanying letter from the Université de Montréal) will allow him to devote substantial time to testing the transcription and coding tools as well as involving students at Université de Montréal in data-entry procedures. As enough verified data become available in our database, we will be able to implement PHP webforms for mapping, based on Ruiz Tinoco’s VARILEX model. The necessary web-formatted maps, with tabular coordinates for all the ALPI points, have already been prepared for this stage of the project. The methods and protocols developed in these consultations will then serve as demonstration models in order to recruit future ODILA collaborators. By visiting similar teams in different regions of the Iberian Peninsula, Heap will be able to show the advantages of working on such a coordinated project and recruit other researchers to collaborate with further stages of the ODILA: having established methodologies and preliminary results, we will be positioned to attract further institutional and regional government support for each team. Canadian team members (Heap from UWO and Pato from Université de Montréal) will meet in person at the beginning of the project (November-December 2006) to discuss the implementation and refinement of work protocols, and will meet again following Heap’s return from Spain (July 2007) as well as remaining in regular e-mail contact during the intervening period. Heap and Perea will attend an international symposium on linguistic cartography (at the invitation of the National Institute of the Japanese Language in Tokyo) in August 2007, which will be preceded by consultations with Pato and Ruiz Tinoco on the PHP webforms for mapping from the database. By the fall of 2007 working versions of both the preliminary databases and the dynamic map-generation interface will be available online to the scholarly community. As PI Heap will not only coordinate the work of the different team members, but also check to ensure the uniformity of the phonetic transcription and coding criteria as well as maintaining consistent use of the web-forms for entering the data. OUTCOMES Once completed, this first stage of ODILA will give the international scientific community access to a substantial sub-sample of the ALPI data, a unique resource for Spanish and Catalan dialectology. Unlike the static maps of the traditional printed linguistic atlas, our dynamic internet atlas will facilitate the HEAP, David, page 16 of 20 exploration of hypotheses not even anticipated by the original fieldworkers nor those who are creating the data base. The data will be delivered to the scientific community in less time than by a paper publication, and will be easier to edit and to rearrange from different points of view for different purposes. The ODILA will thus represent not only a new way to disseminate research results, but also a new way to collaborate as a team in linguistic research: The key feature of the Web site is that it is an interactive resource. It is abundantly cross-linked in addition to allowing the user to ask several different kinds of questions of the database. When we have more data, it will be possible to ask questions across several different projects at once.... The Web is the research the tool of the future, and we have it now. (Kretzschmar 1999: 283) Such a vision, of course, requires a data structure built on relational databases in which the information is distributed in different interconnected tables, with user-authentication and data-checks between tables. The dissemination of research results over the Internet has another characteristic that distinguishes it from traditional printed linguistic atlases: its easy access, not only for the scientific community but also for the public in general: “we need to accept as central to our purpose the goal of informing the public, not just the scholarly community, about the facts of language variation, especially as that information can affect education and public policy” (Kretzschmar 1999: 283) .The lay public’s natural interest in spoken language—be it regional speech or the speech of a particular town—has rarely if ever been satisfied by academic dialectologists, as our work is too often difficult for non-specialists to access. With a dynamic linguistic atlas online we will take an important step towards the preservation and dissemination of real data from Iberian Romance varieties, in this way giving to all interested people an overall view of this fundamental element of their cultural heritage: their ways of speaking. Results of this research will of course be presented at appropriate scholarly venues, including conferences such as New Ways of Analysing Variation (annual), Linguistic Symposium on Romance Languages (annual), the International Congress on Methods in Dialectology (Leeds, 2008) and the International Society for Dialectology and Geolinguistics, as well journal articles in for example Language Variation and Change, Revista de filología española or the Revue de linguistique romane. The ALPI data which will be made available by the ODILA will shed crucial new light on some recurring issues in Spanish and Catalan linguistics currently of interest to researchers, including variable pronoun usage, the distribution of different verb forms, and of course patterns of lexical and phonetic variants. In addition, the ODILA will provide a novel teaching resource: future students and teachers of Hispanic linguistics or of language variation will be able to access this goldmine of data via the internet, and extract either tabular data or custom maps which can be easily embedded in their research projects or in their classroom materials. In terms of timing, this proposal follows and builds on a series of international scholarly meetings involving the proposed team members: at the International Conference on Methods in Dialectology XII (Moncton, August 2005), we established our presence in the field of dialectology with a Workshop on New Methods in Iberian Dialectology. This was followed up by a Workshop on the Automatic Processing of Variation in Iberian Languages (Tokyo, July 2006). During the period of this award, the International Conference of Historical Linguistics (August 2007 in Montréal) occurs at a point in this project which will allow for Canadian and overseas team members to meet and discuss results with scholars keenly interested in the data we will be making available. As other regional research groups in Spain (for example in Santiago de Compostela, Oviedo and Sevilla) move towards regional projects based on the ALPI data, we can establish Canadian researchers as leaders in the field and place our team advantageously with respect to these other emerging initiatives. REFERENCES HEAP, David, page 17 of 20 (Attachment # 3) ALPI 1962. Atlas Lingüístico de la Península Ibérica. Madrid: Consejo Superior de Investigaciones Científicas. Catalán, D. 1964. El ALPI y la estructuración dialectal de los dominos lingüísticos de la Ibero-romania. Archiv für das Studium der neuen Sprachen and Literaturen 201, 307-311. Catalán, D. 1975. De Nájera a Salobreña, notas lingüísticas e historícas sobre un reino en estado latente. Studia Hispanica in Honorem R. Lapesa, III, Madrid, Seminario Menéndez Pidal y Gredos, 97121. Fernández-Ordóñez, I & Pato, E. 2005. L’espagnol rural de la Péninsule Ibérique étudié dans une perspective grammaticale: le nouvel apport du Corpus Oral et Sonore de l’Espagnol rural (COSER). International Conference on New Methods in Dialectology XII. Moncton, New Brunswick, Canadá. Fernández-Ordóñez, I. 2004. Nuevas perspectivas en el estudio de la variación dialectal del español: El Corpus Oral y Sonoro del Español Rural (COSER). XXIV Congrés International de Linguistique et Philologue Romanes. Aberystwyth, Gales. Heap, David. 2002. “Segunda noticia histórica del ALPI a los cuarenta años de la publicación de su primer tomo” Revista de filología española. LXXXII:1-2, 5-19. Kretzschmar, William. 1999. “The Future of Dialectology.” In Katie Wells & Clive Upton, eds. Proceedings of the Harold Orton Centennial Conference. Leeds Studies in English XXX, 271287. Orton, H. et al. 1962-1971. Survey of English Dialects, Leeds, Published for the University of Leeds by E. J. Arnold. 4 volumes. Pato, E. 2004. La sustitución de cantara-cantase por cantaría y cantaba (en el castellano septentrional peninsular). Madrid: Universidad Autónoma de Madrid. Perea, M.P. 2004. Mapes electrònics i mapes sonors. Dialectologia i recursos informatics. Barcelona: Promociones y Publicaciones Universitarias. 135-152. Perea, M.P., & J. Viaplana. 2003. Textos orals dialectals del català sincronitzats. Una selecció. In PPU Universitat de Barcelona. 1-167. DESCRIPTION OF TEAM HEAP, David, page 18 of 20 (Attachment # 4: description of team) All of the researchers in this international team (Heap at Western, Fernández-Ordóñez in Madrid, Pato in Montréal, Perea in Barcelona, Ruiz Tinoco in Tokyo) have complementary expertise in the electronic publishing of linguistic data in the field of Iberian Romance language variation. All have participated successfully in collaborative research teams as well and have international experience. Principal Investigator: David Heap is the researcher who uncovered the ALPI notebooks in different archives in Spain after decades of neglect and has been publishing them online (www.alpi.ca). He is former director of the Theoretical and Applied Linguistics Lab at UWO, and has experience in managing project grants and teams of research assistants. Apart from his work in linguistic geography, his research deals primarily with formal approaches to morphosyntactic variation in Romance pronominal paradigms. He brings to this project an intimate knowledge of the ALPI materials and issues relating to digitizing these data, as well as an overall vision of the ODILA project’s long-term direction. Co-Investigators: Inés Fernández-Ordóñez is a medievalist as well as a dialectologist, and has experience with publishing texts in both areas, as well as other scholarly works. For more than a decade she has been conducting sociolinguistic interviews of contemporary rural Spanish, a selection of which are now available online (Corpus Oral y Sonoro del Español Rural, COSER http://www.uam.es/coser), and she is among the first scholars to use ALPI data to shed light on contemporary issues in dialect variation. Her research on Spanish dialects looks at non-standard pronouns systems, from both a synchronic and diachronic perspective. Enrique Pato’s doctoral thesis (Pato 2004) was the first modern work to exploit part of the potential of the ALPI data for linguistic research. He has worked on different research teams in Spain and Guatemala, and most recently as postdoctoral fellow under Heap’s supervision at UWO’s Theoretical and Applied Linguistics Lab, where he took a leadership role in organizing a special session on Iberian Dialectology at the XII International Conference on Methods in Dialectology (Moncton 2005). His published research on historical and contemporary dialectology deals with non-standard verb forms, conditional structures, adverbs and other aspects of morphosyntactic variation across Spanish dialects. Maria Pilar Perea has studied variation in Catalan dialects both from contemporary fieldwork data and from historical data, using philologists’ fieldwork notes from the 19th and early 20th centuries to create a database of linguistic forms that can be mapped and displayed automatically. Her experience with largescale data-entry from dialect survey materials as well as electronic linguistic cartography will be invaluable to this project. She has published widely in the area of morphosyntactic variation in Catalan, and participates in a research team which conducts surveys of modern Catalan dialects. Antonio Ruiz Tinoco is general coordinator of the international Varilex project, which gathers lexical data from more than 50 cities in over 20 countries across the Spanish speaking world, and publishes this data online. Among other aspects of this project, he is responsible for the database structure and management (using open-source mySQL software) and the user interface which uses PHP scripts to produce automatic maps (http://lingua.cc.sophia.ac.jp/varilex/php-atlas/lista3.php). His expertise with database management and automatic web-based linguistic cartography will be crucial contributions to this project. In addition to studying lexical and morphosyntactic variation in Spanish, his research deals with applications of computing to linguistics. ROLE OF STUDENTS HEAP, David, page 19 of 20 (Attachment # 5: Role of Students) Students form an integral part of the collaborative research team proposed here, and their involvement is crucial to the overall success of the ODILA project. They will be closely involved in many aspects of the research activities, including:      planning and implementing the relational database structure underlying the storage and delivery of ALPI data via the internet; developing and refining the phonetic transcription and morphological tagging systems used to encode the ALPI data; entering data from the ALPI notebooks using the web-tools and interface protocols; helping develop and test the PHP scripts to create automatic maps ‘on the fly’ from the ALPI’s SQL relational databases; analyzing and developing presentations based on data from the ODILA project. The continuous feedback which research assistants provide to the rest of the team regarding their experiences (difficulties and successes) with the web-tools and data-entry protocols will be particularly important as we test the system with a view to recruiting more research teams from different regions. By interacting in this way not only with the Canadian team members but also with our international coinvestigators, students will gain valuable exposure to top scholars in the field from around the world, as well as useful experience in linguistic research which will be applicable to their own research interests. Since the PI for this project (Heap) will be on sabbatical leave in Spain for a substantial part of the period covered by this award, he will depend crucially on graduate student Research Assistants to ensure that everything is running smoothly on the server at UWO, and to maintain contact with technical support services at this institution should the need arise. It is anticipated that two or more students will develop scholarly papers based on their work with the ODILA project (as has happened in the past with students working on ALPI data), leading to presentations at refereed international conferences, either alone or as co-authors with the Canadian researchers (Heap and Pato). The International Conference of Historical Linguistics (August 2007 in Montréal) is one particularly promising venue for such presentations, given its geographical proximity and the number of high-calibre scholars from different countries who will be in attendance. STUDENT APPLICANT POOL University of Western Ontario: since 2003, the ALPI project has employed more than 15 graduate students as Research Assistants, and some forty undergraduates through UWO’s Work-Study Program, making it a major employer of student assistants in our discipline at this institution. UWO’s offerings in linguistics continue to grow: French has both MA and PhD students specializing in linguistics, the Spanish program (recently expanded, adding a PhD to the existing MA) attracts an increasing number of students specifically interesting in linguistics, and there is a proposal to introduce a ‘standalone’ twoyear MA program in linguistics in the near future. With this increasing pool of qualified students available for assistantships, we anticipate hiring one Ph.D. student and one M.A. specializing in linguistics, at the standard SSHRC stipend amounts. Université de Montréal : co-investigator Pato will be just beginning to develop student interest and expertise in the area of Spanish linguistics at this institution. As he is in the first year of his position there, it is anticipated that he will be able to hire at most a small number of (undergraduate or graduate) students on an hourly basis, in order to begin develop a group of potential Research Assistants. BUDGET JUSTIFICATION HEAP, David, page 20 of 20 (Attachment # 5: Budget Justification) Personnel: Graduate Students: our budget allows for amounts equivalent to one Ph.D. stipend ($15 000) and one M.A. stipend ($12 000), because it is our hope to recruit graduate students through UWO’s growing linguistics program to work along side us on this project in a collaborative fashion. The amounts for undergraduate students are to be paid on an hourly salary, either at UWO or at Université de Montréal: average hourly wage of $18 (including benefits) for 500 hours, total $9000. Travel: Canadian team members: Travel to Spain for Pato and Heap (two return flights, each $1200), and subsistence for two stays of 15 days each (30 days @ UWO rate of $125 / day), total $6150. International team members: Travel to Canada from Spain for Perea and Fernández-Ordóñez (two return flights, each $1200), travel from Tokyo for Ruiz Tinoco (return flight $2000), subsistence for three stays of 10 days each (30 days @ UWO rate of $125 / day), total $8150. Students: Travel to conference and team meeting in Montréal or other North American conference (2 trips at $500 each) + subsistence for 2 stays of 7 days @ $125 / day), total $2750. Professional / Technical consulting: For support in design and implementation of relational databases and web-forms, 100hours @ $52 / hour (UWO Information Technology Services consulting rate): $5200. Other supplies and related expenses: While every effort will be made to communicate via e-mail where possible, considerable postage, courier and telephone expenses will be incurred during team members’ travel. Modest amounts of computer consumables (paper) are needed, and at least one network laser printer cartridges will also be required for this project: total $2000. Computer equipment: Very little is required: the Theoretical and Applied Linguistics Lab at UWO, like our co-applicants’ facilities at their institutions, have adequate computer equipment and software for this project. The only new requirement is a portable (notebook) computer for the PI to use in demonstating the project in different site visits ($1800) and related new software licenses ($500). All funds are budgeted for one year only (2006-2007), total: $62550.

Most of the European countries have at least a linguistic atlas of

Related documents

Products

Support

Most of the European countries have at least a linguistic atlas of

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib