slides

advertisement
The ARC Project:
Creating Logical Models of Gothic
Cathedrals Using Natural Language
Processing
Charles Hollingsworth (cholling@gmail.com)
Stefaan Van Liefferinge (svlieffe@uga.edu)
Rebecca A. Smith (rsmith17@uga.edu)
Michael A. Covington (mc@uga.edu)
Walter D. Potter (potter@uga.edu)
This research benefited from the generous support of a Digital
Humanities Start-Up Level 1 Grant from the National Endowment for the
Humanities (Grant Number HD5110110), a University of Georgia
Research Foundation Grant, and from The University of Georgia
President's Venture Fund.
About ARC
• ARC (Architecture Represented Computationally) is a
collaborative project between architectural historians and
artificial intelligence researchers
• Our goal is to assist architectural historians (and others) with
the task of gathering and using information from
architectural descriptions
• Specifically, we aim to create a logical representation for
Gothic cathedrals, closely tied to the semantics of natural
language, that reflects the mental model historians have of
the "typical" Gothic cathedral.
• This model can then be used to create representations of
specific cathedrals based on verbal descriptions
Why Gothic?
• Gothic cathedrals are
major monuments of
cultural heritage
• Gothic is particularly
suited for logical analysis
• Structure follows a logical
form
• Many typical features,
such as pointed arches
and cruciform floor plan
• Much repetition of
elements, such as
columns and vaulting units
Basic Outline of ARC
Superuser Mode
User Mode
Administrator
• A small set of
Mode
• Cannot add
superusers
• Administrators
new
create and edit
input information
information, but
generic model of
about specific
can submit
a Gothic
buildings
queries about
cathedral
the model
• Need only
• Consists of
describe how
• Can test
features all or
they differ from
models for
most Gothic
the generic
completeness
cathedrals have
model
and
in common
consistency
ARC English: An Architectural
Description Language
• At the superuser level, ARC is an exercise in natural
language programming
• Rather than enter information using Prolog or other
programming language syntax, the superuser will enter
information in "ARC English"
• This is a true subset of English that is expressive enough to
describe the necessary architectural entities, their
properties, and their relationships (spatial and functional) to
each other.
• It should allow for multiple ways of expressing the same
idea, rather than enforcing a strict syntax in the manner of
programming languages
Example of ARC English
A column is a type of support. Every column has a base,
a shaft, and a capital. Most columns have a plinth. The
base is above the plinth, the shaft is above the base, and
the capital is above the shaft. Some columns have a
necking. The necking is between the shaft and the capital.
Some challenges
• Referring to unnamed entities: Skolem functions are used in
place of proper nouns, allowing us to describe properties of
hypothetical or nonspecific entities such as "each column's
base"
• Context sensitivity: When we say "the nave" or "the capital",
which one are we referring to? This depends on what was
said in previous sentences. Analysis takes place at the level
of discourse, not at the sentence level.
• Defeasible reasoning: "Most columns have a necking"
makes no definite universal claim; allows for the possibility
that a particular column has no necking
• Partial ordering: If we're just told that the capital is above
the shaft, we don't know that it's immediately above
From ARC English to real-world
descriptions
• No matter how carefully we design ARC English, it will never
capture the full range of English as used in scholarly articles
about architecture
• Real-world descriptions frequently contain information
irrelevant to ARC, for example historical background
• The task of the Administrator Mode software is more
information extraction than natural language programming
• The generic model tells us what hasn't been specified, and
the software can search real-world descriptions to fill in the
gaps (e.g. how many vaulting units are in the nave, whether
the columns have a necking, how many stories in the
elevation)
Querying ARC
• User mode interaction with ARC recalls natural-language
database querying
• Sample queries might include "How many vaulting units are
in the nave at Saint-Denis?" or "Show me all cathedrals with
a four-story elevation."
• Whereas web searches only look for strings of characters,
the ARC software will be able to process queries on a
semantic level, resulting in more relevant information
• ARC queries can also tell us whether a given description is
underspecified (does not tell us all relevant information) or
contradictory (contains incompatible information)
Download