A PhD candidate in the Department of Geography is researching the

advertisement
Structured Annotation for Land Grant Research:
Interdisciplinary Collaboration and
Database Modeling for Historical GIS
By: Mary B. Ruvane
University of North Carolina
School of Information & Library Science
Date: October 22, 2004
Revisied: November 2004
1 of 37
TABLE OF CONTENTS
INTRODUCTION...................................................................................................................... 3
BACKGROUND ........................................................................................................................ 4
PROJECT SCOPE: LOCATING THE INDIAN TRADING PATH ........................................................ 4
LAND GRANT INFORMATION: SOURCES, CHARACTERISTICS & DATABASE CHALLENGES....... 5
North Carolina Land Grant System ................................................................................. 6
Organization & Condition of Land Grant Records .......................................................... 8
18th Century British-American Handwriting ................................................................. 10
Surveying Techniques & Measurement Systems .......................................................... 12
Additional Information Sources for Locating Parcels ................................................... 13
GEOGRAPHER’S INITIAL DATABASE DESIGN......................................................................... 13
DATABASE COLLABORATION ......................................................................................... 14
STEP 1: EVALUATION OF THE GEOGRAPHER’S ORIGINAL DATABASE MODEL....................... 14
Limitations ..................................................................................................................... 15
Structure Evaluation....................................................................................................... 15
 Flat-File Model ..................................................................................................... 15
 Data Fields: Entity Duplication ............................................................................ 16
 Data Fields: Compound Attributes ....................................................................... 17
 Data Fields: Commingled Information ................................................................ 18
STEP 2: UNDERSTANDING THE GEOGRAPHER’S INFORMATION NEEDS .................................. 19
Mapping a Tract of Land: Primary Clues ...................................................................... 19
 General Clues - Area or Connecting: Land Office, Basin, County, Features ...... 20
 Specific Clues - Vicinity or Adjacent: Features, People Names .......................... 21
 General & Specific Clues: Comparing Characteristics ........................................ 22
 Parcel Shape Clues: Angles, Distances ................................................................ 23
Overlooked Clues: Structure related, bibliographic, annotations .................................. 23
 Structural: Multiple Variables, Commingled Data .............................................. 23
 Additional: bibliographic reference ..................................................................... 24
New Clues: Entries, Warrants, Deeds and More ........................................................... 24
STEP 3: CHANGES IMPLEMENTED - THE NEW DATABASE MODEL ......................................... 25
The Relational Model .................................................................................................... 25
 Limit redundancy ................................................................................................. 26
The Data Entry Form ..................................................................................................... 30
Parse and Import Geographer’s Data ............................................................................. 31
REMAINING OBJECTIVES ................................................................................................. 31
COLLABORATION SUMMARY ......................................................................................... 32
CITED REFERENCES ........................................................................................................... 33
APPENDIX A: DATA ENTRY FORM & ORIGINAL DATABASE FIELDS ................. 35
APPENDIX B: ENTITY RELATIONSHIP DIAGRAM .................................................... 36
2 of 37
INTRODUCTION
Last year a PhD candidate in the Department of Geography at UNC-CH, G. Rebecca
Dobbs, began researching the relationship of the Indian Trading Path and the role it may have
played in the evolution of today’s North Carolina Piedmont urban centers. As an historical
geographer, one objective of her ongoing project is to demonstrate the Path’s influence on the
settlement patterns of Europeans who migrated into North Carolina during the later half of the
18th century. Using a Geographic Information System (GIS) Dobb’s intends to build a multimedia time-based map in support of her findings to illustrate the physical location of land
settled in relation to its date of occupation and proximity to the Trading Path. Her ultimate
purpose is to relate the Path’s influence on the early settlers’ site selection to the consequent
emergence of present day city hubs in central North Carolina. The majority of information
facilitating the task of building the GIS map is being culled from original 18th century land
grant documents.
Initially the geographer took it upon herself to design a simple Microsoft Access
database to store and organize the evidence discovered in each land grant record, but after
working with a sample of the records it became clear to Dobb’s that modifications were in
order. After discussing the database’s shortcomings with me, a PhD student in information
science, we agreed to collaborate on reengineering the geographer’s original data model. As
of this writing some major improvements have been implemented while minor updates are
pending. As the project progresses additional enhancements will be incorporated as needed.
The first half of this paper provides an overview of the geographer’s historical research
project, her information needs, the information characteristics, and the shortcomings of her
initial database. Part of the original deficiencies centered around the ambiguous nature of
historical material, while others were more directly related to the geographer’s limited
3 of 37
experience with relational databases and the modeling of complex data. The remaining
sections discuss our collaboration process, the database modifications implemented by the
information scientist (me), the results, and implications for future research.
BACKGROUND
Project Scope: Locating the Indian Trading Path
The geographer’s study area is in North Carolina, limited to counties lying in the
Piedmont region of the state situated between the mountain region to the west and the coastal
plain to the east. It is through this land a section of the Indian Trading Path ran, continuing
along its route in the NE into Virginia or along its southerly track crossing in the SW into
South Carolina (see: Figure 1). Land grant paperwork processed between 1748 and 1763,
coinciding with the most active years of the Granville District Land Office, was chosen by the
geographer as the most reliable source for evidence of the Path’s position in relation to the
tracts settled. As the project progresses the resources examined and the time period under
study will necessarily expand or contract as the study’s objectives dictate.
Several historic maps of regional or state scale, along with anecdotal materials, exist to
substantiate the Indian Trading Path’s importance during this Colonial settlement period [1-4].
Unfortunately there is no known map at the local-scale needed to illustrate the progression of
parcel occupancy as settlers moved in and began making claims to the land. Therefore to
effectively observe the influence the Indian Trading Path may have had on a settler’s site
selection requires that the geographer document each tract of land to determine its physical
size and location in conjunction with its relationship to neighboring properties, geographic
features and distance from the Trading Path; these findings will be presented in a digital GIS
format.
4 of 37
Figure 1. Route of the Indian Trading Path through NC Piedmont
Map Source: based on original by G. R. Dobbs; minor legend modifications by M.B. Ruvane.
Land Grant Information: Sources, Characteristics & Database Challenges
During the early settlement period of the NC Piedmont Area the distribution of land
was recorded, albeit inconsistently, by the administrative entities of the day. Assorted
documents exist, collectively referred to as land grant records, which provide information
ideally suited for geographically positioning individual land parcels within a GIS map.
Especially useful has been parcel measurement information contained in survey documents.
Originals of these land grant records are housed in the NC State Archives in Raleigh and are
viewable in microfilm format onsite, or reels may be purchased for viewing elsewhere. After
struggling with the logistics of travel to the State Archives, dealing with the inadequate
equipment, limited hours and the high price for poor quality copies, the latter option was
ultimately adopted by Dobb’s.
5 of 37
North Carolina Land Grant System
There were two land grant systems in use in the NC Piedmont area during the later part
of the 18th century. If land was in the Granville District, which consisted of the northern half
Figure 2. 18th Century land administered by two systems
of the State, the process
involved Lord Granville’s1
agents and was recorded in
his books. If land was south
of Lord Granville’s Line, the
Map Source: G.R. Dobbs
process involved Colonial,
and later State, agents and their records. In either instance the documents typically fall into
one of four categories representing individual stages of the land grant process, which was
initiated by the recording of an entry document and culminated with the issuance of a deed.
The first stage in the land grant process involved an application for a certain tract of
land by a settler, resulting in an entry record of the request. The second stage was the issuance
of a warrant by the Land Official, a document authorizing a surveyor to go out and survey the
land applied for. Stage three was the survey, performed by an assigned surveyor who would
physically measure the land and prepare a document containing his findings. The final stage
was the grant of land (or deed), a document authorizing the grantee rights to the property. In
the Granville District processing of an individual request for land ideally should have
transpired over an 18-month period, the survey completed within 6 months of a recorded entry
and the deed issued within 12 months of the survey, but this goal was frequently not realized.
According to Mitchell, a ten-year lapse between the date of the survey and the date of the deed
was common [5].
1
Lord Granville was also referred to as John Cartaret, Earl of Granville.
6 of 37
The information contained in these land grant records provides essential clues needed
by the geographer to construct her GIS map depicting the locations and sequence of land
occupation. For instance the transaction dates, the county a tract resides in, parcel dimensions,
people relationships (e.g., grantee, assignee, neighbors, surveyor, witnesses, chain carriers),
and feature names and characteristics (e.g., rivers, paths, fields, a ridge) all play a role in
piecing the puzzle together. Collecting this evidence requires that each document be manually
reviewed by the geographer in order that pertinent data can be recorded in her database.
Typically an entry record provides a brief description of the vacant land, the county it resided
in (at the time), the estimated acreage, and the name of the person hoping to purchase the land.
A warrant supposedly repeats the parcels’ description “exactly” as written in the entry record
and is signed by the agent authorizing a survey.
The ‘plat of survey’
Figure 3. Example: Plat of survey document
documents contain a small
\
map of the land and a
written description of the
property including
boundary measurements
with directional indicators.
In addition the name(s) of
the grantee(s) and surveyor
are specified along with the
total acreage, which often varied considerably from the estimated acreage listed in the entry or
warrant. Frequently included in the narrative, or drawn on the plat, are references to
geographic and cultural features, neighbors, and chain carriers. The deed documents contain
7 of 37
the signature of the grantee, the indenture agreement2 between the grantee and Land Official,
and in records from the Granville District are normally accompanied by a copy of the survey.
In most cases the date of each document’s transaction is clearly indicated. One other record,
less frequently encountered by the geographer, is a paper assigning the rights to the land to
another person, sometimes these transfers are simply noted on the back of a survey or warrant.
Organization & Condition of Land Grant Records
The majority of original land grant documents housed in the NC State Archives have
been permanently removed from circulation for preservation purposes. Instead, to view these
materials visitors are provided access to well-worn microfilmed copies of the paper work. The
reels are filed by record series based on the issuing Land Office, either the State’s or Lord
Granville’s, arranged alphabetically by county, then by the surname of the grantee. Warrant
and plat [6, 7] documents are collated together within one record series while the deeds [8, 9]
are grouped separately in another. In the Granville records entry documents are also included
in the former mentioned series. In some instances the alphabetical order by grantee is not
adhered to, especially in the deed records as opposed to the warrants and plats.
In the Granville District survey documents can be found in both record series’: the
warrants and plats or the grant of deeds. In the deed series’ a survey frequently precedes its
related deed on the microfilm, this does not appear to be the case for State records. In the
warrant and plat series’ the documents are intermingled, and may or may not be related. While
an ‘exact’ duplicate of a survey found in a deed series could be included in the warrant and
plat series, the same is not always true in reverse. If a tract was surveyed but ultimately did
2
A fee simple agreement that included an annual quit rent clause authorizing the grantee legal rights to the
property.
8 of 37
not convey to the person it was surveyed for the completed survey would only be found in the
warrants and plats series’.
In the Granville District a surveyor was expected to prepare three copies of each
survey: one for the Land Grant Office, the second for attachment to the grantee’s indented
deed, and a third for attachment to a duplicate of the deed (signed by the grantee) to be sent to
Lord Granville in London (although this did not always happen). Finding three surveys
together is a good indicator that the grant of deed transaction never occurred. It has not been
determined at this stage of the geographer’s research whether three copies of each survey were
required by the Colony’s or State’s Land Office.
The NC State Archives cautions researchers that although the entry, warrant, and
survey [6] records of the Granville District are filed together, there is no guarantee that
contiguous documents refer to the same parcel of land. They further warn that multiple entries
‘of the same date for the same person for land in the same county’ often exist, preventing a
precise match between these vague documents and other entry papers, warrants, plats or
grants. Additional pitfalls to be mindful of include land interests that may have been assigned
to another party and the frequent shifts in county boundaries. It was not uncommon for a
parcel to reside in one county at the beginning of the land grant process that midway through
became incorporated into another.
9 of 37
The condition of land
Figure 4. Example of damaged land grant document
grant records varies. While
many are intact, others over
the course of years and
through physical use have
faded or been damaged
making interpretation difficult.
There are some that only
remnants of the original document remain, rendering them useless for this project. In a
number of cases only portions of the content can be deciphered due to smears and blemishes
that have obscured the writing. Adding to these readability challenges are the inherently poor
quality of microfilm images and limitations with the equipment available for reading them.
To date, the Land Entries, Warrants, and Plats of Survey [6] and the Grants of Deeds
[8] series’ related to the Granville District have provided the majority of evidence entered into
the geographer’s new3 database. These consist of 14 reels and 19 reels of material respectively,
representing thousands of land parcels. The State land records are pending data entry. There
are other record series’ that may be incorporated as the project progresses, but suitability of
these have not been thoroughly explored.
18th Century British-American Handwriting
Deciphering hand-written land grant documents from the colonial days can be quite a
challenge, yet with a little practice it usually can be done according to Dobb’s. In the 18th
century spelling was not standardized [10]. Words were often spelled phonetically,
3
The geographer’s original database was utilized for entering approximately 800 initial records before migrating
her data into the new database designed by the author (me).
10 of 37
abbreviated, or simply shortened with either superscript notation or no indication of the
missing letters at all. Within the same document different spellings of the same word can
often be found. The lower case s is frequently written in a style that today would be
interpreted as an f. Additionally, depending on the penman, certain upper case letters can look
similar such as K, P, and R or J and T.
An added complication to interpreting these documents, and especially in designing an
effective database for comparing similar terms, is that proper names were just as likely to be
distorted in this cryptic prose. A persons’ first name was commonly abbreviated, for example
Jno could represent the proper name for John, Jonathan, or even Jonas. Last names
encountered the same imprecision, for instance there are multiple spellings recorded in the
Dobb’s database for Sherrill, such as Sherrill, Sherill, Sherrel, and Shirill. The most common
spelling found in today's NC phone directory is the former. Another case in point are the
phonetically equivalent spellings of a creek named Lyle, which according to the Getty
Thesaurus of Geographic Names [11] is the preferred spelling over the vernacular versions
Lyles or Liles. Currently the database includes potential matches such as Lyles, Liles, Lylles,
Lyleses, Lillis, Lilses, and Lileses. But whether any of these fuzzy similarities actually refer to
the same proper name is yet to be resolved.
This ambiguous writing style creates a dilemma when transcribing information for use
within a database. While it was tempting from the geographer’s standpoint to ‘substitute on
the fly’ a modern translation for an apparently archaic or misspelled word found in a document
in order to standardize terms for searching, I recommended she enter information as it is
written leaving uncertain language for comparison with the larger body of data as it grows.
Any conscious revisions to original spellings presented in the document she agreed would be
noted in the future. Regardless of the approach taken a method should be determined in
11 of 37
advance on how to indicate illegible words, uncertain letters, and translations inserted by the
data entry person. Where possible the method chosen should be employed consistently and the
process clearly documented.
Surveying Techniques & Measurement Systems
Prior to the 18th century a property line might have been described by ‘the sweep of an
arm from a rock by a river to a distant tree’. By the 18th century the common practice in use
by surveyors in the 13 colonies and other parts of the eastern states was the more ‘precise’
metes and bounds surveying system [12, 13]. This system incorporated measurements taken
between landmarks to more accurately distinguish a property’s boundaries such as
‘…beginning at a red oak, running East 70 [chains] to a pine…’ [14]. The units of measure in
the metes and bounds system were based on chains (100 iron or steel links equal to 66 feet
long) or poles (a unit of length equal to 16.5 feet). Poles were also referred to interchangeably
as perches or rods.
Interpreting and recording these parcel measurements is the crux of the geographer’s
project, for without the size, shape, and geographic orientation of each tract it would be
impossible for her to map them. Chains were typically the unit of measurement employed.
Many of the plats in the study area consist of four measured sides that are rectangular in shape;
although numerous exceptions can be found where boundaries exceed eight or more measured
sides or a meandering stream forms one or more sides of the property line. This distinction
becomes important when working with GIS technology. Parcels constructed of enclosed linear
boundaries need to be interpreted and drawn as polygons in the system, whereas those with
borders defined by waterways initially must be recognized and drawn as lines. More about
this later.
12 of 37
Additional Information Sources for Locating Parcels
Numerous additional resources have been employed by the geographer to facilitate
interpretation of, or expand upon, the information contained in the land grant records. These
include but are not limited to cartographic material, books, and manuscripts. Most useful have
been maps depicting selected themes in the 18th century drawn at various scales. For example
Collet’s 1770 map [1] is a survey of the entire state of North Carolina representing geographic
features (e.g., rivers, trees, counties, cultural features, settlements) of the period, while
Markham’s maps [2] prepared in 1973 illustrate the location and ownership of land parcels in
old Orange County between 1743-1810. An Atlas of Historical County Boundaries [15]
details the shifts in NC administrative borders from the time of colonial settlement up to 1998.
Ramsey [3] published a book describing the settlement of Rowan County between
1747-1762, which also contains map illustrations. A prime reference is Powell’s North
Carolina gazetteer [16], a dictionary of NC names and places including variant historical
spellings and aliases. Additional material appears to be available at the Durham County
Library, in Durham, NC and in two special collections maintained by the Wilson Library at the
University of North Carolina, but as of this writing they have not been fully explored by
Dobb’s. These are but a few of the more authoritative sources that add credence to, and assist
with the analysis of the evidence found in the land grant records.
Geographer’s Initial Database Design
At the outset it seemed clear to the Dobb’s that the survey records held the primary
content needed for constructing the overall GIS land parcel map. With this in mind the first
task she undertook was to become familiar with the survey documents, the evidence they
provided, and what data needed to be collected. The geographer then considered her study’s
13 of 37
objectives, sketched out the process, and proceeded to build on her own a database tool for
storing parcel measurements and relevant ‘incidental’ clues, such as the names of people,
features, and selected characteristics mentioned in each survey.
Dobb’s initial design worked well, handily facilitating a method for storing and
organizing essential information contained in the documents. As expected, she made a few
modifications along the way to improve the data entry process and to incorporate new facts.
Unfortunately the first test of her database’s usefulness, after entering records from a sample
of the study area, offered insight into its limitations and problems: the database’s information
retrieval capabilities were ineffective for her project’s needs.
DATABASE COLLABORATION
Having previously collaborated before with me [the Author], and encouraged by my
enthusiasm to jump right in and help, the geographer agreed to discuss the issues she was
having and to explore possible database modifications. Based our first conversation it sounded
like the clues she needed for identifying likely parcel adjacencies had not been thoroughly
considered in terms of her database’s design. The first step was to evaluate her database’s
strengths and weaknesses. The second was to learn about her information needs and
understand the characteristics of the land grant documents she was working with (described
previously). The third was for me, as the information scientist, to implement changes based on
my area of expertise and the geographer’s needs.
Step 1: Evaluation of the Geographer’s Original Database Model
Dobb’s original database consisted of 44 fields (see: Appendix A). It was evident that
the data being collected from the survey documents had been well thought out capturing the
14 of 37
essential evidence required to differentiate each tract, although certain limitations and structure
concerns needed to addressed.
Limitations
As mentioned, the search capabilities of the geographer’s database had proven to be
ineffective during a trial run with sample data she had entered. The retrieval problems were
due to the variety of methods being employed for storing multiple values, as explained in the
next section. The second limitation was the inability to easily incorporate new types of data.
This was rooted in the original design’s singular focus on capturing information found in the
survey records; the documents initially deemed most significant to the study by the
geographer.
The need to incorporate content from additional sources had become evident during the
process of entering sample survey data from one county. Dobb’s discovered that many of the
surveys she was recording in her database provided little detail and other parcels known to
exist seemed to be missing entirely. Upon further review it appeared that the entry, warrant,
and deed documents held promising information to fill in these gaps, and potentially other
resources would be useful. Collecting content from different records had not been planned for
in her original database design. To do so would require new fields and a way to differentiate
each source. Therefore, the overall goal of our collaboration was to determine a method to
incorporate new document types and improve the database’s search capabilities.
Structure Evaluation
 Flat-File Model
The geographer’s initial database design was a flat-file, a simple database
management model for storing data in one table (see: figure 5). Dobb’s had chosen to work
15 of 37
with MS Access, a relational database application intended for use with multiple tables.
The allure of Access was its’ feature for building customized forms to expedite data
collection, a characteristic not found in spreadsheet applications typically employed for
single table modeling. Although a flat-file provides a good method to store information it
offers less flexibility for posing queries, dealing with large disparate quantities of data, and
for customizing. As the content collected in each database field began to exceed its original
intent and new fields were added the efficiency of the geograher’s single table design had
greatly diminished.
 Data Fields: Entity Duplication
Most of the 44 fields in Dobb’s database were redundant: over half were devoted to
collecting paired survey measurements while other multiple fields were being used to store
people names and feature names (see: figures 5, 6, 7 and 8). Creating separate fields for
entities that share common attributes adds unnecessary complexity to a database’s search
criteria. For instance, when looking for people with the same name an advanced union
query would be necessary to join the seven individual ‘person name’ fields into one list for
comparison. To make matters more complicated, content entered into these fields was
inefficiently formatted, as described in the next two subsections.
Figure 6. Redundant fields used for two entity types: angles and lengths
Survey pairs identified in 24 separate fields instead of two: angles and lengths (note: pairs 6 - 12 not shown).
16 of 37
 Data Fields: Compound Attributes
Figure 5. One table with 44 fields
Many of the geographer’s fields held compound
attributes in a single field. For example the seven ‘person
name’ fields contained full names such as “Robert Samuel
Barshear Jr.” or “Reverend John Thompson” (see figure 7)
instead of being divided into separate fields (e.g., suffix,
first, middle, last, and prefix). This method only allowed for
sorting on a person’s first name. For searching by a person’s
last name, in a one field entry such as this, the better method
would have been to enter the last name first separated by a
comma, such as “Barshear, Robert Samuel, Jr.”.
Figure 7. Redundant fields used for one entity type: people names
People names identified in eight separate fields instead of one (note: surveyor field not shown)
For the geographer’s project individual fields are likely to work better in facilitating
searches on imprecise data such as peoples’ names. Being able to compare first name, last
name, or any combination may assist in finding phonetic matches or possible spelling
alternatives. The same single field format was an issue in her feature names’ entries, but
these fields had additional troubles (see: figure 8). They not only included entity
duplication (e.g., two columns) but also were seriously compromised by commingled data
17 of 37
(discussed below). Fields for people names’ also contained commingled data, but less
often. Because of these formatting issues searching for parcels sharing similar features or
with related people was essentially futile in the geographer’s original database model.
 Data Fields: Commingled Information
Commingling of unique data within individual fields was causing the most
conspicuous impediment to the geographer’s ability to effectively search her database.
Multiple people, of the same type, were being entered together into the single field she had
allocated for capturing this information. For example a grantee’s name (field titled:
surveyed for) occasionally held two or more names formatted as follows: “George Tate
[and] John Chew”. Similarly a field for documenting neighbors (field titled: Adjacent to)
might contain three or more names such as: “John Beavard, Alexander Osborn [and] John
McConil.”
Clarification notes were also being added to the mix, especially in fields designed to
identify features, causing another snag (see: figure 8). For example one feature field (field
titled: Location Keywords) combined the name of a creek, location information, and the
geographer’s clarification notes to herself: “head branch of Coddel Creek (now Coddle,
Codle on Collet map)” [17]. Fields designed for recording people’s names were not
immune to this unstructured annotation practice either, for instance one neighbor’s field
contained “Moses Andrew (or near); George Davison must be somewhere near.” These
inconsistencies, of entering several entity types along with observation notes into one
common field, were prevalent throughout the original database making it a high priority for
resolving.
18 of 37
Figure 8. Commingled information in two feature fields: water and transportation
Step 2: Understanding the Geographer’s Information Needs
The second step was to understand Dobb’s information needs and what clues the new
resources she was interested in including might contain. How did the current evidence
collected assist with placing a parcel in the correct location on a GIS map? Were there clues in
the survey documents that had been overlooked in the first database? What additional clues
would the entries, warrants, and deeds provide? What other materials might contain useful
information for recording? What types of database searches did she envision employing?
These questions sparked a great deal of dialog that ultimately provided the blue print I used for
creating the current rendition of Dobb’s new database tool.
Mapping a Tract of Land: Primary Clues
The root of this collaboration is to facilitate the geographer’s task of positioning
individual land parcels in real time and space. Although a comprehensive survey document
provides the necessary measurements to reconstruct a parcel’s size and shape, the tract’s
physical location often remains uncertain without further investigation. Locating a parcel in
real space is contingent upon comparing a variety of details across multiple land grant records.
Indications of how each parcel relates geographically are found by identifying tracts that share
common characteristics, offering clues ranging from the more general to specific. The greater
19 of 37
the number of shared characteristics the greater the likelihood that those parcels are in the
same general area, and ideally adjacent.
 General Clues - Area or Connecting: Land Office, Basin, County, Features
General clues help to divide parcels into broad geographic areas. For example, the
land office involved tells you whether the property was in the northern or southern half of
the State. Knowing which water basin a tract resided in provides another clue, although the
associated basin is not always clear. The county further reduces a tract’s possible position
by limiting its location to within an administrative boundary, keeping in mind that border
shifts require careful interpretation. Features provide the remaining general clues and in
some instances fit the description of specific clues described in the next section.
Within the confines of a parcel’s designated land office, basin, and county, the
features identified become a crucial aid to further narrow down a tract’s general position.
At times a feature’s location is even illustrated on a survey’s plat. Unfortunately not every
document identifies a parcel’s features and many that do lack sufficient context for a
definitive placement in real space. Features predominantly cited include waterways and
transportation routes, followed by less frequently mentioned cultural features such as ‘a
mill’, ‘Indian old fields’, or ‘a courthouse’. By extracting the names of features such as
rivers, streams, creeks, paths, fords, and roads the geographer can not only compare them
with present day map locations and historical records but also with other parcels’
characteristics.
An especially valuable clue comes from connecting features, such as waterways,
transportation routes, and land office or county lines. These types of features traverse
multiple properties along a continuous route, an inextricable link that positions a tract along
a common reference. In illustration, one survey document [18] pinpoints a tract’s location
20 of 37
‘…on the N side of the Catawba River, straddling Buffalo Creek, bordering the Granville
Line…’, unfortunately most records are not as precise offering only vague positions such as
‘on the south side of the Yadkin River,’ which could be anywhere along a 203 mile route
[19]. Just like a broken strand of beads, restringing these loose pieces into their original
order can be virtually impossible without further indicators. Nonetheless features can offer
valuable clues, especially in combination with other evidence.
 Specific Clues - Vicinity or Adjacent: Features, People Names
Specific clues help to determine a tract’s position in relation to other parcels’, either
by inferring they are ‘in the vicinity of’ or by providing a clearly stated adjacency. The best
evidence to establish nearby or neighboring properties is by comparing the names of people
associated with each parcel, although at times adjacencies can be surmised based upon
adequately described or unique feature clues. The grantee and the surveyor are two ‘types’
of people most consistently recorded within the land grant documents. Less frequently
mentioned are the names of bordering neighbor(s), near-neighbor(s), chain carrier(s),
assignee(s) and other minor relationships of less value for deciphering positions.
As an example of how peoples’ names aid in determining adjacent parcels one
survey document points to two neighboring tracts as follows: ‘…to a black [oak on]
William Grant’s line then [straight] along said line 48 [poles] to a [red] oak on Hugh
Dixon’s line then [east] along said line 16 [poles] to Dixon’s corner…’ [18]. Another
combines feature relationships along with a near-by neighbor’s name to suggest an
approximate location: ‘…On the N side of the Catawba River, Straddles Third Creek,
about 3 miles above Thomas Gilespy's property…’ [20].
Properties not necessarily adjacent, but perhaps in the vicinity of each other may
also be uncovered by comparing the names of chain carriers who assisted a surveyor with a
21 of 37
parcel’s measurements. Because long distance travel was not practical during this era it is
assumed that volunteer chain carriers lived somewhere near the tract being surveyed. The
exception might be a chain carrier with the same last name as a grantee’s, for he
presumably was a resident member of the family the land was being surveyed for.
 General & Specific Clues: Comparing Characteristics
Aside from the rare document providing precise location information for a particular
parcel, pinpointing the actual position of the majority of tracts requires some detective
work. Starting with records containing the most productive evidence, whether general or
specific, features or people, the process entails frequent back and forth comparison in an
attempt to first cluster likely groups of properties followed by arranging them into their
original configuration. Although the results may be initially fuzzy by analyzing those
parcels that share general and continuous features in conjunction with those likely to be
adjacent or near-neighbors, the tedious job of placing each tract onto a GIS map usually
begins to work.
To evaluate the evidence collected a relational database is ideally suited for
generating these comparisons. For example, with my assistance, Dobb’s could develop a
two-part query based on general clues. First, find all records issued by the Granville Land
Office that lie in the same water basin within a designated county and contain the Yadkin
River. Second, select from these records only those listing neighbors, containing one or
more additional common feature, or any other criteria deemed relevant to achieve the
results sought. Alternatively, a query could be written using specific clues to find all
documents that include neighbors and/or chain carriers. Followed by a sub-query to
compare any general features they may have in common that indicate a possible adjacency.
22 of 37
 Parcel Shape Clues: Angles, Distances
The shape of a parcel offers a visual location clue. The angles and lengths provided
in each survey document are extracted from the database into an application tool that
generates GIS compatible shape files4 [21], resulting in a ‘puzzle piece’ yet to be fitted into
the picture. Although not as distinctive as a finely carved jigsaw piece, one can be certain
that a meandering river boundary does not adjoin a property whose border edge is linear.
Parcels with unusual angles or irregular boundaries provide similar incompatibility clues.
Overlooked Clues: Structure related, bibliographic, annotations
In discussions surrounding the original database’s recorded evidence Dobb’s indicated
a few items that had been overlooked or were causing problems. Some of these issues related
to structural inconsistency discussed earlier, others were additional clues identified for
incorporation into the new database.
 Structural: Multiple Variables, Commingled Data
Multiple variables require special handling in the design of a database. In this
project evidence such as the county a parcel resided in or the date of a survey at times falls
into more than one category. For example, some parcels were associated with multiple
counties either because of boundary uncertainty or an official shift in administrative borders
during the land grant process. In other cases a document may have conflicting dates, where
the front of a record indicated one month, day and year, and the back another. Structural
errors such as these complicate access to anomalous types of information.
4
A shapefile stores nontopological geometry and attribute information for the spatial features in a data set. The
geometry for a feature is stored as a shape comprising a set of vector coordinates
23 of 37
Commingled data within one field essentially nullified the value of the primary
evidence recorded. As discussed in the previous structure section, several forms of this
dilemma existed in Dobb’s original database. The primary names of features, such as
waterways or transportation routes were typically intermingled with directional
information. Other fields included primary names intermixed with various types of
annotations, including personal reminders, citations to additional information, or notations
concerning transcription uncertainty such as spelling or legibility. It was clear these pieces
of evidence needed to be separated into new fields to improve their benefit to the goal of
building a GIS map.
 Additional: bibliographic reference
As the Author, I wanted to expand upon the one field Dobb’s was using to capture
bibliographic information to insure each source and its’ location was properly identified.
The existing institution field needed to be capable of storing multiple organization names
and possibly instances where the same document might be owned by more than one, even
though the State Archives in Raleigh would likely provide most of the material. The
format of material also needed to be documented, such as whether it was a microfilm
record, book, manuscript, or map. Another field equally vital to incorporate was a place for
storing call numbers and related descriptions to validate the source and allow Dobb’s to
return to the evidence at a later date.
New Clues: Entries, Warrants, Deeds and More
Our discussions included many conversations on what new information the entries,
warrants, deeds and potentially other documents might provide. The content they contained
seemed to overlap with most of the ‘entity types’ already being recorded in Dobb’s original
database, with the exception of the parcel measurements. For instance she wanted to record
24 of 37
the type of document, the associated date(s), an extract of selected content, the features noted
and people mentioned. This request could easily be handled by adding a ‘type’ field (e.g.,
document type, date type, person type, etc.) to differentiate the evidence pulled from the
anticipated new mix of documents.
Step 3: Changes Implemented - The new Database Model
The third step was to remodel Dobb’s database to take advantage of the relational
database application, better structure the field content, and address her expanded information
needs. Based on our discussions I initially created an entity relationship diagram to illustrate
the conceptual changes I proposed (see: Appendix B). After some back and forth discussions,
which initiated several modifications to the diagram, the final rendition was used as a blueprint
for designing the new database model described below.
The Relational Model
A relational database application, such as MS Access, typically employs multiple
related tables. This approach cuts down on data redundancy, is well suited for handling
multiple values that cause anomalies, and provides a method to maintain relational integrity.
To remove as much redundancy as possible and accommodate the variety of multiple values
five primary tables (e.g., document, parcel, people, features, survey) were created with links to
13 related subcategory tables (see: figure 9). This clearly differs from the Dobb’s original one
table model illustrated previously (see: figure 5). Although some tables purposely still contain
redundancy, to simplify the query building process from a geographer’s standpoint, the fields
involved are coupled with underlying integrity rules to insure reliability.
Figure 9. New database’s multiple table model
25 of 37
 Limit redundancy
Multiple values were causing the majority of redundancy problems in Dobb’s
original single table model. In a flat-file these values can only be handled in one of three
ways: by adding columns (like: figure 10a), by commingling all values in one cell (like:
figure 10b), or by using multiple rows (which Dobb’s had not done). Each approach causes
problems or creates anomalies when updating, adding, or deleting subsequent data [22].
Figure 10. Multiple value redundancy inherent in Dobb’s flat-file model
(a) Multiple columns in single row.
(b) All names in one column, in single row.
26 of 37
Limiting these types of redundancy in the new database was a priority, especially
for fields containing critical clues such as people and feature names. In the geographer’s
original database duplicated data was unlinked and appeared across and down columns as
illustrated in the fields for people names and feature names (see: figures 7, 8, and 11a). In
the new database several linked tables were employed for connecting people and feature
elements. For example seven new tables were created for organizing information related to
people: people, people associated with, prefix, first, middle, last and suffix (see: figure 9).
For features six new tables were created: feature, feature locators, locator terms, primary
name, suffix, and feature type (see: figure 9).
The new primary entity table for people, PEOPLE, stores one unique name per row
and separates the compound name elements into their smallest units (see: figure 11b). The
PEO_ASSOC_WITH table links the unique person (PerID) from the PEOPLE table with all
documents containing the same name, as well incorporates new fields for relation type and
related comments. The relation type identifies what role a person played in the processing
of a particular parcel and the related comments field eliminates the commingling of
annotation data (see: figure 11c). The five remaining tables associated with people names
(e.g., prefix, first, middle, last, suffix) store individual name elements and are called upon
to establish integrity each time a new unique full name is required in the PEOPLE table.
27 of 37
Figure 11. Original single table vs. New People & People Associated with Tables
b. New database: PEOPLE table
a. Portion of geographer’s original database
c. New database: PEO_ASSOC_WITH table
(a) Redundancies in “surveyed for”: Mordecai Mendenhall (IDs 436, 435), William Morrison (IDs 446, 447); and
between columns Alexander McCulloch is ID 619 in “surveyed for” then ID 426 in “adjacent to”.
(b) The name William Morrison is now associated with the unique ID 189.
(c) Note William Morrison’s relationship with multiple properties – these are but a few.
The new primary entity table for features, FEATURE, stores one unique feature per
row, separates compound feature elements into their smallest units, and uses a category
field to identify the type of feature (e.g., water, transportation, cultural, place name, etc.)
(see: figure 12b). The FEA_LOCATORS table links the unique feature (FeaID) from the
FEATURE table with all parcels containing the same feature and incorporates new fields
for description [location] terms and related comments. The description terms identify a
parcel’s position in relation to the feature identified and the comment field eliminates the
need to commingle annotation data (see: figure 12c). The four remaining tables associated
with feature names (e.g., locator terms, primary name, suffix, and feature type) store
individual name elements and are called upon to establish integrity each time a new unique
term is required in the FEATURE or FEA_LOCATORS tables.
28 of 37
To illustrate the benefit of the new database's multiple-table model take a look at
figure 12. Start by looking at ID159 in figure 12a, Dobb’s original single-table model, and
compare it to the new multi-table output shown in figure 12d. In the new database ID159
has taken on a new role as the document ID (.docID) and in the query results displays on
three separate lines, which serves to associate a document with each unique feature it
contains, commonly referred to as a one to many relationship. This type of visual output
would have been impossible to perform using the original data structure.
Figure 12. Original single table vs. New Features & Fea_Locator Tables
a. Dobb’s original feature fields; 2 separate columns with commingled data
b. New primary feature table
c. New linking feature table
d. Query joining several tables to display related feature characteristics
29 of 37
The other three new primary entity tables (e.g., DOCUMENT, PARCEL, and
SURVEY) designed for storing categories of related content were handled in a similar fashion
(see: figure 9). Each connected to tables that provided integrity checks specific to them, as
well linked back to the other primary tables based on the relationships formed between
matching unique IDs.
The Data Entry Form
A new form was designed to facilitate Dobb’s data entry process (see: figure 13). This
is similar to the one employed her initial database (Appendix A), although the new form
addresses the variety of multiple values and is designed to handle additional source material.
Currently evidence extracted from each land grant document is being entered using this form,
which automatically populates the underlying new tables.
Figure 13. New data entry form
30 of 37
Parse and Import Geographer’s Data
Finally, once I had created the form and Dobb’s had tested it, the last and most difficult
step was to parse and import the original survey data she had entered into her old database.
There were over 300 records and 44 fields containing commingled data stored across and
down multiple columns. This was no easy task, while some of the process could be automated
the majority required manual intervention. But with patience and perseverance the process
slowly was completed. Each field holding compound attributes was parsed into their simple
elements, commingled data was separated, and data spread across multiple columns were
joined into one field. From there the data could be imported into the new tables and fields
created just for them.
REMAINING OBJECTIVES
There are a few data entry modifications Dobb’s would like to see implemented,
mostly features identified as “nice to have.” Where feasible these will be incorporated in the
near future. Additionally, due to the author’s time constraints the new form temporarily
requires extra steps to complete certain entries, such as looking up previously entered unique
IDs for people and features. Future adjustments are in the works to address this inefficiency as
well other tasks that could be automated with the incorporation of additional programming
code.
At this time queries for identifying parcels with shared characteristics have not been
automated for easy use by Dobb’s. This is partly because the data entry process is still
proceeding and as she becomes more familiar with the evidence her requirements continue to
change. Additionally, several fields were added as placeholders in the database for linkage to
related fields once the majority of data has been entered, such as a parcel identifier to bind
31 of 37
multiple documents (e.g., entry, warrant, survey, deed) to an individual parcel and alias fields
for connecting the variant spellings of people and feature names.
A few other goals, not part of the original collaboration objective, include: determining
a method for automatically exporting parcel measurements from the database into the
application tool that generates the GIS shapefiles, linking images of each parcels’ shape to the
database, and adding additional fields to hold derived information based on content stored in
the database (e.g., Julian date conversions, measurement conversions, and fields for holding
concatenated data).
COLLABORATION SUMMARY
Since implementing the new multi-table model the Dobb’s has successfully entered
over 1500 new records. A key component to the successful implementation of this new
database model was the open and frequent communication between the author and Dobb’s.
Even though at times our field specific jargon, related to geography or information science,
created barriers to understanding each other’s processes and objectives we both agree that the
results met the purpose of our collaboration as outlined at the onset. In fact, we’re looking
forward to continuing our partnership to tackle the remaining objectives outstanding.
32 of 37
CITED REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
Collet, C.J., A Compleat Map of North Carolina From an Actual Survey. 1770, S.
Hooper: London, England.
Markham, A.B., Land grants to early settlers in old Orange County, North Carolina:
parts of present Orange, Chatham, and Durham counties, period 1743-1810. 1973,
A.B. Markham: Durham, NC.
Ramsey, R.W., Carolina Cradle: Settlement of the Northwest Carolina Frontier, 17471762. 1964, Chapel Hill, NC: University of North Carolina Press.
Humber, J.L., Transportation & Settlement [map], 1660 -1775, in Atlas of North
Carolina. 1967, University of North Carolina Press: Chapel Hill,. p. 39-40.
Mitchell, T.W., The Granville District and Its Land Records. North Carolina Historical
Review, 1993. 70: p. 103-129.
Granville Proprietary Land Office: Land Entries, Warrants, and Plats of Survey, in
State Agency Records. Secretary of State Record Group. NC State Archives. 17481763: Raleigh, NC.
Land Office: Land Warrants, Plats of Survey, and Related Records, in State Agency
Records. Secretary of State Record Group. NC State Archives. 1679-1959: Raleigh,
NC.
Granville Proprietary Land Office: Granville Grants of Deed, in State Agency
Records. Secretary of State Record Group. NC State Archives. 1748-1763: Raleigh,
NC.
Land Office: Patent Books (Land Grant Record Books), in State Agency Records.
Secretary of State Record Group. NC State Archives. 1693-1959: Raleigh, NC.
How to Read 18th Century British-American Writing. 2004, Film Study Center at
Harvard University (developer) and the Center for History and New Media at George
Mason University (host & maintainer).
[TGN] Getty Thesaurus of Geographic Names On Line. 2003, The J. Paul Getty Trust.
unknown, Changing Chains, The Virtual Museum of Surveying.
Broyles, S., Metes and Bounds Surveys, Direct Line Software.
Anson County. Feagly, Peter: Parcel survey. Granville Proprietary Land Office: Land
Entries, Warrants, and Plats of Survey, in State Agency Records. Secretary of State
Record Group. NC State Archives. 1748-1763: Raleigh, NC.
Long, J.H., Atlas of Historical County Boundaries: North Carolina, ed. J.H. Long and
G.C. DenBoer. 1998, New York: Charles Scribner's Sons.
Powell, W.S., The North Carolina Gazetteer: A Dictionary of Tar Heel Places. 1968,
Chapel Hill,: University of North Carolina Press.
Anson County. Berrey, Thomas: Parcel survey. Granville Proprietary Land Office:
Land Entries, Warrants, and Plats of Survey, in State Agency Records. Secretary of
State Record Group. NC State Archives. 1748-1763: Raleigh, NC.
Anson County. Graham, Richard: Parcel survey. Granville Proprietary Land Office:
Land Entries, Warrants, and Plats of Survey, in State Agency Records. Secretary of
State Record Group. NC State Archives. 1748-1763: Raleigh, NC.
Yadkin-Pee Dee River Basin, Office of Environmental Education, Department of
Environment and Natural Resources.
33 of 37
20.
21.
22.
Anson County. Blain, George: Parcel survey. Granville Proprietary Land Office:
Land Entries, Warrants, and Plats of Survey, in State Agency Records. Secretary of
State Record Group. NC State Archives. 1748-1763: Raleigh, NC.
ESRI Shapefile Technical Description: An ESRI White Paper. 1998, Environmental
Systems Research Institute, Inc.
Roman, S., Access Database Design & Programming. 2nd ed. 1999, Sebastopol, CA:
O'Reilly. xx, 409.
34 of 37
APPENDIX A: Data Entry Form & Original Database Fields
Figure 14. Geographer’s Original Data Entry Form
Note: Several fields added by reseracher at a later date (e.g., prior occupant, etc.)
Figure 15. Geographer’s Original Database Fields
35 of 37
Derived
Chains
Acres
County
Riv er
T ype
Stream
Transportation
Other (Topography ,
Built Landscape, etc.)
Length
Basin
Angle
FeatureID
Grantor
Surveyed by
1, 2, 3, etc. Measurements
Primary
M
+
Riv er
Stream
Secondary
Creek
(sfx)
Fork
Path
Road
*add'l user specif ied
characterized_by/
charcterize
FEATURES
Name
parcel(s) MAY
hav e 0:M f eature(s)
=
M
f eature(s) MUST
hav e 1:X parcel(s)
M
1
LAND
PARCEL
surv ey MUST
hav e 1 parcel
described in/
describes
SURVEY
parcel MAY
hav e 1:M surv ey s
1
Full name
Locating
Descriptor(s)
(Primary + 2ndary)
Descrpt No.
Descrpt Term
T erm
Sequence
Description
microf ilm,
manuscript, book,
etc
Describes/
Described in
Entry
Warrant
Type
Shuck
Surv ey
Grant/deed
*add'l user specif ied
Draw as
Poles, Chains,
or Rods
Format
State Archiv es,
Library , etc.
Institution/
Archive
36 of 37
Catalog No.
Fir_Alias
Record
Identifyer
Comment
Prefix
Doc Details
Month
M
Day
First
Mid_Alias
M
Middle
Name
PEOPLE
M
Associated
With
DOCUMENT/
RECORD
Year
Dates
Type
Last_Alias
(doc, entry )
Last
Month
Relationship
Suffix
documentID
PersonID
Derive OS
Comment
Day
Type
Assignee
Attestor
Grantee
Chain Carrier
Neighbor-adjacent
Neighbor-near
*add'l user specif ied
APPENDIX B: Entity Relationship Diagram
Year
Line
Poly gon
Appendix B: Entity relationship diagram
Ex: Feature is in the N, NE,
NW, etc. quadrant of
parcel; Or f eature is N, NE,
NW - OFparcel; or
straddles a f eature (such as
stream?)
Unit Type
Full Descrip. type
Brief
Verso
Improv ement
Notes
*add'l user specif ied
Comment
Download