Automated Archeological Survey of Ancient Irrigation Canals 2009-11

advertisement
Department of Computer Science & Engineering
2009-11
Automated Archeological Survey of Ancient Irrigation Canals
Authors: Joseph Izraelevitz
Abstract: In Ancient Southern Mesopotamia (now Iraq), rainfall was never high enough to sustain
agriculture, so for thousands of years people used a system of canals to irrigate their fields.
These canals, built up over generations, have long formed the basis of settlement patterns
and state relations in the region, making their location and mapping of key interest to
archaeologists. However, since 1991, western archaeologists have not been allowed into
Iraq, and consequently have been unable to conduct ground surveys.
This work focuses on automating the detection of these critical irrigation canals. Owing to a
process of silt deposition, the canals gradually raise their banks above the surrounding plain.
Using free elevation data collected by NASA in 2000, we use a combination of image filters
and tracing algorithms to detect and highlight the ancient canals, quickly scanning an area for
archaeologically relevant details. Our method was generally successful in detecting very large
and old canals, but data noise and resolution prevented the detection of smaller features.
Type of Report: MS Thesis
Department of Computer Science & Engineering - Washington University in St. Louis
Campus Box 1045 - St. Louis, MO - 63130 - ph: (314) 935-6160
WASHINGTON UNIVERSITY IN ST. LOUIS
School of Engineering and Applied Science
Department of Computer Science and Engineering
Thesis Examination Committee:
Robert Pless, Chair
Ron Cytron
Tao Ju
Automated Archeological Survey of Ancient Irrigation Canals
by
Joseph Herman Izraelevitz
A thesis presented to the School of Engineering
of Washington University in partial fulfillment of the
requirements for the degree of
MASTER OF SCIENCE
May 2009
Saint Louis, Missouri
ABSTRACT OF THE THESIS
Automated Archaeological Survey of Ancient Irrigation Canals
by
Joseph Herman Izraelevitz
Master of Science in Computer Science
Washington University in St. Louis, 2009
Research Advisor: Professor Robert Pless
In Ancient Southern Mesopotamia (now Iraq), rainfall was never high enough to sustain
agriculture, so for thousands of years people used a system of canals to irrigate their fields.
These canals, built up over generations, have long formed the basis of settlement patterns
and state relations in the region, making their location and mapping of key interest to
archaeologists. However, since 1991, western archaeologists have not been allowed into
Iraq, and consequently have been unable to conduct ground surveys.
This work focuses on automating the detection of these critical irrigation canals. Owing to a
process of silt deposition, the canals gradually raise their banks above the surrounding plain.
Using free elevation data collected by NASA in 2000, we use a combination of image filters
and tracing algorithms to detect and highlight the ancient canals, quickly scanning an area for
archaeologically relevant details. Our method was generally successful in detecting very large
and old canals, but data noise and resolution prevented the detection of smaller features.
ii
Acknowledgments
I would like to thank the many people who helped not only in the writing of this thesis but
my entire college journey. Consequently, I would like to thank the people who made it
possible for me to come to Washington University in St. Louis, especially Andrea Heugatter
and Dean J. Chris Kroeger.
Another thanks goes to the cadre and staff at Gateway
Battalion, in particular MAJ Dave Owens and LTC Warren Griggs, who taught me all about
leadership and confidence. For their help on this project, I would like to thank Professor
Carrie Hritz, who inspired this project, Troy Ruths, who helped in the early stages, and my
mentor, Professor Robert Pless.
I would also like to thank my good friends who kept me happy through the rigors of school,
my rabbi, Tzvi Schwartz, and my grandparents, Shep and Faith Ellis. Finally, I would like to
thank my two brothers, Jacob and Adam, and my parents, for their love, care, and concern.
Joseph Herman Izraelevitz
Washington University in St. Louis
May 2009
iii
Dedicated to my parents.
iv
Contents
Abstract.......................................................................................................................................... ii
Acknowledgments ..................................................................................................................... iii
List of Figures ........................................................................................................................... vii
1
Introduction .......................................................................................................................... 1
1.1 Problem ......................................................................................................................... 1
1.2 Approach ....................................................................................................................... 2
1.3 Results ............................................................................................................................ 3
2
Archaeological Background ............................................................................................. 4
2.1 Mesopotamia ................................................................................................................. 4
2.2 Geography ..................................................................................................................... 6
2.3 Canals ............................................................................................................................. 7
3
Computerized Survey ......................................................................................................... 8
3.1 Advantages of Computerized Survey ........................................................................ 8
3.2 Disadvantages of Computerized Survey ................................................................... 8
3.3 Implementation ............................................................................................................ 9
4
Data………. ......................................................................................................................... 10
4.1 SRTM Data ................................................................................................................. 10
4.2 Other Sources of Data............................................................................................... 11
4.3 Ground Truth ............................................................................................................. 11
4.3.1 Ground Surveys ............................................................................................ 12
4.3.2 Aerial Surveys ................................................................................................ 12
4.4 Test Sets ....................................................................................................................... 13
5
Failed Approaches ............................................................................................................. 16
5.1 Threshold .................................................................................................................... 16
5.2 Threshold with Fitted Plane ..................................................................................... 18
5.3 Hough Transform ...................................................................................................... 20
5.4 Watershed .................................................................................................................... 21
6
Final Algorithm .................................................................................................................. 23
6.1 Preprocessing .............................................................................................................. 23
6.1.1 Failed Approaches ........................................................................................ 23
6.1.2 Fourier Transform ........................................................................................ 24
6.1.3 Tiling ............................................................................................................... 26
6.2 Normalized Cross Correlation ................................................................................. 26
v
6.3
Tracing Walk ............................................................................................................... 31
6.3.1 Tracing Step................................................................................................... 32
6.3.2 Wrapping the Step ........................................................................................ 34
7
Results and Analysis ......................................................................................................... 35
7.1 Results .......................................................................................................................... 35
7.2 Comparison to Ground Truth ................................................................................. 36
7.3 Analysis ........................................................................................................................ 39
8
Further Application ........................................................................................................... 40
8.1 Data…… ..................................................................................................................... 40
8.2 Results and Analysis…….......................................................................................... 41
9
Conclusion... ........................................................................................................................ 45
References .................................................................................................................................. 46
Vita ................................................................................................................................................ 47
vi
List of Figures
Figure 2.1 Topographic Map of Iraq......................................................................................... 6
Figure 4.2 Ground and Satellite Surveys in Area around Nippur ....................................... 13
Figure 4.2 Shatt al-Gharraf Dataset ......................................................................................... 14
Figure 4.3 Nippur Dataset ........................................................................................................ 15
Figure 5.3 Thresholded values on Nippur dataset ................................................................ 17
Figure 5.2 Blurred and thresholded values on Nippur dataset ............................................ 18
Figure 5.3 Thresholded values above a linearly fitted plane in area around Nippur........ 19
Figure 5.4 Nippur dataset. Data was blurred, fitted to plane, then thresholded. ............. 20
Figure 5.5 Hough transform on blurred, plane fitted, and thresholded
Nippur dataset. ........................................................................................................ 21
Figure 5.6 Watershed algorithm on Nippur dataset at blur filters
of size 20 and 100 pixels. ....................................................................................... 22
Figure 6.1 Fourier transform of Shatt al-Gharraf dataset. ................................................... 25
Figure 6.2 Comparison of original and Fourier adjusted data. ............................................ 26
Figure 6.3 Normalized cross correlation filter. ...................................................................... 27
Figure 6.4 Filter response at zero and ninety degrees on Nippur dataset.......................... 28
Figure 6.5 Aggregate filter response on Nippur dataset. ...................................................... 29
Figure 6.6 Oriented filter response. ......................................................................................... 30
Figure 6.7 Scaled filter response on small section of the Nippur data set ......................... 31
Figure 6.8 Normalized cross correlation filter response at intersection. ........................... 32
Figure 6.9 Traced Canals of Nippur Dataset. ........................................................................ 34
Figure 7.1 Traced Canals with Filter Response on Shatt al-Gharraf Dataset ................... 36
vii
Figure 7.2 Comparison of Final Result to Adams Ground Survey .................................... 37
Figure 7.3 Comparison of Final Result to Adams Satellite Survey ..................................... 37
Figure 7.4 Comparison of Final Result to SRTM Data ........................................................ 38
Figure 7.5 Comparison of Final Result subsection to SRTM Data .................................... 38
Figure 8.1 Comparison of Visual Spectrum and Elevation Data near Zaozhuang .......... 41
Figure 8.2 Canal Grid In Visual Spectrum ............................................................................. 41
Figure 8.3 Comparison of Visual Spectrum and Filter Response near Zaozhuang ......... 42
Figure 8.4 Final Result near Zaozhuang ................................................................................. 43
Figure 8.5 Comparison of Visual Spectrum and Traced Canals near Zaozhuang ............ 44
viii
Chapter 1
Introduction
1.1
Problem
Mesopotamia, a region that now stretches across Iraq, Syria, and parts of Jordan, is the
home of the earliest civilizations. The region is bordered by two rivers, the Euphrates
to the south and the Tigris to the north, and forms a crescent from the Persian Gulf to
the Mediterranean. In ancient times Mesopotamia was blessed with extremely good soil
and the ancestor grains of both wheat and barley. However, only in the northern
sections of Mesopotamia can agriculture be sustained with rain. In the south, where the
most fertile soil is, there is not enough rain to grow crops.
The Euphrates is a slow moving, snow-fed, muddy river. The silt it carries is a major
reason for the good soil on its banks. The silt also has another effect. As the river
flows, it deposits silt along its bed, gradually raising itself. Periodically, when the river
floods, it also deposits silt along its sides, or levees. The combination of this silt
deposition means that the Euphrates, over time, has raised itself above the surrounding
plain. The Tigris, being a much faster river, experiences an opposite effect, and has cut
a bed for itself below the plain.
The silt deposition phenomena of the Euphrates made it ideal for early agriculturalists.
By cutting small channels in the levees, they could bring water into their fields, without
needing any sort of pump. Indeed, until relatively modern times the Tigris was not used
for agriculture.
Consequently, Southern Mesopotamia, now the south eastern corner of Iraq, contained
some of the most elaborate irrigation works of the ancient world. Canals, dikes and
1
ditches crisscrossed the open plain, providing water for crops and draining the flat
fields.
The social importance of these waterworks went beyond simple irrigation. The canals
influenced settlement patterns, state relations, and governmental control. The earliest
recorded war in human history was over water rights in Southern Mesopotamia.
Since the location of these canals is important in knowing and understanding the history
of this region, several archaeological surveys have been conducted, identifying sites and
waterways. This method is especially useful in the area just south of Bagdad, which is a
desert and easily traversable in vehicles. Unfortunately, since 1991, Iraq has been closed
to western archaeologists. Furthermore, much of the region of historical interest is now
covered by agricultural fields, which cannot be easily surveyed by ground.
1.2
Approach
The idea behind this research was to automate the detection of Mesopotamian canals,
not only to speed the detection but also to apply the algorithm to other regions. If
successful, such an algorithm would be able to map out these critical archaeological
features and provide useful information to historians and archaeologists.
This goal is achievable because the same silt deposition that occurs along the Euphrates
also happens to the canals that feed off of it. Consequently, the canals also raise
themselves up above the fields. This phenomenon allows the canals to be seen in
terrain elevation data collected by satellite (Hritz and Wilkinson). While not at as high a
resolution as regular images, elevation data has the advantage of eliminating a lot of
noise that would otherwise obscure the canals.
To extract the canals from the elevation data, we tried a number of image processing
techniques. In the end, a combination of techniques was required to eliminate noise,
filter the canals, and then identify the features.
2
1.3
Results
The final algorithm had some success in identifying canals in Mesopotamia.
In
particular, it was able to find the large, wide, and old canals fairly easily. These were the
canals most easily visible in the elevation data. They also tended to be those canals that
had large numbers of archaeological sites next to them, implying they were occupied for
long periods and supplied a large number of fields. Smaller canals were lost in the noise
of the data or too small to be easily identified at the resolution of the data.
3
Chapter 2
Archaeological Background
2.1
Mesopotamia
Mesopotamia is the oldest civilized region in the world. The first domesticated grains,
first cities, and first waterworks are all located in this region. It is referred to both as the
“Fertile Crescent” and “the cradle of civilization.”
Since 6000 BC, the region has been inhabited by complex civilizations. Consequently,
there are layers and layers of archaeological material built up in the alluvial plain.
Various societies have come and gone, each leaving behind a rich archaeological deposit.
Evidence of static human settlement begins around 6000 BC during the Ubaid period
(Adams 54).
From this period until the end of the Uruk period in 3000 BC,
Mesopotamia experienced a gradual transformation from pastoral and subsistence
techniques to widespread agriculture. Around 3000 BC the Early Dynastic period
began, marking the beginning of true city-states.
During this period Mesopotamia
experienced intense urbanization, the proliferation of writing, and the rise of kings.
The waterworks developed in the late Uruk period became increasingly important for
producing the food supply; the earliest recorded war was fought over control of a canal.
The earliest Mesopotamian state was formed when the Sargon of Akkad took control
of most of southern Mesopotamia around 2300 BC, creating the Akkadian Empire and
achieving the first centralized control over the waterworks. After the fall of Akkad,
political centralization persisted under the powerful kings of Ur until 2004 BC (Van De
Mieroop, 59).
4
After the fall of Ur and throughout the early 1000s BC, competing city dynasties vied
for the Mesopotamian plain. Isin and Larsa, then Babylon controlled swaths of the
region. The territorial states were suddenly ended by Hittite raids from Anatolia, leaving
behind a power vacuum and a dark age that lasted until 1500 BC (Van De Mieroop, 8081, 112-117).
Around 1500 Mesopotamia was integrated into a much larger political system
incorporating much of the Middle East. The Southern Kassite state and northern
Assyrian state were much larger than their predecessors, and interacted with polities in
the Mediterranean, Egypt, Anatolia, and Persia (Adams, 172-174). These were large
nations, yet still regional powers. They would fall apart, along with their contemporaries
in Egypt, Anatolia, and Greece, during the sweeping raids of the Sea People around
1100 and the instability that followed (Van De Mieroop, 180-182).
Following the collapse of the regional powers, Mesopotamia was progressively acquired
by various world powers that used its now extensive irrigation system and good land as
a bread basket to feed their empires. The Assyrian, Neo-Babylonian, and Persian
Empires would conquer the area and incorporate it into their states. These empires
would greatly improve the canal system, creating a grid of small canals that connected
the old major waterworks and brought much more land under cultivation (Adams, 188).
This period of Mesopotamian based empires lasted from Assyria’s rise in the 900’s BC
until Alexander the Great’s defeat of Persia in 331 (Van De Mieroop, 279).
The imperial pattern continued as the region was controlled by the Greek Seleucids,
Persian Sassanians, and Islamic Ummayads and Abbasids. Under the Sassanians in
particular the irrigation works were expanded, creating a dense population that was very
susceptible to plague (Adams, 214). However, during Abbasid rule, the canal system
went into decline as its maintenance was disregarded, resulting in a gradual but steady
decline in arable land and population (Adams, 215).
5
2.2
Geography
Mesopotamia, meaning “the land between two rivers,” stretches across much of the
Middle East. From north to south it ranges from the base of the Taurus Mountains to
the Persian Gulf, and from east to west from the Arabian Desert to the Zagros
Mountains. The land slopes gently downward from the north east, between the high
ground of the mountains and desert. The two main rivers, the Tigris and Euphrates,
flow along the slope into the Arabian Gulf. In ancient times, these rivers and their
floods provided rich soil and water for the world earliest farmers.
In modern times, however, much of this region is manmade desert, the result of intense
agriculture and the resulting salinization of the soil.
Some areas are still under
cultivation, especially the lower regions near the Shatt al-Arab.
Figure 2.1 Topographic Map of Iraq (Sadalmelik)
6
2.3
Canals
The canals of southern Iraq were necessary to sustain agriculture, since the region does
not get enough rainfall. Fortunately, due to silt deposition, the Euphrates runs above
the surrounding plain, making irrigation simple. Silt deposited during floods built up
the levees on either side of the river, while silt deposition on the river bottom raised the
water level. To build a canal, a small cut was made in the levee, allowing water to spill
out into the surrounding land. The same process of levee creation that occurred along
the Euphrates happened on the banks of these canals.
Canals levees rise, on average, three to five meters, above the plain. They gradually
slope away from the watercourse, and provide rich alluvial soil for farming. This slope
extends for a minimum of half a kilometer, often more. They are the dominant
features in an otherwise flat plain (Adams, 10).
7
Chapter 3
Computerized Survey
This project was an attempt to automate the time-consuming survey of large areas of
land for archaeological features. The canals of Mesopotamia were chosen because they
can be identified in elevation data and are relatively important in the study of the region.
While aerial survey is an accepted archeological tool, to automate the search has not yet
been done.
3.1
Advantages of Computerized Survey
Automating the survey of land has several advantages. Most obviously, it enables an
appropriate algorithm to survey very large areas, without requiring a trained person.
While the algorithm may be rather slow, as the one presented in this paper is, it still
eliminates the tedious work. Additionally, once the algorithm is written, it can be
tweaked to find similar features in other areas, thus increasing the amount of land that a
survey program can be applied to.
Less obviously, automating the survey adjusts for certain biases present in a human
search. In particular, the range of data is often large, much larger than the available
colors on a monitor. Mapping this range onto a monitor can both blur important
boundaries and introduce artifacts through the use of artificially chosen colors.
3.2
Disadvantages of Computerized Survey
There are, however, some disadvantages to automated survey. Writing the algorithm is
difficult and may not yield good results. Even if it does, a computerized survey is not
8
authoritative and would have to be validated through a trained person. Finally, the
algorithm itself may yield artifacts.
In order to overcome these difficulties, the results of a computerized search for features
should be used in conjunction with results obtained through all other means, including
aerial surveillance and ground survey if possible.
3.3
Implementation
To program the automated survey, I used Mathwork’s MATLAB. MATLAB was chosen
based on its ability to handle large amounts of data and its rich library of image
processing tools. Furthermore, its use of matrices as base data types allowed certain
operations to be understood more intuitively. While the algorithms described here
could have been written in a faster language, such as C++, they could not have been
prototyped as easily in such a language.
9
Chapter 4
Data
Since the canal silt deposition process raises the levees of the watercourses above the
surrounding plain, ancient canals can be seen in terrain elevation data. This data can
then be used in an automated survey to capture the features.
4.1
SRTM Data
A widely available and free elevation dataset was collected by the Shuttle Radar
Topography Mission (SRTM). This mission, undertaken by NASA in February of 2000,
used sophisticated radar equipment to measure elevation across a wide swath of the
earth. The collection method is roughly analogous to taking a depth measurement over
a large grid. The resolution of the grid varies over the surface of the Earth, with certain
areas of interest, such as the United States, being covered more completely than others
(Farr et. al., 3).
The data for Mesopotamia is laid out in a grid, with measurements taken every ninety
meters. According to the official paper published with the data, noise values in the
elevation measurement vary by at most six meters with a 90% confidence rate
(Farr et. al., 3).
SRTM data is available from a number of sources. I used two locations. The Global
Land Cover Facility provided the data for Mesopotamia, while the CGIAR Consortium
for Spatial Information provided the data for China. The sites were chosen based on
accessibility and ease of use, not on any innate differences in the data.
10
4.2
Other Sources of Data
While the SRTM elevation data was the most useful for this project, other sources of
data were looked at for their possible utility in identifying ancient watercourses.
Visible Spectrum
The most obvious data source would be overhead visible
spectrum images. These are available in several free forms, both aerial and satellite, at a
range of resolutions. The resolution on these images is fairly high, and identifying long
features should not be difficult. However, deciding which features are canals versus
roads or hedges would be near impossible. It is difficult even for a human to pick out a
canal from these images.
False Spectrum
Less obvious are images in false color that measure the non
visible spectrum. These wavelengths generally correspond to properties of the soil and
vegetation. While in theory canals have a different soil than the surrounding plain, in
practice the canals are no more visible in these images than in the visual spectrum, and
the images tend to be just as noisy as the visible spectrum.
4.3
Ground Truth
The results obtained from the automated survey need to be validated in some way. In
other words, we need some sort of ground truth map of where some canals are in order
to test the algorithm. Fortunately, a series of archaeological surveys, both ground and
air, were done in the decades prior to the First Gulf War by Robert McCormick Adams,
a professor at the University of Chicago. While they only cover a limited area, his maps
provide a good way to test and validate the performance of the algorithms developed.
11
4.3.1
Ground Surveys
Adams did two major ground surveys of Southern Mesopotamia.
His first, published
in 1965, was titled Land Behind Baghdad and concentrated on a major tributary of the
Tigris, the Diyala River and its surrounding area. The second, called the Heartland of
Cities, was published in 1981 and surveyed the area just south of Baghdad between the
Tigris and Euphrates Rivers. Both provide immensely detailed maps of the
archaeological sites and waterworks in the region covered. However, the maps are very
detailed and do not indicate how old, wide, or big a canal is, since all water courses are
drawn as uniform lines. As such, it is difficult to tell which canals it would be possible
to find in the satellite images, and which are tiny ditches that the SRTM data would
miss.
4.3.2
Aerial Surveys
To guide his search on the ground, Adams did detailed aerial surveys of the areas before
arriving in Iraq using data obtained from United States military spy planes. He also did
a quick search using primitive visual spectrum satellite images. The surveys done using
the satellite images provide a rough sketch of canals in Iraq. In particular, they show
the very large and wide canals. Combining this map with the more detailed ground
surveys, we can get a sense of where the major canals are and which ones are less likely
to be found. Figure 3.1 shows the area around the ancient city of Nippur. The top
image shows the detailed ground survey, while the bottom image shows the far less
detailed satellite survey. As mentioned, those found in the satellite image are larger and
easier to detect.
12
Figure 4.1 Ground and Satellite Surveys in Area around Nippur (Adams 34, insert)
4.4
Test Sets
Memory and speed issues quickly became a problem when dealing with such a large data
set. Consequently, I built some smaller test sets that were used to develop and test
algorithm prototypes. These sets had the added advantage of being within the Adams
survey region, so results could be compared to ground truth. Since these sets are used
to display various algorithms and results, they are being introduced here.
13
Shatt al-Gharraf Dataset
The Shatt al-Gharraf dataset covers the entire Heartland
of Cities survey region and the river of Shatt al-Gharraf. This large dataset was used to
test some faster algorithms.
It ranges from the Tigris River to the north to the
Euphrates to the south, and contains several ancient cities. The ancient cities, which
show up as sharp hills or tells in the elevation data, are useful for georeferencing and
homographies. Figure 4.2 shows the dataset with the area covered by the Adams
ground survey outlined.
Figure 4.2 Shatt al-Gharraf Dataset
Nippur Dataset
The Nippur dataset is the northwest quarter of the Shatt al-
Gharraf dataset and covers a completely surveyed region. It has three dominant canals,
14
two of which intersect along the western edge. The ancient city of Nippur is the
southern boundary, while the Tigris bounds it on the north.
Figure 4.3 Nippur Dataset
15
Chapter 5
Failed Approaches
Before trying any costly and complicated algorithms, it was useful to try simple and
built-in algorithms that had a decent chance of success. While none of the algorithms
described here worked particularly well, they did illuminate some of the issues with the
data set. Furthermore, had they worked, they would have provided computationally
quick solutions to the problem.
5.1
Threshold
The most obvious algorithm to try to find height differentiated features is to use a
simple threshold above the mean value of the image. This algorithm can be quickly
tested, but it has rather discouraging results.
16
Figure 5.1 Thresholded values on Nippur dataset
There are two issues of note in this image. Firstly, the data is very noisy, with many
small sharp peaks. Also, the top left of the image is much higher than the bottom right,
due to the general slope of the plain. In order to identify the canals in this image, we
need to fix at least some of these problems. Trying to fix the noise problem, we can
blur the image. The blur used in Figure 5.2 was a Gaussian blur about twenty pixels
wide, but most blurs have the same general effect. Furthermore, the blur does not fix
the slope issue.
17
Figure 5.2 Blurred and thresholded values on Nippur dataset
5.2
Threshold with Fitted Plane
To try to fix the slope problem, we linearly fitted a plane to the data, and subtracted it
out of the image. We then used a threshold to identify canals.
18
Figure 5.3 Thresholded values above a linearly fitted plane in area around Nippur
We now find good data points across the entire image but have not eliminated noise,
nor are the canals identified. Using a preprocess blur fixes some noise, but we do not
have well connected canals.
19
Figure 5.4 Nippur dataset. Data was blurred, fitted to plane, then thresholded.
5.3
Hough Transform
To connect the canals, we tried using the Hough Transform, a standard line finding
algorithm. Since MATLAB has a built-in implementation, this algorithm can be quickly
tested.
20
Figure 5.5 Hough transform on blurred, plane fitted, and thresholded Nippur dataset.
Unfortunately, we are still limited by the simplicity of our feature algorithm. The simple
thresholding creates large blocks of features that confuse the line finding algorithm,
seen in the bottom left and top right. Furthermore, the inability to choose a correct
threshold means that some large features are broken, such as the long east-west canal
that bisects the Nippur dataset.
5.4
Watershed
The final simple algorithm to try is the watershed algorithm. This algorithm gradually
fills low intensity regions of the image, then marks edges where “puddles” merge. It is
an algorithm that is often used for ridge finding, and seems especially applicable to
elevation data.
Once again, MATLAB has a built-in implementation.
21
Since the
watershed algorithm is especially sensitive to noise, we blur the image as a preprocessing
step. However, regardless of how much the image is blurred, noise is still a problem.
With a moderate blur (a twenty pixel wide Gaussian filter), the watershed algorithm
extremely over detects, while a large blur (one hundred pixels wide Gaussian filter) will
blur out major features, while still not eliminating over detection. The watershed
algorithm also cannot display how certain a feature is. Figure 5.6 shows the watershed
algorithm run at two blur levels.
Neither eliminates all noise, and over blurring
eliminates the large horizontal canal in the center of the dataset and at the top. Note
especially the long, straight, vertical and horizontal features. These are all errors.
Figure 5.6 Watershed algorithm on Nippur dataset at blur filters of size 20 and 100 pixels.
22
Chapter 6
Final Algorithm
The algorithm described in this chapter is the best working algorithm we could write. It
works relatively well, but certainly does not achieve the level of detail seen in the Adams
ground surveys. Furthermore, it is computationally intense, and requires a good bit of
time to run. However, it does give a good overview of the watercourse landscape and
should be useful in assisting archaeologists of the region.
The algorithm consists of three steps. A preprocess step eliminates some noise and tiles
the image. The cross correlation step identifies features, and the tracing step traces the
canals.
6.1
Preprocessing
The first step in our final algorithm is a series of preprocessing filters. These filters
generally deal with noise issues. Several filters used early in the project were later
eliminated as unnecessary, and these are enumerated here.
6.1.1
Failed Approaches
Gaussian Blur
The Gaussian blur filter was the standard blurring algorithm
used in this project. It was done by convolving a small Gaussian matrix across the
image, with a standard deviation of one third the width of the filter. In general, the blur
eliminated some of the small noisy peaks in the data set, but also blurred out thin canals.
The preprocess blur was rendered unnecessary by the cross correlation step of the
algorithm, and was thus eliminated.
23
Resize
Resizing was a useful preprocess step in finding various sizes of canals.
The resize also acted as a type of blur, since small peaks were averaged out as the image
got smaller. However, the scaled filters step of the algorithm made this preprocess step
unnecessary.
Value Correction
Due to the instruments used to collect the elevation data, large
areas of water, like lakes or rivers, give extremely low values. To fix these values, we
tried a two step process. Since the highest water value was approximately ten percent of
the lowest land value, thresholding was the easiest method to identify water regions.
The water pixels were then filled in using the mean of their non-water neighbors.
Unfortunately, this method introduced additional artifacts that, while less noticeable in
the weight image, still affected the end result significantly. As a result, we left the noisy
data in to indicate where error values may have occurred.
6.1.2
Fourier Transform
Due to the collection method, the SRTM data has periodic crisscrossing noise. The
noise was mostly eliminated using a Fourier transform. First, we took the Fourier
transform of the image. In Figure 6.1, the peaks of this periodic noise can be identified
radiating from the center. These peaks were set to zero and the inverse Fourier
transform was taken. Figure 6.2 shows the results.
24
Figure 6.1 Fourier transform of Shatt al-Gharraf dataset.
25
Figure 6.2 Comparison of original and Fourier adjusted data. (USGS)
6.1.3
Tiling
The entire data set to be scanned is approximately 53 MB in a compressed tiff
format. Consequently, it needs to be tiled before any memory intensive computation
is run. The data was tiled into windows of about 500 by 500 pixels, which were later
concatenated together. Most of the testing and verification of the algorithm was
done on a single tile.
6.2
Normalized Cross Correlation
After preprocessing to prepare the data, we used a normalized cross correlation filter to
identify the canals. The filter was square, with its height values shaped like a prism, as
shown in Figure 6.3. The height of the filter was based on examination of canal heights.
26
Figure 6.3 Normalized cross correlation filter.
When conducting the pass, at each pixel, the filter was rotated around 180 degrees, with
the normalized cross correlation computed at increments of ten degrees. The output of
this step was an array of images, each containing the filter response for a given filter
orientation. Figure 6.4 shows the filter response for orientations of zero and ninety
degrees. Notice how each filter generally finds canals oriented along it.
27
Figure 6.4 Filter response at zero and ninety degrees on Nippur dataset.
To combine the array of filter responses, we take the max value across all orientations,
recording which orientation this is. This results in an aggregate weight image and a
corresponding vector field of the orientation along the canals.
28
Figure 6.5 Aggregate filter response on Nippur dataset.
29
Figure 6.6 Oriented filter response.
The final step in the normalized cross correlation step is to scale the filter. This allows
the program to identify many sizes of canals. For every size filter, we computed the
same rotated and oriented normalized cross correlation described above. Figure 6.7
shows the scaled responses, along with a comparison to the original SRTM data and the
Adams ground survey.
30
Figure 6.7 Scaled filter response on small section of the Nippur data set. (USGS) (Adams, insert)
6.3
Tracing Walk
While the filter response gives a good representation of the canals, it has some
problems.
First of all, it is not a binary representation, but rather a display of
approximate certainty. It is also poorly connected. This problem is the result of the
filter failing at intersections, where it correlates poorly with the data. This is visible in
Figure 6.8, which shows the filter response at several intersections.
31
Figure 6.8 Normalized cross correlation filter response at intersection.
To connect the canals, we use a tracing algorithm, which follows the orientations
obtained during the correlation step. Using these orientations, we can walk along the
canals, similar to a classic gradient descent.
6.1.3
Tracing Step
During the tracing walk, the state is stored as a location (l) and momentum (m) vector.
After each step, these variables are updated. The tracing walk is conducted using a
small window seven pixels wide, which is the approximate size of a small canal.
32
The step begins by determining the direction of the trace (Dtrace). The direction is found
by summing all directions (D) within the window and weighted by the filter response
(W). Since the orientations in the cross correlation step are computed in the range of
zero to 170 degrees, continuous canals could have discontinuous orientations (i.e. a
generally horizontal canal will have orientations varying between values of zero, 170 or
10). Consequently, the direction sum is done by computing the dot product of the
momentum and orientation. If the dot product is negative, the orientation is flipped
before getting summed.
(6.1) Dtrace =
1
∑Wij * Dij * sign(m • Dij )
49
The total direction has a weight, which is then linearly mapped to a step size in pixels.
The constants on this mapping are used to satisfy a set of conditions. The maximum
step size (smax) is chosen so that the walk is always continuous.
This means the
maximum step is the width of the window, and is the step size that should result at the
maximum filter response. The minimum step is one pixel, and corresponds to a filter
response of .05, which is the value on the filter response below which we can be very
sure no canal exists. The step size in pixels (strcce) is linear between these two bounds.
smax − .5
Wmax − .05
(6.3) intercept = smax − Wmax * slope
(6.2) slope =
(6.4) strace = slope * Dtrace + intercept
Similar to a classic gradient walk, we introduce both a momentum term and a random
force to the trace. Momentum (m) is the last step vector, while the random force (r) is a
small vector in a random direction.
(6.5) s final = c1strace + c2 m + c3 r
However, these additional terms rarely improved the performance of the algorithm, and
were eliminated from the final program. The tracing walk will continue to move until it
hits the edge of the tile or stops moving.
33
6.1.3
Wrapping the Step
The tracing walk is seeded by choosing the maximum weight still untraced. The walk is
given two opposing momentums and allowed to move to completion.
The tracing walk is run on all scales of the correlation filter images. The sum image of
all traces on all scales is the final result.
Figure 6.9 Traced Canals of Nippur Dataset.
34
Chapter 7
Results and Analysis
The final algorithm gives an approximate map of canals in the region scanned. While
the canals generally line up with ground truth, there are still artifacts and incorrect
values. In general, the algorithm is much better at finding large, wide canals, and is less
consistent when looking for smaller features.
7.1
Results
Figure 7.1 shows the final result for the Shatt al-Gharraf data set. The image took about
two hours to create. Note how the image is generally well connected and clearly
delineates the canals.
35
Figure 7.1 Traced Canals with Filter Response on Shatt al-Gharraf Dataset
7.2
Comparison to Ground Truth
Figure 7.2 shows a small section of the Shatt al-Gharraf solution that corresponds to the
Adams Heartland of Cities survey. In general, large canals dense with sites are found by
the proposed algorithm. However, the density of the canals on the ground survey is far
greater than those found using the algorithm, so many small canals are missing.
36
Figure 7.2 Comparison of Final Result to Adams Ground Survey (Adams, insert)
When compared the satellite surveys done by Adams as in Figure 7.3, one can again see
the trend that this algorithm tends to find large canals. The large canals are generally
found, even if not in their entirety.
Figure 7.3 Comparison of Final Result to Adams Satellite Survey (Adams, 34)
Comparing the results to the actual elevation data as in Figure 7.4 shows the same trend
of finding large canals. However, even small linear bumps hardly visible in the elevation
37
data are usually found by the tracing, as shown in Figure 7.5. This fact implies that
some linear mounds are not canals, that the ground survey missed some canals, or that
noisy values in the elevation data created these features.
Figure 7.4 Comparison of Final Result to SRTM Data (USGS)
Figure 7.5 Comparison of Final Result subsection to SRTM Data (USGS)
38
7.3
Analysis
As the various figures show, the program was moderately successful at finding linear
mounds in the elevation data. In particular, the program found large and long linear
features which usually corresponded to wide canals with a high density of archaeological
sites. However, in general, correlation with the ground truth surveys of Adams was not
very good. While many of the canals identified by satellite were found, many were not.
Furthermore, the algorithm could not come close the detail of the ground survey, since
the resolution of the data did not allow it.
An interesting correlation emerges on a more careful study of the data. Areas where the
proposed algorithm fails to find a high density of features, like the center of the western
edge of the region, tend to have far fewer sites. This correlation implies that the canals
in the western region are newer, since they did not experience the necessary soil build
up to significantly raise the levees.
Finally, the algorithm corresponds very well to the elevation data in the area along the
Shatt al-Gharraf, an area that, due to increased cultivation, is near impossible to survey
on the ground.
39
Chapter 8
Further Application
Automated survey becomes far more useful when it can be applied additional regions
outside of the original dataset. Fortunately, the same process of silt deposition that
forms the Mesopotamian canals is also active in China along the Yangtze River.
Consequently, we can apply our algorithm to the region and attempt to find features in
this region.
8.1
Data
The data for this area is processed SRTM elevation data found at the CGIAR
Consortium for Spatial Information. I chose a small section from a tributary of the
Yangtze River just west of Zaozhuang. For ground truth I used a visual spectrum
satellite image from Google Maps, which shows the canals fairly clearly since they are
still in use.
40
Figure 8.1 Comparison of Visual Spectrum and Elevation Data near Zaozhuang
(TeleAtlas et. al.) (Jarvis)
8.2
Results and Analysis
Since the same silt deposition process occurs in China as in Mesopotamia, the canal
detection algorithm should work there as well. However, unlike the meandering canals
of Mesopotamia, the canals of China tend to fall along a grid pattern, as shown in
Figure 8.2. As shown in Figure 8.3, the filter response also follows this general pattern.
Figure 8.2 Canal Grid In Visual Spectrum (TeleAtlas et. al.)
41
Figure 8.3 Comparison of Visual Spectrum and Filter Response near Zaozhuang
(TeleAtlas et. al.)
While the filter response seems fairly reliable, the tracing algorithm mostly fails. This
problem seems to arise from the canal pattern. As noted above, the filter response fails
at intersections, especially those with right angles. The density and regularity of these
intersections means that the filter response tends to be lower on average than in
Mesopotamia. Consequently, the tracing algorithm is likely to fall off the canals or stop
prematurely. Figure 8.4 shows the final result, while Figure 8.5 shows the result in
comparison to the visual spectrum image.
42
Figure 8.4 Final Result near Zaozhuang
43
Figure 8.5 Comparison of Visual Spectrum and Traced Canals near Zaozhuang
(TeleAtlas et. al.)
44
Chapter 9
Conclusion
The algorithm presented in this paper performed reasonably well on the original dataset,
but needed to be tweaked significantly when moved to another dataset. This adjusting
is likely a matter of changing certain constants, not fixing the entire process. In general,
this algorithm would be best suited for assisting archaeologists find hard to see features
in the dataset. However, other archaeological features might be easier for computers to
discern and, in this case, it is certainly conceivable that other algorithms could be
written to perform full automated survey of regions.
The particular algorithm described here is good at finding long quasi-linear features, a
problem generally known as ridge detection. These types of features are present not
only in elevation data, but also in certain medical images. Finding ridges in data has
been the study of several recent papers (Lopez et. al., Cañero and Radeva), yet this
thesis presents a novel and unique approach to the problem.
45
References
[1]
Robert McC. Adams. Heartland of Cities: surveys of ancient settlement and land use on
the central floodplain of the Euphrates. University of Chicago Press, Chicago, 1981.
[2]
C. Cañero and P. Radeva, “Vesselness enhancement diffusion,” Pattern
Recognition Letters 24 (2003), pp. 3141–3151.
[3]
Tom G. Farr et. al. “The Shuttle Radar Topography Mission,” Reviews of
GeoPhysics, 45 (2004). http://www2.jpl.nasa.gov/srtm/SRTM_paper.pdf.
[4]
Carrie Hritz and T.J. Wilkinson. “Using Shuttle Radar Topography to map
ancient water channels in Mesopotamia,” Antiquity, 80:308 (2006): 415-424.
[5]
A. Lopez, D. Lloret, J. Serrat, and J. Villanueva. “Multilocal creaseness based on
the level-set extrinsic curvature,” Computer Vision and Image Understanding,
77(2):111.144, 2000.
[6]
A. Jarvis et. al. Hole-filled SRTM for the Globe Version 4. 34.480015, 116.544342,
CSI-CGIAR, 2008. srtm.csi.cgiar.org/SRTMdataProcessingMethodology.asp.
[7]
Sadalmelik. “Iraq Topography,” Wikipedia, 2008.
http://en.wikipedia.org/wiki/File:Iraq_Topography.png
[8]
TeleAtlas, Digital Globe, TerreMetrics. Google Maps. 34.480015, 116.544342.
Google, 2008. maps.google.com.
[9]
USGS. Shuttle Radar Topography Mission. 32.558389, 45.335083. Global Land
Cover Facility, University of Maryland, College Park, Maryland, 2004.
www.landcover.org.
[10]
Marc Van De Mieroop. A History of the Ancient Near East ca. 3000-323 BC.
Blackwell Publishers Ltd, Oxford, 2004.
46
Vita
Joseph Herman Izraelevitz
Date of Birth
July 23, 1986
Place of Birth
Boston, Massachusetts
Degrees
B.S. Computer Science with Second Major in History, May 2009
M.S. Computer Science, May 2009
Professional
Societies
Tau Beta Pi
Phi Alpha Theta
May 2009
47
Automated Archaeological Survey, Izraelevitz, M.S. 2009
48
Download