1.3 Wicked problems

advertisement
Smart Data and Wicked Problems
Paul L. Borrill
Founder, REPLICUS Software
Abstract
Doug Lenat says we are plagued by the problem of our data not being smart enough. In
this presentation, we first explore why we want smarter data, and what it means. We look
behind the scenes at the frustrations that knowledge warrior’s experience in their digital
lives. Problems that are easy to fall victim to, such as information overload, the constant
unfolding of additional tasks that get in the way of getting real work done (Shaving the
Yak), and the seemingly endless toll on our time & attention in order to manage our digital
lives. We illuminate these problems with insights gained from design considerations for a
100PB distributed repository and peel the onion on these problems to find that they go
much, much deeper than we imagined: connecting to “wicked problems” in mathematics,
physics, and philosophy: what is persistence? Why is time and space not real? Why is the
notion of causality is so profoundly puzzling? And the impossibility of solving certain
problems with a God’s Eye View. Finally, we propose a prime directive comprising three
laws, and six principles for design, so that if our data becomes smart, that it does so in
ways that truly serve us: simple, secure, resilient, accessible, and quietly obedient.
1. Introduction.
1.1 Why make data smart?
“The ultimate goal of machine production
from which, it is true, we are as yet far removed
– is a system in which everything uninteresting is
done by machines and human beings are
reserved for the work involving
variety and initiative”
“What information consumes is rather obvious: it
consumes the attention of its recipients. Hence a
wealth of information creates a poverty of
attention, and a need to allocate that attention
efficiently among the overabundance of
information sources that might consume it”
~ Bertrand Russell
~ Herbert Simon
As our commercial operations, intellectual assets,
professional and personal context progressively
migrate to the digital realm, the need to simply,
reliably and securely manage our data becomes
paramount.
However,
managing
data
in
enterprises, businesses, communities and even in
our homes, has become intolerably complex. This
complexity has the potential to become the single
most pervasive destroyer of productivity in our postindustrialized society. Addressing the mechanisms
driving this trend and developing systems &
infrastructures that solve these issues, creates an
unprecedented opportunity for scientists, engineers,
investors and entrepreneurs to make a difference.
Human attention is, at least to our species, the
ultimate scarce resource. Attention represents both
a quantitative as well as a qualitative aspect of life,
which is all we have. The moments we are robbed
of by the voracious appetites of our systems to
demand our tending, the less life we have available
to live, for whatever form our particular pursuit of
happiness may take: serving others, designing
products, creating works of art, scientific discovery,
intellectual achievement, saving the earth, building
valuable enterprises, or simply making a living.
In enterprises, massive budgets are consumed by
the people hired to grapple with this complexity, yet
the battle is still being lost. In small & medium
businesses, it is so difficult to hire the personnel
with the necessary expertise to manage these
chores, that many functions essential to the
continuation of the business, such as disaster
recovery, simply go unimplemented. Most
consumers don’t even back up their data, and even
for those who should know better, the answer to
“why not” is that it’s just too cumbersome, difficult
and error prone.
Why we want to make data smart is clear: so that
our data can, as far as possible, allow us to find
and freely use it without us having to constantly
tend to its needs: our systems should quietly
manage themselves and become our slaves,
instead of us becoming slaves to them.
What this problem needs is a cure; not more
fractured and fragmented products or an endless
overlay of paliatives that mask the baggage of the
storage industry’s failed architectural theories,
which in turn rob human beings of their time and
attention to manage the current mess of fragility
and incompatibility called data storage systems.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 1 of 20
1.2 Three Laws of Smart Data
“Men have become tools of their tools”
~ Henry David Thoreau
Now that we recognize we are living inside an
attention economy, we might ask what other
resources we can bring to bear on this problem. It
doesn’t take much to realize that there are rich
technological resources at our disposal that are
rather more abundant: CPU cycles, Memory,
Network Bandwidth and Storage Capacity.
We propose the following laws for Smart Data:
1.
2.
3.
Smart Data shall not consume the attention of
a human being, or through inaction, allow a
human being’s attention to be consumed,
without that human being’s freely given
concurrence that the cause is just and fair.
Smart Data shall obey and faithfully execute
all requests of a human being, except where
such requests would conflict with the first law.
Smart Data shall protect its own existence as
long as such protection does not conflict with
the first or second law.
1.3 Wicked problems
Rittel and Webber1 suggest the following criteria for
recognizing a wicked problem:
 There is no definitive formulation of a wicked
problem.
 Wicked problems have no stopping rule.
 Solutions to wicked problems are not
true-or-false, but good-or-bad.
 There is no immediate and no ultimate test of a
solution to a wicked problem.
 Every solution to a wicked problem is a "oneshot operation"; because there is no
opportunity to learn by trial-and-error, every
attempt counts significantly.
 Wicked problems do not have an enumerable
(or exhaustively describable) set of solutions.
 Every wicked problem is essentially unique.
 Every wicked problem can be considered to be
a symptom of another problem.
 Discrepancies in representing a wicked
problem can be explained in numerous ways.
The choice of explanation determines the
nature of the problem's resolution.
 The designer has no right to be wrong.
Note that a wicked problem2 is not the same as an
intractable problem.
Related concepts include:

Yak shaving: any seemingly pointless activity
which is actually necessary to solve a problem
which solves a problem which, several levels of
recursion later, solves the real problem you're
working on. MIT’s AI lab named the concept,
which is popularized by Seth Godin and others.

Gordian Knots: Some problems only appear
wicked until someone solves them. The
Gordian Knot is a legend associated with
Alexander the Great, used frequently as a
metaphor for an intractable problem, solved by
a bold action ("cutting the Gordian knot").
Wicked problems can be divergent or convergent,
depending upon whether they get worse as we
recursively explore the next level of problem to be
solved, or it gets better.
1.4 Knowledge Warriors
If we apply our intelligence and creativity, we can
conserve scarce resources by leveraging more
abundant resources. Many of us devise personal
strategies to counter this trend toward of incessant
Yak shaving to keep our data systems clean and to
conserve our productivity. This is the zone of the
Knowledge Warrior.
We begin each section with the daily activities and
reasonable expectations of knowledge warriors as
they interact with their data, and go on to explore
the connection to the deep issues related to design
of smart data. While we hope to extract useful
principles for designers of smart systems to follow,
we cannot hope in such a small space to provide
sufficient evidence or proofs for these assertions.
Therefore, connections to key references in the
literature are sprinkled throughout the document
and those reading it on their computers are
encouraged to explore the hyperlinks.
I make no apology for a sometimes-controversial
tone, the breadth of different disciplines brought
into play, the cognitive dislocations between each
section, or the variability in depth and quality of the
references. It is my belief that the necessary
insights for making progress on this problem of data
management complexity cannot be obtained by
looking through the lens of a single discipline; and
that the technology already exists for us to do
radically better than do the systems currently
available on the market today.
Section 5 contains this paper’s central contribution.
1
Rittel, H. J., and M. M. Webber (1984). “Planning
problems are wicked problems”.
2
When I began writing this paper and chose the concept
of “wicked problems”, I thought I was being original.
Google dissolved my hubris when I discovered that Rittel
& Webber had defined a similar concept in 1984 (the year
of Big Brother) in the context of social planning. They
described that, in solving a wicked problem, the solution of
one aspect may reveal another more complex problem.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 2 of 20
2. 100PB Distributed Repository
2.1 True Costs
The arithmetic for a 100PB distributed repository is
rather straightforward: 12 disks per vertical sled3, 8
sleds per panel, 6 panels per 19” rack yields >500
disks per rack. In current 2008 capacities this yields
>0.5PB per rack. So five datacenters containing 40
racks each are required for ~100PB raw capacity.
The quantitative picture above may be accurate in
disk drive costs, but anyone with experience in the
procurement and operational management of digital
storage will recognize it as a fantasy.
Alternatively, mobile data centers built from 20-foot
shipping containers4 (8 racks/container), yields
~5PB per container, or 10PB in 40-foot containers.
Thus: 10 x 40-foot or 20 x 20-foot containers are
required for a 2008 100PB deployment. It is not
difficult to imagine a government scale project
contemplating a 100 container deployment yielding
1EB, even in 2008. Half this many containers will
be needed in 2012, and a quarter (25 x 40-foot
containers = 1EB) in 2016, just 8 years from now.
Table 1 : Anticipated Disk Drive Capacities
RPM
2008
2012
2016
Capacity
7200
1TB
4TB
8TB
Performance
10K
400GB
800GB
1.6GB
High-perf.
15K
300GB
600GB
1.2TB
What may surprise us, is when we consider costs of
the disk drives alone in this arithmetic exercise:
Normalizing for a mix of performance and capacity
3-1/2” drives, and assuming an average of $200
/disk – constant over time – yields, for a 100PB
deployment, approximately $26M in 2008, $13M in
2012, and $6.5M in 2016. Table 2 below projects
costs for Government (100PB) Enterprise (10PB),
Small & Medium Businesses (SMB) (100TB), and
Personal/Home (10TB) from 2008 to 2016.
Table 2 : Anticipated Disk Drive Cost
Capacity
2008
2012
2016
Lg Gov
1EB
$260M
$130M
$65M
Sm Gov
100PB
$26M
13M
$6.5M
Lg Ent.
10 PB
$2.6M
$1.3M
$650K
Sm Ent.
1 PB
$260K
$130K
$65K
SMB
100TB
$26K
$13K
$6.5K
Personal
10 TB
$2K
$1K
$500
Given the history of data growth and the voracious
appetite of governments, industry and consumers
for data storage, it is reasonable to assume that
scenarios such as the above are not just possible,
but inevitable in the years to come.
But this is not an accurate picture for stored data.
While disk procurement costs are in the 20c/TB
range, the costs of fully configured, fully protected
and disaster recoverable data can be a staggering
two or more orders of magnitude higher than this.
For example one unnamed but very large Internet
company considers their class 1 (RAID 10, fully
protected) storage costs to be in the range $35$45/GB per year. In such a scenario, if the disk
drive manufacturers gave their disks away for free
(disk costs = $0), it would make hardly a dent on
the total cost of managing storage.
Some of this cost comes understandably from the
packaging (racks), power supplies and computers
associated with the disks to manage access the
data: a simple example of which would be Network
Attached Storage (NAS) controllers which range
from one per disk, to one per 48 disks. Another
factor is in the redundancy overhead of parity and
mirroring. At the volume level, RAID represents a
30% to 60% overhead on the usable capacity. This
is doubled for systems with a single remote
replication site. Disk space for D2D backups of the
primary data consumes 2-30 times the size of the
RAID set (daily/weekly backups done by block
rather than by file5), and with volume utilizations as
low as 25% on average, we must multiply the size
of the RAID set by a factor 4 to get to the true ratio
of single-instance data to raw installed capacity.
All of this can be calculated and tradeoffs in
reliability, performance and power dissipation can
be an art form. However, even with a worst-case
scenario, the cost of all the hardware (and software
to manage it) still leaves us a factor of five or more
away from the actual total cost of storage. If all
hardware and software vendors were to give their
products away for free, it might reduce the CIO’s
data storage budget by about 20%; and as bad as
this ratio is, it continues to get worse, year after
year, with no end in sight6.
In order to satisfy Wall street’s obsession with
monotonically increasing quarterly returns, digital
storage vendors are forced to ignore (and try to
hide) the externalities their systems create:
primarily, the cost of human capital, in the form of
administrative effort to install and manage data
storage. This is not even counting the wasted
attention costs for knowledge warriors using those
systems.
5
3
High-density 3-1/2” drive packaging + NAS controllers in
a vertical sled arrangement (Verari): 576-720 disks/rack.
4
Project Black Box (Sun).
Zachary Kurmas and Ann L. Chervenak. “Evaluating
Backup Algorithms”. IEEE Mass Storage 2000. p 235-242.
6
Economist Magazine: Andreas Kluth “Make it Simple:
Information Technology Survey”. October 28th 2004.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 3 of 20
3. Identity & Individuality
"Those great principles of sufficient reason
and of the identity of indiscernibles change
the state of metaphysics. That science
becomes real and demonstrative by means of
these principles, whereas before it did
generally consist in empty words."
~ Gottfried Leibniz
How smart data “appears” to us is affected by how
easy it is to identify and manipulate it. But what is
“it”? , and how do we get a handle on “it” without
affecting “that” begins to reveal some wickedness.
3.1 Getting a handle on “it”
The first question is to consider how we identify
what “it” is, where its boundaries are and what its
handles might be. Knowledge warrior’s prefer to
identify data by namespace and filename.
Administrators prefer to identify data by volume(s),
paired with the array(s) and backup sets they
manage. Storage system architects prefer not to
identify data at all but to aggregate block containers
into fungible pools that can be sliced diced and
allocated as fungible commodities.
Each optimizes the problem to make their life
easier. Unfortunately, the knowledge warrior has
the least power in this hierarchy, and ends up with
the left over problems that the designers and
administrators sweep under the rug.
When it comes to managing “changes” to the data
(discussed in detail in the next section), the
situation begins to degenerate: knowledge warriors
prefer to conceptualize change as versioning
triggered by file closes. This creates several
problems:
1. Administrators have difficulty distinguishing
changes to a volume by individual users, and
have no “event” (other than a periodic
schedule) to trigger backups, so whole
volumes must be replicated at once.
2. As the ratio between the size of the volume
and the size of the changed data grows:
increasing quantities of static data are copied
needlessly, until the whole thing becomes
intolerably inefficient.
3. As data sets grow, so does the time to do the
backup – this forces the administrators to favor
more efficient streaming backups on a block
basis to other disks, or worse still, tapes.
4. Users experience vulnerability windows, where
changes are lost between the recovery time
objective (RTO) and recovery point objective
(RPO) imposed on them by the system
administrators.
Palliatives are available for each of these problems:
diff’s instead of full backups, more frequent
replication points, continuous data protection
(CDP), de-duplication etc. Each of these imposes
complexity that translates into increased time and
expertise required by already over-burdened
administrators.
The traditional method of managing change is to
identify a master copy of data, to which all others
are derivatives. Complexity creeps in when we
consider what must be done when we lose the
master, or when multiple users wish to share the
data from different places. Trying to solve this
problem in a distributed (purely peer to peer)
fashion has its share of wickedness. But trying to
solve it by extending the concept of a master copy,
while seductively easier in the beginning, leads
rapidly to problems which are not merely wicked,
but truly intractable. Problems such as entangled
failure models, bottlenecks and single point failures
which lead to overall brittleness of the system.
Storage designers find disks to be a familiar and
comforting concept: It defines an entity that you can
look at and hold in your hand. Identifying the
internal structure of the disk, and the operations we
can perform is as simple as a linear sequence of
blocks, from 0 to N, with “block” sizes of 512 to
4KB. Where N gets larger for each generation of
disk technology and the operations are defined by a
simple set of rules called SCSI commands.
Disk drive designers, storage area network (SAN)
engineers and computer interface programmers
work hard to make this abstraction reliable. They
almost succeed…
Give the disk abstraction to applications
programmers and we soon see it’s warts: No
matter how big disks get, they have a fixed size7,
limited performance and they fail, (often
unpredictably). These “abrupt” constraints make
systems brittle: disks fail miserably and often at the
most inopportune time. We get “file system full”
messages that stop us in our tracks, massive
slowdowns as the number of users accessing a
particular disk goes beyond some threshold, or our
data becomes lost or corrupted by bitrot8.
Fear not: Our trusty storage system designers go
one level higher in their bottom up design process
and invent “volumes” for us. In principle, volumes
are resilient to individual disk failures (RAID), as
fast as we would like (striping), and as large as we
want (concatenation). We can even make them
“growable” by extending their size while they are
7
Similar and sometimes more vexing problems occur with
artificial constraints on file size. For example, outlook pst
files, which fail when they grow beyond 1GB.
8
Hidden data corruption on disks, often caused by poorly
designed RAID controllers.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 4 of 20
on-line (to take advantage of this requires a
compatible file systems technology).
The invention of abstract volumes for data was a
powerful tool to help application programmers in
particular, and their users, be able to focus on the
problem they wished to solve by providing them
with a substrate they could rely on. A perfect
example of how an abstraction should work – at
least for application programmers and users.
Under the hood, Storage Arrays, SAN’s, switches
and Volume Managers employ highly complex
mechanisms to direct, indirect and interleave
commands from servers to disk arrays, in order to
provide an illusion of a reliable storage service.
Unfortunately, although this abstraction works well
for application programmers, it comes at a cost of
substantially higher complexity for administrators,
partly because of conflicts between the abstractions
users prefer to use when identifying their data, with
those abstractions designers felt comfortable with.
3.2 Hiding “it”
A rather similar problem9 was solved back in the
early days of computer systems: the problem
known as virtual memory, where each program
(and thus each programmer) is given an illusion of
a reliable memory, which spans from page 0 to
page N (with “page” sizes of 32K or higher).
When we add new memory DIMM’s into our
computers, we simply turn them on and the
additional memory becomes instantly available to
the operating system and application programs.
This is the way it should be. Unfortunately, storage
designers were not as successful in architecting
things as were the designers of virtual memory
systems. When we add new disks to a computer, or
a RAID array, we have a whole bunch of
administrative actions to take, and if we get any of
them wrong, we end up with an unusable system or
worse still, the corruption of data on existing disks.
A whole Yak shaving industry has been built up
around storage management because of the frailty
of this block (and volume) view: “human beings”
were inadvertently designed into the system and
now have to be specially trained and certified in
Yak shaving. There are tools available for
administrators to count the hairs on their yaks, and
to style special tufts to go behind their ears and on
their posteriors. Yak shaving certification courses
are available and an entire industry of yak razors
and Yak shaving creams have been invented to
help them do their job. These administration tasks
go by various names: disk administration, LUN
masking, storage provisioning, backup, restore,
9
archiving, monitoring, capacity planning, firmware
updates, multipathing, performance engineering,
availability analysis, installation planning, etc. Why
are all these functions necessary for managing
disks, but not necessary for managing memory? If
you think I am jesting, take a look at a typical
storage administrator’s job function by IBM,
Sun/VERITAS, or Oracle.
3.3 Users vs Administrators
In conventional storage systems, administrators
identify data by device and volume, and users
identify data by filename. Both the notion of what a
data object is, as well as the operations performed,
are vastly different between administrators and
users. Manipulations or “operations” on the data for
administrators means things like backup, restore,
replicate remotely. For a user this means edit, copy,
rename, etc. These different views of what
constitutes the “identity” of data creates
fundamental conflicts in mental models.
Data corruption in a file object abstraction provides
isolation, so that only one file at a time appears
corrupted. Data corruption in a volume abstraction
can mean that every file within the volume is
inaccessible. By forcing volume level constraints
and behaviors on the system, we make our data
overall more fragile. Volumes must be “mountable”
otherwise their data is inaccessible by the filesysem
or database, and unavailable to the user. The larger
the volumes get, the more data is affected by these
kind of failures. But we are destined to have to deal
with this problem in some respects anyway, as disk
drive storage capacities increase over time.
Conventional restore from a backup can be costly
and time consuming because whole volumes must
be restored at once, whereas users often need only
a specific file to be restored. Users want their files
to be protected the instant they close or save them.
Administrators prefer to wait until the system can be
quiessed before initiating more efficient streaming
block backup from one device to another.
A significant contribution to the user-view of data
identity has been achieved by Apple with their
recent introduction of time machine.
3.4 Zero Administration
Eliminating administrator chores means everything
must be made either automatic or commandable by
the user. If we are not to overburden the user with
the requirement for the kinds of skills and training
that current administrators require, then we must
present a simple, consistent framework to the user
within which s/he can relate to their data without the
unnecessary conceptual or cognitive baggage:
designed by the designer for the convenience of the
designer.
Example courtesy Jeff Bonwick, Sun Microsystems.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 5 of 20
These problems are not only relevant for making
the life of the corporate knowledge warrior easier.
Eliminating the need for administrators is essential
for applications where there are no administrators,
such as high or low-conflict military zones, SmallMedium Businesses, and consumers at home,
where a great deal of storage is being sold10.
In order to do this, we need a theory for our
understanding of objects as individuals with welldefined identity conditions. Users are interested in
files and directories, not disks, volumes RAIDsets,
backup sets or fragile containers such as pst files.
3.5 Indiscernability
The biggest simplifying assumption for a user is a
single universal namespace; represented perhaps
by a single portal through which all the files a user
has ever created, shared, copied or deleted can be
viewed and manipulated without cognitive effort,
and unconstrained by system considerations. This
single namespace portal must be accessible from
anywhere, and behave as if the data is always
local, i.e. provides the user with a local access
latency experience and the file is consistently
available even when the network is down. The
natural way to do this is to replicate the data
objects. A degenerate, but perfectly valid case (in a
world of cheap disk capacity), is to replicate
everything to everywhere that the user may try to
access their data. A more economical method is to
replicate only the most frequently used data
everywhere, and pull other data when a user
requests it (but this is an optimization, orthogonal to
the issue of identity and individuality).
In our canonical distributed data repository, the
user can access a file from anywhere they may find
themselves using a local PC/Client. Just point and
click, drag and drop within the portal, or type some
simple command containing an operator and a
target file specification.
As far as possible, the user needs the illusion of a
single file, the fact that replication is going on
behind the scenes is irrelevant to them. It shouldn’t
matter whether there are three replicas of the data
or three thousand, the user doesn’t need or want to
know; for the user, replicas should be indiscernible.
In order for a system to successfully provide this
illusion, it must be able to treat all the replicas as
substitutable, so that it can retrieve any one of the
replicas as a valid instance of the file. For the
system, replicas should be substitutable.
10
Many of us have become the family CTO/CIO at home.
Even corporate CIO’s who understand the issues and use
the state-of art solutions in their work environments,
would prefer not to have to deal with the complexity of
those solutions when they go home to their families.
Thus, what constitutes the ‘principle’ of individuality
is a wicked problem, the users need replicas to be
indiscernible, and the system needs them to be
substitutable. But wait, there’s more!
3.6 Substitutability
Making replicas substitutable solves many of the
primary issues related to system scalability,
brittleness, and the illusion of reliability and
simplicity for the user. Recovery in the wake of
failures and disasters now becomes trivial because
any replica can be used for recovery, and not just
some fragile master copy or primary backup.
The statistics of file usage are rather interesting
with respect to this notion of substitutability. 90% of
all files are not used after initial creation; those that
are used are normally short-lived; if a file is not
used in some manner the day after it is created, it
will probably never be used, and only approximately
1% of all files are used daily11. This suggests that
the notion of substitutability is highly exploitable in
simplifying the design of Smart Data systems. For
example, the replicas of write-once-read-many files
such as .jpg .mp3 and most .pdf files, are static,
simplifying the design considerably because we
never need to distinguish one replica from another:
these kinds of files naturally exhibit this property we
call substitutability.
However, a small fraction of files are modifiable and
evolve over time. For example, by being edited by a
user or updated by an application at different times,
or worse, by multiple users or applications at the
same time, in which case some mechanism is
required to resynchronize the replicas so they can
once again become substitutable.
A constant cycle of specialization to distinguish
replicas, followed by resynchronization to make
them substitutable (or indiscernible) once again,
represents a core mechanism in our distributed
repository that, if we can guarantee its correctness,
enables our data to appear smarter. This is
discussed in more detail in the next section on
persistence and change.
The issue is that users wish, as far as possible, to
make all replicas indiscernible, because that allows
them to focus their attention on only a single entity
(a singular file), treating all its replicas as one;
thereby reducing the toll on their attention and
cognitive energy when dealing with data.
Databases present yet another identity perspective.
11
Tim Gibson, Ethan Miller, Darrell Long. “Long-term File
Activity and Inter-Reference Patterns”. Proc. 24th Intl
Conf. on Technology Management and Performance
Evaluation of Enterprise-Wide Information Systems,
Anaheim, CA, December 1998, pp 976--987.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 6 of 20
3.7 Implementation
We can take this one step further, and from the
perspective of our local computer, treat the replica
of the file that exists on this machine as the proxy,
or representative, of all the replicas there are in the
entire system, no mater how many there may be.
Because if we operate (change) any one of them,
then the system takes care of making sure that all
of them are automatically updated without having to
disturb the user.
From the perspective of the designer, trying to
achieve this objective, and at the same time, trying
to optimize system scalability, it is useful to design
local data structures to maintain a knowledge of
only how many remote replicas exist in the system,
rather than a data structure with separate pointers
to each, which require unnecessary maintenance
traffic as the system creates, deletes and migrates
replicas around the system.
Being able to identify and consistently refer to an
entity is an essential capability for system designers
who design them, and administrators who manage
them, but these needs are at odds with knowledge
warriors who associate names with things they wish
to manipulate, and need multiple namespaces only
to disambiguate the same name used for different
objects (usually in different work contexts).
Identifying entities in different ways, and at different
levels in a hierarchy may make a product easier to
design, but its subsequent operation may become
more difficult. Indiscernability is to the user that
information hiding [Parnas] is to the programmer.
3.8 Philosophy
Philosophy teaches us that this problem is already
wicked, and is related to the Identity of
Indescernables12. This principle states that no two
distinct objects or entities can exactly resemble
each other (Leibniz's Law) and is commonly
understood to mean that no two objects have
exactly the same properties. The Identity of
Indiscernibles is of interest because it raises
questions about the factors that individuate
qualitatively identical objects. This problem applies
to the identity of data and what it means to have
many substitutable replicas, as much as it does to
Quantum Mechanics13.
This issue is of great importance to the complexity
of humans interacting with their data, because it is
unnecessary for a human to need to expend
attention (or cognition) on any more than a single
entity, no matter how many copies exist, as long
has s/he can make the assumption that all the
copies will eventually be made identical by the
system. Thus, this form of (attention conserving)
information has gone up and the (attention
consuming) entropy has gone down.
A related but independent principle, the “principle of
indiscernables” states that if x is not identical to y,
then there is some non-relational property P such
that P holds of x and does not hold of y, or that P
holds of y and does not hold of x. Equivalently, if x
and y share all their non-relational properties, then
x is identical to y. In contrast, the “principle of the
indiscernibility of identicals” (Leibniz's Law), asserts
that if x is identical to y, then every non-relational
property of x is a property of y, and vice versa.
In its widest and weakest form, the properties
concerned include relational properties such as
spatiotemporal ones and self-identity. A stronger
version limits the properties to non-relational
properties (i.e., qualities), and would therefore imply
that there could not be, for example, two ball
bearings which are exactly similar. This is resolved
in our distributed repository by allowing the system
to maintain metadata that is not accessible to the
user, to enable the system to manage properties
that may be usefully different in different locations
between the cycles of specialization and
resynchronization
3.9 Properties of an Individual
In philosophy, there is a so-called ‘bundle’ view of
individuality, according to which an individual is
nothing but a bundle of properties. If all the replicas
were truly indistinguishable this would lead to a
violation of Leibniz’s principle of the Identity of
Indiscernibles, which expressed somewhat crudely,
insists that two things which are indiscernible, must
be, in fact, identical.
Electronically, it is possible to create many replicas
that have exactly the same set of bits, and are
therefore indistinguishable by looking at only the
object itself through different replicas. Thus,
distinguishability and individuality are conceptually
distinct from the perspective of the system, but
remain indiscernible to the user who prefers files to
be viewable and accessible as a singular entity.
There is one category of common error, in which
the user may wish to distinguish versions of files so
as to be able to select one of them to be
“undeleted”. Versions of files represent a different
(time) dimension of discernability from replicas
separated in space. Each version may have a
multitude of replicas throughout the system which
should still remain indiscernible, even though the
versions
may
need
to
be
temporarily
distinguishable.
12
Gottfried Leibniz, Discourse on Metaphysics.
Stanford Encyclopedia of Philosophy. Identity and
individuality in Quantum Mechanics.
13
As we can see, this problem is now starting to show
signs of wickedness.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 7 of 20
3.10 Indistinguishability
Can particles in quantum mechanics be regarded
as individuals, just like tables, chairs and people?
Then what about their electromagnetic fields, are
they individuals too?
changes and then re-writes that portion of the file
back to the disk. The operation appears to the user
to be irreversible: any data within the file that was
deleted appears to be lost forever.
According to the conventional quantum-mechanical
view of physicists - they cannot: quantum particles,
unlike their classical counterparts, must be
regarded as indistinguishable - 'non-individuals'.
However, recent work has indicated that this may
not be the whole story and that some theories are
consistent with the position that such particles can
be taken to be individuals.
Consider two computers connected by a network:
computer A and computer B. Assume all
modifications are made on computer A. A
replication program copies the file from A to B each
time the file is modified and closed. Computer B
can now be considered as the “backup” of
computer A. However, there may be a delay
between when the file is closed on computer A, and
the update is complete on computer B. This delay
may be arbitrarily long if the network between the
two computers is down.
The metaphysical wickedness of the problem of
identity and individuality applies to identity in
quantum mechanics and in the identity of data in
similar ways. In both cases, the history of the object
must be taken into account in order to stand any
chance of understanding the complete set of
properties which define an individual. Just like in
quantum mechanics, when the wave function
collapses, history is erased, and all that exists is a
single state which persists until the next event. The
same is true of a file if the system doesn’t have
versioning built-in at a fundamental level. But then,
keeping an indefinite number of versions of a file
uses up a great deal of storage, in ways that
creates a great deal of unnecessary duplication.
This explains the current fad for de-duplication.
Users ordinarily prefer to see a file only once, no
matter how many times or where the system has it
stored. However, if a file is stored by the user in
multiple places (say in different organizational
structures of some directory or namespace), they
may may prefer for it to behave as a single file (a
kind of cross-reference). Whichever location we
pick up the file, we can make operations on it15.
However, this distinction is orthogonal to making
replicas on different locations indiscernible to the
user. It may be necessary to seek guidance from a
user as to whether such an object is merely a copy
of an existing file or an entirely new file.
If quantum particles are regarded as individuals,
then Leibniz’s Identity of Indiscernibles is violated.
For a detailed treatment of the problem of Identity
and individuality in physics see the most recent
book by Steven French and Décio Krause 14.
4. Persistence & Change
Consider a static file that can be changed (e.g.
edited) and once again become static. The
sequence of events, from the perspective of a
single user on a single computer is:
1.
2.
3.
4.
5.
A duration before any activity where the
file F is closed (static)
An event which opens the file
The duration between when the file is
opened, modified and closed (dynamic)
An event which closes the file
The file has now become F and enters
another duration of inactivity (static)
Consider a single computer, the file is stored once
on the main disk of the computer. A simple editing
program opens the file, allows the user to make
A similar semantic conflict can be seen when we
drag and drop a file on a modern Operating
System. If the drop is to another place on the same
disk, it is considered a “move”. If it is to another
disk, it is considered a “copy”. By having a single
portal through which all a users files may be
accessed, independent of any concept of the
structure of disks behind it, eliminates this semantic
conflict for the user, and enables a user to organize
their logical namespace without regard to the
organization of the physical storage devices.
What if a user created a crude “backup” by copying
a folder to another disk? Would we want that
backup copy to show up in the search results?
4.1 How do things persist?
Are digital objects, like material objects, spread out
through time just as they are spread out through
space? Or is temporal persistence quite different
from spatial extension? These questions lie at the
heart of metaphysical exploration of the material
world. Is the ship of Theseus the same ship if all the
planks were replaced as they decayed over
hundreds of years? Is George Washington’s axe
the same axe if the shaft was replaced five times
and the head twice?
14
Steven French and Décio Krause. Identity in Physics: A
Historical, Philosophical, and Formal Analysis.
Oxford University Press, 2006
15
Filesystem hard-links and soft-links provide functionality
for “connecting” references to the same file together.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 8 of 20
4.2 Distributed Updates
The conventional view of change propagation is to
take “the” primary copy of data, and replicate it to a
secondary copy. In a local environment this is
called backup, over distance to an alternative
geographic site this is called remote replication.
Traditionally, there are two ways of doing this:
taking a snapshot of a data set (e.g. volume) in a
single indivisible action (often assisted through
some copy on write (COW) scheme to minimize the
time for which the primary data must be quiessed),
or on a continuous basis, similar to the process of
mirroring, but with a remote target.
Maintaining consistency of data in such an
environment is easy, providing we maintain in-order
(FIFO) delivery of the updates and there are no
alternate paths where packets related to some
operation on an object can overtake the others. A
single source to a single destination is
straightforward, providing the stream of updates
can be partitioned into disjoint atomic operations,
which is trivial for a single source and a single
destination, where the source has unique write
access privileges, and the link is 100% reliable.
According to Katherine Hawley17, there are three
theories that attempt to account for the persistence
of material objects: perdurance theory, endurance
theory, and stage theory. The issues of identity and
persistence in the definition of data objects may be
even more complex than those in the material
world: What happens when a data object has
multiple digital (as opposed to material) replicas,
and each of those replicas are bit for bit identical
looking at the binary data in the files, but the
physical blocks for each replica reside on different
media, and in different logical blocks in the file
systems which manage those media?
Perdurance theory suggests that objects persist
through time in much the same way as they extend
through space. Today’s file is the same as
yesterday’s file because the file exists on the
surface of a disk as a magnetically encoded entity
through time, part of which existed yesterday, and
was 1000 words, and another part of which exists
today, which is the same 1000 words, with 5 of the
original words modified and 250 more added.
The problem shows signs of wickedness when we
combine our notion of substitutable replicas with a
reliable method of change propagation. If the
locking semantics of the file require it, only one user
may have the file open for write access, and all
other replicas of the (closed) file represent
destinations. Many multicast schemes have been
designed to tackle this problem.
By contrast, endurance theory suggests that the
way objects persist through time is very different
from the way they extend through space. Objects
are three-dimensional, and they persist through
time by being ‘wholly present’ at each time at which
they exist. Today’s file is the same as yesterday’s
shorter file because it is named the same, and
appears through the same namespace by which it
is accessed. Just as the file is ‘wholly present’ at
each of the times at which it exists, and has
different properties at some of those times.
However, if the file semantics allow multiple writers,
then any replica may be open for write access and
be a source of updates, and all others destinations.
Now, updates to the file can be created in any
order, so the atomicity of the updates is critical.
This problem is often thought to be wicked, but is
merely a Gordian knot: The core issue is to
guarantee a FIFO path between all updaters (no
overtaking!) and to build a mechanism that
guarantees transactions will be atomic even
through failure and healing of the system. The
ordering problem can be expressed mathematically,
using lattice theory, and the healing with the help of
some old forgotten algorithms in network theory.16
The stage theory of persistence, proposed by
Theodore Sider18, combines elements of the other
two theories. It shares perdurance theory’s
dependence on a four-dimensional framework but
denies that theory’s account of predication in favor
of something like the endurance theory view of
predication. It is not four-dimensional (three space
and one of time) objects which satisfy identity
predicates like ‘is a file’ and which changes by
having parts that are the same as yesterday and
other parts that are different today. Instead, it is the
stable intermediate stages that make up the fourdimensional objects that are files with only 1000
words vs files with 1250.
4.3 Metaphysical Considerations
How do objects persist through change? In what
sense is this file being edited now the same file as
the one created yesterday forming an earlier draft
of this paper, even though it has been edited five
times, copied twice, and there are three backups on
other media?
Stage Theory could be a valuable model for a
versioning file system that manages versions of a
file as they change, like the VAX Operating System
OpenVMS, with each version representing a stable
“stage” of the file, which remains forever static, but
the file, each time that it is edited, creates another
“static” stage of the file, and all the stages are
saved, representing the history of that file.
17
Katherine Hawley, How Things Persist.
Thodore Sider. “Four-Dimensionalism”. Oxford
University Press, 2001
18
16
Gafni & Bertsekas. Link Reversal Protocol.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 9 of 20
Stage theory’s perspective on the problem may
help us design simpler, more reliable systems that
preserve the consistency and reduce the cognitive
overhead for users of data as it changes.
The issues of persistence are somewhat orthogonal
to the issues of identity. There may be an arbitrary
number of replicas of each version. Indeed, we may
deliberately preserve more replicas of more recent
versions than older ones, which are less likely to be
needed. This hints at the need for a dynamic
mechanism for managing replicas, which enables
competition with other dimensions of importance in
an “information lifecycle” for Smart Data.
However, which theory is applicable may depend
on who is the “observer”. An administrator may
prefer an endurance theory view, whereas a user
the perdurance view. More interestingly, how do we
manage the users’ desire for indiscernability when it
comes to versions?: under normal circumstances,
all versions should be hidden behind one object:
the file. However, when a user needs access to an
earlier history of a file, to recover from, for example,
the accidental deletion of the file or some portion of
it, how do we temporarily switch back to an
endurance view so that recovery can take place?
Maybe stage theory can provide an answer?
The whole concept of an “observer” is a major
wicked problem that delves into the very heart of
Quantum Mechanics, and the theory of causality,
but we don’t have time to deal with that here19.
4.4 Real Problems
In the modern digital age, we address this problem
somewhat less ontologically, but the complexities of
managing change tend to exhibit wickedness when
we try to solve real problems. A synchronization
program (for example rsync) is able to synchronize
directories on two machines only from one direction
to another. Andrew Tridgell’s rsync20 algorithm
shows a masterful perspective on the notion of
identity in the context of Smart Data. Unison21 uses
rsync and does the best it can to synchronize in
both directions, but creates false positives which
require administrative or user intervention to
resolve. Harmony is an attempt to overcome these
human attention overheads, by using ideas from
programming languages to predefine how data can
be merged automatically.
Judea Pearl. Causality: Models, reasoning and
inference. Cambridge University Press. 2000.
20
Andrew Tridgell. “Efficient Algorithms for Sorting and
Synchronization. Ph.D Thesis, Australian National
University, February 1999.
21
Unison is a file-synchronization tool which allows a
collection of files and directories to be stored on different
host to be modified separately, and then brought up to
date by synchronizing them at a later time.
http://www.stanford.edu/~pgbovine/unison_guide.htm
There is no universal solution to the
synchronization problem (the issue is what to do if a
replica is modified differently on two sides of a
partitioned network, and the results are not
mergable). However, there are solutions that can
be associated with specific file types and
applications. In general, applications have their own
file structure semantics, and these may be
proprietary to the company which owns the
application.
Harmony provides an automated way of merging
XML data, so that the need for human attention is
diminished. This leads the way for a “plug-in”
architecture to address the synchronization needs
of each application/data type independently. This
may overcome some of the Berlin wall’s of
proprietary ownership, or encourage vendors to
adopt public standards to enable interoperability.
Either way, a plug-in architecture will help.
Synchronizing replicas post facto represents only
one method of propagating changes among
replicas. In a distributed shared memory system at
the processor/cache level, every single memory
transaction is interleaved/propagated in such a way
that real-time interleaving of memory locations can
be achieved. This can be done for files also, but
requires a more sophisticated technique than post
facto synchronization, which can be considered
analogous to continuous replication, as opposed to
snapshot backups. Although we don’t have time to
go into this here, the solutions are similar in
principle, and simply require finer-grained atomic
operations that can still be guaranteed in the
presence of failures and healing operations on the
system’s topology.
This is also the basis on which distributed
databases address their issues of data identity and
persistence through transactions and ACID
properties, which brings us to the wicked problem
of “time and causality” which is the subject of the
next section.
Identity, persistence, and substitutability (or
substitutivity) are three wicked problems which
have begun to appear in data storage, but which
have a distinguished philosophical history22.
19
22
Cartwright R. “Identity and Substitutivity” in M. K. Munitz
(ed), “Identity and Individuation” (1971), New York
University Press.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 10 of 20
5. Time & Causality
5.1 What is Time?
“A measure of change”
~ Aristotle
“It is impossible to mediate on time
without an overwhelming emotion at the
limitations of human intelligence”
“A persistently stubborn illusion”
~Albert Einstein
~ Alfred North Whitehead
“Beyond all day-to-day problems in physics,
in the profound issues of principle that confront us
today, no difficulties are more central than those
associated with the concept of time”
~ John Archibald Wheeler
A relationship with time is intrinsic to everything we
do in creating, modifying and moving our data; yet
an understanding of the concept of time among
computer scientists appears far behind that of the
physicists and philosophers. This state of affairs is
of concern because if fundamental flaws exist in the
time assumptions underlying the algorithms that
govern access to and evolution of our data, then
our systems will fail in unpredictable ways, and any
number of undesirable characteristics may follow.
According to Bill Newton-Smith26 time may be
defined as a system of “temporal items” where by
“temporal items” are understood to be things like
instants, moments and durations. This he describes
as unenlightening since, even granting its truth, we
are accounting for the obscure notion of time in
terms of equally if not more obscure notions of
instants, moments and durations. Wicked!
Julian Barbour27 describes his informal poll at the
1991 international Workshop on Time Asymmetry:
The question posed to each of the 42
participants was as follows: “Do you believe
that time is a truly basic concept that must
appear in the foundations of any theory of the
world, or is it an effective concept that can be
derived from more primitive notions in the
same way that a notion of temperature can be
recovered in statistical mechanics?”
For the majority of knowledge warriors going about
their daily lives, failing to understand the subtleties
of time is forgivable: Our culture and language is
very deeply biased toward temporal concepts23.
Almost all sentences we utter encode time in tense.
Our everyday experiences incessantly affirm a
temporal ordering, implying both a singular direction
of temporal processes and a seductive distinction
between “before” and “after”.
It is not surprising, therefore, that Leslie Lamport’s
seminal paper24,25 defined the “happened before”
relation, and through that, a system of logical
timestamps that many computer scientists use as a
crutch to sweep their issues with time under the
rug. Unfortunately, the notion of “happened before”
is utterly meaningless unless it is intimately
associated with “happened where”.
Computer scientists and programmers frequently
base their designs for distributed algorithms on an
implicit assumption of absolute (Archimedean) time.
This, plus other implicit assumptions including the
concepts of continuous time and the flow of time
merit closer examination; because if our data were
to be elevated to a true level of “smartness”, then
we should allow no excuse for designers to get it
wrong. Time is the most critical area to ask hard
questions of the computer scientists and
programmers whose algorithms govern our
relationship with our evolving data.
23
David Ruelle. The Obsessions of Time. Comm.
Mathematical Physics, Volume 85, Number 1 (1982)
24
Leslie Lamport: "Time, clocks and the ordering of
events in a distributed system", Communications of the
ACM, Vol. 27, No. 7, pp. 558-565, July 1978
25
Rob R. Hoogerwoord Rob R. Hoogerwoord. Leslie
Lamport’s Logical Clocks: a tutorial. 29-Jan-2002
The results: 20 participants said that there was
no time at a fundamental level, 12 declared
themselves undecided or abstained, and 10
believed time did exist at the most basic level.
However, among the 12 in the undecided/
abstain category, 5 were sympathetic to or
inclined to the belief that time should not
appear at the most basic level of theory.
Since then, many books, scientific papers & popular
articles have appeared with a similar theme: that
time is an emergent (or derived) property of the
relationships between entities, rather than a
fundamental aspect of reality. A concept referred to
as the background-independent assumption of the
universe (for both space and time)28,29.
Some of the most compelling arguments for a
reappraisal of time are set forth by Huw Price 30, a
philosopher from the University of Sydney, whose
penetrating clarity and ruthless logic puts many of
the world’s best physicists to shame.
In October 2007, Brian Greene hosted a
conference at Columbia University to discuss this
mystery of time. Telescope observations and new
thinking about quantum gravity convinced them that
it is time to re-examine time31.
26
Bill Newton-Smith The Structure of Time Google Book
Julian Barbour. The End of Time. Oxford UP, 1999
28
Carlo Rovelli. Quantum Gravity. CUP, 2005
29
Lee Smolin Three roads to Quantum Gravity London/2001
30
Huw Price. Time’s Arrow & Archimedes’ Point. 1997
31
Scott Dodd. “Making Space for Time”. Scientific
American, January 2008. pp 26..29
27
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 11 of 20
A conference attendee: MIT’s Max Tegmark said
"we've answered classic questions about time by
replacing them with other hard questions”. Wicked!
As far as computers and our data objects are
concerned, the ordering of events and the atomicity
of operations on those objects is critical to the
correctness of our algorithms and thus the behavior
and reliability we can expect from our data.
Distributed algorithms are difficult enough to
comprehend, but if they are based on fundamental
misconceptions about the nature of time, and those
misconceptions can undermine, for example, the
ordering of events that the algorithm depends upon,
then we can no longer trust these algorithms to
guard the safety or consistency of our data. Let us
now consider some common misconceptions:
5.2 Simultaneity is a Myth
We have heard that the notion of an absolute frame
of reference in space or time is fundamentally
inconsistent with the way the Universe works. In
1905 Einstein showed us that the concept of “now”
is meaningless except for events occurring “here”.
By the time they have been through graduate
school, most physicists have the advantage of a
visceral understanding that time flows necessarily
at different rates for different observers from
Special Relativity (SR) and General Relativity (GR).
Those trained primarily in other disciplines, despite
a passing acquaintance with SR, may not be so
fortunate. Those with only a vague recollection of
the Lorenz transformations or Minkowski space
may find that insufficient to render a correct intuition
when thinking about the relativity of simultaneity32.
Simplistically, what this means for processes which
cause changes to our data is that the same events
observed from different places will be seen in a
different order, or that “sets” of events originating
from different sources can be arbitrarily interleaved,
at potential destinations (preserving individual
source stream order if no packets are lost)33.
What is frequently not understood, however, is that
the very nature of simultaneity, as an instant in time
in one place, has no meaning whatsoever at
another place in space34. Not only is our universe
devoid of any notion of a global flow of time, but we
cannot even synchronize our clocks, because the
Rachel Scherr, Peter Shaffer, & Stamatis Vokos.
“Understanding of time in special relativity: simultaneity
and reference frames”. 2002. arXiv:physics/0207109v1.
33
Strictly speaking, we don’t need SR to understand this
observer dependent reordering of events at a destination;
we need only a concept of a finite propagation velocity of
communication. SR is what makes this problem wicked.
34
Kenji Tokuo, “Logic of Simultaneity”. October 2007.
arXiv:0710.1398v1
notion of an inertial frame, which might cause us to
believe our clocks can theoretically run at the same
rate, is infeasible due to our computers residing on
the surface of a rotating sphere, in a gravitational
field, orbiting a star, and connected by a long-tailed
stochastic latency distribution network.
If we ascribe an event a “tag” of an instant in time,
then whether that specific instant is in the future or
past depends on the observer. SR alone denies the
possibility of universal simultaneity, and hence the
possibility of the identification of a specific universal
instant. GR nails this coffin shut.
Fundamentally, this means that an instant of time in
one computer can have no relation to another
instant of time in another computer, separated by
any arbitrary distance, no matter how small. In
practice, we make believe that our computers are
slow relative to the effects of SR and GR, that they
are all in a constant state of unaccelerated motion
with respect to each other, and that their
communication channels can be approximated to
Einstein’s light signals in the concept of an inertial
frame. But this is an assumption that merits serious
scrutiny. Lamport gets around this by tagging not
instants of time, but events in a linearly ordered
process. Unfortunately programmers do not
universally understand this. But what is an event?
We should get nervous any time we hear a
computer scientist discuss notions of synchronous
algorithms, simultaneous events, global time or
absolute time35.
5.3 Time is not continuous
Lets now consider the Aristotelian view that time is
empirically related to change. Change is a variation
or sequence of occurrences36. A sequence of
occurrences in relativity is substituted by a
sequence of spacetime events. An event is an
idealization of a point in space and an instant in
time. The concept of an instant (as well as that of
duration or interval) is also wicked: Peter Lynds,
suggests there is no such thing as an indivisible
moment in time37.
Duration is an ordered set of instants, not the sum
of instants because duration is infinitely divisible
into more durations, and not into an instant.
According to Zeno, between any infinitesimally
neighboring instants, an infinity of instants exist.
32
35
The term “absolute time” is used six times in Lamport’s
paper: “Using Time Instead of Timeout for Fault-Tolerant
Distributed Systems”. Leslie Lamport. ACM Transactions
on Programming Languages and Systems, Vol 6, No 2,
April 1984, pp 254..280.
36
Francisco S.N. Lobo. “Nature of time and causality in
Physics. arXiv:0710.0428v1
37
Peter Lynds. Zeno’s Paradoxes. Instants.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 12 of 20
This gives us our first hint that time may not be a
linear continuum (unless we consider that planck
time ~10-43s will come to our rescue as a theoretical
limit to this infinite divisibility). But then we would
need to get into quantum gravity (as if this problem
were not already wicked enough).
5.4 Time does not flow
It appears to us that time flows: events change from
being indeterminate in the future to being
determinate in the past. But the passing of time
cannot be clearly perceived as matter and space
directly; one can perceive only irreversible physical,
chemical, and biological changes in the physical
space -- the space in which material objects exist.
On the basis of human perception we can conclude
that physical time exists only as a stream of change
that runs through physical space. The important
point is: Change does not "happen" in physical time
-- change itself is physical time. This is different to
the conventional perspective, in which space-time
is the theater or "stage" on which physical change
happens.
Nothing in known physics corresponds to the
passage of time. Indeed, many physicists insist that
time doesn't flow at all; it merely is (what Julian
Barbour calls Platonia). Philosophers argue that the
very notion of the passage of time is nonsensical
and that talk of the river or flux of time is founded
on a misconception. Change without Time 38 is the
only known alternative without these conflicts.
But what is change? An event? So a series of
events constitutes a flow which we can call time?
Not quite my dear Josephine: Events reflect
something that happens, in spacetime (and space
is just as important as time in this context). Events
represent interactions, between two (or is it more?
– this is an even more wicked problem), where
“something” is exchanged. Otherwise, events could
“happen” all the time (forgive the pun) without any
apparent consequence.
There is no more evidence for the existence of
anything real between one event and another than
there is for an aether to support the propagation of
electromagnetic waves in an empty space.
Time, like space, is now believed by the majority of
professional physicists to be a derived concept: an
emergent property of the universe: something like
the temperature or pressure of a gas in equilibrium.
It is time for computer science to catch up with this
revolution on physics and philosophy?
38
Johannes Simon. “Change without Time: Relationalism
and Field Quantization.”. Ph.D. Dissertation. Universitat
Regensburg. 2004. (Note from Paul – one of the best
Ph.D. Thesis’ I have ever read!).
The time we think we measure, as well as our units
of distance, are based on the speed of light as a
universal constant. This can be hard to understand
when we define “speed” as the derivative of
distance and time39. But then, the bureau of
standards defines space (the metre) as “the
distance traveled by light in vacuum during
1/299,792,458 of a second”.
Aristotle identified time with change. For time to
occur, we need a changing configuration of matter.
“In an empty Universe, a hypothetical observer
cannot measure time or length”40.
Intuitively, like Aristotle, we verify that a notion of
time emerges through an intimate relationship to
change, and subjectively may be considered as
something that flows in our day-to-day interactions
with other humans. However, this is ludicrously
imprecise when we discuss interactions of our
computers with multi-GHz processors & networks.
One way to usefully (for our purposes) interpret this
is to recognize “events” in spacetime as the causal
processes that create our reality. Setting the
direction of time aside for the time being, we could
call such events “interactions” because at the
fundamental level, an event is a process where two
things are interacting and exchanging “something”.
This corresponds to the theory of Conserved
Quantities (CQ)41 in the philosophical debate on
causality, or the theory of Exchanged Quantities
(EQ) described below.
5.5 Time has no Direction
Physical time is irreversible only at macroscopic
scales. Change A transforms into change B, B
transforms into C and so on. When B is in existence
A does not exist anymore, when C is in existence B
does not exist anymore. Here physical time is
understood as a stream of irreversible change that
runs through physical space. Physical space itself
is atemporal.
Irreversible processes that capture “change” like a
‘probability ratchet’ that prevents a wheel going
backwards, are the engines of our reality. They
make changes persistent, even though at the most
basic level, time has no intrinsic direction.
Time can be formally defined in quantum
mechanics with respect to its conjugate: energy. A
39
A bit of trivia – we use the word “speed” when referring
to the speed of light, instead of velocity, because velocity
is a vector, with some direction, whereas the speed of
light has the same maximum speed in all directions.
40
Francisco S.N. Lobo. “Nature of time and causality in
Physics. arXiv:0710.0428v1
41
Phil Dowe. The Conserved Quantity Theory of
Causation and Chance Raising.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 13 of 20
photon with energy h (Planck’s constant) is
equivalent to a once per second oscillation. In
quantum mechanics, as well as classical
mechanics, there is no intrinsic direction of time; all
our equations are temporally symmetric. The
direction of time that we perceive is based on
information42 and can be seen only at macroscopic
scales in the form of the second law of
thermodynamics (entropy43).
An assumption of monotonically increasing global
time instants creates a problem for computer
scientists, who must now reconcile transactions,
concurrency and interactions at nanosecond scales
(about one foot at the speed of light), and yet our
computers already involve interactions where
picoseconds are relevant, and our computers must
necessarily be more than a foot apart.
The reduction of the wave function is essentially
time asymmetric. And although can be considered
as an irreversible process at the atomic level, it
does not necessarily set a “direction of time” (a
distinction that, unfortunately, escaped the Nobel
laureate Ilya Prigogine44).
The idea that process interactions can be
symmetric with respect to time is independent of
the fact that sometimes an interaction may be
irreversible, in the sense that all knowledge of what
happened before it has been lost, and therefore the
operation cannot be undone in that direction again.
This idea relates closely to the concept of
“transactions” on data.
Atomic transactions (in the database and file
system sense) therefore, are to computer scientists
what irreversible processes are to chemists and
biologists. They represent ratchets on the change
of state. But built into the notion of transactions (the
rollback) there exists a concept of time reversal.
This is what makes the notions of atomicity,
linearizability and serializability a wicked problem.
Even the notion of an “algorithm” (derived from the
same conceptual underpinning as the Turing
Machine), is incomplete if we don’t at least specify
a place where the time “event” occurs. But how do
we specify a place, when everything is relative?
Lamport did this implicitly by defining events within
a “process” (by which, we assume he means within
a single spatially constrained entity, even if it is in
constant relative motion). But then the whole
concept of a “distributed process” is flawed.
What this may mean is that, we may be better off
with a notion of interactive agents, which relate to
42
Information is equal to the number of yes/no answers
we can get out of a system.
43
Entropy is a measure of missing information.
44
Ilya Prigogine. Time, Structure and Fluctuations. Nobel
Lecture. December 1997.
each other as independent and autonomous
entities, responding only to locally received events.
The now thoroughly discredited Archimedean view
of absolute time is an attempt to preserve a “GodsEye-View” (GEV) of the processes in our world or
among our computer systems.
If our concept of the flow of time is fundamentally
flawed, and we can no longer create a logically
consistent framework from notions of instants and
durations, then what can we count on? Events?
5.6 The Theory of Exchanged
Quantities (EQ)
Atoms interact with each other directly, through the
exchange of photons. In the words of Richard
Feynman “The sun atom shakes; the atom in my
eye shakes eight minutes later because of a direct
interaction across space”. However, from the
perspective of a photon, the connection between
the atoms in the sun and my eye is instantaneous –
proper time for a photon is always zero! No matter
how many billions of light years it travels through
our universe to reach our eyes or the detectors in
our telescopes. Radiation, it appears, requires both
a sender and a receiver. Maybe this is an event?
Events? But what are they? Maybe they are related
to what John Cramer45 describes in his theory of
transactional exchanges in Quantum Mechanics,
which is related to our concept of exchanged
quantities (EQ), described as follows:
Taking the notion of transactions all the way down
to the atomic level, and then reflecting that notion
back up all the way to the way “interactions” work
between computers begins to shed some light on a
possible new way to design reliable distributed
systems. By always requiring the notion of some
“quantity” to be exchanged (even if it is a packet in
the buffer of a sending agent for a “hole” in the
buffer of a receiving agent) starts to get us the
notion of a time reversible interactions in the same
way that photons tie together two atoms in space.
It is not hard to see that an interaction between
computers requires the presence of both a sender
and a receiver, just like radiation does. Exchanged,
or conserved quantities would appear to be exactly
what we need to deal with distributed system
problems such as locks, transactions, and
deadlocks.
A theory of exchanged quantities represents a
concrete way of modeling reality: directly applicable
to data storage, by connecting computer science to
current knowledge in physics and philosophy.
45
John G Cramer. The Transactional Interpretation of
Quantum Mechanics. Physical Review 22, (1980)
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 14 of 20
5.7 Consistency & In-order Delivery
Causal ordering of messages plays an essential
role both in ensuring the consistency of data
structures and unlocking the inherent parallelism
and availability in a distributed system. Ensuring
causal ordering without an acyclic communication
substrate has a long and checkered history.
Variations on Lamport’s logical timestamps are the
most frequently proposed mechanism to address
this issue. The fundamental problem is frequently
described as “maintaining causal order” of
messages. Logical timestamps attempt to do this by
tagging each relevant event in a system with a
monotonically increasing variable or a vector of perprocess variables, and comparing these variables
when messages are received in-order, to detect
“causal violations” when they have occurred, but
before they can affect the consistency of data.
Much of the theory and mechanism for logical
timestamps is common to version vectors used
when replicated file systems are partitioned.
The idea behind logical timestamps initially appears
reasonable, but every implementation so far has
proven to be intolerably complex when applied to
real systems that scale and can fail arbitrarily46,47.
Timestamp variables overflow and every method
attempted so far to resolve this issue simply adds
yet more wicked complexity. A single logical
(scalar) timestamp imposes a total order on the
distributed system that severely constrains the
available concurrency (and hence scalability) of the
system. Vector timestamps provide a preconfigured
per-process vector which constrains the dynamic
nature of the system. Transactions cannot be
undone (backed out) unless a history of timestamps
is maintained by each node and transmitted with
each message (matrix timestamps), the overhead
of which is so huge that this scheme is rarely used
except perhaps in simulations, where the system
under test can be severely bounded.
Logical timestamps represent an attempt to resynthesize a God’s Eye View (GEV) of “logical
time”. Our experience is that they present
intolerable complexities in the deployment and
brittleness of scalable storage systems.
Just as wait-free programming48 represents an
alternative to conventional locks (which are prone
to loss and deadlock), the ordering of events can be
articulated much more simply in the network, rather
than attempting to recreate a sense of logical global
time at the message senders.
The figure above shows how they work. Three
processes (P,Q,R) are distributed in space, and
communicate by sending messages to each other
over their respective communication channels
(indicated by the diagonal lines, the angle of which
vaguely represent the speed of transmission). Each
Process marks its own events with a monotonically
increasing integer (the logical timestamp), and
keeps track of the messages received from the
other processes, who conveniently append their
own logical timestamps. In this way, each process
can ascertain what state the other processes are in
based on the messages they receive. The figure
shows clearly that the causal history, and future
effect, for the two events 32, and 24, represent
distinctly different perspectives of what is going on.
These problems with time and concurrency go far
beyond distributed storage. The processor industry
is in a crisis: Multicore processors have become
necessary because of the clock frequency wall
related to power dissipation. Now we have reached
the point that we “have to” go multi-core, and
nobody’s thought about the software49. Computer
scientists have failed to produce the theoretical
models for time and concurrency, as well as the
software tools needed to use these multiple cores.
There hasn’t been a breakthrough idea in parallel
programming for a long time. We are now going to
have to invest in the research on new models for
time and concurrency that we should have 10 years
ago in order to be able to utilize those processors.
This problem only gets worse as we have more
cores per processor.
46
Reinhard Schwarz, Friedemann Mattern. “Detecting
Causal Relationships in Distributed Computations: In
Search of the Holy Grail”. Distributed Computing. Vol 7,
No 3 (1994). pp 149..174.
47
David R Cherition, Dale Skeen. “Understanidng the
Limitations of Causally and Totally Ordered
Communication.
48
Maurice Herlihy. Wait-free synchronization. ACM
Transactions on Programming Languages and Systems,
13(1):124--149, January 1991
49
John Hennessy, Keynote speech to the CTO Forum.
Cisco, November 2007.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 15 of 20
5.8 Causality
I wish, first, to maintain that the word “causality”
is so inextricably bound up with misleading
associations as to make its complete extrusion
from the Computer Science vocabulary desirable;
secondly, to inquire what principle, if any, is
employed in physics in place of the supposed “law
of causality” which computer scientists imagine to
be employed, to exhibit certain confusions,
specially in regard to teleology and determinism,
which appear to me to be connected with
erroneous notions as to causality.
Computer Scientists, imagine that causation is one
of the fundamental axioms or postulates of physics,
yet, oddly enough, in real scientific disciplines
such as special and general relativity, and
quantum mechanics, the word “cause” never
occurs. To me it seems that computer science
ought not to assume such legislative functions, and
that the reason why physics has ceased to look for
causes is that in fact there are no such things. The
law of causality, I believe, like much that passes
muster among computer scientists, is a relic of a
bygone age, surviving, like a belief in God50, only
because it is erroneously supposed to do no harm.
~Paul Borrill (with apologies to Bertrand Russell)
The conventional notion of causality is: given any
two events, A and B, there are three possibilities:
either A is a cause of B, or B is a cause of A or
neither is a cause of the other. This “traditional”
distinction within the notion of causality is built into
Lamport’s happened before relation.
In philosophy, as in the management of data, cause
and effect are twin pillars on which much of our
thought seems based. But almost a century ago,
Bertrand Russell declared that modern physics
leaves these pillars without foundations. Russell's
revolutionary conclusion was that 'the law of
causality is a relic of a bygone age, surviving, like
the monarchy, only because it is erroneously
supposed to do no harm51.
Russell's famous challenge remains unanswered.
Despite dramatic advances in physics, this past
century has taken us no closer to an explanation of
how to find a place for causation in a world of the
kind that physics or computer science reveals. In
particular, we still have no satisfactory account of
the directionality of causation - the difference
between cause and effect, and the fact that causes
typically precede their effects.
This would appear to be an aspect of the
reversibility of time discussed in the previous
section.
50
Richard Dawkins. The God Delusion.
51
Bertrand Russell. “On the Notion of Cause”. In Russell
on Metaphysics. Stephen Mumford. Routledge. 2003.
For computer scientists, a sequence of events has
a determined temporal order: events are triggered
by causes, thus providing us with the notion of
causality. The causality principle is often defined
as: “every effect must have a proximate,
antecedent cause”.
While this simple comment seems eminently
sensible, and this style of thinking regarding causal
processes is widespread in the computer science
community, the fields of physics and philosophy
take a very different view of the validity of the whole
concept of causality52. If conceptual difficulties are
known to exist with the notion of causality in other
fields, then should we not pay a little more attention
to this in the assumptions underlying the design of
our data systems and our algorithms?
6. Curse of the God's Eye View
“All behind one pane of glass” is a frequent mantra
of IT people trying to manage the complexity of
their data systems: A metaphor for a single
workstation view of an entire operation (datacenter,
or Network Operations Center), with everything
viewable, or accessible, a few clicks away, drilling
down through remote control into some part of the
system to yield information that an expert
administrator could immediately pattern match the
problem to diagnose and prescribe the solution to.
“One throat to choke” is another: reflecting the
desire to avoid dealing with vendor finger pointing
when systems fail to interoperate, or data becomes
lost or corrupted. This perspective is easy to
understand in an industry perspective which
anyone who has perceived the value of
standardization of protocols or interfaces, or even
operational procedures intuitively understands
These very human intuitions are understandable
from the perspective of harried CIO’s, IT managers
and even ourselves as knowledge warriors, “trying
to do their job”, but they inevitably lead to failure.
Is it possible that the God’s Eye View provides us
with too much information? And that the system
interactions in trying to maintain that information
actively interferes with the architectural processes
which create subtle but essential behavioral
robustness of systems?
Our intuition fails us when we try to solve what
appear to be immediate problems in front of us. Our
unconscious beliefs, and our impatience (often
driven by the downward causation of quarterly
profits, or Investor “time to return”), pressure us into
sweeping issues under the rug, where we forget
Huw Price, Richard Corry. “Causation, Physics, and the
Constitution Of Reality”, Clarendon Press – Oxford, 2007
52
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 16 of 20
about them. After a while, the lumps under the rug
start to become minor mounds, and then termite
hills, and eventually a mountain range). These
lumps under the rug are our individual and our
collective wicked problems. They are our first dim
glimpses of awareness of the next series of
problems that we need to address as individuals,
organizations, an industry, a society or a
civilization, in order to move to the next level of our
collective consciousness.
Sooner or later, in order to make progress (instead
of the inevitable Albert Caymus’ Myth of Sisyphus),
we must pull these problems back from under the
rug, and work on their wickedness. Today, we have
discussed a small, and hopefully manageable set of
wicked problems, to be solved in order to address
our prime directive of reliving human beings from
their slavery to their data.
Richard Dawkins, in his 2005 TED talk at Oxford
illustrated the problem of how our beliefs and
mental models blind us to the truth:
We are now so used to the idea that the Earth
spins rather than the Sun moves across the
sky. It's hard for us to realize what a shattering
mental revolution that must have been. After all
it seems obvious that the Earth is large and
motionless, the Sun small and mobile. But it is
worth recalling Wittgenstein's remark on the
subject. "Tell me", he asked a friend, "Why do
people always say it was natural for man to
assume that the Sun went around the Earth
rather than that the Earth was rotating?" His
friend replied, "Well obviously because it just
looks as though the Sun is going around the
Earth!". Wittgenstein replied ... "Well what
would it have looked like if it had looked as
though the Earth was rotating?"
So what would it look like, if we were not able to
reach out as designers, and to be unable to directly
control things as the number, connectivity and
diversity of things scale? Well maybe it would look
exactly like what we are experiencing now: the
indefinable complexity of wicked problems, rising
like a termite hills under the carpet, and causing us
to develop our Yak shaving skills.
We are evolved denizens of a local environment
that we perceive. Our primitive mammalian brains,
selected over millennia, are trained to reach out
and touch, to chase after prey, to run away from
danger, to grasp our tools or our food. This notion
of locality and spatial perception is deeply
imbedded into our perspective of the world. We can
even reach out through our newspapers, to other
events in our world, and our screens, that are but a
few feet away, connect us with the lives of others
on the other side of the world.
But this sense of humans being able to control
everything in our environment, just because we can
see it, is deeply flawed, it just doesn’t scale. I call it
the curse of the God’s Eye View. Just because our
God-like design ego would like to see everything,
doesn’t mean that we should, and even if we can
see something, it doesn’t mean we should try to
control them from that same GEV perspective. The
inductive transition from n to n+1 as systems scale,
will seduce into believing that what works at small
scale will work at larger scales. It simply isn’t true.
Ultimately, the God’s Eye View (GEV) represents
an Archimedean perspective: an attempt to gain
visibility to everything as if time and space were a
backdrop canvas for a theater on which we can
draw our Newtonian view of how our universe
“should” work. Are designers the enemy of design?
5.8 Reduction to Practice
He who loves practice without theory is like the
sailor who boards ship without a rudder and
compass and never knows where he may cast.
~Leonardo da Vinci
As we extend to more and more cores in our
processors, and more nodes in our distributed
systems, the problems of time, causality and
concurrent programming models become more
acute. There are some models that do show some
promise for some insight to this problem53. When
they are combined with a relativistic view of time
and space: a breakthrough may be forthcoming.
This is an area of challenge for computer scientists.
Until they give up their naïve models about time,
and their attempts to recreate a GEV, little progress
will be made. Designers are the enemy of design,
especially when they try to play God.
Turing machines and Algorithms must completely
specify all inputs before they start computing, while
interaction machines54 can add actions occurring
during the course of the computation.
Is this shift in perspective from the “Gods-EyeView” of algorithms, to this “neighbor to neighbor”
interactive view of distributed computation the shift
we need to more correctly model our notions of
time in a distributed computing environment?
Another major concept, which goes along with the
notion of Interactive computation, is SelfStabilization.
53
Maurice Herlihy. Nir Shavit. The Topological Structure
of Asynchronous Computability. Journal of the ACM,
November 1999. Also see the forthcoming Textbook: "The
Art of Multiprocessor Programming" by Maurice Herlihy
and Nir Shavit, Morgan-Kaufmann Elsevier, March 2008.
54
Interactive Computation: The New Paradigm. Edited by
Dina Goldin, Scott Smolka, and Peter Wenger.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 17 of 20
Self Stabilization Is a halfway house, an insight
described by Dijkstra in 1974 when he began to
recognize the limitations of the algorithmic
paradigm based on Turing machines. Not much
was done with this at the time, but Lamport’s later
praise of the concept fed life into it, and it has now
become an active, if not major, area of research.
An extension of the concept of self-stabilization is
that of superstabilization [Dolev and Herman 1997]
The intent here is to cope with dynamic distributed
systems that undergo topological changes with an
additional passage predicate in addition to the
conditions for self-stabilization. In classical selfstabilization theory, environmental events that
cause (for example) topology changes are viewed
as errors where no guarantees are given until the
system has stabilized again. With superstabilizing
systems, a “passage predicate” is required to be
satisfied during the reconfiguration of the underlying
topology.
Wenger argues that Interaction Computing is more
powerful than algorithms. Interaction as a
distributed systems concept ties in with EQ and CQ
notions described previously.
7. Conclusions
Throughout Computer Science, we see a culturally
embedded concept of Archimedean time. The idea
that there is some kind of master clock in the sky,
and that time flows somehow independently of the
interactions between our atoms and molecules, or
our computer systems as they communicate. We
now know with some certainty that without the
intense mathematical training of a theoretical
physicist, our intuition will fail us.
upon our attention, as we unconsciously but not
inevitably become slaves to our data.
When I began to write this paper, I had in mind that
the prime directive would be of the form:
Smart data looks after itself,
or
Thou shalt not cause a human being to do
work that a machine can do
However, the issue of human beings steadily
becoming slaves to machines was felt to be of such
paramount importance to our future relationship
with our data, that we needed something a little
stronger. Taking inspiration from Asimov on our
cultural awareness of our relationship to machines,
I felt it better to express the proposed prime
directive for Smart Data in the form of the three
laws which we presented in the introduction56:
The intent of these laws is clear: our technology
abundances, whether in the form of CPU cycles,
network bandwidth, or storage capacity, should be
used to preserve the scarce resources, such as
human attention, and latency57.
7.2 Six Principles of Smart Data
The following six principles are the result of insights
obtained when thinking through and modeling the
design of a 100PB Distributed File Repository.
1.
We conjecture that the majority of the complexity of
storage systems (and much of the difficulty in
building scalable, reliable, distributed systems), is
due to the failure of designers to resist the idea that
they are God, when they design their systems.
2.
God’s Eye View solutions are a fantasy. The only
world that nature builds is through self-organizing
behavior55: simple rules, near-neighbor interactions
(whether those interactions are by contact between
electromagnetic fields, or photons across space).
3.
7.1 The Prime Directive
Every knowledge warrior knows that we are, as yet,
far from the singularity of our data becoming
smarter than us. But the writing is on the wall: if we
are not to progressively become tools of our tools,
then we must address the tax necessarily imposed
55
Autopoietic File Systems: From Architectural Theory to
Practical Implications. Paul L. Borrill, Vanguard
Conference, San Francisco, January 2005
The system shall forsake any God (a single
coordinator - human or otherwise) that can
fail and bring down the whole system, or
prevent the system from returning itself to
fully operational state after perturbations due
to failures, disasters or attacks.
Thou shalt use only a relative time
assumption in any aspect of the design of
a smart data system or its distributed
algorithms58.
Each storage agent, in conjunction with its
neighbors, will do everything within its
power to preserve data. If choices must be
made, then data shall be conserved
according to the following priority classes:
Class 1 – Not important to operations
Class 2 – Important for Productivity
Class 3 – Business Important Information
Class 4 – Business Vital Information
Class 5 – Mission Critical Information
56
See Roger Clarke’s overview on Asimov’s three laws of
robotics, or the description in Wikipedia
57
While bandwidth continues to improve, latency remains
a constant due to the finite speed of light
58
Absolute (Archimedean) time is depreciated, and should
be banished from our designs, and our thinking
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 18 of 20
4.
5.
6.
7.
For users, all replicas of a file shall remain
indiscernible; all versions shall remain
indiscernible, until s/he needs to undelete
For systems: replicas shall be
substitutable
Storage agents are individuals; don’t try to
put their brains hearts and intestines in
different places.
The designer has no right to be wrong.
The motivation for these laws and principles is not
only to make our lives as knowledge warriors more
satisfying and productive, but to encourage the
elimination of a whole set of administrative chores,
so that those wonderful human beings called
systems and storage administers, who so valiantly
try to make our lives as knowledge warriors easier,
can move on to new roles in life involving variety
and initiative.
The Archimedean or GEV perspective does not
relieve complexity in design, it causes it.
7.3 Call to Action
To computer scientists, architects, programmers
and entrepreneurs, to lift up their heads from their
keyboards, and look more broadly to other
disciplines for inspiration and ideas on how to get
us out of the current rut they have gotten us into
with the intensely-wasteful-of-human-attention ways
conventional storage systems are designed.
Look more broadly to other disciplines for
inspiration and insight into how nature works, and
model your systems accordingly. Be acutely aware
of assumptions about time, root out hidden
assumptions of simultaneity, treat everything as
interactions with conserved or exchanged quantities
(with nothing but a void in between), and respect
that time can (sometimes) go in reverse.
If we truly wish our data to be smart, and to be able
to rely on it going forward, then the designers of our
systems have no right to be wrong. We have the
right to expect our bridges remain standing during a
storm, or our airplanes to land safely even though
faults may occur in flight, and our smart data to be
there when and where we need it, and for it not to
tax our attention when we don’t.
Let the music begin.
Vanguard Feb 20-21, 2008: SMART(ER) DATA
8. References
Articles / Papers:
1.
2.
3.
4.
5.
6.
7.
8.
9.
Economist Magazine: Make It Simple.
Information Technology Survey. October
2004.
Scott Dodd. “Making Space For Time”.
Scientific American, January 2008. Pp-2629.
Peter Denning. “The Choice Uncertainty
Principle. Communications of the ACM,
November 2007, pp 9-14.
[Ellis 2006]. GFR Ellis Physics in the real
universe; time and space time, “Gem Re;
Grav. 39. 1797, 2006.
Rachel Scherr, Peter Shaffer, & Stamatis
Vokos, “Student understanding of time in
special relativity: simultaneity and
reference frames”.
arXiv:physics/0207109v1
Kenji Tokuo, “Logic of Simultaneity”.
arXiv:0710.1398v1
Francisco S.N. Tobo. “Nature Of Time
And Causality In Physics”. 2007
Donna J. Peuquet, “Making Space For
Time: Issues In Space-Time Data
Representation”. GeoInformatica.
Volume 5, Number 1, March 2001.
David Ruelle, The Obsessions of Time.
Communications in Mathematical Physics,
Volume 85, Number 1 (1982)
Books:
10. Gottfried Leibniz, Discourse On
Metaphysics And Other Essays.
11. Steven French and Decio Krause, Identity
in Physics: A Historical, Philosophical,
and Formal Analysis. Oxford University
Press, 2006
12. Leslie Lamport: “Time, clocks, and the
ordering of events in a distributed system”,
Communications of the ACM, Vol. 27,
pp.558-565, July 1978
13. Bill Newton-Smith, The Structure of Time,
Routledge & Kegan Books Ltd, October
1984.
14. Huw Price and Richard Corry, Causation,
Physics, and the Constitution Of Reality,
Clarendon Press – Oxford, 2007
15. Julian Barbour, The End of Time, Oxford
University Press Inc., 1999
16. Lee Smolin, Three Roads To Quantum
Gravity, Basic Books, 2001
17. Huw Price. Time’s Arrow & Archimedes’
Point: New Directions for the Physics of
Time.
18. Dina Goldin, Scoot A. Smolka, Peter
Wegner (Eds.). Interactive Computation,
The New Paradigm. Springer, 2006.
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 19 of 20
Additional Reading:
19. Victor J. Stenger, Ph.D., Timeless Reality
Symmetry Simplicity and Multiple
Universes. Prometheus Books, 2000.
20. Murray Gell-mann, The Quark and the
Jaguar, Henry Holt & Company LLC.,
1994
21. Hyman Bass and Alexander Lubotzky,
Tree Lattices (Progress In Mathematics),
Birkhauser Boston, 2001
22. Hans Reichenbach, The Direction Of
Time, Dover Productions, 1999
23. Davide Sangiorgi and David Walker, The
PI-Calculus: A Theory Of Mobile
Processes, Cambridge University Press,
2001
24. Peter Atkins, Four Laws That Drive The
Universe. Oxford University Press, 2007.
25. Robin Milner, Communicating And Mobile
Systems: The PI-Calculus., Cambridge
University Press, 1999
26. George Gratzer, Lattice Theory: First
Concepts and Distributed Lattices, W.H.
Freedman & Company, 1971
27. B.A. Davey and H.A. Davey, Introduction
to Lattices and Order Second Edition,
Cambridge University Press, 1990
28. Neil Gershenfeld, The Physics Of
Information Technology, Cambridge
University Press, 2000
29. Judea Pearl, Causality: Models,
Reasoning, and Interference, Cambridge
University Press, 2000
30. Lisa M. Dolling, Arthur F. Gianelli, Glenn
N. Statile, Editors. The Tests Of Time:
Readings In The Development Of Physical
Theory, Princeton University Press, 2003
31. Stephen Hawking, A Stubbornly Persistent
Illusion: The Essential Scientific Works Of
Albert Einstein, Edited with Commentary
by Running Press Book Publishers, 2007
32. Mark Burgess, Principals Of Network And
Systems Administration, John Wiley &
Sons, LTD., 2000
33. Paul Davies, About Time: Einstein’s
Unfinished Revolution, Orion Productions,
1995
34. Matthew Hennessey, A Distributed PICalculus, Cambridge University Press,
2007
35. Nancy A. Lynch, Distributed Algorithms
(The Morgan Kaufmann Series In Data
Management Systems), Morgan
Kaufmann Publishers, Inc., 1996.
36. Autopoietic File Systems: From
Architectural Theory to Practical
Implications. Paul L. Borrill, Vanguard
Conference, San Francisco, January
2005.
37. Carlo Rovelli. Quantum Gravity.
Cambridge University Press, 2005
Paul Borrill. “Smart Data and Wicked Problems” – TTI Vanguard. V1.2 12-Feb-2008
Page 20 of 20
Download