doc - Ian Graham

advertisement
Putting Privacy in Context -- An overview of the
Concept of Privacy and of Current Technologies
Dr. Ian Graham
Centre for Academic Technology
Information Commons
University of Toronto
ian.graham@utoronto.ca
Tel: (416) 978-4548
Table of Contents
Introduction ......................................................................................................................... 1
A Definition of Privacy....................................................................................................... 3
A Short History of Privacy ................................................................................................. 5
The Modern Age: Freedom of Information and Privacy Laws ....................................... 7
The Modern Age: Going Digital ......................................................................................... 8
User-Centric Privacy ..................................................................................................... 10
Personal Privacy and Relationships on the Web ........................................................... 11
Next-Generation: Negotiated Relationships .................................................................. 12
Summary........................................................................................................................ 12
Important Technologies and Technical Issues .................................................................. 14
Future Technologies.......................................................................................................... 18
Conclusions ....................................................................................................................... 19
References ......................................................................................................................... 19
Introduction
The worlds of information and communication are changing, in ways that some people do
not yet appreciate, and that none of us truly understand. The latter is so because, over the
next 20 years, digital technologies will replace or radically transform essentially every
technology of information storage, retrieval and dissemination currently in use. Indeed, it
is likely that, transported 20 years into the future, we would barely recognize the
everyday tools of communications and data processing.
This social revolution is taking place in part because the new technologies offer
unprecedented cost reductions for all three of these just-mentioned factors. But a more
important driving force is the new, and largely unexplored opportunity to combine, reuse, and repurpose digital data in ways that were simple not possible with the previous
technologies. This will allow us to better understand the underlying processes giving rise
to this information, and will (hopefully) give us a better understanding of the natural,
physical, and social worlds.
Indeed, the academic community is embracing these technologies as research tools, to
improve their ability to understand the world we live in. Similar opportunities exist--and
are being embraced--in the commercial and political realms. Increased amounts of digital
-1-
information, combined with new tools for combining, processing and analyzing these
data promise improved efficiencies in business and government process, and a better
understanding of these processes as well.
But at what cost? All technologies have unanticipated side effects, and can be used for
good purposes or bad: information technologies are no exception. One possible cost is
privacy--the concern being that these new technologies will infringe on the ability or
right, on the part of an individual, to control their own exposure to the rest of the world or
to hide knowledge about themselves. There are many social and technical issues that
make this a growing concern, and many possible technical and legal mechanisms that
have/are being/will be proposed to provide information protection and define acceptable
uses of accumulated information, and protection over the flow and reuse of this
information. And, indeed, this is the topic of today's conference.
This paper does not pretend to cover all these issues--subsequent papers in these
proceedings will discuss legal and technical mechanisms that can and are being
constructed to understand and manage privacy issues, and the administrative processes by
which these can be designed and put in place. Furthermore, as I have suggested earlier, I
believe that this whole question is still fluid, and that there is as-yet no definitive
understanding of how privacy issues will be managed in the next century.
Rather, it is my intention here to provide a cultural and technical framework for thinking
about privacy, so that you can appreciate the reasons why privacy is such a compelling
issue, and understand the cultural and technical forces that drive the dramatic changes we
are seeing. I will attempt to do so by:





reviewing the social history of privacy
describing how this view changes over time, and across cultures
outlining the competing issues that must be dealt with when discussing privacy
issues
reviewing the technical changes that have led to so much interest in this topic,
discussing some of the tools required to control privacy and information access,
and how they are related to each other
Thus, the first part of my paper is a brief review of the 'history' of privacy, based on
existing work in this area (the References section at the end of the paper give some
suggested readings). As I shall argue, privacy--as far as this conference is concerned--is a
relatively new concept and concern. It is also a complex concept, rife with overlapping
meanings that vary from society to society and from individual to individual. This has an
important impact on policies regarding privacy protection and control in
"internationalized" Web applications, and on the construction of appropriate tools for
supporting privacy in personalized electronic transactions.
In the next section I attempt to put privacy in context--privacy competes with other social
requirements, and can never be dealt with on its own. I will outline some of these
competing forces, and explain how they are related.
-2-
Finally, I look at the main technical and administrative issues associated with creating
"privacy-aware" Web applications.
1991
1982
1973
1964
1955
1946
1937
1928
1919
1910
60
50
40
30
20
10
0
1901
Number
Last, I wish to point out that there is an extensive literature on privacy issues, most of
which is quite recent--indeed this paper is only a rough outline of thought on this issue.
To demonstrate this I performed a small experiment, by accessing the University of
Toronto Library online database and requesting a list of all physical items (mostly books)
containing the word "privacy" in the title or item description. Figure 1 shows the number
of items published in each year since 1900. Note how almost all text date from the 1970s
or later--almost nothing was written on this topic prior to 1960.
Figure 1. Books and other items published in each year since 1900 that are available at
the University of Toronto Library and that contain the word "privacy" in either the title
or keyword description.
A Definition of Privacy
What, exactly, is privacy? To answer this question, I began by taking a dictionary and
looking up the definition. This seemingly innocuous task is very useful, because it helps
highlight how complex the word "privacy" is, and how difficult it is to discuss "privacy"
without an explicit definition of the explicit concept under discussion, and of other
concepts related to it.
For example, the Oxford English Dictionary, Second Edition (electronic database
version) contains the following definitions for the word privacy (the information listed is
somewhat simplified from the actual dictionary entry) [1]:
privacy (15c) (from private) The state or quality of being private.
1.a. The state or condition of being withdrawn from the society of others,
or from public interest; seclusion.
-3-
2.
3.
4.
5.
6.
b. The state or condition of being alone, undisturbed, or free from
public attention, as a matter of choice or right; freedom from
interference or intrusion. Also attrib., designating that which
affords a privacy of this kind.
a. pl. Private or retired places; private apartments; places of retreat.
Now rare.
b. A secret place, a place of concealment. (Obsolete)
a. Absence or avoidance of publicity or display; a condition
approaching to secrecy or concealment. a synonym for secrecy
b. Keeping of a secret, reticence. (Obsolete)
a. A private matter, a secret; pl. private or personal matters or relations.
Now rare.
b. pl. The private parts. (Obsolete)
Intimacy, confidential relations. (Obsolete)
The state of being privy to some act; = privity. rare.
It is interesting to note the several meanings: This is the beauty--and horror--of human
language! Most of these meanings are rarely used, or obsolete. The meanings of
particular interest to use are marked in boldface.
Words generally take on multiple meanings, and privacy is no exception. Even if we
ignore the obsolete or rarely used meanings (generally of historical interest only), there
are two relevant meanings. It is interesting to look at these, and find examples of "first
use" -- that is, examples of the first instance at which the word was used with the
associated meaning. This context helps provide a better sense of meaning, and also
indicates when that meaning was introduced into the language.
Conveniently, this information is available in the OED, with the following result [1]:
1 b. The state or condition of being alone, undisturbed, or free from public attention, as a
matter of choice or right; freedom from interference or intrusion. Also attrib., designating
that which affords a privacy of this kind. <one's right to privacy>
1814 J. Campbell Rep. Cases King's Bench III. 81 Though the defendant
might not object to a small window looking into his yard, a larger one
might be very inconvenient to him, by disturbing his privacy, and enabling
people to come through to trespass upon his property.
1890 Warren & Brandeis in Harvard Law Rev. IV. 193 (title) The right to
privacy.
3 a. Absence or avoidance of publicity or display; a condition approaching to secrecy or
concealment. a synonym for secrecy
1598 Shaks. Merry W. iv. v. 24 Let her descend: my Chambers are
honourable: Fie, priuacy? Fie.
-4-
1641 Wilkins (title) Mercury: or the Secret and Swift Messenger. Shewing
how a Man may with Privacy and Speed communicate his Thoughts to a
Friend at any Distance.
Thus, although the word came into existence in the 15th century, the meaning that is of
interest to us (1 b) did not arrive for another four hundred years. This tells us something
quite interesting --that privacy, as we understand it, is a relatively new concept. We also
note that the definitions depend on an appreciation of the concepts of "public", "secret",
and "freedom." As we shall see, it is impossible to separate these linked concepts--and,
indeed, no privacy policy can ignore issues of public versus private rights and
obligations, secrecy, and freedom of information. Figure 2 summarizes these four issues:
Freedom
Private
Privacy
Public
Secret
Figure 2. Illustration of the relationship between "privacy" and the four related concepts
of private, public, freedom (of information), and secret. Privacy has no meaning unless
information can be kept "secret," while concerns for privacy are not relevant unless there
are well-defined private and public realms.
These issues will be discussed in a later section. The questions for now are: how did this
new meaning evolve in a broad, social context; and how does this affect our current
appreciation of the issues surrounding privacy? These questions are tackled below.
A Short History of Privacy
Given a time frame and an acting definition, we can now examine history y. By looking
at the cultural and social history of privacy (see, for example, [2,3]), one finds that
"privacy" is simply not relevant in most pre-technical, non-democratic societies. For
example, in nonliterate societies there is no privacy in the modern sense: privacy is only
relevant in the sense of meaning 2 a defined above, as a physical retreat to a location
private from the community.
This is the case because the modern view of "privacy" requires a well-defined separation
between public and private realms. Pre-literate societies do not provide such a distinction:
such societies are essentially communal and largely unstructured--everybody knows
-5-
everybody else, and everything is everybody's business. As a result, there is no clear
social boundary between the "private" and "public". Indeed, it is clear that "privacy" has
no meaning except with reference to a formally structured organization (such as a society,
or company) that has a "public" component with power that can infringe upon an
individual's "private" realm.
Indeed, the concepts of "public" and "private" are well-developed in societies such as
ancient Greece or China, and there is a reasonable record of writings on the topic of
"public" and "private" behavior. However, "privacy" is never discussed as an important
issue of social policy, other than in terms of minimalist property rights.
As societies continued to become more structured, one might have expect privacy to soon
become an important social issue. However, this did not happen until very recently (19th
century). Instead, history shows most societies evolving more and more powerful
centralized "public" realms (the Catholic church, imperial authority in China, European
royalty) with broad powers of "public authority." These authorities served to provide
stability, and order to the society, with also to preserve the state itself.
But there was no equivalent push for social rights such as privacy. In part this was
because power was concentrated in the ruling classes, so that there was little opportunity
for "general" issues to evolve. At the same time, the generally-accepted social foundation
of most societies (e.g., the Church and royalty in the West) specifically denied rights to
the individual. Society had two strata: the state on top, and the peasants below, with
accepted dogma implying that truth and justice flowed, by divine nature, down from the
state. Thus there was no assumption of individual rights, and hence no reasons to even
think of a right to privacy.
The impetus for strong "private" realms likely arose from 18th century changes in the
power structure within Western society. In this context, there were two particularly
important changes that affected the relationship between individuals and the state: the
concept of individual freedom that arose out of the work of the philosophers of the
"Enlightenment" (late 17th century), and the growth of a wealthy mercantilist class (18th
century) with economic power independent of the state.
The "enlightenment" introduced an enormous change in the perception of an individual's
place within society. Although many ideas that came out of this era, perhaps the most
important from our perspective was the idea that human experience was itself the
foundation of human understanding of truth, with external authority being less important
than personal experience.
Note how this "breaks" the rationale of top-down "public authority" as the controlling
force in a society. Up until the enlightenment, the individual was morally subsumed, by
the underlying social assumptions of the day, beneath the society in which they lived-public authority had both the obligation and right to dictate any detail of the lives of
individuals. The philosophers of the enlightenment, however, say that each individual
had the unique ability and right to determine truth, and that authority should be
-6-
questioned. This then eliminates the underpinnings of the top-down social order of the
existing society, and hence the legitimacy of the existing social order.
The second blow was the appearance and growth of a wealthy mercantilist class, and of
the bourgeoisie. These groups, distinct from the church and the ruling royalty, grew over
time to control larger and larger portions of the economic power of society. However,
with economic power came political power, and with political power came privilege and
rights. Thus, to preserve their economic well being, these groups pushed societies to
incorporate legal and political protection against unbridled actions by the "public" state
that would affect their economic power base. Political protection came via their inclusion
of the bourgeoisie within the political system, while legal protection came via laws for
arbitrating commercial disputes, and laws enforcing legal protection of property and other
private assets. In a sense, these property laws were the first manifestation of privacy
protection, providing freedom from arbitrary seizure of private assets, and the ability to
control activity on private property free of arbitrary state interference.
Of course, these issues were dealt with differently in different countries (and in some
countries, much later than others). In particular, each country's history led it to a different
interpretation of the appropriate boundary between personal and state rights. One can
even argue that some countries (such as the United States) were formed due to a dispute
over this boundary. Thus, it is not a surprise that laws and attitudes varied (and still vary)
widely between countries.
The Modern Age: Freedom of Information and Privacy Laws
With the majority of government and corporate assets consisting of things (property,
machinery, etc.) property and asset protection rules were largely sufficient. However,
during the 1960s, information started to become a important asset, as computer databases
began to archive large amounts of personal and general governmental data (for example,
for tax calculations, census analysis). Such databases led to unprecedented concentration
of information in easy-to-access and easy-to-use forms.
At the same time, there was growing awareness of the amount of information held by
governments, and a growing sentiment that most of this information should be publicly
accessible--in most cases this information was considered state property, not open to
public scrutiny. This concern led, in many countries, to freedom of information
legislation, which provided mechanisms for public access to government (and, in some
cases, corporate) information. Such legislation makes allowances for the natural
restriction of access to sensitive information, such as "state" secrets, or private
deliberations. In particular, access is typically refused to information that would result in
an invasion of personal privacy.
Not all countries have freedom of information laws (The US freedom of Information Act
was passed in 1966, similar legislation in Canada in the early 1980s. Sweden's Freedom
of Press act, however, dates back to 1766). Furthermore, each piece of legislation has
different access rules, owing to different nations' sensibilities regarding what is
considered public and private. For example, in Sweden, aggregate information from an
-7-
individual's tax return (i.e., gross and net salary, tax paid) is public information, available
to anyone. Citizens from many other countries would consider such public access a
distinct violation of their right to privacy!
At the same time, there arose "invasion of privacy" concerns due to possible misuse of
personal information stored in these databases. This concern is actually implicit in most
freedom of information legislation, which largely forbids the release of information that
would violate an individual's right to privacy. However, freedom of information
legislation generally placed no specific restrictions on how information could be used
within government (or within a company), and as more and more information was being
accumulated, concerns about possible misuse grew. Many countries saw a need for
additional legislation or regulation to regulate how information could be gathered and
used, and to provide ways by which an individual could confirm that information about
them was accurate.
Legislation for protecting personal privacy is often discussed under the category of data
or privacy protection, and many governments have implemented relevant laws. Such
laws are designed to govern the appropriate use of collected information, by both
government and private organizations, and to specify access mechanisms by which an
individual can view information collected about them. In Canada, bill C-54 is, among
other things, intended to be a major step in implementing these mechanisms with the
force of law.
It is important to note that such laws are new and evolving, while attitudes towards
privacy protection and freedom of information still vary significantly between nations.
Even within countries, laws regarding privacy often vary between regions, depending
both on the responsibilities afforded the different levels of government and the degree to
which the government decides to offer protections. For example, in Canada, provincial
privacy laws are quite different from province to province.
This last section briefly summarized some general trends. The next section will look in
more detail at how new digital technologies are affecting our understanding of privacy
issues.
The Modern Age: Going Digital
The recent explosive growth in interest in privacy issues is due to four related factors:




Exponentially increasing quantities of low-cost, digital information about
individuals in both corporate and government databases
Increased ability to easily share this information with others (the Internet,
open data standards)
Increased ability to combine disparate database, and to mine such databases
The spread of digital databases throughout the corporate world
Digital records have been rapidly growing over the last twenty years, as more and more
business processes have been computerized. However, only been in the last few years has
-8-
it been easy to share this information with others, either inside or outside an organization.
This is largely due to evolving "open" standards for data representation, that make it
much easier to interpret digital data provided by others, and cost reductions in these
technologies. Last, new "data mining" tools make such combinations enormously
beneficial, as they let researchers o analysts view the information in new ways, and
determine trends or patterns that have important commercial applications.
The growth of corporate databases in the 1970s led to realization that some forms of
government-imposed protection were in order. Indeed, there have been several stabs at
legislating appropriate use of information gathered by private companies. For example,
the United States Cable Communications Privacy act of 1984 prohibited a cable system
from gathering information that might be considered "personally identifiable," and
provided rules by which information could be released [4].
Interestingly enough, there have also been attempts by government to obtain information
from "private" corporate databases. An example of interest to academic institutions took
place in the early 1980s. The FBI, through a "Library Awareness Program," began asking
librarians to search through their records to locate library patrons who had requested
books considered dangerous--in this particular case, the FBI was searching for clues to
the identity of the so-called "Zodiac" killer. Librarians who refused to do so (note that
there was no legal obligation to comply with this request), were themselves added to a list
of those of concern to the FBI [5].
It is difficult to protect against such requests: moralistic arguments, combined with mild
threats, are usually sufficient to coerce compliance from most individuals or businesses.
Within the library community, a consensus arose that the best protection against such
requests was to simply not gather this information, thereby reducing the ways in which
staff could be pressured to provide it. For a library this is a simple decision, as
management does not need to know the identity of a reader once a book is returned. Thus,
most current "patron" databases discard this information when a book return is registered.
For cases where usage patterns are of interest, systems are designed to anonymize the
data. That is, once the book is returned, the identity of the reader is discarded, and only
the fact that the book was checked out is retained.
This story illustrates that systems can (and probably should) be designed to gather only
information that is needed, and in such a way that information is automatically discarded
when no longer required. Not also that in the case of library databases this was a
conscious design decision, designed explicitly to preserve patron privacy and based on a
requirements analysis. For any organization, the software implementation will depend on
a definition and analysis of the gathered information, possible uses of this information,
and concerns about misuse and appropriate use. Note that the final software
implementation can be designed to explicitly discard possibly useful information--as in
user reading patterns at a library. What is kept or discarded will depend on the culture of
the company, imposed legal requirements, and on the difficulty of the software design.
-9-
Note too that this is a policy issue, but one where significant implementation details
occur in software. That is, the software design merely implements higher-level policy
decisions regarding information gathering and usage. This means that policy issues
should be carefully thought through before software systems are developed, and
furthermore that the software design should allow for changes to privacy policy if public
demand (or legislation) calls for it. The implications for a company implementing Webbased commerce are clear:


personal privacy policies should be designed as (or before) software is being
implemented.
The software processing user data should be as independent of other business
software as possible, so that changes required in this component (due, for
example, to changes in legislation), can be implemented without affecting other
portions of the system. This, indeed, would be one of goals in any good object
oriented software design.
Of course, most projects are not implemented in this way. However, as systems become
larger, and as legislation changes, organizations run the risk of high-cost software and
data conversion efforts should the software and archived data not lend itself to easy
modification.
User-Centric Privacy
The preceding discussion has focussed on privacy with respect to the relationship
between individuals and large institutions, such as government. In this case, the type of
information accumulated about people has traditionally been well defined (and, indeed, is
largely mandated by regulation). Thus it makes sense to talk of "one size fits all" privacy
policies and/or legislation, as has been the case to date.
For example, when applying for a bank loan or mortgage, the types of information
required by is largely standardized from bank to bank, as are the allowed uses of the
information by the bank. Furthermore, an individual can negotiate certain aspects of the
loan (interest rates, time frames, etc.) before deciding which institution will provide the
loan, only then providing the required private information. Completing tax returns is an
even more uniform example--everyone fills out the same information, with the
knowledge that the information is supposedly well protected by government privacy
policies.
However, information disclosure and privacy are not just policy issued--they are often
also individual, personal decisions. In daily life, and in business or in personal
relationships, each individual decides what type of information to reveal about
themselves. In other words, people invoke different levels of privacy depending on the
party they interact with, balancing the advantages gained by releasing information against
the possible risks. For example, we provide far more information about ourselves to the
government, a bank, or our spouse, than to a pollster, a corner store, or a person we just
met in a bar. These choices are based on the perceived advantages of the exchange, and
the risk of inappropriate use of the information we provide.
-10-
When providing personal or other "private" information, each individual develops and
evaluates a trust relationship with the other party. Via discussions in a (typically) private
environment, each individual will determine what information to reveal, based on the
perceived trustworthiness of the other party, and on the expected benefits derived from
revealing the information. Trust is a complex issue, and traditionally has been based on
the reputation of a company (or individual), and, in the case of individuals, on the
personal rapport that develops between them.
Such issues have traditionally been outside the scope of concern of privacy regulation,
since few companies collected extensive information about customers, and had
insufficient tools to make effective commercial use of what information they did collect.
Today, however, more and more information is automatically incorporated into
databases, to the point where social scientists often refer to "digital personas"--the digital
impression of people onto electronic systems. Today such information is gathered via
mechanisms such as membership subscriptions (e.g., magazines), specialty service cards
(e.g. Air Miles) or reservations/bookings (e.g., hotel, air). This information is most
commonly used for traditional marketing, such bulk mail, or for tracking product/service
preferences.
In most cases individuals have a common understanding of firms that they trust, an
awareness of the information they provide (as they provide most of it directly) and a
perceived (if perhaps inaccurate) understanding of how the information will be used.
Web applications, on the other hand, tend to obscure the first two issues, while opening
up new opportunities for data use.
Personal Privacy and Relationships on the Web
On the Web, data mining and reuse have become central to most Web-based business
ventures. This is because a major strength of Web commerce is the ability to provide
service tailored to individual customers. Indeed, most Web-based businesses store and
use enormous amounts of information about their customers to provide effective and
competitive service. Furthermore, the customer information they gather, which itself is
often in easy-to-reuse formats, is in itself a valuable asset that can itself be traded or sold.
However, all of this is contingent on gathering user information. Doing so requires
requires establishing a trust relationship between the company and its customers or
business partners. Open privacy policies, and the ability to deal effectively and promptly
with user inquiries, are critical to establishing this relationship.
There are four issues that come up when dealing with privacy and Web commerce:

Confidentiality of communication--to ensure that communication between two
parties is private or, if it is not private, that the way in which the information will
-11-



be used is well known. When communicating with a customer, the boundary
between public and private information must be made clear.
Identifiability or particpants--the ability to prove one's identity electronically
(either on the part of an individual, or a company). The importance of knowing
identities will vary depending on the nature of the information gathered, and on
the degree of trust required for completing the transaction.
Data security--to ensure that gathered data cannot be accessed by unauthorized
parties. Again, higher levels of security (and hence privacy) are required
depending on the nature of the relationship.
Policy transparency--to ensure that customers understand how information they
provide will and will not be used.
In the last section of this paper we review the relationship between these issues and
relevant technical and systems architectural components. Other papers in these
proceedings will investigate how these issues can be integrated into operational policy.
Next-Generation: Negotiated Relationships
The implicit long-term goal of Web site personalization is the establishment of negotiated
relationships with customers. A negotiated relationship is one in which the two parties
negotiate for a range of services based on the information and financial contributions of
each party. For example, a customer may choose to not divulge their mail address, in
exchange for non-customized service. Alternatively, they may offer to provide both an
email address and other identifying information, and perhaps pay a monthly fee, to obtain
additional, customizable services.
At present, this must be done by hand by each user--they must personally check the
service offered, and determine what level of service they wish to use. In many cases, this
is the most time-consuming and complex stage in attracting new customers, and hence
the most likely inhibitor to new subscribers. Thus, current systems have very little
flexibility to negotiate different relationships.
However, this is clearly the wave of the futur, as it provides the greatest advantage over
traditional business operations, and the greatest advantage over Web sites not offering
equivalent service. Of course, it is clear that such a range of service offerings makes
privacy issues more complex. For example, a company could offer additional service in
exchange for permission to use a customer's email address in marketing efforts. But, what
specifically would that mean? As the range of possible "privacy policies" grows, it
becomes more and more difficult to understand the implications of specific choices.
We will revisit this issue in the last part of this paper, when we discuss some of the
technologies used to implement privacy issues in commercial Web sites.
Summary
This brief history leaves us with several interesting and useful observations.
-12-
1. Privacy is a cultural issue. Different societies define differing boundaries between
private and public information. For example, in Sweden, the aggregate information of
an individual's personal income tax record are public records, and not private.
2. Attitudes to privacy are governed by social norms. A culture tells people how to
behave, and defines standards for how individuals should interact, and by implication
how groups should interact with individuals. Thus, as a society evolves, attitudes
towards privacy change.
3. Privacy is often regulated by legal structures. There are two broad classes of legal
structures: freedom of information, and privacy protection, the latter to provide
enforceable rules for the collection and use of information. These two classes are
synergistic, and sometimes in conflict.
4. Privacy concerns depend on the parties involved, and on the trust relationship
between them. An individual will consciously decide to provide different
information to different parties (and hence preserve different levels of privacy),
depending on the trust between them and on the benefits to the individual of revealing
information.
5. Privacy is often fine-grained. Individuals or organizations need to be able to choose
which information they release to each other, or to business.
6. Trust relationships can depend on proof of identity of the parties. In complex
interactions, it is important that each party be able to prove the identity of the party
they are communicating with.
7. Privacy must be ensured in communication, as well as storage. Privacy policies
may ensure that archived information is safe, but must also ensure that communicated
information is safe from interception. Furthermore, the communication mechanism
must be able to prove the identity of all parties, to ensure.
8. Privacy policies must be well known. If information is being collected about
customers, it is important to define (and state) clear policies regarding the use of the
collected information. This can be a complex problem if a site offers a variety of
different services, each service requiring differing levels of information disclosure
from the client.
Every commerce system must be designed to deal with the issues relevant in a particular
implementation--the situation is simply more important to Web commerce due to the
increased quantity of information and the ability to customize to individual users. For
example, an international commerce site must allow for national variation in attitudes to
privacy, as well as for differing legal requirements. Other sites may be particularly
concerned with proof of identity, so that confidential information is not divulged to other
parties. Finally, data usage policies must be well known and publicized so that parties can
understand the relationship with their partners.
-13-
Important Technologies and Technical Issues
As mentioned earlier, Web commerce applications should take privacy issues into
account during the design phase of the application. This ensures that the designers are
aware of the constraints imposed on communication and information storage/retrieval by
legislation (if relevant) or by the details of the chosen privacy policy. This will determine
how data and communication encryption, security issues, and database design are
implemented--decisions that can be very expensive to change if not implemented
correctly at the outset.
Note that security and encryption are themselves complex technical topics. The material
here is simply a rough overview: additional useful information is found in [8].
Data Security: Firewalls, Data Encryption, and Communication Encryption. Once
information is gathered, an organization will need a way of keeping it secure. This task
has both technical and administrative components: technically, the system must provide
appropriate network (and physical) security and access control, to ensure that private
information is kept private. An example where this failed is found in the recent Air Miles
fiasco, but there are have been many, less publicized incidents. Indeed, just about every
institution can look inside itself to find examples where data security has been
compromised.
Typical data security technologies include firewalls (to exclude external users from your
internal network), data encryption (to make stolen data useless), and communication
encryption (so that data in transit cannot be intercepted). Behind this lie a well-defined
network security policy, designed by security experts. Such policies should take into
account concerns for privacy issues, so that particularly sensitive data is adequately
protected.
At the system level, the computers and the application software must be designed to
restrict access to authorized users, or to authorized software agents (e.g., the agent that
assembles email addresses for bulk mailings). Note that systems supporting "negotiated
relationships" will need to provide different levels of access control and security
depending on the data. For example, if some users have authorized re-use of their email
addresses while others have not, the system must be designed to know of these options,
and handle them accordingly.
Systems should also have proper transaction logging and audit trails that monitor activity
of the database and of user (or agent) access. Something will inevitably go wrong, and
these tools will let the network and system administrators find out what went wrong, and
why.
Lastly, data security is a human resources/administrative task--most security breaches
occur due to unimplemented policies (procedures are not accurately followed) intentional
theft or sabotage. It is important that administrative policies and rules reinforce and
augment any software and hardware security tools.
-14-
Communications Security. Information is not private if it is communicated via publicly
accessible means--which is the case for all unencrypted Internet traffic. Thus, if there is
concern over information contained in communication between individuals and an
organization or between individuals but moderated by a company, technology should be
employed to ensure the communication is confidentiality.
There are several ways to do this. When sending or receiving Web pages or newsgroup
messages, low-level packet encryption of the communications line (e.g. Secure Socket
Layer (SSL) or Private Communications Technology (PCT) software) encrypts the
underlying communications channel. This ensures that the message (a Web document,
submitted HTML form data, newsgroup posting, or mail message communicated to a
mail server) cannot be intercepted and read.
This may be insufficient, however, if the message is stored somewhere (either at the
destination or in transit), as the message itself is unencrypted--only the communications
channel is protected. Thus, if messages are stored or cached the application must make
sure that the received data is itself encrypted for storage, or destroyed.
Internet mail systems support message encryption--indeed, very little mail is sent using
low-level SSL-style encryption. Message encryption encrypts the message prior to
sending it via standard, possibly unencrypted Internet mail protocols. Thus, even if the
message is intercepted, it cannot be read. In this case, low-level encryption is not needed,
and the information can be sent via regular Internet mail systems. Note, however, that a
mail message could be intercepted, and in some cases a "forged" message substituted in
its place. Low-level SSL encrypting eliminates this possibility.
There are three common forms of email message encryption: Pretty Good Privacy (PGP),
Privacy Enhanced Mail (PEM) and Secure MIME (S-MIME). All these mechanisms
allow reliable data encryption (but using different approaches).
Again, once received messages are decrypted, they can be read by anyone. Thus, if they
are to be stored locally, it may be important to use encryption to ensure that they are
unreadable if stolen. It may be sufficient to simply delete the raw data once the
transaction is complete. Failure to do so has causes several of the "security" failures of
several common E-commerce systems.
Of course encrypting a message is not the same as proof of identity for the author--an
encrypted message can come from anybody. Proof of identity is the second important
aspect of any commercial transaction, and some digital ways of providing such proof are
discussed next.
Identity Verification: Digital Certificates. For many transactions it is important to
know the identity of the party you are dealing with. For example, if you are a consumer
about to purchase an expensive product via the Web, you will want to be sure of the
identity of the company you are dealing with. Conversely, the company may want proof
-15-
of identity of the customer--to verify, for example, that the customer has permission to
access certain information in the company's system, or to exceed a defined credit limit.
For low-security systems, identity of a Web-accessed resource is typically assumed from
the URL of the resource (for example, www.ibm.com is probably IBM Corporation).
User identity, on the other hand, is generally "proved" by the user logging in with defined
usernames and passwords. In general this username/password is originally generated by
the commerce system when the user creates an account.
Note that neither of these approaches provides actual "proof" of identity. A URL can be
spoofed, so that a malicious computer expert could redirect unsuspecting customers away
from the real location, and to another machine. A user, on the other hand, can create an
account using any name/identity they choose--a company has, in general, few ways of
authenticating the identity of a person creating an account. Furthermore, it can be easy to
steal a person's username and password--particularly if the underlying messages are not
encrypted when sent via the Internet--so it can be easy for a third party to obtain this
information, and masquerade as someone else.
There are technologies that can help. Encryption technologies such as SSL include digital
certificates for each Web server (purchased with the server, renewable and with a finite
lifetime), that let a remote user verify that they are indeed talking with a particular Web
server. Every browser checks with a trusted third party to verify that the certificate is
valid, and informs the user if there is a problem. Thus a user can always be sure, if they
are using an SSL-secured connection, that they are communicating with a company that
bought and registered the indicated certificate.
The converse is also possible--browsers can have their own certificates, to verify the
origin of the communication. Similarly, the higher-level email encryption protocols
(PGP, S-MIME) support digital message signing. With this mechanism, the author can
digitally sign a document such that the recipient can ensure that the document was
composed under the authority of the signing certificate, and also that the message was not
tampered with.
Unfortunately, this mechanism is rarely used at present--each person needs to purchase a
certificate, and most do not bother to do so. Furthermore, such certificates are only as
trustworthy as the user's computer--if person A can digitally sign mail using their mail
client, but their machine is also accessible to person B, then person B can easily
masquerade as person A. Indeed, digital identifiers, as presently implemented, are only
as secure as the machines from which signed documents are sent. At present, this
somewhat limits the reliability of these technologies as "proof of authorship," although it
is certainly better than receiving unsigned data. It also limits masquerading to a single
machine, as opposed to any machine on the Internet, which is admittedly a significant
improvement.
-16-
Figure 3 illustrates in a general way the relationship between security/privacy issues
(which will be determined by policy) and the related technical/administrative issues that
appear in a real implementation
Physical Access




Access control
Cabling protection
Off-site backups
Physical document
policy (shredding /
destruction)
dependencies
Network Architecture
Data security
Communications
Security
Identification /
non-repudiation







Internal vs. external
Firewalls and rules
Servers and locations
Access control rules
Auditing tools (logins,
accesses, attacks)
E-mail encryption
Web page encryption
Application Design
PRIVACY
POLICY










Data model
Data access rules
Data encryption
Web page encryption
Email encryption
Server certificate
management
User certificate
management
Alternate
authentication tools
Data deletion policies
Cache protection
dependencies
Figure 2. Schematic showing the relationships between privacy/security issues (left), and
implementation decision points. The arrows on the right illustrate the interdependencies
found in any implementation.
It is interesting to note that portions of Bill C-54 would make digital signatures a legal
means of identification (or, better stated, non-repudiability) of the sender.
-17-
Future Technologies
When a visitor contacts a resource requiring "private" information, they generally decide,
based on a number of factors, which information they feel comfortable providing. At
present, such information is provided via fill-in forms. But, as mentioned previously, this
stage of negotiation is the most complex "entry" point to a service. Unfortunately,
complex transactions tend to scare away potential customers, simply because of the
tediousness of wading through page after page of privacy policy statements, personal
information questions and fill-in forms.
Much of the information requested, and statements about the intended use of the provided
information, are relatively straightforward. Indeed, there is a lot of repetition from site to
site, with most sites asking for the same sorts of information, and in return offering
similar privacy/information use policies.
Ideally, it would be nice if much of this "negotiation" could be automated. For example,
upon accessing a Web site, a Web browser could receive a machine-readable "privacy
policy" statement, describing the privacy policy and the types of information the site is
requesting. The browser could then read a set of user-defined privacy preferences, and
determine which information can be immediately sent to the server, which should not be
sent, and which should be sent at the option of the user.
To make this work requires a well-defined language for expressing privacy preferences
and policies. This is the long-term goal of a World Wide Web Consortium working group
known as the Platform for Privacy Preferences (or P3P) project. The goal of P3P is to
establish an application-level language for publicizing privacy policies and for
negotiating the release of person information to sites, depending on the site policies. The
stated goal of P3P is:
The P3P specification will enable Web sites to express their privacy practices
and users to exercise preferences over those practices. P3P products will allow
users to be informed of site practices, to delegate decisions to their computer
when possible, and allow users to tailor their relationship to specific sites. Sites
with practices that fall within the range of a user's preference could, at the
option of the user, be accessed "seamlessly," otherwise users will be notified of
a site's practices and have the opportunity to agree to those terms or other
terms and continue browsing if they wish [6]
Unfortunately, P3P is still a work in progress. Furthermore, many commercial
projects/services are already underway designed to address some of the issues being
addressed by P3P [7]. However, understanding the design parameters of P3P are useful
for understanding the issues of software-mediated privacy negotiation, independent of the
biases of any current software implementation.
-18-
Conclusions
This short paper has provided a brief history of the concept of privacy, and tried to show
how this rather new concept is evolving rapidly in the latter half of the 20th century. In
doing so, it demonstrated that there are several competing issues to address if
understanding privacy issues, including culture, law, proof of identity, and trust. This
section also demonstrated that the negotiation of relationships also requires a clear
understanding of privacy policies and a fine-grained approach to information exchange
(so that the exchange can be tailored to the individual).
The second part of the paper briefly reviewed some of the technical issues that come up
when dealing with privacy and information security issues. This brief review touched on
issues such as data and communication encryption, network and application architecture,
and the important roll of policy decisions in the implementation of the underlying design.
In the beginning of this paper, I noted how much had been recently written on privacy
issues. Reference [9] provides a useful summary of some printed and Web-accessible
resources, should you wish to read further on this most interesting issue
References
1. Oxford English Dictionary, Second Edition, Oxford University Press,
http://www.chass.utoronto.ca/chass/oed/oedpage.html (University of Toronto access
only)
2. Privacy: Studies in Social and Cultural History, Moore, Barrington, Jr. Pantheon
Books, 1984. This book provides an excellent review of the history and sociology of
privacy prior to the 20th century.
3. Technology and Privacy: The New Landscape, Agre, Philip E., and Rotenberg, Marc,
eds. MIT Press, 1998. This contains a collection of papers summarizing some of the
ramifications of new technology on privacy and freedom of information issues.
4. http://www.conwaycorp.com/electronics/services/privacy_notice.html (Conway Corp.
statement regarding compliance with the Cable Communications Privacy Act)
5. American Libraries, 21/3, pp245-248, 1990 (March)
6. http://www.w3.org/P3P/
(Platform for Privacy Preferences overview)
7. http://www.w3.org/P3P/implementations (A listing of privacy and personal profiling
software services: such as DigitalMe, Firefly, and others.)
8. E-Commerce Security: Weak Links, Best Defenses, Anup K. Ghosh, John Wiley and
Sons, 1998
9. List of other Privacy-related books, articles, and Web resources:
http//www.utoronto.ca/ian/privacy/intro.html
-19-
Download