Blogs: A privacy perspective Karen Mc Cullagh CCSR, University of Manchester Privacy in the Information Society • Blog data may provide exciting new possibilities for research e.g. on public privacy attitudes and expectations. • Individuals choose this media to disclose a wide variety of information about themselves. • Blogging provides a provocative challenge since the act of publication prima facie, implies a waiver of privacy. • There is a need to consider the ethical, and legal problems posed by these resources. Blogs: A source of Social Data • online journal -links and postings in reverse chronological order. Blogs are “post-centric”. Typically link to other websites and blogs, and many allow readers to comment on the original post. • Persistence and archives (blogging for > 2 years) • In Nov 2006 Technorati tracking more than 67 million blogs Methodology • Survey of bloggers from around the world 1258 were selected for data analysis • Non-random sample (variant of snowball sampling) • Results cannot be generalized to the entire blogging population. Survey questions 4 topics: 1) Blogging practices, 2) Privacy Expectations, 3) Blog content and Privacy Attitudes, 4) Questions about other people's privacy. Results: Study population • • • • (49.1%) were female Over half between 19 and 34yrs 39.5% from UK (60.7%) were working though only a minority claimed to be the main earner (31.5%). Findings - Blogging practices • (60.1%) characterized entries as "My life (personal diary/journal)." • Main reason – (62.6%) to “document their personal experiences and share them with others.” • 88.3% - making money was not a reason. • majority of respondents had text and photographs on their blogs, with less than 1/4 posting audio. Issues of Social Importance Socially Important? Preventing Crime Very Concerned (%) 22.5% Improving standards in education 41.3% Protecting people's personal Information 33.3% Protecting freedom of speech 51.9% Equal rights for everyone 53.2% Unemployment Environmental issues 13.4% 28.4% Access to information held by public Authorities 25.4% Providing health care National Security 31.6% 15.4% Improper handling of information by organisations Consequences Very Concerned (%) Threat to personal safety 36.7% Threat to your health 24.3% Financial loss 29.8% Indignity 18.7% Loss of liberty 36.8% Annoyance or inconvenience 20.6% Invasion of privacy 39.2% Personal distress 26.2% Categories of sensitive data Art 8 Legally recognised categories Not legally recognised categories Trade-union membership Employment history Religious or philosophical beliefs Education Qualifications Political opinions Membership of political party / organisation Data concerning race or ethnic origin Clickstream data (e.g. record of web pages visited) Criminal records Personal Contact Details Sexual life information Genetic Information Health information Biometric information (e.g. iris scans, facial scans and finger prints) Financial data Percentage 60% 50% 40% 30% 20% 10% 0% Race or ethnicity data Political opinions Religious or philosophical Trade-union membership Health information Sexual life information Criminal records Education Qualifications* Employment history* Membership of political Clickstream data* Personal Contact Genetic Information* Biometric information* Financial data* Sensitivity of data types Blog survey: sensitivity of data Data Type Not at all sensitive Extremely Sensitive Legally recognised categories Trade Union Membership Religious or Philosophical beliefs Political opinions Data Concerning race or ethnic origin Criminal records Sexual life information Health information No Answer 12.7% 12.1% 12.2% 12.4% 12.6% 12.4% 12.6% 28.5% 24.1% 21.6% 23.0% 9.1% 4.1% 4.4% A little Sensitive 16.7% 16.9% 16.8% 15.7% 9.1% 4.0% 4.1% Sensitive 24.0% 22.6% 26.4% 23.4% 20.9% 9.9% 12.8% 11.8% 15.3% 14.5% 16.4% 22.7% 19.4% 20.6% 6.4% 9.1% 8.5% 9.1% 25.6% 50.3% 45.5% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% Not at All Sensitive Very Sensitive Extremely Sensitive Total Not legally recognised categories Employment history Education Qualifications Membership of political party / organisation Clickstream data Personal Contact Details Genetic Information Biometric Information Financial data No Answer 12.3% 12.4% 12.4% 12.2% 12.2% 12.7% 12.9% 12.2% 22.3% 22.7% 22.8% 7.9% 3.2% 7.2% 3.1% 1.6% 16.1% 17.1% 16.5% 11.0% 4.5% 5.2% 4.1% 1.8% 26.2% 25.5% 25.6% 18.4% 10.6% 11.0% 6.9% 3.7% 15.5% 15.2% 14.2% 24.7% 22.0% 16.0% 12.9% 13.5% 7.6% 7.1% 8.4% 25.8% 47.5% 47.9% 60.2% 67.2% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% Not at All Sensitive A little Sensitive Sensitive Very Sensitive Extremely Sensitive Total Personal identification • 42.4% posted their real name. • 16.3% exercise restraint e.g. “I use my first name, but always leave out my surname. I also try not to mention by name where I work or where I grew up. This isn't so much because I don't want my audience knowing these details, but rather that I am aware that including such details makes it much more likely an employer, former acquaintance or anyone I wouldn't want reading might accidentally 'google' their way onto my site. Despite these safeguards, some friends have still managed to google their way to my blog, so I think my concerns are well founded. If I were to start blogging afresh, I would give serious consideration to adopting a pseudonym.” Personal identification 1) First name only, or first name and an initial of their surname, or first name and maiden name, but not their legal surname 2) pseudonym, a nickname, penname or alias 3) First name and geographical data e.g. State or town 4) first name on the blog …full name details in the URL 5) No name, but photograph 6) Full name – but only because it is very common e.g. Mike Martin (many Google hits – so effective anonymity) 7) Full name because it is a legal requirement e.g. Germany Knowledge of audience • Widespread variation was expressed: How well do you feel you know your blog's audience? Extremely well Very well Quite well A little Not at all Prefer not to answer It is more complicated Total Percentage 8.1% 23.1% 32.1% 18.4% 7.6% 2.1% 8.5% 100% Audience knowledge “It really does vary. I have made good friends with a handful of people through blogging who I have gone on to meet. In fact, I had a year long relationship with someone who 'met' me by initially reading my blog. And there are other people who I have had a degree of contact with for 18 months or so who I may not have met, emailed or spoken to, but over such a time it is hard not to form some sort of bond - real of imagined - with such people. However, there is another section of my audience who I don't know much about. Some people read regularly, and from reading their comments and blogs I decide I don't want to get to know them any further and pay them scant attention, and yet they continue to return, getting to know me better by the day whilst I remain purposely oblivious to them. Finally, there is the section of readers who never interact, and yet return on a frequent basis.” Audience knowledge (2) • “The audience changes frequently. Some remain faithful readers and some drift away from you. Some you 'know' better than others.” • “You don't really "KNOW" your audience, it could be anyone, preacher, teacher, convict, sexual predator, or anyone in between. You never truly "know" who is watching or what their motives are.” Limit blog readership Over ¼ limit readership e.g. • LiveJournal allows locking of posts to be read by ‘friends only’ “My blog is password-protected, so even though I identify myself by name on the blog, only about a dozen people even have the URL and a password to see it.” “I tend to make more personal posts friends only, sometimes even limiting posts to people I don't know in real life as I sometimes prefer to limit my depressed/suicidal musings to people who can't do much about them.” Frequency of posting personal information • 24.8% of respondents said they had done so “All the time”. • Only 2% of respondents said they had “never” posted anything highly personal on their blogs. • Most respondents (65.6%) said they had considered certain topics were too personal to write about on their blogs. • Suggests that the question of where to draw the boundaries between publishable and non-publishable materials is of concern to bloggers at present. Types of information “too personal/private” to publish on blog • “I don't give specific details and names. I don't talk about sex. I am describing events and my feelings about them. There is a difference between personal and intimate.” • “Personal/romantic relationships. I also try not to vent about gripes I have with specific people, lest they ever read it.” • “Financial and health issues.” Traditional Diary v Blog • 1/5th kept a traditional dairy as well as an online blog. • 21.5% of respondents indicated a decision not to post certain information on their blog Reasons: “My traditional diary is for my eyes only. It's more personal to me. I can put what I like without worrying about being read. I can speak about people by their real names” “If I'm particularly embarrassed about something I'm more likely to put it in my private diary than my blog, even though my blog is anonymous. Some things you just don’t share.” “The two will never be the same. The reader of my diary will only be me. “ Blog practices • Bloggers are not complying with laws e.g.: • Over half the respondents never seek permission to post copyrighted material! • 10.2% of respondents do not spend time verifying the facts of their posts. • 10.3% do not post corrections. Privacy invasion More than 1/10th had experienced privacy invasion through the activities of other bloggers: “some of my friends call me by my first name when they comment” “Personal health information was babbled to the world by an ex.” “A white supremacist named me and gave sufficient details about my home address” Privacy Invasion (2) “putting my picture on their blog without asking me, though i have done the same” “A friend linked to me and when she talked about me in her blog, she used my real name” “I have discovered the blogs of people that I know, and found mentions of myself.” Blogging about others • Bloggers write not only about themselves but often also about other people whom they know personally. • 61.8% did not seek permission, and only • 15.4% always asked permission first when blogging about others. • Only 8.3% never blogged about people they knew personally. • Thus the great majority of respondents write about people they know but most of them never ask their permission to do so. Identity of others • 7.4 % revealed full names, 71.1% did not. Most bloggers are sensitive to issues of privacy when blogging about friends and family. Over half (51.8%) of bloggers used an identifier instead of name when blogging about someone they know personally. Contrast: Almost 2/3rds reveal the full name of any person they don’t personally know. Suggest that the bloggers are not concerned about potential legal action arising from blogs about celebrities. Blog about work – trouble! • 54.8% blogged about work. • 19.6% had gotten into trouble (themselves, or friends) Examples: “ I posted something about a terrible boss I had and he found the blog and threatened legal action even though I hadn't mentioned his name or the name of his business. I removed the post and found another job but it taught me not to give a wider berth to other people's stories on my blog” “I almost got fired from my last job, so I deleted it and started a new one. I work at home now, so what ever I say is only about how much I work because I cannot divulge any information on a public (even password protected) forum or blog. I signed a contract, and to do so, and get caught would be breach of contract and termination” Concluding remarks It is the authors' subjective sense of privacy and liability that is revealed. This selfdisclosure approach has 3 important implications: (1) There can be disparities between stated privacy attitudes and actions. (2) Participants' perceptions of their blogs might differ from those of outside observers and researchers. (3) because of the self-reporting nature of this study, accuracy is difficult to verify, e.g., no external validation was conducted. Concluding remarks (2) • Respondents described tactics for keeping certain information private even when it is publicly published e.g. using friends' initials instead of their full names • bloggers reported having difficulty negotiating privacy boundaries in certain circumstances. • Bloggers' privacy boundaries in the workplace have yet not been clearly established, either socially or legally. Thank You Karen.McCullagh@postgrad.manchester.ac.uk