Science and the Web Daniel Swan Bioinformatics Support Unit (http://twitter.com/nclbsu) d.c.swan@ncl.ac.uk (http://twitter.com/d_swan) (yes, even Twitter is a useful scientific tool!) In the bad old days There was the internet But there was no web No Facebook No iPlayer No Spotify No Google But we still managed to use it for awesome science Yes, this really was the view of the worlds sequence databases that we used to have! But we had social resources Email newsgroups MUDs (kind of like a text only Second Life) Usenet Newsgroups A kind of forerunner of today's message boards Often un-moderated Full of spam All the science stuff of interest was under bio.* or sci.* How the web was won The internet has been designed by scientists for scientists pretty much since inception The internet ‘as we know it today’ arrived in the 1990s OK that’s just not true, but it’s what most people associate with the internet Aka ‘The Web’ And this WAS about science (Sir) Tim Berners-Lee is a scientist The World-Wide Web (W3) was developed to be a pool of human knowledge, which would allow collaborators in remote sites to share their ideas and all aspects of a common project Data, and pages could be linked together for the first time with images, sounds, text As the data grew, it needed to be searched Search engines collated the data And so Web 1.0 begun We got used to: Online shopping Having a homepage Getting our news online Guestbooks on websites Animated GIFs Wonderful colourschemes: So wtf is Web 2.0? 30% marketing nonsense to attract investors 20% buzzwords designed to annoy old people ‘mashup’ etc. 50% useful Specifically refers to ‘user generated content’ High availability and access to data Networks of people and network effects Openness Tapping into the ‘wisdom of the crowds’ Concepts One of the social concepts most people are familiar with now is tagging: And the ubiquitous upvote: How can we take this out of Facebook and into science? We have a data overload How can you define what is important? And how can you deal with it subsequently? Tagging is very useful Search and retrieval tool Can be applied to just about anything Browser bookmarks Uploaded YouTube videos Academic Papers Uploaded Powerpoint presentations And most importantly they can be SHARED with other users We call these tags ‘folksonomies’ “a system of classification derived from the practice and method of collaboratively creating and managing tags to annotate and categorize content” Social bookmarking If you’re going to share your folksonomies, why not share your bookmarks too? Maybe you don’t want to share everything you have bookmarked in your browser… But what about the stuff that’s related to work? Many services exist for this, but the one that most people know about is http://delicious.com/ Why share? Well you might as well ask Why go to a conference? “At a conference the most important things happen in the coffee break” – Hans Ulrich Obrist Why talk to colleagues? The internet doesn’t have to be a distraction, it can be an extension of your peer group. A place to find and exchange relevant information, build your profile and do your job more efficiently Case study 1: Publications Most people’s workflow for papers consists of Download Print Stick in directory called ‘Papers’ Forget Maybe it might make it into Endnote! However online… you can do so much more Connotea Cite-U-Like Mendeley Zotero Connotea Cite-U-Like Mendeley Zotero Case study 2 – finding things No, not via Google One of the issues of information overload is having to go to multiple sites in order to collate the day to day information you might want to read This is already a solved problem RSS ‘Really Simple Syndication’ Who has RSS? You can have RSS feeds from Journal publishers Search results Blogs Connotea, Cite-u-like, etc. Actually just about any dynamically updated web resource. RSS is everywhere Pervasive, Useful, Centralised, Shareable Case study 3 - presentations Even the humble Powerpoint slide has gone all Web 2.0 Scientists give talks, upload their slides Feedback given Even if you don’t share, it can be good to consume Get ideas for presentations, formats etc. Case study 4 – your work Yes, you can even share some of your work Protocols, early results, triumphs and tribulations Some scientists, keen to move from Web 2.0 to Science 2.0 are using blogs, wikis and the internet to publish their information outside of the limited sphere of journals One of the largest sites for this in the world is OpenWetWare Science – being Open Already noted that biologists are good at openly sharing primary data, in increasingly standardised databases DNA sequence, protein sequence, microarray experiments Increasingly happening with publications too PLoS (Public Library of Science) and BioMedCentral (BMC) are two big players Rather than charging users for subscriptions, the authors 'pay to publish' Why is Open good? Your work is not behind a 'paywall' It can be very frustrating trying to get hold of papers from journals or publishers that the University does not have a site licence for Your work can be more widely read Hard to argue that this is not desirable! The full text of your article is preserved and can even be analysed computationally to derive even more knowledge Publishing and Web 2.0 Many publishers now offer the ability to comment and ask for clarification on papers, especially in the Open Access journals Authors can respond to comments and issue rebuttals at the site the paper is downloaded Increases discourse on published papers, which is very hard to do in a paper journal Certain publishers are building up quite significant social networks around their published content. Nature Group in particular with their 'Nature Network' I can't keep track of it all! As we have alluded to already it's hard to keep track of all the data that is out there and available Whilst we can have email alerts and RSS feeds, the issue is often one of separating out the signal from the noise Whilst a regular review of the sources you take in is a good way to do this there are other ways to try and filter out what is coming to you FOAF approaches FOAF is 'friend of a friend' – the online equivalent of having someone come up and tell you about a cool paper they have just read – or someone has just told them about that might interest you. The success to FOAF approaches is being in the right setting so that you are interacting with the right set of people There have been many MANY attempts to build 'social networking for scientists' I'm only going to present one, and that's the one that currently works for me FriendFeed Friendfeed is like Google Reader, it sucks in RSS you generate or pull in This display is updated in real time For instance I import - Cite-U-Like, Google Reader 'shared items‘, activity on software development sites, Flickr photos They appear in everyone who has subscribed to me – a la Twitter, only with threaded discussions where things percolate up to the top the more they are 'liked' or discussed There 'rooms' that you can subscribe to where you will also get content from on specific topics This is a GREAT way of seeing what other people are reading, watching, viewing, talking about – and in a way that you can leap right in and start discussing it with them. Crowdsourcing & collaboration One of the coolest things I have seen FriendFeed used for is 'crowdsourcing' This is where someone posts up a request for help, and the community jumps in to help out Virtual collaborations spring up around the community, from people who have never met each other, and yet result in real scientific publications Don't be afraid to ask for help online, and don't be afraid to offer it either Postgenomic – blog filter From consumer to publisher The essence of Web 2.0 is that you provide the content Whether you're uploading a video of an experimental technique Whether you're releasing your slides onto Slideshare Whether you're sharing what you're reading on Connotea YOU are generating the content Why should I produce content? It's a valid question Why should I blog? My blog is a mix of the personal and of work – that is my decision Many people just blog about their work, on the assumption your work life and your private life should be kept very separate I think this is increasingly hard to do As soon as you have work colleagues and friends mixed up on Facebook or Twitter, or whatever other social networks you used the streams are polluted Blogging work Blogging provides an opportunity to hone your writing skills, rarely troubled outside of writing papers, end of year reports and a thesis Sharpen your analytical skills by posting about the papers your presenting at your lab meetings, and use http://ResearchBlogging.org to reach an audience Part of your network building activities It's a continually updating, evolving CV – marketing YOU It helps you build a brand It gets your name up there in Google All of these are 'informal esteem factors' Case study 5: Allyson Allyson adopted the following technologies as part of her social workflow – blog, twitter, FriendFeed She started to attend conferences and blogged the talks, which fed into FriendFeed People on FriendFeed appreciated being kept up to date, and she became well known for it She has published 3 papers on using Web 2.0 to cover conferences She has been paid for to fly to the USA and attend conferences EVERYONE in her field, and some beyond – knows who she is and this is PRICELESS Why should I care? I heard a question on a podcast recently, that someone who had published some stuff online in earlier, more headstrong times that were now at the top of Google wanted to know what the best way was to push this out of Googles memory (off the front page effectively!) The suggestion was to generate content – good content with your name attached, content that people want to read, and enough of it to ensure that you come up top every time on searching for your name What happens if someone decides to bad mouth you online in the future? Just make sure they don’t have enough Google traction to care My experience Two years ago the first thing you got when googling my name was an entertaining set of news stories about a racist, redneck called Daniel Swan who had, in a drunken state one evening, decided to plant a giant burning cross in the front garden of a black family in his neighbourhood For various litigious reasons, his conviction was overturned and his stupid grinning face was on every page you hit, baseball cap on and leaning on his pickup 2 years later, and the only person you will hit on the front page is me and my grinning face instead – just regular online interaction and content production means that I get found first when my name is Googled Your online brand As I alluded to earlier if you're going to publish content, in this day and age a future employer is going to Google you What you say and do in public forums is a matter of public record, it’s ok to engage and discuss, but be wary of getting mad You might want to have a look at what your Facebook profile exposes to the world. It may be more than you think. Whilst FB has many scientific groups, I don't think it's a great platform for scientific discussion. Best to keep your most personal life there, and your professional life elsewhere. LinkedIn is a great place for an online CV and networking LinkedIn Licensing your content OK this is a dry topic and I won't go into it in too much detail Make sure that when you produce content online you state which licence it is under I would advise people to look into Creative Commons licencing for blog posts or anything extensive that you have written and posted online Clear guidelines on what to use so that you can control downstream what people do with your work Otherwise people may not have the right to reuse, redistribute your work, or at the worst, will have the right to do so, but won't have the requirement to credit you for it And you do want to be credited for your work! In summary Please do share and tag your content with the world Think about how you can use the tools that are out there to support your scientific workflow, so that the data is there for YOU when YOU need it most Try to seek out like minded people in your field online and interact with them, the University isn't your scientific world – the world is your scientific world! Consider publishing your work in open access journals to widen its readership Build a recognisable, trusted and centralised online identity so when a potential employer starts to search for information about you what comes back is positive, intelligent and encouraging! URLs Delicious (social bookmarking) http://delicious.com/ JoVE (Journal of Visualised Experiments) http://www.jove.com/ Connotea (social reference manager) http://connotea.org/ PLoS (Open Access journal) http://www.plos.org/ Cite-U-Like (social reference manager) http://www.citeulike.org/ BioMedCentral (Open Access journal) http://www.biomedcentral.com/ Mendeley (social reference manager) http://www.mendeley.com/ Nature Network (aggregation, scientific social networking) http://network.nature.com/ Zotero (research tool, reference manager) http://www.zotero.org/ FriendFeed (aggregation, scientific social networking) http://friendfeed.com/ Google Reader (RSS reader) http://reader.google.com/ SlideShare (share Powerpoint slides) http://www.slideshare.net/ ResearchBlogging (scientific blog aggregation) http://www.researchblogging.org/ BioScreenCast (share videos on scientific software) http://bioscreencast.com/ LinkedIn (your online CV) http://www.linkedin.com/ CreativeCommons (licensing online content) http://creativecommons.org/ OpenWetWare (protocols, lab wikis and more) http://openwetware.org/wiki/Main_Page Postgenomic (scientific blog aggregation) http://www.postgenomic.com/