Supporting distributed scientific communities: Making CiteSeer collaborative John M. Carroll, Umer Farooq, and Craig H. Ganoe Center for Human Computer Interaction School of Information Sciences and Technology The Pennsylvania State University cganoe@ist.psu.edu Motivation • Science as a social enterprise • Examples – Watson-Crick – Marie-Pierre Curie –… • Trends in scientific collaboration – Increasingly distributed – Increasingly on the Internet Objective • Support distributed scientific communities • Build CiteSeer into a collaboratory CiteSeer • Academic search engine for Computer and Information Science and Engineering (CISE) – http://citeseer.ist.psu.edu – 700, 000 full-text papers and 10 million citations – About a million hits per day Collaboratories • NSF/NRC push in the 1990s • Centers for geographically distributed scientists – Matter of resources access and logistics • Facilitate greater opportunities for scientific collaboration The CiteSeer Collaboratory • Collaboration around intellectual resources – Social networks – Neighbour-based discussion forums – Synchronous collaborative spaces CiteSeer survey • Opportunity sample over two weeks • 301 responders • Data collection – Quantitative data: Likert scales – Qualitative data: Follow-up questions Survey design • Professional interaction – Face-to-face and remote collaboration – Types of collaborative support • CiteSeer use – Nature of queries • Background information Overall results • 42% graduate students • 42% master’s degree, 32% PhD degree • 79% at least with computer science background • Mean use of CiteSeer: 3.7 years • 45% downloaded more than 100 papers • 40% use CiteSeer once or twice per week Collaboration with whom Distribution of whom to collaborate with on a modified scale (N=268) 100% 90% 80% 70% 60% 5-7 (More than sometimes) 50% 4 (Sometimes) 1-3 (Less than sometimes) 40% 30% 20% 10% 0% Who look Who read for similar my papers papers Whose papers I read Who cite Whom I my papers cite in my papers Who cite similar papers Who are recruiting for jobs Who want jobs cont • “[I want to] collaborate with CiteSeer users who are looking for similar papers as [I am] and who cite similar papers as [I do]…the reason is I can save more time to find a good paper worth reading and can touch more ideas in my research area by collaboration.” • “[I] would definitely like to collaborate with people whose papers I cite and who cite my papers.” Collaborative activities Types of collaborative activities (N=281) 100% 90% 7 (Very often) 6 80% 5 70% 4 (Sometimes) 3 60% 2 50% 1 (Never) 40% 30% 20% 10% 0% Strenghten Brainstorm social new ideas connections Plan joint projects Write joint Make my Get Learn about papers work visible feedback on their work my work cont • “I’d love to participate in forums or discussions about my field, to see what is going on, and what other people think.” • “I think the online discussions and brainstorming could be useful. For paper writing and project planning, I’d imagine that the team would be cohesive and we’d just use email or a wiki to coordinate.” Awareness Agreement on difficulty in staying aware (N=277) 100% 90% 80% 70% 60% 5-7 (Agree) 50% 4 (Neither disagree nor agree) 1-3 (Disagree) 40% 30% 20% 10% 0% Recent papers published in my area Who reads my paper New colleagues in my area Who cites my papers cont • “Knowing who reads my papers would be neat, as it could help smooth an introduction.” • “I’d like to know who has started a new discussion thread related to my area of interest, because I want to be aware what is going on outside my lab, and what other researchers are thinking or focusing on.” Conclusion • CiteSeer users want social networks – Based on multiple social matching criteria • CiteSeer users want support for divergent collaborative activities – Because of privacy and security concerns for convergent activities (e.g., writing a journal paper) • CiteSeer users want awareness support – Through notification systems (RSS) Future work http://bridgetools.sourceforge.net Questions/comments? cganoe@ist.psu.edu