Recreating Popular User-Generated Tags Effectively and Efficiently by Utilizing Crowdsourcing A Thesis Submitted to the Faculty of Drexel University by Deima Elnatour in partial fulfillment of the requirements for the degree of Doctor of Philosophy June 2011 UMI Number: Information to Users ii © Copyright 2011 Deima Elnatour. All Rights Reserved iii Thesis approval form iii Dedications To my parents who have given me the security to question and the liberty to seek my own answers. iv Acknowledgements It took a village; without the help of so many people I would not be writing these words today. I want to thank the faculty and staff members of the College of Information Science and Technology at Drexel University for all their support and guidance throughout this arduous, yet joyful, process. Without your help and guidance this work would have not been possible. I am grateful for this opportunity and am fortunate to have worked with the best in our field. Xiaohua (Tony) Hu, Ph.D., my advisor thank you so much for your guidance and support. I am especially thankful to you for introducing me to my life passion, the field of data mining and user-generated content. Your patience and support helped me overcome the impossible. My committee members Jiexun Jason Li, Ph.D., Yuan An, Ph.D., Christopher C. Yang, Ph.D., and Zeka Berhane, Ph.D., thank you so much for being there for me every step in the way. Your feedback and advice have been invaluable to this line of work. To my mom, Itaf Elnatour, I am grateful for your kindness, endless encouragement, and for teaching me the value of education and continuous learning. To my dad, Tahsein Elnatour, I wish you were with us today. To my sister, Jumana and her beautiful family, thanks for keeping me humbled. To my brothers Naill, Hazim, and Abdulla, thanks for keep pushing me forward and expecting more of me. Last but not least, I want to thank my long life friends and colleagues for being there for me during this journey. Thanks everyone. v Table of Contents CHAPTER 1: INTRODUCTION TO THE STUDY.......................................................... 1 1.1 Introduction .......................................................................................................... 1 1.2 Background of the problem .................................................................................. 2 1.3 Statement of the Problem ..................................................................................... 4 1.4 Purpose of the Study ............................................................................................ 4 1.5 Significance of the Study ..................................................................................... 5 1.6 Nature of the Study .............................................................................................. 6 1.7 Research Questions and Hypothesis .................................................................... 7 1.8 Assumptions ......................................................................................................... 8 1.9 Limitations ........................................................................................................... 8 1.10 Delimitations ........................................................................................................ 9 1.11 Summary .............................................................................................................. 9 CHAPTER 2: LITERATURE REVIEW .......................................................................... 10 2.1 Folksonomies .......................................................................................................... 11 2.2 Use of Tags in Web Search ..................................................................................... 12 2.3 Motivations of Using Tags ...................................................................................... 14 2.4 Types of Tags .......................................................................................................... 16 2.5 Metadata .................................................................................................................. 17 2.6 Crowdsourcing through Mechanical Turk .............................................................. 19 2.7 Information Retrieval Systems and Evaluation Models .......................................... 23 CHAPTER 3: METHODOLOGY .................................................................................... 35 3.1 Introduction ............................................................................................................. 35 3.2 Research Design ...................................................................................................... 36 vi 3.3 Appropriateness of Design ...................................................................................... 38 3.4 Research Questions ................................................................................................. 40 3.5 Population................................................................................................................ 42 3.6 Sampling.................................................................................................................. 42 3.7 Instrumentation and Data Collection....................................................................... 43 3.8 Operationalization of Variables .............................................................................. 46 3.9 Data Analysis .......................................................................................................... 48 3.9.1 Descriptive Statistics ........................................................................................ 48 3.9.2 ANOVA ............................................................................................................ 48 3.9.3 Multiple Linear Regression .............................................................................. 49 3.10 Summary ............................................................................................................... 50 CHAPTER 4: RESULTS .................................................................................................. 52 4.1 Introduction ............................................................................................................. 52 4.2 Collected Data and Overview of Sample Population .............................................. 52 4.2.1 Mechanical Turk Population and Survey Descriptive Statistics ...................... 53 4.2.2 Popacular and Delicious Data for Most Tagged Sites ...................................... 56 4.3 Hypothesis Data Analysis ....................................................................................... 59 4.4 Summary ................................................................................................................. 67 CHAPTER 5: CONCLUSIONS AND RECOMMENDATIONS .................................... 68 5.1 Scope, Limitations, Delimitations ........................................................................... 68 5.2 Findings and Implications ....................................................................................... 69 5.3 Recommendations ................................................................................................... 72 5.4 Scope and Limitations of the Study ........................................................................ 73 5.5 Significance of the Study ........................................................................................ 75 vii 5.5 Summary and Conclusions ...................................................................................... 75 References ......................................................................................................................... 77 Appendix A: The Survey Tool .......................................................................................... 81 Appendix B: Popacular Top 100 Most Tagged Sites on Delicious – All-Time ............... 88 Appendix C: Popacular List of Most Tagged Sites – One Month .................................... 91 Appendix D: Popacular List of Most Tagged Sites – One Week ..................................... 92 Appendix E: Popacular List of Most Tagged Sites – One Day ........................................ 93 Appendix F: Popacular List of Most Tagged Sites – 8 Hours .......................................... 94 Vita.................................................................................................................................... 95 viii List of Tables 1. Cost-Benefit Assessment for All Participants……………………………………….26 2. Descriptive Statistics of Study Sample..........................................................................55 3. Data for the All Time Top 5 Most Tagged Sites...........................................................57 4. Mapping Between Tag Classification Schemes.............................................................57 5. Site 1 - YouTube Tagging Data from Delicious............................................................57 6. Site 2 - Flickr Tagging Data from Delicious.................................................................58 7. Site 3 - Pandora Tagging Data from Delicious..............................................................58 8. Site 4 - Facebook Tagging Data from Delicious...........................................................59 9. Site 5 - Digg Tagging Data from Delicious...................................................................59 10. Pairwise Comparisons of Tag Creation Effectiveness among Sites............................61 11. Comparison of Tag Creation Effectiveness by Tag Types at Sites 1, 2, 3 and 5.........63 12. Regression Results for Site 1.......................................................................................64 13. Regression Results for Site 2.......................................................................................64 14. Regression Results for Site 3.......................................................................................65 15. Regression Results for Site 4.......................................................................................65 16. Regression Results for Site 5.......................................................................................66 ix List of Figures 1. Taxonomy of Tagging Motivations...............................................................................15 2. IR Evaluation Model......................................................................................................24 3. Real-time Processing Pipeline.......................................................................................33 4. Iterative Experimental Design Approach Used in This Study.......................................36 5. Age Distribution of Mechanical Turk Workers.............................................................54 6. Education Distribution of Mechanical Turk Workers ..................................................54 7. Primary Reasons for Participation.................................................................................55 x ABSTRACT Recreating Popular User-Generated Tags Effectively and Efficiently by Utilizing Crowdsourcing Deima Elnatour Tony Hu, Ph.D. It is well-known today that not all user-generated tags provide additional information that can be used to further improve web search beyond traditional methods. A number of studies established that popular tags are most useful for web search. Popular tags are common tags provided independently by a large number of users to describe web sources of interest. However, the same studies concluded that incorporating these tags will not make a measurable impact on search engine performance given the size of the web and the scarcity and distribution of popular user-generated tags across the extended web. This dissertation is focused on finding a way to create social bookmarking tags efficiently and effectively by utilizing crowdsourcing systems. Crowdsourcing is a platform in which tasks can be posted that would then be completed at a stated price or reward. This aims to attract users to complete relatively simple tasks that are easier for humans and harder for machines. Crowdsourcing has been widely used by companies and researchers to source micro-tasks requiring human intelligence such as identifying objects in images, finding or verifying relevant information, or natural language processing. xi The purpose of the study is to determine whether popular internet bookmarking tags can be recreated through crowdsourcing. Amazon Mechanical Turk, the work marketplace, was used as a means to conduct an experiment regarding the reproduction of popular tags for a variety of websites using Delicious, a service for storing and sharing bookmarked pages on the internet. The key research questions for the study were examined as a number of factors regarding tag creation including the effectiveness of crowdsourcing in reproducing popular tags, categorizing which tags can be recreated most effectively, and the relationship of worker characteristics and demographics to the effectiveness of producing popular tags. The results of the study suggest that popular internet bookmarking tags can be recreated effectively through crowdsourcing. Moreover, tag creation effectiveness was significantly higher for tag type “Factual and Subjective” (F & S) than for tag type “Factual” (F). Additionally, other variables were tested to assess their relationship with tag creation effectiveness. Interest in site, familiarity with site, tag creation experience and tag usage experience were significantly related to tag creation effectiveness for some of the sites, although the direction and significance of these relationships was not consistent across all sites included in this study. This study provides a promising new direction for cheap, fast and effective creation of user-generated tags that would be useful in indexing more of the extended web and consequently help improve web search. Furthermore, it informs future experimental and micro-task design for creating high quality tags reliably using crowdsourcing platforms. 1 CHAPTER 1: INTRODUCTION TO THE STUDY 1.1 Introduction Web-based tagging systems, which include social bookmarking systems such as Delicious, allow participants to annotate or tag a particular resource. Historically, annotations have been used in several ways. Students annotate or tag their books to emphasize interesting sections, to summarize ideas and to comment on what they have read (Wolfe, 2000). Davis and Huttenlocher (1995) suggested that shared annotations in the educational context can serve as a communication tool among students and between students and instructors. There can be threaded discussions around class materials that directly and specifically link to the class material. Farzan and Brusilovsky (2005, 2006) made use of annotation as an indicator of the page relevance for a group of learners in an online learning system. Web systems that allow for social annotation can provide useful information for various purposes. Dmitriev et al. (2006) explored the use of social annotation to improve the quality of enterprise search. Freyne et al. (2007) made use of social annotation to rerank research paper search results. Hotho et al. (2006) proposed a formal model and a new search algorithm for folksonomies. Bao et al. (2007) explored the use of social annotation to improve web search. Social annotations have the potential to improve searching for resources (Marlow et al., 2006). However, published research on using social annotations to improve web search is sparse. Rowley (1988, p. 43) explained that “the indexing process creates a description of a document or information, usually in some recognized and accepted style or format” where the term “document” is used to reflect a container of information or knowledge. 2 Therefore, a document could take the form of any combination of form and medium. Document indices could also be viewed as a structured form of content representation of a document. This leads to the notion that indexing is actually the activity of creating surrogates of documents that summarizes their contents (Fidel, 1994). With the new era of the internet and the growing popularity of online social software, ordinary people are becoming human indexers with no policies, rules or training. This calls for the need to focus on social indexing and understand its characteristics as it stands in online social networks today. It is also critical to understand the nature of user-generated tags (folksonomies) and see if such tags can be leveraged to improve the search function and address some of the long standing issues in the IR field. This dissertation is focused on developing a better understanding of usergenerated tags and finding a way to generate tags that can help improve web search. A number of empirical studies concluded that social bookmarking tags can provide additional data to search engines that were not provided by other sources and can consequently improve web search. These studies however concluded that there was a lack of availability and distribution of the tags that can improve search. This study was focused on finding a way to create social bookmarking tags efficiently and effectively using crowdsourcing. 1.2 Background of the problem Tags, also known as user-contributed metadata, are viewed as a tool for digital information management. Most of these tags are created during the use of social computing software (e.g. Delicious and Flickr) and web 2.0 systems. Social computing 3 systems allow users to be both producers and consumers of information. They also allow users to find and save interesting documents and web pages that were produced by others. In either case, social computing systems have one important and common feature: they provide the user with the ability to tag documents. A tag is a one word descriptor that best describes the content of the tagged document. Users have the flexibility to assign no tags at all or assign multiple tags to best describe the content of the document. This phenomenon has spread contagiously throughout the Internet where most organizations had no choice but to incorporate some form of social network features into their home pages and online services. It is believed that social features such as tags enhance the user experience and are usually associated with higher user satisfaction. Researchers turned their attention to this phenomenon and tried to figure out why tagging is a popular feature and increasingly in high-demand. There are two main incentives to tagging: i) private, mainly to address Personal Information Management (PIM) needs, or ii) public, where the tags are driven by the need to collaborate and share information with others. Pirollli (2005) suggests that tags provide the “information scent” that connects users with information. Social network tags are also known as user-generated tags. Researchers find that tags have both private and public benefits. Private benefits are mainly focused on enhancing Personal Information Management (PIM) where users’ main incentive is saving this information for personal use in the future. Public benefits, on the other hand, are usually driven by the desire to collaborate and discover new information that is of interest through navigating tags. Tagging requires a lesser cognitive burden than categorizing, as stated by Sinha’s tagging cognitive analysis (Sinha, 2005). Social tagging 4 offers a flexible solution when compared to traditional hierarchical tagging as it allows a user to express the various dimensions of a document by applying multiple tags to one document. Some studies show that users choose tags based on the words that they are likely to use when searching for these documents in the future (Wash, 2006). Therefore, social network tags open new doors and provide new hope to solving some of the long standing problems in the field of IR. Especially ones related to the mismatch between search query words and words in the document. The hope is that tags offer quality indices for documents that cannot be achieved by machine indexing function. This is believed to be true since social network tags harness the social and collaborative wisdom of the crowds, and they are likely to be more effective as indices that can enhance IR function. 1.3 Statement of the Problem A number of studies concluded that there was a lack of availability and distribution of the tags that can improve web search. This study focused on finding a way to create social bookmarking tags efficiently and effectively using crowdsourcing. Furthermore, this study examines tag creation effectiveness across systems and tag types while exploring the relationships between tag creation effectiveness and a number of user-related factors such as interest in the website, familiarity with the website, tagging experience (both usage and creation), experience with search engines, and time spent on the internet. 1.4 Purpose of the Study The purpose of the study was to determine whether popular internet bookmarking tags can be recreated through crowdsourcing. Amazon Mechanical Turk, the work 5 marketplace, was used as a means to conduct an experiment regarding the reproduction of popular tags for a variety of websites using Delicious, a service for storing and sharing bookmarked pages on the internet. The key research questions for the study were examined as a number of factors regarding tag creation including the effectiveness of crowdsourcing in reproducing popular tags, categorizing which tags can be recreated most effectively, and the relationship of worker characteristics and demographics on the effectiveness of producing popular tags. 1.5 Significance of the Study The significance of this study is two-fold. First, the findings of this study could help develop a better understanding of how social bookmarking tags are created and what can be done to effectively improve their availability and distribution towards an efficient web search. Second, the findings of this study provide new insights on indexing which is the basis of information retrieval. Numerous variations on indexing have been tried over the years. Modern search engines use several methods to find additional metadata information to improve a resource indexing for enhancing the performance of the similarity ranking. Craswell et al. (2001) and Westerveld et al., (2002) explored the use of links and anchors for web resource retrieval. They pointed out that anchor text helps improve the quality of search results significantly. The anchor text can be viewed as web page creator annotation. This suggests that annotation can be used to support document indexing. Social tagging systems, e.g. Delicious, allow participants to add keywords that are tags to a web resource. These tags can be viewed as user annotations of a web resource. Dmitriev et al. (2006) explored the use of user annotation as intranet document 6 indexes. Yanbe et al. (2007) converted a tag and its frequency to be a vector that represents a page’s content. These findings suggest that there is potential value in investigating and finding a reliable method to producing popular tags through crowdsourcing effectively and efficiently. This would allow for on-demand indexing of web resources which would then lead to enhanced web search. 1.6 Nature of the Study The research method that was selected for this study was a quantitative quasi experimental correlational research design. The dependent variable for the study is tag creation effectiveness which is a continuous variable. The independent variables for the study are tag type, system type, interest in the website topic, experience with website or similar website, tagging experience, search engine experience, and average daily time spent on the internet. Tag type and system type are categorical variables while the average daily time spent on the internet is a continuous variable and the rest are ordinal variables. An analysis of variance (ANOVA) was conducted to determine the relationship among the independent and the dependent variables. ANOVA was used for the second and third research question. Furthermore, a multiple regression was conducted to support the ANOVA. The use of a quasi-experimental research design allowed the determination of whether there were statistically significant differences between groups (Cozby, 2007) in which for this study are the different tag and tag types. The quasi experimental design was appropriate to assess these differences because it allowed the researcher to compare 7 the levels or categories of the independent variables with regard to the dependent variable in order to determine whether there was a difference between the groups (Broota, 1989). More so, this quasi experimental correlational quantitative study specifically investigated the relationship of tagging experience, search engine experience, and average daily time spent on the Internet of the participants. With such an objective then a correlational design was appropriate. In the context of social and educational research, correlational research is used to determine the degree to which one factor may be related to one or more factors under study (Leedy & Ormrod, 2005). 1.7 Research Questions and Hypothesis The research questions and hypotheses that guided this study were: RQ1: Are there statistically significant differences in tag creation effectiveness for popular tags among the sites included in this study? H10: There are no statistically significant differences in tag creation effectiveness for popular tags among the sites included in this study. RQ2: Are there statistically significant differences in tag creation effectiveness across tag types? H20: There is no statistically significant difference in tag creation effectiveness across tag types. RQ3: What is the relationship among tag creation effectiveness and Interest in the Website Topic, Experience with Website or Similar Website, Tagging Creation Experience, Tag Usage Experience, Experience with Search Engine, and Time spent on the Internet? 8 H30: None of the independent variables of Interest in the Website Topic, Experience with Website or Similar Website, Tagging Experience, Experience with Search Engine, and Time on the Internet have a statistically significant effect on tag creation effectiveness. 1.8 Assumptions The conclusions and interpretations developed as a result of this study will be based on a number of assumptions guiding the study. First, it is assumed that the participants in this study know and are good representatives of the post tagging tasks that are to be evaluated on Mechanical Turk. Second, since this study will be based on the answers to the survey instruments used to collect data, it will be assumed that the instrument will be valid and reliable with respect to the collection of the tag creation effectiveness. 1.9 Limitations Limitations are factors that limit the study, such as weaknesses, problems, and reservations, which impact the research. All participants in the study will participate voluntarily and will fill out all survey questions honestly and completely. Therefore, respondents will be limited to the number of consenting participants that choose to participate in the research study and complete the survey. Individuals may decide not to participate for various reasons (Leedy & Ormrod, 2003; Creswell, 2009). The study will also be limited by the potential for malingering by the participants and the time limits placed on the conduction of the study. The validity of this quantitative quasi experimental correlational study will be limited to the reliability of the instruments used to gather and interpret research data. The 9 methodology of the study might present a limitation because it does not allow for variable manipulation. The lack of variable manipulation prevents the opportunity for determining causality in regards to research relationships. 1.10 Delimitations Delimitations are factors the researcher could not control or decide not to include in the study. These factors limit the ability to generalize the results of the study to the actual population. This study will be confined to surveying participants selected for the post tagging tasks in Mechanical Turk. This study will focus only on determining whether popular internet bookmarking tags can be recreated through crowdsourcing. The findings of this study are limited to the use of Mechanical Turk, the work marketplace. Mechanical Turk was used as a means to conduct an experiment regarding the reproduction of popular tags for a variety of websites using Delicious, a service for storing and sharing bookmarked pages on the internet. 1.11 Summary This chapter provided an introduction to the study regarding the determination of whether popular internet bookmarking tags can be recreated through crowdsourcing. The background was discussed along with the problem purpose, and nature of the study. Research questions and hypotheses were presented to guide the quantitative correlational study. Assumptions, limitations, and delimitations were discussed. Chapter 2 will contain a more detailed discussion of the literature review and Chapter 3 will include the methodological specification for the proposed study. 10 CHAPTER 2: LITERATURE REVIEW Web 2.0 applications focused on users’ interactions and encouraged them to generate content. As a result, online communities flourished and users started sharing their ideas and actively generating new content and collaborating with others to satisfy their curiosity and needs. With all this content the need for organizing, finding and haring such content became an important focus for most users. Web 2.0 applications kept up with the rising needs around managing and accessing user generated content by introducing a number of features to help classify and explore content. Tagging was one of these features. Tagging allowed the user to assign self-defined words that would effectively describe and organize their content. This presented users with a platform to explore available content and connect by finding similarities in their collections. As online communities continue to grow in importance and user based, prominent search engines decided to invest in these systems, with Yahoo acquiring Flickr and Del.icio.us and Google acquiring YouTube (Bischoff, Firan, Nejdl, & Paiu, 2008). By annotating their respective content with tags, users’ connections can be established. Flickr and other photo-sharing websites provide users with an option to provide tags that best describe their photographs. Tags in this case could be a description of the photo, the place or setting, subject of the picture or any other distinguishing characteristic such as color or action. Meanwhile, for music sharing websites like Last.fm, songs are tagged based on their artists, album, genre, mood, or any other classification specified by the user. 11 The emergence of tagging and the wide spread of tagging applications caused it to become a main topic of research in the field of information science (Figueiredo, et al., 2009). Problems arise with the lack of an accepted standard methodology for evaluation and distinct textual features available on web 2.0 systems. Large-scale samples to be used for these are also restricted in access, in effect hindering the comparison of performance between social networks or content features on these networks. Thus, different techniques have been formulated to perform these tasks. Website crawls or nonpublic snapshots are used in analysis, but because of their innate characteristics and distinct methodologies, the comparison between them has become challenging, if not inaccurate (Crecelius & Schenkel, 2009). 2.1 Folksonomies Existing taxonomies and predefined dictionaries have been found as lacking in flexibility and expensive to create and maintain. Tagging has become an alternative to these top-down categorization techniques, allowing users to choose their own labels based on their real needs, tastes, language or anything that would reduce their required cognitive efforts. Community services that offer tagging of resources are called folksonomies, where thousands or millions of users share their interests, preferences and contributions. Folksonomies come in two forms, depending on the tagging rights that are given. Narrow folksonomies restrict the bookmarking and tagging of resources only to a number of users – such as the owner or other users he/she would specify. Broad folksonomies, on the other hand, are those open to the entire community, enabling each individual to relate to the activity of other users (Wetzker, Bauckhage, Zimmermann, & Albayrak, 2010). 12 Aside from the existing indices for search queries of search engines, tags could be used to complement existing indices and produce enhanced results. Tags and annotations can provide additional information about the sources they are describing. These tags and annotations include keywords or phrases that would be linked with other related and relevant sources. In doing so, semantic web relationships are developed, leading to improved retrieval and review ratings that would attract other users to it. The objective of semantic web is to make online resources more understandable to humans and machines. This has ushered in the emergence of web applications such as web blogs, social annotations and social networks. Research in this field has been centered on discovering the latent communities, detecting topics from temporal text streams and the retrieval of highly dynamic information (Zhou, Bian, Zheng, Zha, & Giles, 2008). 2.2 Use of Tags in Web Search Tags have also been utilized as a way of bookmarking and giving out brief, concise summaries about web pages for search engines. As mentioned earlier, this could be used in a developed algorithm that would measure the popularity of a page or its contents. It is used as an alternative for determining customer preference, such as the case for Last.fm. Aside from associating track lists of similar users, it uses descriptive tasks to recommend new songs to existing users. It has been proven that these tag-based search algorithms come up with better results in comparison to track-based collaborative filtering methods (Bischoff, Firan, Nejdl, & Paiu, 2008). The text in the hyperlink that is most visible and accessible to the user is referred to as the Anchor Text (AT) or link label. These are the ones that are used heavily by web 13 search engines. As they are able to describe the content of a linked object, they are used as a measure for similarity among objects in various webpages and aid in query refinement. (Bischoff, Firan, Nejdl, & Paiu, 2008) As most personalization algorithms still work on text, the documents in the dataset should be primarily textual social web content. The documents should be equipped with full text information, but more important is the basic bibliographic information such as author, title, abstracts and keywords. The dataset should explicitly contain users and their search tasks for evaluating personalization. Because the algorithms rely on the history of behavior and results that were adapted, then there should be a sufficient sample. The person who proposed the search task is also encouraged to provide relevance annotations. This should include as many extra features as possible, such as hyperlinks, tags, categories/topic labels and virtual communities defined. This optional user profile enables personalized results by identifying the users’ interests and other document similarities, helping online communities make it easier to identify user expertise and interest. (Yue, et al., 2009) Social bookmarking systems could also help in the detection or identification of trends in tagging, popularity and content. Del.icio.us is fast growing because of its ability to centrally collect and share bookmarks among users. It follows a format that shares information through two channels of the website. The first channel is through bookmarks or tags. This is where users subscribe to others’ content and are updated whenever their interests are added onto. The second channel is through the main webpage, where the front page is the primary means of sharing information. As it is the first point of contact, 14 it attracts the attention of all visitors of the site (Wetzker, Zimmermann, & Bauckhage, 2008). 2.3 Motivations of Using Tags There are a number of studies that focused on understanding the motivation behind tagging systems Studies have shown that among the primary goals of tags would be to serve the needs of individual users – such as in browsing, categorizing and finding items (Bischoff, Firan, Nejdl, & Paiu, 2008). This could also be used for information discovery, sharing, community ranking, search, navigation and information extraction. There are two aspects of motivation in using tags – organizational and personal (Suchanek, Vojnovic, & Gunawardena, 2008). The organizational aspect is for the community – to provide context to others or describe characteristics of a certain object. The personal aspect is done by the user for his/her own use, for better organization and classification of information. An example of the use of these tags in social networking sites is that of Flickr. The site has helped in the annotation of photos and enabled users to share it with others. A photo is made searchable and tags are generated to further increase its exposure to the community. Another service, ZoneTag, is a mobile phone application that encourages annotation immediately after taking the picture. Aside from personal organization, another motivation for tagging was to convey information and opinions about the photo itself yet. Another motivation is to share contextual information for other people (presumably relatives and friends) whenever they tag their photos. A taxonomy is developed (Ames & Naaman, 2007) for annotation motivations and are summarized in Figure 1. It states that there are two dimensions with different incentives whenever 15 photographs are tagged. The first dimension is sociality, on whether the tag was intended for use of the person who took the shot or for friends/family or the general public. The second dimension, function, refers to the intended uses of the tag. Figure 1. Taxonomy of Tagging Motivations (Ames & Naaman, 2007) From the observations and analysis of user motivations, several implications were made for tagging systems in general (Ames & Naaman, 2007). First, the annotation should be pervasive and multi-functional, incorporating all the categories in the taxonomy. Second, information captured should be easy to annotate right away. Easy annotation at the point of capture would ensure that the tagging activity is done, and at a more precise manner. Third, users should not be forced to annotate. Even if this might be a more efficient way of tagging, it is still up to the discretion of the user when they would annotate. Fourth, annotation should be allowed in both desktop and mobile settings. For mobile annotation, it would help define the in-the-moment aspect of annotation, while the desktop/web-based component would allow more descriptive or bulk notation. Lastly, relevant tag suggestions can encourage tagging and give users ideas about possible tags. 16 These suggestions should be clearly defined in order to prevent confusion or ambiguity. It should be ensured that tags are accurate and not just entered when made available. 2.4 Types of Tags Tags are classified through eight different dimensions (Bischoff, Firan, Nejdl, & Paiu, 2008): topic, time, location, type, author/owner, opinions/qualities, usage context and self-reference. The topic provides a description about the item under consideration – the subject of a picture, the title and lyrics of a song, and so on. While the theme of a written piece can be extracted from its content, it is not as easily done in pictures and songs. The time category specifies the month, year, season or any other periodical indicator. The location would talk about the setting – a country or city, its sights and attractions, landmarks or hometowns of artists/writers. The type of file would define which kind of media is used, such as the type of web page presented. For music, it would define the accompaniment of instruments or genre. For pictures, the camera settings and styles used would be identified. The classification by author/owner defines the user and the rights to the said object. Tags could also be made on subjective descriptions, such as the quality of the object and the common opinion shared by different users. Usage context states the purpose of the object, along with how it is collected or classified. Lastly, self-reference are tags that are highly personalized for personal use (Bischoff, Firan, Nejdl, & Paiu, 2008). In an analysis of tag types, it was found that 50% of the tags in Del.icio.us, Flickr as well as Anchor Text (AT) are Topic-related keywords (Bischoff, Firan, Nejdl, & Paiu, 2008). This was mainly because of the convention that pictures and web pages are classified according to topic. For music in Last.fm, the type was the classification that 17 was most prominent. This is because this category is comprised of the song format, instrumentation and genre for music-related media. It was followed by Opinion/Quality and Author/Owner, showing how users refer to their content when it comes to music. Another finding in this research was that more than half of the existing tags provide additional information to the resources they annotate. 2.5 Metadata Metadata is essential for the organization and search of information resources. There are professional metadata creators who base their work on standards or controlled vocabularies. However, the high quality of this data also entails high cost. This would limit its production and how it expands its scale. With the increasing volume of digital resources on the internet, alternative methods to metadata creation are desired. Even if automatic or semi-automatic generation of metadata is explored, the capabilities are limited in comparison to those that are created through human intelligence. Through the transition from Web 1.0 to Web 2.0, web users who annotate web resources through social tagging systems have become another class of metadata creators. The social tags created by users provide a special type of metadata that can be used for classifying and searching for resources. All web users who are able to access the content can also be taggers. The main difference is visible in the enforcement of strict indexing standards where in Web 2.0 there are fewer rules and more freedoms provided to the users to choose whatever they see as a good description. Folksonomies introduced a big improvement compared to controlled vocabularies or large scale ontologies – an appropriate set of required resources or information – that would have to be upgraded and would entail high maintenance cost (Lu, Park, Hu, & Song, 2010). 18 While innovations in technology have brought forth greater attention to photography and other related fields, semantic metadata about photo content is not readily available. Thus, photo collections would need to have some form of annotation to improve usefulness, as well as to help recall and support search. However, the burden of this semantic interpretation and annotation still falls to the owner of the collection. Therefore, tools for annotation have been a constant topic of research (Ames & Naaman, 2007). Two approaches for metadata creation in the web environment are studied: usercreated and author-created metadata (Lu, Park, Hu, & Song, 2010). The user-created metadata are those applied by users to annotate web pages, such as social tags. The author-provided metadata are the keywords and descriptions placed in the head part of the document. The overlap of metadata with the page title and text/body is examined to gauge how much these tags contain additional information beyond page content. It was found that both tags add to the existing page content, but more than 50% of the tags and keywords are not present in the title and content of the pages. Authors are also more likely to use terms from the page content to annotate the pages. Data analysis also showed that users and authors only agree on a small portion of terms that can be used in describing the web pages (Lu, Park, Hu, & Song, 2010). Clustering methods were then used to evaluate whether social tags or author-provided keywords and descriptions are effective in discovering web resources. The results showed that both tags and authorprovided data could be used to improve the performance significantly, with tags being the more effective independent information source. Lastly, it was found that tags can be more effectively utilized as the links connecting pages with related topics and with regards to 19 the social/user property (Lu, Park, Hu, & Song, 2010). 2.6 Crowdsourcing through Mechanical Turk This knowledge must also be in the form that is suitable for reasoning. Text corpora are mined to create useful and high quality collections of this knowledge under a methodology referred to as Open knowledge extraction (Gordon, Van Durme, & Schubert, 2010). The whole process of encoding the knowledge is quite cumbersome and entails additional cost in labor by experts. OKE creates logical formulas by using forms of human knowledge – books, newspapers and websites. It extracts insights and information from important sources, and is different from information extraction as it focuses on the everyday, common sense knowledge rather than specific facts. It also offers a logical interpretability of outputs. The knowledge base of these OKE systems is aimed to address the gap in the quality of automatically acquired knowledge. Thus, it would propose an easy method for evaluating the quality of results. This is where the use of the Mechanical Turk comes into play. From their research, it was found that an inexpensive and fast evaluation of its output could be a way to measure incremental improvements in output quality coming from the same source. Based on these issues, new ways have been developed to collect input from users online, such as surveys, online experiments and remote usability testing. With these tools, the potential users could easily be accessed through anyone with internet. A study has been done on the micro-task market (Kittur, Chi, & Suh, 2008), where small tasks are entered into a common system that users would select and complete for some reward (monetary or non-monetary). The micro-task market offers convenience as it could be 20 completed within a few seconds or minutes. It presents instant gratification as there is a quick access to the large user pool to collect data and they are immediately compensated. Amazon’s Mechanical Turk (MTurk) is one of the platforms in which tasks can be posted that would then be completed at a stated price. This aims to attract human users to complete relatively simple tasks that are easier for humans and harder for machines. The Mechanical Turk has been widely applied by companies to source micro-tasks requiring human intelligence such as identifying objects in images, finding relevant information or natural language processing. The Mechanical Turk works by converting each annotation task into a Human Intelligence Task (HIT). The core tasks for a researcher are: (1) define an annotation protocol and (2) determine what data needs to be annotated (Sorokin & Forsyth, 2008). These tasks only require minimal time and effort and today this system employs over 100,000 users from about 100 countries. However, it is inevitable that the system would face some serious challenges. First, it would go against the conventional way of participation assignment and would rely solely on people accepting and completing the tasks. Second, it requires a bona fide answer that could not be quickly monitored in case other users do not cooperate properly. Lastly, the diversity in user base can generalize population from different areas without taking into consideration their demographic information, expertise and other important user-specific data. It should be noted that during the initial operation of the Mechanical Turk, only workers with US bank accounts were accepted. However, they have recently allowed workers from India and other countries to receive payment as well (Ipeirotis, 2010). 21 Several designs have been recommended to address these issues after experiments were made (Kittur, Chi, & Suh, 2008). First, it is important to have explicitly verifiable questions as part of the task. This would ensure their awareness in monitoring and checking of their answers, promoting better participation. Second, the task should be designed to require least effort in order to prevent wrongful completion. Third, there should be multiple ways to detect suspect responses. This could be done through task durations or repeated answers. These are similar to the three distinct aspects of quality assurance (Sorokin & Forsyth, 2008): (a) ensuring that the workers understand the requested task and try to perform it well; (b) cleaning up occasional errors; and (c) detecting and preventing cheating in the system. The basic strategy done is collecting multiple annotations per image. This would identify the natural variability of human performance and how occasional errors influence the results. While it allows malicious users to be caught, it entails additional cost. Another strategy is to perform a separate grading task. This is done through scanning annotated images and scoring each one. This results in cheaper quality assessments. The third strategy is to build a gold standard with the use of images with trusted annotations. This would detect the performance immediately, as feedback would also be provided to the worker. It is also a cheaper alternative as only a few images would be used to demonstrate the gold standard. It has been found that it is important to turn the annotation process into a utility. This would make it easy to determine which data to annotate and what type should be applied. Another study (Snow, O'Connor, Jurafsky, & Ng, 2008) was explored to determine whether non-expert labelers can provide reliable natural language notations. 22 Similar to the gold-label strategy, they chose five natural language understanding tasks that were easy to learn and understand even for non-experts. These tasks are: affect recognition, word similarity, recognizing textual entailment, event temporal ordering and word sense disambiguation. Each task was processed through the MTurk to annotate data and to measure the quality of annotations in comparison with the gold-standard labels. From the experiments conducted on the different tasks, the evaluation was that only a small number of non-expert annotations per item are necessary to equal the performance of an expert annotator. One thing that the previous study was missing was the inclusion of machine translation (Callison-Burch, 2009). MTurk provides the requesters of tasks three ways to ensure quality. First, multiple workers could complete each HIT. This allows them to select higher quality labels among respondents. Second, requesters could specify a particular set of qualifications for the workers. Third, they have the option to reject the work, which does not require them to pay. This keeps the level of participation high, even if the incentive system is relatively small. This low cost is used to generate multiple redundant annotations, which is in turn used for ensuring translation quality. The judgment that was extracted from non-experts was able to achieve the equivalent quality of experts. The study also showed the other capabilities of the MTurk, such as creating, administering and grading a reading comprehension test with minimal intervention. The Amazon Mechanical Turk has also been investigated on how built-in qualifications could avoid spammers. From an investigation of worker performance, it was found that a low constraint for a group would attract more spammers. It was also found that there was no improvement in annotator reliability over time. Thus, consistent 23 annotations cannot be easily expected. However, it was observed that the workers could be reliable in subjectivity word sense annotation. This provides great benefit as it enables annotations to be collected for low costs and over short time periods. Thus, the large scale general subjectivity word sense disambiguation component could possibly be implemented, helping with various subjectivity and sentiment task analysis (Akkaya, Conrad, Wiebe, & Mihalcea, 2010). 2.7 Information Retrieval Systems and Evaluation Models The content and organization of information presented in screen displays is critical to the successful performance of online catalogues. It is imperative that the presentation of information is clear and effective in order to be helpful to the users. Thus, classification methods have been used to summarize the contents of retrieved records into one or two screens instead of long lists being displayed. The information retrieval system should be carefully evaluated on what basis it would be designed after and what structure it would follow. This could also take into account the perspective of the system designer or a content-centered design based on user behavior (Carlyle, 1999). Moving from the static process, a sequence of unrelated events, it now includes the users, tasks and contexts in a dynamic setting (Belkin, Cole, & Liu, 2009). The current systems used in evaluating information retrieval are not appropriate for many circumstances considered in research. An evaluation model was proposed (as depicted in Figure 2) to address these needs, with particular focus on usefulness. 24 Figure 2. IR Evaluation Model (Belkin, Cole, & Liu, 2009) Information seeking occurs whenever there is an information need. The performance of the system would be measured on how it supports users in meeting their goal or completing the task that led them to information seeking. The proposed evaluation model conducts IR evaluation of three levels. The first level is to evaluate the information seeking with regards to what the user wants to accomplish. The second level should assess each interaction and its contribution to the overall accomplishment of the user. The third level would then assess each interaction with the information seeking strategy (ISS) being used. The usefulness of each level is measured by how it contributes to the outcome per interaction and how it would accomplish the whole task. Other evaluations of search engines measure their accuracy and completeness in returning relevant information, quantified through variables like recall and precision. 25 However, these measures are not sufficient to evaluate the whole system. Accuracy and completeness only measure the system’s impact on the user. It was found that the vocabulary used by del.icio.us is highly standardized, attributed to the tag recommendation mechanisms they provide to the users. It was also observed that the attention of users to new URLs is only for a short period, thus making them disappear after just a short while. This could be caused by spam posted by automated mechanisms. This presence of spam highly distorts any analysis, and it was seen that 19 among the top 20 super users are automated. Thus, characteristics (very high activity, few domains, very high or very low tagging rate, bulk posts, or any combination of the aforementioned) are identified to improve detection and filtering (Wetzker, Zimmermann, & Bauckhage, Analyzing social bookmarking systems: a del.icio.us cookbook, 2008) To understand how the system of information retrieval would work, participants and roles are clearly defined. First, there is the information seeker, usually the end-user or consumer of the services offered by the system. Second is the information provider, the entity responsible for the content to be searched, explored and delivered. Third, there are the information intermediaries, which could be categorized as either resource builders or exploration partners. Lastly, there is the system provider, responsible for the development and maintenance of the technology (Paris, Colineau, Thomas, & Wilkinson, 2009). While these roles aim to differentiate between all entities, they may not be appropriate for all situations. A summary of the cost-benefit assessment for these participants is shown in the table below. 26 Table 1 Cost-Benefit Assessment for all participants (Paris, Colineau, Thomas, & Wilkinson, 2009) Participant Information Information Information System Seeker Provider Intermediaries Provider Benefits Task Audience reach Resource builders: System usage Ease of knowledge Reliability effectiveness Audience Knowledge accuracy creation & context Response time modeling Correctness gained Message Accuracy of accuracy Exploration partners: Task exploration Satisfaction effectiveness Costs Time to Metadata Resource builders: Implementation Time to create and hardware & complete task provision Cognitive load Structured integrate the software cost resource Learning time information System Exploration Currency of maintenance Data partners: Time to System capture contextual integration factors The Cranfield or “batch mode” style of evaluation has been a cornerstone of IR progress for over 40 years and serves as a complement to manual user studies (Smucker, 2009). This evaluation style utilizes a list of documents from which ranks are generated by a retrieval system based on the response to queries. From here, the list is evaluated through a pre-existing set of relevance judgments. The caveat in the process is how it does not take into consideration the wide range of user behavior that is present in interactive IR systems. To address these, three ideas are proposed: (a) evaluation should be predictive of user performance, (b) evaluation should concern itself with both the user interface and the underlying retrieval engine and (c) evaluation should measure the time required for users to satisfy their information needs. There has been an evident shift of interest from the retrieval of query-relevant documents to the retrieval of information that is relevant to the user needs. An approach 27 for identifying user needs is to have an analysis of user activity based on query results (Stamou & Efthimiadis, 2009). It has been established that the evaluation would depend on the analysis of user interaction with the retrieved results for judging their usefulness in satisfying the user search intentions. Another aspect to be explored systematically is the user’s perception of the usefulness of the results. It would also observe the impact of retrieved results that are not used on user satisfaction from retrieval effectiveness. Almost half of the searches conducted do not result on a single click for the results, and these might fall under two categories: intentional-cause and unintentional-cause. The difference is that the unintentional is when the user does not get what is expected and the intentional is used for instant information or updates (Stamou & Efthimiadis, 2009). This is where the importance of tags is highlighted, as it would also be evaluated on the following measures developed by Stamou and Efthimiadis: (a) query refinement probability, (b) query-results usefulness, and (c) update search probability. The three methods undergo a probabilistic approach in which query refinement probability finds the effectiveness of consecutive searches and how they are refined through identification of overlapping terms. A threshold is set for these refinements, and when it is not met, the query-results usefulness is examined. It calculates the amount of time spent on the results as well as the activity that is done on them. It would then lead to update search probability, the probability that the user intention is only to obtain new information about a previous search. These measures enable feedback into the system, using it to determine user satisfaction from searches. User-Independent Ground Truth. Using the DMOZ (Open Project) catalogue, queries and ground truth are extracted. Each category would be given respective labels to 28 be used for keyword query. The set of relevant results for this query was formed by the URLs in that category which was also present in the crawl made in del.icio.us. This requires a large test collection while completely disregarding the user who submits the query (Crecelius & Schenkel, 2009). Context-based Ground Truth. A set of relevant answers were developed and assumed to be more relative to the querying user. The set of relevant answers for a keyword query is computed through the sets of items from friends of the user that match the query. This is not entirely reliable as some bias could be found whenever those within close proximity to the user are prioritized (Crecelius & Schenkel, 2009). Temporal Ground Truth. A snapshot or group of snapshots from the social network would be used to gauge the change experienced by the network over time. This would be used to come up with relevant answers for a query. This may lack relevance also as a user may just list down an item out of lack of knowledge and not actual interest (Crecelius & Schenkel, 2009). User Study. A set of topics was defined for each user, and results for each topic from different methods are gathered. Each group (pool) is assessed by the user who defined the topic. Although the queries are made public, a snapshot of the network is not available, making it hard to reuse and evaluate other approaches (Crecelius & Schenkel, 2009). Community-driven evaluation venues have successfully distributed the load of defining queries and assessing evaluation results among the participating organizations (Crecelius & Schenkel, 2009). Thus, it is preferable to apply it in social tagging networks. Each organization would be required to define several topics, along with a description of 29 the information need, a corresponding keyword query and example results. Each topic must be partnered with a user from the organization, one who has been a member of the social network or has experienced being part of it. As the topics are established, a snapshot of the network including the users is taken. This would be the data set that would be submitted and compiled per topic and assessed by its original author. This approach will then enable an evaluation that would incorporate all the peculiarities of social networks. The success of such an initiative would be dependent on the cooperation of companies and institutions who own social network data, along with others who would want to participate in the project. The perspective of test collection has truly shifted – from the use of a single judge (topic author) before letting samples of the user population make explicit judgments, or just analyzed to infer relevance (Kazai & Milic-Frayling, 2009). Google also applies Crowdsourcing for its Image Labeler game and Yahoo has its Answers portal. While both offer no incentive, Yahoo rewards the members with points that would raise their status in the community. This is also referred to as Community Question Answering. The incentive system has been found as a critical factor that motivates the workers to provide relevant answers. The establishment of trust is further strengthened when multiple assessors agree upon a common judgment. This translates to a better-defined topic, subsequently leading to similar interpretations among judges. Care must be applied as this may also lead to collusion by these workers just so they could increase their score. Meanwhile, disagreement can indicate that a topic is ambiguous and there are difference in the workers’ knowledge and criteria. Thus, the trust weight will depend on the ability to differentiate between the two. From the experiments made, the observed levels of 30 agreement are relatively high. This suggests two things – collusion between the workers was present or there was bias in their work. As relevance labels were already showed on their tasks, it could have affected their opinions. It was also found that background knowledge or topic familiarity does contribute to differences of opinions. Annotations are also given importance as it may be more trustworthy since workers spend extra time and effort in adding them. Three out of every four (76%) comments were explanations of relevance decisions or short summaries, while around 15% were qualitative statements about the relevance of the content (Kazai & Milic-Frayling, 2009). These comments may have been added as suggestions to the reviewers and may signal ambiguous content or be a measure of relevance. Lastly, it provides clues on the user background and task (Kazai & Milic-Frayling, 2009). A common problem encountered by search engines is vocabulary mismatch. Existing work on information retrieval has been categorized into two classes: query expansion and document expansion (Chen & Zhang, 2009). Query expansion is executed at query running time, and terms related to the original query are added. Meanwhile, document expansion modifies the documents as the system adds words related to the document at indexing time. Document expansion is seen as the more desirable form as it will not affect the query response time due to the long list of expanded query terms (Chen & Zhang, 2009). The effectiveness of the search engine can be measured via inferring classical precision-recall based on the click-through rates mined from other websites of the main link, inferred relevance of the different information facets from the click-through rates mined from weblogs, and user studies to determine user satisfaction of the retrieved 31 information on the web via navigation. Thus, issues such as redundancy and effort to navigate should be evaluated. It has been found that the Enhanced Web Retrieval Task can be applied to numerous, active areas in web IR including semantic relationships, opinions, sponsored content, geo-spatially localized results, personalization of search, and multilingual support in search results (Ali & Consens, 2009). The textual features comprise the self-contained textual blocks that are associated with an object, usually with a well-defined topic or functionality. This type of analysis uses four features: title, description, tags and comments. For scientific publications, description is referred to as abstract and comments are reviews. Textual features may also be categorized according to the level of user collaboration allowed by the application. The textual features can either be collaborative or restrictive. Collaborative features are those that may be altered or appended by any user, while restrictive only allows the user to apply changes. This is also referred to as tagging/annotation rights. Usually, the title is restrictive while the comments are collaborative. These textual features are characterized in four aspects: feature usage, amount of content, descriptive and discriminative power and content diversity. Feature usage shows that the title offers the best quality of all features in all applications. It provides an understanding for the other objects and whether they may be a reliable source of information or not. The amount of content would determine if a feature is sufficient to be effective for IR. Heuristics are then used to assess the descriptive and discriminative power of each feature, on whether they offer a reasonably accurate description of the object content and/or discriminate objects into different pre-defined categories. This measure of efficiency classifies the categories into levels of relevance. Lastly, the content diversity across different features is measured to 32 come up with feature combination strategies. It was found that restrictive features seem to be more often explored than collaborative ones. However, there is a higher amount of content for collaborative features. Also, title and tags both exhibit higher descriptive and discriminative power. Lastly, there is significant content diversity among features associated with the same object, indicating that each feature possesses various kinds of information about it (Figueiredo, et al., 2009). Traditionally, there were only three ways search engines were able to access data describing pages. These were page content, link structure and query or clickthrough log data (Figueiredo, et al., 2009). An emerging field is the fourth type of data made available: the user-generated content that uses tags or bookmarks to describe pages directly. There are two different strategies in gathering datasets for the website. The first one is through the monitoring of the recent feed, a real-time tracking but without including older posts. There is also the crawl method, where tags are used to identify similar URLs that are subsequently added to the queue. It provides a relatively unfiltered view of the data but could experience bias towards popular tags, users and URLs. The two methods complement each other and are represented through Figure 3. It shows (1) whether the post metadata is acquired, (2&4) where the page text and forward link page text is acquired and (3) where the backlink page text is acquired. 33 A significant finding from the research (Heymann, Koutrika, & Garcia-Molina, 2008) was that social bookmarking as a data source for search has URLs that are often actively updated and prominent in search results. The use of tags has been proven to be overwhelmingly relevant and objective. However, these tags are often functionally determined by context. Almost one in six tags are found in the title of the annotated page, and more than half are found in the page text. Other URLs also determine other tags, which still lack in sample size to be more effective in use rather than full text search. Improvement could be made through user interface features that would increase the quality of tags. Figure 3. Real-time Processing Pipeline (Heymann, Koutrika, & Garcia-Molina, 2008) User-generated tags are able to make substantial semantic noise more than the terms from page content and search queries. Tags are made more meaningful when they are created by more users. Popular tags for a document would provide a better incorporation of terms for the queries rather than the frequent content terms. These useful terms (titles, categories, search keywords and descriptions) in the tags grow proportionally with their popularity. While tag suggestions may bring about the bias that 34 the user would have on the more popular ones, it was found that the users did not prefer this kind of suggestion method. They are more interested in the larger set of data rather than its popularity. This, in turn, encourages the users to suggest a few tags for them to gain popularity as well (Suchanek, Vojnovic, & Gunawardena, 2008). A study was made on tagging within folksonomies from a user-centric perspective (Wetzker, Bauckhage, Zimmermann, & Albayrak, 2010). It was seen that users who tag for content categorization develop distinct tag vocabularies over time. While this promotes heterogeneity, it is reduced when the tags of many users are aggregated. This is how characteristic tag distributions are formulated. A novel approach to tag translation was introduced, as it maps user tags to the global folksonomy vocabulary using the labeled resources as intermediates. These mappings were used as basis for inference of the meaning of user tags and a predictor of which tags a user will be assigning to new content. Tag translation under this approach improves prediction accuracy for both tag recommendation and tag-based social search. Expanding this approach for narrow folksonomies would help in understanding how interests of users shift and change over time. By creating accurate user models, quality of service could be improved. Incorporating social annotations with document content is a natural idea, especially for IR application. A framework was proposed to combine the modeling of information retrieval with the documents associated with social annotations (Zhou, Bian, Zheng, Zha, & Giles, 2008). Therein, user domains would be discussed based on their social annotations. Language models from tags are combined with documents before user expertise is evaluated on activity intensity. The study suggests that the effect of parameter sets should be observed, especially on how it impacts user experience. 35 CHAPTER 3: METHODOLOGY 3.1 Introduction The purpose of the study was to determine whether popular internet bookmarking tags can be recreated through crowdsourcing. Amazon Mechanical Turk, the work marketplace for tasks that require human intelligence, was used as a mean to conduct the study. The study was comprised of multiple iterative experiments that were designed to achieve the highest possible quality in popular tag reproduction. Delicious – an online service for tagging, saving, and sharing bookmarks from a centralized location. Most tagged websites and their tags were used as the golden set of tags to be ultimately reproduced in this study. Key research questions for the study were examined along with a number of factors regarding tag creation including the effectiveness of crowdsourcing in reproducing popular tags, categorizing which tags can be recreated most effectively, and the relationship of worker characteristics and demographics on the effectiveness of producing popular tags. Based on these criteria, a quantitative quasi-experimental research design was deemed to be appropriate. This chapter presents a discussion of the following specifications: (a) the research design, (b) sample size, (c) research questions/hypotheses, (d) variables, and finally (e) the data analysis that would be conducted in order to comprehensively address the research objectives. A summary will conclude the chapter. 36 3.2 Research Design This proposed quantitative approach with a quasi-experimental correlational research design primarily examined whether or not popular bookmarking tags can be recreated through crowd sourcing. The main purpose of the research design is to provide a method that allows for effective and efficient reproduction of popular tags using crowdsourcing. To this end a number of experiments were conducted. Each experiment provided useful data that suggested modifications to improve the experimental design of the study, which helped improve tag recreation activity. Figure 4. Iterative experimental design approach used in this study 37 The effectiveness of crowd sourcing in reproducing popular tags was examined using a) quantitative data derived from online surveys and b) popular tags for most tagged websites on Delicious. Participants were gathered by posting tagging tasks on Mechanical Turk. Each participant was required to go through a qualification survey before he/she was trusted to take part in the research study. This quality assurance step was necessary to protect against automated scripts and workers that were trying to game the system. There were three main objectives to the quality assurance step: a) verifying that the participants understood the task and what was requested of them: b) identifying incomplete responses or non-sense responses, c) identifying cheaters and preventing them from participating in the study. Five websites were considered for tagging tasks and this include You Tube, Flickr, Pandora, Facebook, and Digg. Those sites were chosen because they are the most all time tagged sites on Delicious according to popacular.com. Popacular.com is an online service that tracks most tagged web pages on Delicious at the following intervals: hourly, 8 hours, day, week, month, and all time. The top 10 most popular tags for each one of these sites were used in this study along with data collected from the study participant survey responses. The top 10 popular tags were used as a golden set to measure the participant’s ability to reproduce the same tags and exploring tag creation effectiveness with a number of user related factors. The 38 analysis of these variables with respect to the objectives of the study was completed by employing analysis of variance (ANOVA) and multiple linear regression. 3.3 Appropriateness of Design The use of a quasi-experimental research design allowed the determination of whether there were statistically significant differences between groups (Cozby, 2001) in which for this study are the different tag and websites. The quasi-experimental design was appropriate to assess these differences because it allowed the researcher to compare the levels or categories of the independent variables with regard to the dependent variable in order to determine whether there was a difference between the groups (Broota, 1989). More so, this quasi experimental correlational quantitative study specifically investigated the relationship of tagging experience (both usage and creation), search engine experience, interest in the website, and average daily time spent on the Internet of the participants. With such objective then a correlational design was appropriate. In the context of social and educational research, correlational research is used to determine the degree to which one factor may be related to one or more factors under study (Leedy & Ormrod, 2005). The research design is quantitative for the reason that a comparison was made between an independent variable and dependent variable (Creswell, 2009). This means 39 that the researcher was able to quantitatively assign numerical values to the independent and dependent variables so that a comparison was possible. The quantitative research approach was more appropriate for this research study than a qualitative design because with a qualitative design the researcher would not be able to assess a direct relationship between two variables as a result of the open-ended questions (Creswell, 2009). Qualitative design is more appropriate for observational or exploratory research that requires open ended questions and possibly ethnographic procedures. This study however follows a traditional deductive approach by building on existing theories and operationalizing variables derived from previous empirical studies. In this study quantitative research methods are most appropriate since the researcher was able to measure the variables needed for this study and define specific research questions derived from existing research. Therefore, the quasi-experimental design was used since this would allow the researcher to determine whether there was a difference in the different tags and websites based on the dependent variables. In order to determine whether there was a difference between the tag creation effectiveness and the various sites in terms of the tagging experience (both creation and usage), search engine experience, and average daily time spent on the Internet, an analysis of variance (ANOVA) was implemented. The ANOVA was appropriate because 40 the purpose was to determine whether there was a statistically significant difference between two independent populations (treatment vs. control) (Moore & McCabe, 2006). In addition, a multiple linear regression analysis was used to determine the relationship between the independent and dependent variables. The dependent variable would be tag creation effectiveness. The independent variables were interest in the website, familiarity with website, previous tag usage experience, previous tag creation experience, experience with search engines, time spend on the internet, and tag types. A multiple linear regression is appropriate because there would be multiple independent variables and only one dependent variable (Moore & McCabe, 2006). 3.4 Research Questions A number of empirical studies concluded that social bookmarking tags can provide additional data to search engines that were not provided by other sources and consequently improve web search. The same studies however concluded that there was a lack of availability and distribution of the tags that can improve search. This study was focused on finding a way to create social bookmarking tags efficiently and effectively using crowdsourcing. The research questions and hypothesis that guided this study were: 41 RQ1: Are there statistically significant differences in tag creation effectiveness for popular tags among the sites included in this study? H10: There are no statistically significant differences in tag creation effectiveness for popular tags among the sites included in this study. RQ2: Are there statistically significant differences in tag creation effectiveness across tag types? H20: There is no statistically significant difference in tag creation effectiveness across tag types. RQ3: What is the relationship among tag creation effectiveness and Interest in the Website Topic, Experience with Website or Similar Website, Tagging Creation Experience, Tag Usage Experience, Experience with Search Engine, and Time spent on the Internet? H30: None of the independent variables of Interest in the Website Topic, Experience with Website or Similar Website, Tagging Experience, Experience with Search Engine, and Time on the Internet have a statistically significant effect on tag creation effectiveness. 42 3.5 Population The participants for this study were selected by posting tagging tasks on Mechanical Turk. All participants were subjected to an initial qualification survey before being allowed to participate in this study. Information related to tagging tasks was collected from the participants and were subjected for analysis. 3.6 Sampling When calculating the sample size for the study, there are several factors that have to be taken into consideration. These factors include the power, the effect size, and the level of significance of the study. The statistical power is based on the probability of rejecting a false null hypothesis. As a general rule of thumb, the minimum power of a study that would be necessary to reject a false null hypothesis would be equal to 80% (Keuhl, 2000). The next important factor is the effect size. The effect size is a measurement of the strength of the relationship between the independent and dependent variables in the analysis (Cohen, 1988). In most instances, the effect size of the study can be divided into three different categories: small, medium, and large. Finally, the last two important considerations for the correct calculation of the sample size are the level of significance and the statistical procedure. The level of significance is usually set at an alpha equal to a 5%, which is typically the standard for 43 statistical significance. The statistical procedure must also be taken into account. Simple t-tests require a smaller sample than multiple regressions and, as a result, the most complicated method determines the sample size. In this case, multiple linear regression was used. Based on this information, the minimum sample size required for this study was 74 (specified as a medium effect size, a power of 95% and a level of significance equal to 5%). However in this study the overall number of participants gathered and user for the analysis was 107. 3.7 Instrumentation and Data Collection The information that was used for this study comes from two sources: 1. Popacular.com was used to obtain the top 5 most tagged web pages on Delicious. In this study the researcher used the all-time data for most tagged sites. Other options include hourly, 8 hours, daily, weekly, and monthly. 2. A survey that was presented on Mechanical Turk (see Appendix A). The survey gathered key demographical information of the participants along with information pertaining to tagging tasks. The information that was gathered from this instrument included age, gender, education level, participant’s interest in site, familiarity with the site, and participant’s experience with search engines, time typically spent on the Internet, tag creation and usage 44 experience if any. The collection of data was administered through the Mechanical Turk system. The researcher used iterative survey research design and kept updating the survey and qualifications requirements until the desired quality was achieved. There were three total iterations of this survey. Each iteration provided tags that overlap more with the golden set of popular tags gathered from popacular.com. The researcher found that the forth iteration did not provide any tag quality benefit and decided to lock in the design and instructions of the third survey. Mechanical Turk allowed workers to comment on tasks and provide feedback to requesters. The researcher found this feature to be very useful as it helped the researcher quickly identify ambiguous questions and task instructions and improve them in relatively short period of time. The initial survey design yielded low quality responses for the reason that turkers try to game the system by attempting to complete a high number of human intelligent tasks (HITs) in the shortest possible time. The original task that was given to the selected participants was priced at $0.02 or 2 cents. The initial survey did not have a qualification requirement. So in the second iteration the researcher added a qualification requirement for the available HITs. The qualification requirements were mainly geared to ensure that the workers are invested in 45 the task and intended to perform it well. Some of these qualification requirements included questions about the “about us” section of the sites included in the study. The questions were brief but ranged from asking the worker how many images were present on a certain web page to finding a sentence and fill in the missing words in the survey for the same sentence. In this second iteration a number of workers provided feedback regarding the work on some questions or tasks. In the third iteration, the researcher introduced a survey with improved instructions and clearly stated questions. This was the last iteration that provided the highest quality results (later iterations did not add any significant improvement). The researcher at that time finalized the survey design and launched the actual study. The final survey contained the enhanced version of the instructions, qualification requirement and the questions related to the 5 websites. The final survey HIT was priced at $0.02 or 2 cents. A participant average time to complete the survey is 15 minutes. Shortest time was 12 minutes and longest was 19. The responses were very reliable and this had resulted in the completion of the final survey responses in 5 hours. The raw data from Mechanical Turk was then downloaded for statistical analyses. A unique identification number was assigned to each of the participants so that no 46 personal information was revealed or exposed (Cozby, 2001). This identification number was used to specify each participant in the study. 3.8 Operationalization of Variables The following variables and their specifications will be used in the analysis. Tag Creation Effectiveness (TCE): Dependent continuous variable. TCE was calculated as the proportion of the participant-created tags that are listed on the popular tag list generated by the social network users. 10 popular tags were used for each site. Each tag was given a value that represents the usage frequency of the tag by Delicious users. For example, if 100,000 users used tag1 and 50,000 users used tag2, then tag1 is assigned a higher score then tag2 which is a reflection of frequency of use. Therefore, more popular tags, i.e. employed by more users, provide a greater variance in this variable and thus, a more robust analysis. Tag Type (TT): Independent categorical variable. Tag type was designed to categorize the type of tags created. In this case the researcher used the tag classification schema provided by Bischoff et al. 2008, which includes: Topic, Time, Location, Type, Author/Owner, Opinion/Qualities, Usage Context, and Self. 47 Interest in the Website Topic (Interest): Independent ordinal variable. Interest was assessed through a 2 point Likert-scale question with 1 being most interested and 0 being least interested. Experience with Website or Similar Website (Experience): Independent ordinal variable. Experience was assessed through a 5 point Likert-scale question with 4 being most experienced and 0 being least experienced. Previous Tag Usage Experience: Independent ordinal variable. This variable was assessed through a 4 point Likert-scale question with 3 being most experienced and 0 being least experienced. Previous Tag Creation Experience (TCX): Independent ordinal variable. This variable was assessed through a 5 point Likert-scale question with 4 being most experienced and 0 being least experienced. Previous Tag Usage Experience (TUX): Independent ordinal variable. This variable was assessed through a 5 point Likert-scale question with 4 being most experienced and 0 being least experienced. Experience with Search Engine: Independent ordinal variable. This variable was assessed through a 5 point Likert-scale question with 4 being most experienced and 0 being least experienced. 48 Average Daily Time Spent on the Internet: Independent ordinal variable. This variable was assessed through a 4 point Likert-scale question with 3 being most time and 0 being least time. 3.9 Data Analysis The data analysis that was used in this study comprised of descriptive statistics, analysis of variance (ANOVA), and multiple linear regression. Each of these analyses was conducted in SPSS Version 16.0®. 3.9.1 Descriptive Statistics The descriptive statistics was comprised of frequency distributions as well as measures of central tendency. For the frequency distributions, the number and percentage of each occurrence were presented for the categorical variables in the study. The measures of central tendency included the presentation of the mean, standard deviation, and minimum and maximum values for the continuous variables in the study such as the age of the participant. 3.9.2 ANOVA As a subsequent analysis, an ANOVA was conducted for the first and second hypothesis. The ANOVA is a statistical method that is used in order to determine whether an independent variable(s) has a significant impact on a single dependent 49 variable. An advantage of the ANOVA is that it allows the researcher to be able to include more than one independent variable in the model at the same time in order to determine the effect of each variable or to control for specific variables (Tabachnick & Fidell, 2001). In other words, the researcher is not limited to only including one variable in the analysis. This is important since this allows the researcher to control for a number of variables that may be related to the dependent variable. When the variables have been included in the ANOVA model, the results would indicate whether an individual or several independent variables contribute to the explanation in the variation of the dependent variable (Tabachnick & Fidell, 2001). What this means is that if a variable is found to be significant then it could be concluded that this variable significantly contributes to the explanation in the variation of the dependent variable (Keuhl, 2000). The significance of the test is based on an F-statistic that is from the F-distribution (Keuhl, 2000). Therefore, if the F-statistic exceeds this critical value then one would be able to conclude that there is a relationship between the independent and dependent variables. 3.9.3 Multiple Linear Regression A multiple linear regression model was used specifically for the third research question. The dependent variable would be tag creation effectiveness. The independent 50 variables were Interest in the Website Topic, Experience with Website or Similar Website, Tag Creation Experience, Tag Usage Experience, Search Engine Experience, and Time on the Internet. A multiple linear regression is appropriate because there would be multiple independent variables and only one dependent variable. This would be the most complex of the analyses because there would have to be more assumptions made in order to make valid inferences about the target population. The one limitation to this multivariate analysis is that the regression residuals must be normally distributed. Statistically significant parameter estimates for the multiple linear regression at the 0.05 significance level would be sufficient evidence to reject the null hypothesis. 3.10 Summary This chapter presented the type of research design that was used which is a quasiexperimental correlational design. This was chosen because it is the objective of the study to determine whether there are significant relationships between or among tag creation effectiveness and a number of independent variables. Mechanical Turk workers were surveyed and used as participants for this study. In terms of the statistical analysis, three separate statistical tests were used. Descriptive analysis, ANOVA, and multiple linear regression were deemed to be the most appropriate methodologies for testing the hypotheses of the study. This chapter also discussed the source of the data, research 51 questions and procedures, hypotheses and data collection. The data analysis and results will be discussed in Chapters 4 and 5 52 CHAPTER 4: RESULTS In this chapter, the results of the statistical analyses that were conducted to address the objectives of the study are presented. The chapter is organized in the following manner: 4.1 Introduction 4.2 Collected Data and Overview of Sample Population 4.3 Hypothesis Data Analysis 4.4 Summary 4.1 Introduction At a high level, this study examined if popular tags can be reproduced using crowdsourcing systems. To that end, there were three research questions – two primary ones and one secondary. The first research question (RQ1) examined the tag creation effectiveness of popular tags across the sites included in our study. The second research question (RQ2) examined those relationships by tag type to find out if certain types of tags are easier to reproduce by employing crowdsourcing workers. The third research question (RQ3) was mainly concerned with exploring the relationship between tag creation effectiveness and the following user specific factors: time spent on the Internet, experience with search engines, interest in the site, familiarity with the site or similar sites, previous tag creation experience, and previous tag usage experience. 4.2 Collected Data and Overview of Sample Population This study included data sets from three main sources: a) survey responses from Mechanical Turk study participants, b) popacular.com for the top most tagged sites on Delicious, c) Delicious sites for the golden set of the top 10 popular tags used in this study. 53 4.2.1 Mechanical Turk Population and Survey Descriptive Statistics Amazon Mechanical Turk is an online marketplace that was launched in 2005 to facilitate the completion of tasks that require human intelligence. This service provided requestors with a diverse, on-demand, scalable workforce while giving workers the flexibility to work from anyplace anytime and the freedom to choose from thousands of tasks. Mechanical Turk was based on the simple idea that there were many tasks that human beings can do much better and more effectively than computers. Tasks in this marketplace ranged from identifying objects in a photo or a video, performing data inspection and clean-up, translation, transcription of audio recordings, or researching data and validating its accuracy. During the time of this study the Amazon Mechanical Turk marketplace had about 85,000 tasks available to workers. At that time Mechanical Turk was viewed as a sweatshop that takes advantage of people by making them do tedious tasks in exchange for pennies. Many people wondered about the workers population and their demographics. In 2008 Panos Ipeirotis, a researcher at Stern School of Business of New York University, conducted an extensive survey that revealed data regarding the demographics of Mechanical Turk workers and proved that the early ideas about who these people were was far from accurate. To that end and according the 2008 survey about 76% of the workers were from the US, 8% were from India, 3.5% were from the United Kingdom, 2.5% were from Canada, and the remaining 9% were distributed across a large number of counties. The survey also revealed that about 59% of workers were females. Age distribution data was favoring the age group of 21 years of age to 40 years of age. Figure 5 shows the details of age distribution. 54 Figure 5: Age Distribution of Mechanical Turk Workers For education level about 52% of the workers have a bachelor’s degree. Figure 5 shows the details of education distribution of the Mechanical Turk Population: Figure 6: Education Distribution of Mechanical Turk Workers The survey also provided information about why workers participate in the Mechanical Turk marketplace – i.e. what motivates them to complete these tasks? Figure 7 shows that for money only, for money and fun, for money fun and killing time are the primary three reasons of participation. 55 Figure 7: Primary Reasons for Participation In summary, it was concluded that the Mechanical Turk population is good representation of online users. The study sample included responses from 107 total Mechanical Turk participants and all the responses were used for the statistical analysis. Sixty-four of these participants (59.8%) were female. The age of the participants ranged from 18 to 66 years (M = 42.21, SD = 10.92). Table 2 presents descriptive statistics on the participants’ education, experience with search engines, time spent on the internet, tagging usage experience, and tag creation effectiveness (agreement scores between participant’s provided tags and popular tags) for each of the five sites included in the study. As can be gleaned from this table, the average tag creation effectiveness ranged from .5665 (digg) through .7561 (facebook). Table 2 Descriptive Statistics of Study Sample Variable Minimum Maximum Mean Std. Deviation Age 18.00 66.00 42.2150 10.92170 Education 1.00 7.00 3.3271 1.62397 56 Variable Minimum Maximum Mean Std. Deviation ESE .00 4.00 3.0187 .85761 Time Spent on the Internet .00 3.00 2.0374 .86793 Previous Tag Usage Exp .00 4.00 2.1215 1.37848 Previous Tag Creation Exp .00 4.00 1.5981 1.18050 Agreement score – Youtube .35 .86 .6577 .18398 Agreement score – Flickr .36 .83 .5713 .18366 Agreement score – Pandora .27 .85 .6084 .20645 Agreement score – Facebook .47 .85 .7561 .14264 Agreement score – Digg .36 .71 .5665 .13884 4.2.2 Popacular and Delicious Data for Most Tagged Sites Popacular is a site that offered data about popular Delicious bookmarks and related user tagging activities including a list of the 100 most tagged sites on Delicious, how many users tagged each one of these sites and time durations of tagging activities such as: hourly, daily, weekly, monthly and an all-time category of most tagged sites. The all times list reflected the least fluctuation in activities and changes to the sites over time while the hourly lists reflected the most changes in the list of sites and frequency of tagging. The all-time list of 100 sites had been tagged by a total of 3,328,778 users. The top ranked site was tagged by 91,345 users and the last site on the list was tagged by 21,370 users (M = 33,287.78, SD = 12,720.73). Appendix B provides the complete list of the 100 most tagged sites. 57 The top 5 most tagged sites were chosen for this study. Table 3 provides detailed information about number of users that tagged each one of these sites. Table 3 Data for the All Time Top 5 Most Tagged Sites Site Brief Description Youtube Video sharing Flickr Photo sharing Pandora Music sharing Facebook Relationships Digg News Sharing No of Unique Taggers 91,347 79,982 62,186 62,007 58,237 Delicious.com provided the list of tags for the five sites included in this study. Tables 5 through 9 below provide more data regarding the tags used and frequency of use for each site. Tag type was determined using the classification schemes developed by Sen et al. (2006). Table 4 shows the tag type classification schemas available and the mapping between them. Table 4 Mapping between tag classification schemes Bischoff et al. Golder et al. Topic What or who it is about Time Refining Categories Location Type What it is Author/Owner Who owns it Qualities and Opinion/Qualities characteristics Usage Context Task organization Self-Reference Self-Reference Xu et al. Content-based Context-based # of Taggers Tag Weight Factual Attribute Subjective Subjective Organizational Personal Table 5 Site 1 - YouTube Tagging Data from Delicious Tag Sen et al. Tag Type (Category) 58 Video youtube Videos entertainment Media web2.0 Social Fun Music Community Total 26,000 18,280 16,906 9,221 7,559 6,747 4,649 4,626 3,391 3,141 100,520 0.258654994 0.181854357 0.168185436 0.091732988 0.075198965 0.067120971 0.046249503 0.046020692 0.03373458 0.031247513 1 F F F F F F S S F F Table 6 Site 2 - Flickr Tagging Data from Delicious Tag Photos Flickr photography Photo Sharing Images web2.0 Community social Pictures Total # of Taggers 22,755 19,077 15,990 15,256 10,670 9,650 9,542 4,586 3,805 3,805 115,136 Tag Weight 0.197635839 0.165691009 0.138879238 0.132504169 0.092673013 0.083813924 0.082875903 0.039831156 0.033047874 0.033047874 1 Table 7 Site 3 - Pandora Tagging Data from Delicious # of Tag Taggers Tag Weight Music 41,731 0.369291081 Radio 24,403 0.215950019 Pandora 8,181 0.072396308 streaming 8,149 0.07211313 Audio 7,560 0.066900879 Free 6,210 0.054954293 web2.0 6,010 0.053184429 mp3 4,908 0.043432475 Tag Type (Category) F F F F F F F F S F Tag Type (Category) F F F F F F F F 59 recommendations Social Total 3,000 2,851 113,003 0.026547968 0.025229419 1 Table 8 Site 4 - Facebook Tagging Data from Delicious # of Tag Taggers Tag Weight Facebook 16,466 0.231335525 Social 15,174 0.213183849 Networking 9,711 0.136432606 Friends 8,272 0.116215685 Community 6,590 0.092584787 socialnetworking 4,732 0.066481216 web2.0 4,448 0.062491219 network 3,104 0.04360898 Blog 1,443 0.020273118 Personal 1,238 0.017393015 Total 71,178 1 F S Tag Type (Category) F F F F F F F F S S Table 9 Site 5 - Digg Tagging Data from Delicious # of Tag Taggers Tag Weight Tag Type (Category) News 25,629 0.297665505 F Technology 12,263 0.14242741 F Blog 9,405 0.109233449 F web2.0 9,041 0.105005807 F Social 7,090 0.082346109 F tech 6,947 0.08068525 F Daily 5,445 0.063240418 S community 4,920 0.057142857 F Links 2,732 0.031730546 F web 2,628 0.030522648 F Total 86,100 1 4.3 Hypothesis Data Analysis Hypothesis 1 60 Null hypothesis 1 stated “There are no statistically significant differences in tag creation effectiveness for popular tags among the sites included in this study.” To assess whether there was a statistically significant difference in tag creation effectiveness among the sites, a repeated-measures ANOVA was conducted. The dependent variables in this analysis were the tag creation effectiveness scores for all five sites. The results are presented in Table 9. Results from the repeated-measures ANOVA showed that there were indeed significant differences in tag creation effectiveness across the sites (F (1, 106) = 70.597, p < 0.001). In order to assess which sites were significantly different from other ones, multiple pairwise comparisons were conducted, using a Bonferroni correction. The results are presented in Table 10. 61 Table 10 Pairwise Comparisons of Tag Creation Effectiveness among Sites (I) Site 1 2 3 4 5 (J) Site Mean Difference (I-J) Std. Error Sig. 2 .086* .011 .000 3 .049* .012 .001 4 -.098* .014 .000 5 .091* .011 .000 1 -.086* .011 .000 3 -.037 .014 .109 4 -.185* .015 .000 5 .005 .012 1.000 1 -.049* .012 .001 2 .037 .014 .109 4 -.148* .015 .000 5 .042 .016 .082 1 .098* .014 .000 2 .185* .015 .000 3 .148* .015 .000 5 .190* .011 .000 1 -.091* .011 .000 2 -.005 .012 1.000 3 -.042 .016 .082 4 -.190* .011 .000 62 As can be gleaned from this table, most of the pairwise comparisons were statistically significant. Site 4 had the highest average tag creation effectiveness (M = 0.756), and it was significantly higher than all other sites. Site 1 had the second highest average tag creation effectiveness (M = 0.657), and it was also significantly different from all other sites. The lowest average tag creation effectiveness was observed for Site 5 (M = 0.566), although its average was not significantly different from that of sites 2 or 3. Based on these results, Null Hypothesis 1 was rejected. Hypothesis 2 Null hypothesis 2 stated “There is no statistically significant difference in tag creation effectiveness across tag types.” To assess whether there was a statistically significant difference in tag creation effectiveness across tag types, a series of ANOVAs were conducted. Specifically, one ANOVA for each site was used. The dependent variable in this analysis was the participants’ tag creation effectiveness for the site, whereas the grouping variable was the tag type for each participant for that site. There were two categories of tag types: “F” and “F and S.” It is important to note that, for Site 4, all participants had the same tag type (F). Therefore, no comparison was possible for this site. The analyses were thus limited to Sites 1, 2, 3 and 5. The results are presented in Table 11. As can be gleaned from this table, tag creation effectiveness was significantly higher (p < 0.001 in all cases) for the “F and S” tag types (with average tag creation effectiveness ranging from .715 to .838) than for “F” tag types (with average tag creation effectiveness ranging from .455 to .558). Therefore, Null Hypothesis 2 was rejected. 63 Table 11 Comparison of Tag Creation Effectiveness by Tag Types at Sites 1, 2, 3 and 5 F F and S M SD M SD F(1, 105) P Site 1 .558 .156 .838 .017 120.780 <.001 Site 2 .535 .166 .833 .000 41.693 <.001 Site 3 .525 .175 .833 .073 85.533 <.001 Site 5 .455 .067 .715 .000 681.526 <.001 Note: The F statistic corresponds to the test statistics of the ANOVA comparing tag creation effectiveness between the “F” and “F and S” group. The p value is the one associated to that test. Hypothesis 3 Null Hypothesis 3 stated: “None of the independent variables of Interest in the Website Topic, Experience with Website or Similar Website, Tagging Experience, Search Engine Experience, and Time on the Internet have a statistically significant effect on tag creation effectiveness.” To determine the relationship between tag creation effectiveness and a set of predictor variables, a series of multiple linear regression analysis procedures were used. Specifically, five regressions were estimated; one for each site. The dependent variable in these analyses was tag creation effectiveness for the site. The predictor variables were: interest in the site, familiarity with site, previous tag usage experience, previous tag creation experience, experience with search engines (ESE), time spent on the internet, and tag types. It is important to note that some of the predictor variables were constant for some of the sites, and thus had to be removed from the analysis. For example, as explained previously, it was not possible to use tag type as a predictor variable for Sites 3 and 4. 64 Additionally, for Site 4, the variable “interest in the site” had to be dropped for the same reason. The regression results are presented in tables 12 through 16. Table 12 Regression Results for Site 1 Variable (Constant) B Std. Error .480 .025 -.108 .018 Familiarity with Site .098 Previous Tag Usage Exp Beta t Sig. 18.870 .000 -.234 -6.021 .000 .011 .476 8.816 .000 .067 .007 .506 10.062 .000 Previous Tag Creation Exp .098 .010 .629 10.290 .000 ESE .008 .007 .037 1.188 .238 Time Spent on the Internet .000 .012 -.001 -.016 .987 -.189 .023 -.493 -8.374 .000 Interest in Site Participant Tag Types (=F and S) R2 = .951; F(7, 99) = 272.131, p < 0.001 Table 13 Regression Results for Site 2 Variable B Std. Error (Constant) .318 .017 Interest in Site .228 .016 Familiarity with Site .034 Previous Tag Usage Exp Beta t Sig. 18.530 .000 .619 13.967 .000 .009 .167 3.658 .000 .008 .005 .057 1.605 .112 Previous Tag Creation Exp .031 .007 .199 4.497 .000 ESE .002 .005 .009 .400 .690 -.012 .008 -.057 -1.457 .148 .052 .014 .093 3.714 .000 Time Spent on the Internet Participant Tag Types (=F and S) R2 = .972; F(7, 99) = 491.780, p < 0.001 65 Table 14 Regression Results for Site 3 Variable B (Constant) Std. Error .427 .021 -.036 .010 Familiarity with Site .186 Previous Tag Usage Exp Beta t Sig. 20.780 .000 -.073 -3.633 .000 .007 1.072 26.393 .000 .007 .006 .045 1.106 .271 Previous Tag Creation Exp -.007 .008 -.042 -.903 .369 ESE -.003 .006 -.011 -.456 .650 Time Spent on the Internet .007 .010 .029 .682 .497 Participant Tag Types (=F and S) -.054 .013 -.117 -4.191 .000 Interest in Site R2 = .968; F(7, 99) = 423.770, p < 0.001 Table 15 Regression Results for Site 4 Variable B Std. Error (Constant) .379 .032 Interest in Site .183 .014 Previous Tag Usage Exp .014 Previous Tag Creation Exp ESE Time Spent on the Internet R2 = .825; F(5, 101) = 95.043, p < 0.001 Beta t Sig. 11.842 .000 .965 13.071 .000 .010 .132 1.427 .157 -.008 .013 -.066 -.634 .527 .002 .009 .012 .213 .832 -.028 .016 -.169 -1.749 .083 66 Table 16 Regression Results for Site 5 Variable B Std. Error (Constant) .131 .013 Interest in Site .134 .008 -.015 Previous Tag Usage Exp Beta t Sig. 9.791 .000 .379 16.035 .000 .006 -.125 -2.664 .009 .007 .003 .068 2.135 .035 Previous Tag Creation Exp .000 .004 .002 .052 .959 ESE .003 .003 .016 .868 .387 -.001 .005 -.006 -.175 .862 .232 .011 .832 20.499 .000 Familiarity with Site Time Spent on the Internet Participant Tag Types (=F and S) R2 = .981; F(7, 99) = 716.785, p < 0.001 As can be gleaned from these tables, the predictive power of the models was high in all five cases, with R2 statistics ranging from .825 (for Site 4) through .981 (for Site 5). This suggests that the chosen set of predictor variables was enough to explain a very large proportion of the variability in tag creation effectiveness. The following conclusions can be derived from the regression results. First, it is apparent that experience with search engines and time spent on the internet did not have a significant effect on tag creation effectiveness for any of the sites. For Site 1, tag creation effectiveness was significantly and negatively related with interest in site and positively related with familiarity with site, previous tag usage experience, and previous tag creation experience. 67 For Site 2, tag creation effectiveness was significantly and positively related with interest in site, familiarity with site, previous tag creation experience. However, tag usage experience was not significantly related to tag creation effectiveness. For Site 3, tag creation effectiveness was significantly and negatively related with interest in site and positively related with familiarity with site. Moreover, for Site 4, tag creation effectiveness was significantly and positively related only with interest in site. Finally, for Site 5, tag creation effectiveness was significantly and positively related with interest in sit and previous tag usage experience. Additionally, it was significantly and negatively related with familiarity with site. 4.4 Summary The purpose of the study was to determine whether popular internet bookmarking tags can be recreated through crowdsourcing. Based on the results from the statistical analysis, it was found that Sites 4 and 1 had the highest average tag creation effectiveness, while the lowest one was associated with Site 5. Moreover, it appears that tag creation effectiveness was significantly higher for tag type “F and S” than for tag type “F.” Additionally, other variables were tested to assess their relationship with tag creation effectiveness. Interest in site, familiarity with site, tag creation experience and tag usage experience were significantly related to tag creation effectiveness for some of the sites, although the direction and significance of these relationships was not consistent across sites. 68 CHAPTER 5: CONCLUSIONS AND RECOMMENDATIONS The purpose of this experimental study was to determine whether popular internet bookmarking tags can be recreated through crowdsourcing. Using the Amazon Mechanical Turk as a means to conduct an experiment, the reproduction of popular Delicious tags for a variety of websites was successfully achieved. Additional objectives of the study was to examine a number of factors regarding tag creation including the effectiveness of crowdsourcing in reproducing popular tags, learn about what tags can be recreated most effectively, and the relationship of worker characteristics and demographics on the effectiveness of producing popular tags. The dependent variable for the study is tag creation effectiveness while the independent variables for the study are tag type, interest in the website topic, experience with website or similar website, tagging creation or usage experience, search engine experience, and average daily time spent on the internet. An analysis of variance (ANOVA) was conducted to determine the relationship among the independent and the dependent variables. Chapter 5 provides interpretations of the findings found in Chapter 4 as it relates to the research questions and literature reviewed. Chapter 5 also provides recommendations in terms of the significance of the study. Recommendations for future research and a brief summary conclude the chapter. 5.1 Scope, Limitations, Delimitations The scope of the present study was limited to the participants selected by post tagging tasks on Mechanical Turk. Mechanical Turk workers are believed to be an acceptable representative of online users according to the surveys conducted by Panos 69 Ipeirotis, an associate Professor at the IOMS Department at Stern School of Business of New York University in 2008 and 2010. However, online users are quite dynamic and they are always shifting and changing in demographics and interests. The same applies to the Amazon Mechanical Turk worker communities. Since one community is a subset of the other, in this case Mechanical Turk workers are also online users, it is reasonable to assume that these two communities would have similar characteristics to some degree. However, the study may not be generalizable beyond the scope of these participants as they may not represent the total user of the sites in consideration or the population that uses search engines to find information online. Limitations include the nonrandomization of the participants selected and the truthfulness of the answers given by the participants, which could limit the analysis and interpretations of the study’s results. 5.2 Findings and Implications A total of 107 participants were used to gather data and results were subjected to statistical analysis to answer the study’s research questions. This section will outline the findings and its implications in the field of the bookmarking tags and crowd sourcing per research questions. Research Question 1: Are there statistically significant differences in tag creation effectiveness for popular tags among the sites included in this study? H10: There are no statistically significant differences in tag creation effectiveness for popular tags among the sites included in this study. Results from the repeated-measures ANOVA showed that there were indeed significant differences in tag creation effectiveness across the sites. In order to assess 70 which sites were significantly different from other ones, multiple pairwise comparisons were conducted, using a Bonferroni correction. This analysis revealed that Site 4 had the highest average in tag creation effectiveness, Site 2 falls second inline while Site 5 had the lowest average amongst the five sites included in our study. One possible reason for this is the popularity and wide memberships of Sites 4 and 1 especially when compared to site 5. During the time of this study Sties 4, 1 and 5 had 500+ million, 70+ million, and 2.7+ million active members respectively. This finding justifies what has been founded in the literature. It has been said by Bischoff, Firan, Nejdl, and Paiu (2008) that popular tags have been utilized as a way of bookmarking and giving out brief, concise summaries about web pages for search engines. Thus, this could be used in a developed algorithm that would measure the popularity of a page or its contents. Further, in terms of social bookmarking systems the finding strengthens the idea that tags could also help in the detection or identification of trends in tagging, popularity and content. For example, del.icio.us was fast growing because of its ability to centrally collect and share bookmarks among users. It follows a format that shares information through two channels of the website. The first channel is through bookmarks or tags. This is where users subscribe to others’ content and are updated whenever their interests are added onto. The second channel is through the main webpage, where the front page is the primary means of sharing information. As it is the first point of contact, it attracts the attention of all visitors of the site (Wetzker, Zimmermann, & Bauckhage, 2008). While many believed that popular tags can be used to improve web search but integrating would not yield a noticeable difference due to lack of their availability and distribution across the web. This study provides a way to 71 generate popular tags in an efficient and scalable way, which opens the door to incorporating existing popular tags and creating them for sites where they do not yet exist. Research Question 2: Are there statistically significant differences in tag creation effectiveness across tag types? H20: There is no statistically significant difference in tag creation effectiveness across tag types. Series of ANOVA analyses were conducted to assess the above hypothesis. The dependent variable used was the tag creation effectiveness and the independent variable was the tag type for each participant for that site. There were two categories of tag types: “F” and “F and S.” It is important to note that, for Site 4, all participants had the same tag type (F). Therefore, no comparison was possible for this site. The analyses were thus limited to Sites 1, 2, 3, and 5. Based from the series of ANOVA results, it was found that tag creation effectiveness was significantly higher at p < 0.001 for the “F and S” tag types than for “F” tag types in all cases. This means that there is enough statistical evidence that there is a significant difference in tag creation effectiveness across tag types. This finding suggests that tag creation effectiveness will differ as the tag type changes. Bischoff, Firan, Nejdl, and Paiu (2008) mentioned that there are 8 different dimensions of tag types. These dimensions are topic, time, location, type, author/owner, opinions/qualities, usage context, and self-reference. With these dimensions, it is expected that the level of tag creation effectiveness will differ. More so, the dimension type refers to the kind of media that is used, such as the type of web page presented. As 72 such, a site that is using different media as compared to another site is expected as well to have a different tag creating effectiveness. Research Question 3: Are there statistically significant differences in tag creation effectiveness across tag types? H30: None of the independent variables of Interest in the Website Topic, Experience with Website or Similar Website, Tagging Experience, Search Engine Experience, and Time on the Internet have a statistically significant effect on tag creation effectiveness. A series of multiple linear regression analysis was used to test the above hypothesis. The independent variable was the tag creation effectiveness and the dependent variables were interest in the site, familiarity with site, previous tag usage experience, previous tag creation experience, experience with search engines (ESE), time spent on the internet, and tag types. Five regressions were estimated, one for each site. The predictive power of the models was high in all five cases, with R2 statistics ranging from .825 (for Site 4) through .981 (for Site 5). This suggests that the chosen set of predictor variables was enough to explain a very large proportion of the variability in tag creation effectiveness. 5.3 Recommendations The findings from the study revealed several significant themes with regard to finding ways to create social bookmarking tags efficiently and effectively using crowdsourcing. This objective was measured through tag creation effectiveness. Firstly, tag creation effectiveness was measured against popular tags among the sites considered in the study. The findings suggest that popular tags can indeed be recreated using 73 crowdsourcing and thus can be made available through this method to improve web search. Creating popular tags or tags useful for search engines through crowdsourcing is one reliable, effective and efficient way to go about it. This solved the scarcity and limited distributions of popular tags, which have been proven most useful to improve web search. Secondly, tag creation effectiveness was measured against tag types. The results suggest that the effectiveness of tags differs from one tag type to another. Thus, it must be of a concern on a site with different tag types. Lastly, tag creation effectiveness was measured across user specific characteristics. It has been found that experience with search engines and time spent on the internet did not have a significant effect on tag creation effectiveness for any of the five sites considered in the study. Specifically, the finding suggests that tag creation effectiveness for Site 1 was significantly and negatively related with interest in site and positively related with familiarity with site, previous tag usage experience, and previous tag creation experience. Meanwhile, tag creation effectiveness for Site 2 was significantly and positively related with interest in site, familiarity with site, previous tag creation experience. Tag creation effectiveness for Site 3 was significantly and negatively related with interest in site and positively related with familiarity with site. Moreover, for Site 4, tag creation effectiveness was significantly and positively related only with interest in site. Lastly, tag creation effectiveness for Site 5 was significantly and positively related with interest in sit and previous tag usage experience. 5.4 Scope and Limitations of the Study The scope of this study was limited to user-generated content only, more specifically, through social networking sites. It does not cover or address other types of 74 content created through traditional channels with proper content management and control processes and procedures. Therefore, the findings of this study should apply only to systems of user-generated content. For example, the findings of this study may not be expanded to business or institutional sites without further research and examination. The rational to this focus is driven from the basic idea that user-generated content presents a bigger problem when it comes to information organization and indexing. Given the speed in which user-generated content is created, it is important for us to find new and creative ways to quickly index this content so it can be accessible to users through web search. Furthermore, the study focused on measuring Tag Creation Effectiveness for the top 5 most tagged sites on Delicious. Tag Creation Effectiveness in essence is the level of agreement between the tags produced by the study participants and those found as popular tags on Delicious for the 5 sites included in this research. There was a question regarding the use of popular Delicious tags as a golden set to measure Tag Creation Effectiveness. This was done because the popular tags are believed to be a good set of tags that can help search. Since a large number of unique users chose these tags, then they are a good set of tags that can serve as a golden set for the study. Since the process used to produce tags using Amazon Mechanical Turk showed that that the Tag Creation Effectiveness was strong across all sites, it is believed that this same process can now be employed to generate tags for less popular user-content sites. New sites are usually not very accessible through web search because it takes time for users to adopt the new features and use them to generate good descriptive data about the new content. The process presented in this research study can be employed at any time to address this gap in search and accessibility of this information – what some call the cold start problem. 75 Lastly, the efficiency aspect of this research study refers to the process of producing tags though the use of crowdsourcing platforms, in this case, Amazon Mechanical Turk. This process is a key feature to this research and must be viewed as a process that can be repeated to generate good set of tags that can describe web pages and more specifically ones that contain user-generated content. This process addresses the efficiency aspect of creating tags because it is fast, cheap and reliable. 5.5 Significance of the Study This experimental study is significant as it added to the body of knowledge by providing a reliable, inexpensive, and fast method of recreating user-generated tags that can be useful for search engines via crowdsourcing. The results identified the potential benefits of crowdsourcing in tag creation and provided a specific process and design for tag creation tasks. This method is most useful for new web sources that are not yet popular or have not yet been tagged by large number of users. The study revealed that tag creation effectiveness differs across different sites, tag types, and user related characteristics. Web sites could use this information on creating their social bookmarking tags to effectively and efficiently improve their web search and make themselves more accessible to their potential users. The results also provide insight into the role of crowdsourcing in generating social tags and ultimately improving web search. 5.5 Summary and Conclusions The present study employed a quantitative experimental research design in order to explore the phenomenon of whether popular internet bookmarking tags can be recreated through crowdsourcing. The participants for this study were selected by post 76 tagging tasks on Mechanical Turk. All participants were subjected to an initial qualification step before they were allowed to participate in the actual study. Information related to tagging tasks was collected from the participants and were subjected for analysis. The results revealed that popular bookmarking tags can be recreated effectively and efficiently through crowdsourcing. Moreover, the analysis revealed that generally tag creation effectiveness differs across different sites and tag types. 77 References 1. Broota, K. D. (1989). Experimental design in behavioral research. Daryaganj, New Delhi: New Age International. 2. Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Erlbaum. 3. Cozby, P. C. P. (2007). Methods in behavioral research (12th ed.). New York, NY: McGraw Hill. 4. Creswell, J. W. (2009). Research design: Qualitative, quantitative, and mixed methods approaches. Thousand Oaks, CA: Sage Publications, Inc. Doi: 10.1177/1558689808325771 5. Fidel, R. (1994). Human-Centered Indexing. Journal of the American Society of Information Science. 45(8), 572-578. 6. Leedy, P. & Ormrod, J. (2001). Practical research: Planning and design (7th ed.). Upper Saddle River, NJ: Merrill Prentice Hall. Thousand Oaks: SAGE Publications. 7. Pirolli, P. (2005). Rational Analysis of Information Foraging on the Web. Cognitive Science, 29(3), 343-373. 8. Rowley, J. E. (1988). Abstracting and Indexing (2nd ed.). London: Clive Bingley. 9. Sinha, R. (2005). A Cognitive Analysis of Tagging. Retrieved from http://www.rashmisinha.com 10. Broota, K. D. (1989). Experimental design in behavioral research. Daryaganj, New Delhi: New Age International. 11. Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Erlbaum. 12. Cozby, P. C. (2001). Methods in behavioral research. New York: McGraw Hill. 13. Cozby, P. C. P. (2007). Methods in behavioral research (12th ed.). New York, NY: McGraw Hill. 14. Creswell, J. W. (2009). Research design: Qualitative, quantitative, and mixed methods approaches. Thousand Oaks, CA: Sage Publications, Inc. Doi: 10.1177/1558689808325771 15. Keuhl, R.O. (2000). Design of experiments: Statistical principles of research design and analysis. Pacific Grove, CA: Duxbury Press. 78 16. Leedy, P. & Ormrod, J. (2001). Practical research: Planning and design (7th ed.). Upper Saddle River, NJ: Merrill Prentice Hall. Thousand Oaks: SAGE Publications. 17. Moore D. S., & McCabe, G. P. (2006). Introduction to the practice of statistics. New York: W.H. Freeman. 18. Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics. Needham Heights, MA: Allyn and Bacon. 19. Bischoff, K., Firan, C. S., Nejdl, W., & Paiu, R. (2008). Can all tags be used for search? Conference on Information and Knowledge Management (pp. 203-212). California, USA: Association for Computing Machinery. 20. Gordon, J., Van Durme, B., & Schubert, L. K. (2010). Evaluation of Commonsense Knowledge with Mechanical Turk. NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk (pp. 159-162). Los Angeles, CA: Association for Computational Linguistics. 21. Kittur, A., Chi, E. H., & Suh, B. (2008). Crowdsourcing User Studies with Mechanical Turk. 26th Annual CHI Conference on Human Factors in Computing Systems. Florence, Italy: Association for Computing Machinery. 22. Sorokin, A., & Forsyth, D. (2008). Utiity data annotation with Amazon Mechanical Turk. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Anchorage, EK: IEEE. 23. Ipeirotis, P. (2010). Demographics of Mechanical Turk. Retrieved from New York University: http://archive.nyu.edu/bitstream/2451/29585/2/CeDER-10-01.pdf 24. Snow, R., O'Connor, B., Jurafsky, D., & Ng, A. Y. (2008). Cheap and fast- but is it good? Evaluating non-expert annotations for natural language tasks. Retrieved from Stanford University: http://www.stanford.edu/~jurafsky/amt.pdf 25. Callison-Burch, C. (2009). Fast, Cheap and Creative: Evaluating Translation Quality Using Amazon's Mechanical Turk. Conference on Empirical Methods in Natural Language Processing (pp. 286-295). Singapore: ACL. 26. Belkin, N. J., Cole, M., & Liu, J. (2009). A model for evaluation of interactive information retrieval. SIGIR 2009 Workshop on the Future of IR Evaluation (pp. 7-8). Boston, MA: IR Publications. 27. Paris, C. L., Colineau, N. F., Thomas, P., & Wilkinson, R. G. (2009). Stakeholders and their respective costs-benefits in IR evaluation. SIGIR 2009 Workshop on the Future of IR Evaluation (pp. 9-10). Boston, MA: IR Publications. 79 28. Smucker, M. D. (2009). A plan for making information retrieval evaluation synonymous with human performance prediction. SIGIR 2009 Workshop on the Future of IR Evaluation (pp. 11-12). Boston, MA: IR Publications. 29. Stamou, S., & Efthimiadis, E. N. (2009). Queries. SIGIR 2009 Workshop on the Future of IR Evaluation (pp. 13-14). Boston, MA: IR Publications. 30. Crecelius, T., & Schenkel, R. (2009). Evaluating Network-Aware Retrieval in Social Networks. SIGIR 2009 Workshop on the Future of IR Evaluation (pp. 17-18). Boston, MA: IR Publishing. 31. Kazai, G., & Milic-Frayling, N. (2009). On the evaluation of the quality of relevance assessments collected through crowdsourcing. SIGIR 2009 Workshop on the Future of IR Evaluation (pp. 21-22). Boston, MA: IR Publications. 32. Yue, Z., Harpale, A., He, D., Grady, J., Lin, Y., Walker, J., et al. (2009). CiteEval for evaluating personalized social web search. SIGIR 2009 Workshop on the Future of IR Evaluation (pp. 23-24). Boston, MA: IR Publications. 33. Ali, M. S., & Consens, M. P. (2009). Enhanced web retrieval task. SIGIR 2009 Workshop on the Future of IR Evaluation (pp. 35-36). Boston, MA: IR Publications. 34. Figueiredo, F., Almeida, J., Belém, F., Gonçalves, M., Pinto, H., Fernandes, D., et al. (2009). Evidence of quality of textual features on the web 2.0. 18th ACM Conference on Information and Knowledge Management (pp. 909-918). New York: Association for Computing Machinery. 35. Wetzker, R., Zimmermann, C., & Bauckhage, C. (2008). Analyzing social bookmarking systems: a del.icio.us cookbook. ECAI Mining Social Data Workshop (pp. 26-30). Patras, Greece: ECAI. 36. Marge, M., Banerjee, S., & Rudnicky, A. I. (2010). Using the amazon mechanical turk for transcription of spoken language. IEEE International Conference on Acoustics Speech and Signal Processing (pp. 5270-5273). Dallas, TX: IEEE. 37. Heymann, P., Koutrika, G., & Garcia-Molina, H. (2008). Can social bookmarking improve web search? WSDM International Conference on Web Search and Web Data Mining (pp. 195-205). New York: Association for Computing Machinery. 38. Suchanek, F. M., Vojnovic, M., & Gunawardena, D. (2008). Social tags: meaning and suggestions. 17th ACM Conference on Information and Knowledge Management. New York: Association for Computing Machinery. 80 39. Ames, M., & Naaman, M. (2007). Why we tag: motivations for annotation in mobile and online media. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. New York: Association for Computing Machinery. 40. Wetzker, R., Bauckhage, C., Zimmermann, C., & Albayrak, S. (2010). I tag, you tag: translating tags for advanced user models. Third ACM International Conference on Web Search and Data Mining (pp. 71-80). New York: Association for Computing Machinery. 41. Zhou, D., Bian, J., Zheng, S., Zha, H., & Giles, C. (2008). Exploring social annotations for information retrieval. 17th International Conference on World Wide Web. New York: Association for Computing Machinery. 42. Akkaya, C., Conrad, A., Wiebe, J., & Mihalcea, R. (2010). Amazon mechanical turk for subjectivity word sense disambiguation. NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk (pp. 195-203). Los Angeles, CA: Association for Computational Linguistics. 43. Chen, S.-Y., & Zhang, Y. (2009). Improve web search ranking with social tagging. 1st International Workshop on Mining Social Media. Sevilla, Spain: CAEPIA-TTIA. 44. Lu, C., Park, J.-r., Hu, X., & Song, I.-Y. (2010). Metadata effectiveness: a comparison between user-created social tags and author-provided metadata. 43rd Hawaii International Conference on System Sciences (pp. 1-10). Hawaii: IEEE Computer Society. 45. Carlyle, A. (1999). User categorisation of works toward improved organisation of online catalogues. Journal of Documentation , 55 (2), 184-208. 81 Appendix A: The Survey Tool Survey Starts Here 1. How old are you? ------ years 2. What is your gender? Please check one selection from the choices below: -- Male -- Female 3. What is your education level? Please check one selection from the choices below: -- Less than high school -- High school -- Associates degree -- Some college, no degree -- 4 year college degree (Bachelor’s degree) -- Some grad school, no degree -- Masters degree -- Ph.D., MD, JD, or other advanced degree 4. What is your experience with using search engine services such as Google, Yahoo or Bing? -- Not at all experienced (I use it rarely and only when instructed by someone else). -- Novice (I use it regularly but I am not always successful at finding the information I need). -- Average (I rely on it regularly to find what I need online and it works in 82 most cases). -- Above average (I use it very often and can find what I need with very little trouble if any at all - rely on it very heavily). -- Expert (I use it all the time and can find anything I need with no trouble at all - I can not live without it) 5. How much time do you spend on the Internet on average? -- I rarely spend time on the Internet -- I use it at least once a week -- I use it at least once a day -- I use it more than once a day 6. Tag Usage Experience: Do you use tags for any purpose (finding, sharing, or storing information)? -- Never: I know nothing about tags -- I don’t use tags but I know about them -- Sometimes I use them -- I use them frequently -- I use tags all the time 7. Tag Creation Experience: Do you create tags for any purpose (finding, sharing or storing information)? -- Never: I know nothing about tags -- I don’t create tags but I know about them -- Sometimes I create them -- I create them frequently 83 -- I create tags all the time 8. Follow the instructions provided below and answer questions about 5 different websites. Website 1: http://www.youtube.com/ Click on the hyperlink provided for website 1 and answer the following questions: a) What do you think this website is about – you can use the “About Us” section to provide this information? ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- b) Are you familiar with this site or a similar site? -- Never saw it before – I am not familiar with it. -- Seen it before or heard about it but I did not use it -- I use this site sometimes so I am somewhat familiar with it -- I use this site all the time so I am familiar with it c) Are you interested in this site or interested in what it is about (the topic it covers (do you like this site)? -- Yes -- No d) What words would you use as tags to describe this site? ------------------------------------------------------------------------------------------------------------------------------------------------------------------- 84 Website 2: http://www.flickr.com/ Click on the hyperlink provided for website 2 and answer the following questions: a) What do you think this website is about – you can use the “About Us” section to provide this information? ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- b) Are you familiar with this site or a similar site? -- Never saw it before – I am not familiar with it. -- Seen it before or heard about it but I did not use it -- I use this site sometimes so I am somewhat familiar with it -- I use this site all the time so I am familiar with it c) Are you interested in this site or interested in what it is about (the topic it covers (do you like this site)? -- Yes -- No d) What words would you use as tags to describe this site? ------------------------------------------------------------------------------------------------------------------------------------------------------------------- Website 3: http://www.pandora.com/ Click on the hyperlink provided for website 3 and answer the following questions: a) What do you think this website is about – you can use the “About Us” section to provide this information? 85 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- b) Are you familiar with this site or a similar site? -- Never saw it before – I am not familiar with it. -- Seen it before or heard about it but I did not use it -- I use this site sometimes so I am somewhat familiar with it -- I use this site all the time so I am familiar with it c) Are you interested in this site or interested in what it is about (the topic it covers (do you like this site)? -- Yes -- No d) What words would you use as tags to describe this site? ------------------------------------------------------------------------------------------------------------------------------------------------------------------- Website 4: http://www.facebook.com/ Click on the hyperlink provided for website 4 and answer the following questions: a) What do you think this website is about – you can use the “About Us” section to provide this information? ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- b) Are you familiar with this site or a similar site? -- Never saw it before – I am not familiar with it. 86 -- Seen it before or heard about it but I did not use it -- I use this site sometimes so I am somewhat familiar with it -- I use this site all the time so I am familiar with it c) Are you interested in this site or interested in what it is about (the topic it covers (do you like this site)? -- Yes -- No d) What words would you use as tags to describe this site? ------------------------------------------------------------------------------------------------------------------------------------------------------------------- Website 5: http://digg.com/ Click on the hyperlink provided for website 5 and answer the following questions: a) What do you think this website is about – you can use the “About Us” section to provide this information? ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- b) Are you familiar with this site or a similar site? -- Never saw it before – I am not familiar with it. -- Seen it before or heard about it but I did not use it -- I use this site sometimes so I am somewhat familiar with it -- I use this site all the time so I am familiar with it 87 c) Are you interested in this site or interested in what it is about (the topic it covers (do you like this site)? -- Yes -- No d) What words would you use as tags to describe this site? ------------------------------------------------------------------------------------------------------------------------------------------------------------------- Survey Ends Here 88 Appendix B: Popacular Top 100 Most Tagged Sites on Delicious – All-Time Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Site YouTube - Broadcast Yourself Flickr Pandora Radio - Listen to Free Internet Radio, Find New Music Welcome to Facebook! | Facebook Digg.com Wordle - Beautiful Word Clouds All News, Videos, & Images Google stock.xchng - the leading free stock photography site TED: Ideas worth spreading Lifehacker, the Productivity and Software Guide Zamzar - Free online file conversion dafont.com COLOURlovers :: Color Trends + Palettes Web 2.0 Tools and Applications - Go2web20 The Internet Movie Database (IMDb) Scribd Upload & Share PowerPoint presentations and documents Smashing Magazine Slashdot - News for nerds, stuff that matters Wikipedia, the free encyclopedia Install Bookmarklets on Delicious Instructables - Make, How To, and DIY Tw deviantART: where ART meets application! W3Schools Online Web Tutorials Technorati: Front Page Etsy :: Your place to buy and sell all things handmade Browsershots kuler Internet Archive - Suchmaschine die u.a. alte Versionen von Websiten findet The New York Times Yahoo! Last.fm - Listen to internet radio and the largest music catalogue online Prezi - The zooming presentation editor Number of Unique Taggers 91,347 79,982 62,186 62,007 58,237 58,847 58,019 55,835 55,220 55,041 53,157 45,945 45,075 44,311 40,766 40,295 39,716 38,976 38,652 38,261 37,831 37,477 37,228 36,751 36,696 35,573 35,063 35,003 34,807 34,740 34,144 33,515 33,332 32,903 32,864 89 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 MySpace Index/Left on MySpace Music - Free Streaming MP3s, Pictures & Music Downloads script.aculo.us - web 2.0 javascript Netvibes FFFFOUND! Ajaxload - Ajax loading gif generator Web Developer's Handbook | CSS, Web Development, Color Tools, SEO, Usability etc... Mininova : The ultimate BitTorrent source! Color Scheme Generator Engadget Speedtest.net - The Global Broadband Speed Test Boing Boing CNN A List Apart: A List Apart KeepVid: Download and save any video from Youtube ... Hulu - Watch your favorites. Anytime. For free. TechCrunch HowStuffWorks Wolfram|Alpha Mashable Animoto - the end of slideshows jQuery: The Write Less, Do More, JavaScript Library TeacherTube - Teach the World | Teacher Videos | Lesson Plan Videos ... teachertube Stock Photography: Search Royalty Free Images & Photos Royalty-Free Stock Photography at iStockphoto.com The FWA: Favourite Website Awards - Web awards at the cutting edge Picnik: edita fotos fácilmente y en línea en tu explorador css Zen Garden: The Beauty in CSS Design ZOHO Email Hosting, CRM, Project Management, Office Suite, Document Management, ... Email Hosting, CRM, Project Management, Office Suite, Document Management, Remot... Zoho Email Hosting, CRM, Project Management, Database Software, Office Suite, Documen... Online Diagram Software - Gliffy Academic Earth - Video lectures from the world's top scholars 31,991 31,991 31,501 31,357 31,283 31,186 30,946 30,460 30,449 30,312 29,945 29,879 29,730 29,555 29,448 29,130 28,991 28,937 28,870 28,668 28,146 27,983 27,883 27,883 27,084 27,084 26,873 26,847 26,719 26,155 26,155 26,155 26,155 25,680 25,481 90 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Urban Dictionary Homepage | Dictionary.com LibraryThing | Catalog your books online Threadless graphic t-shirt designs PortableApps.com - Portable software for USB drives Open Source Web Design - Download free web design templates. Project Gutenberg Main Page - Gutenberg popurls® | the genuine aggregator for the latest web buzz xkcd - A Webcomic - Blockbuster Mining Gizmodo, the Gadget Guide Download music, movies, games, software! The Pirate Bay - The world's largest Bi... Ning lets you create and join new social networks for your interests and passion... 960 Grid System LogoPond - Identity Inspiration Wikipedia Vimeo, Video Sharing For You MiniAjax.com / Highlighting Rich Experiences on the Web Iconfinder | Icon search made easy 53 CSS-Techniques You Couldn't Live Without | CSS | Smashing Magazine 53 CSS-Techniques You Couldn't Live Without « Smashing Magazine Remember The Milk: Online to do list and task management Jing | Add visuals to your online conversations Wired News Digital Camera Reviews and News: Digital Photography Review: Forums, Glossary, F... Learn to Read at Starfall - teaching comprehension and phonics Bugmenot.com - login with these free web passwords to bypass compulsory registra... Amazon.com: Online Shopping for Electronics, Apparel, Computers, Books, DVDs & m... Khan Academy Facebook | Home 25,231 24,851 24,737 24,345 24,307 24,253 24,016 24,015 24,012 23,930 23,882 23,747 23,642 23,388 23,110 22,981 22,821 22,769 22,725 22,639 22,639 22,464 22,420 21,944 21,779 21,696 21,596 21,431 21,405 21,280 91 Appendix C: Popacular List of Most Tagged Sites – One Month Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Site Javascript PC Emulator Subtle Patterns | High quality patterns for your next web project SpyBubble Review Innovative Techniques To Simplify Sign-Ups and Log-Ins Smashing Magazine Cool, but obscure unix tools :: KKovacs Microjs: Fantastic Micro-Frameworks and Micro-Libraries for Fun and Profit! delicious/register/bookmarklets Angry Birds The Architecture of Open Source Applications National Jukebox LOC.gov Front End Development Guidelines On TermKit | Steven Wittens - Acko.net SLR Camera Simulator | Simulates a digital SLR camera Affordable Link Building Services Layer Styles Clean Up Your Mess - A Guide to Visual Design for Everybody Home Based Business Dictionary of Algorithms and Data Structures CSS3 Generator - By Eric Hoffman & Peter Funk Stolen Camera Finder - find your photos, find your camera Layer Styles YouTube - Broadcast Yourself. lovely ui Data Mining Map LogicalDOC Document Management - Document Management Software, Open Source DMS Number of Unique Taggers 1778 1734 1459 1244 1201 1052 1038 966 958 957 900 872 811 764 733 678 677 621 618 599 584 565 536 517 508 92 Appendix D: Popacular List of Most Tagged Sites – One Week Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Site SpyBubble Review The Architecture of Open Source Applications Cool, but obscure unix tools :: KKovacs Data Mining Map Link Building Service at Diamond Links | Leyden Energy Develops Durable Laptop batteries Hewlett-Packard Updated Their Mini Note, Notebook Lines AT&T Lifts Android Application Confinements Google Correlate Hype Hivelogic - Top 10 Programming Fonts Samsung Galaxy Tab 10.1 With Android 3.1 Coming in a Few Days Android Security Fix Will Enter Market In Coming Few Days, Says Google Boy or Girl? Gender Reveal Parties Let the Cat Out of the Box The History Of Car Accidents Kung Fu Panda 2 Preview: The Awesomeness is back Number of Unique Taggers 1302 958 628 517 404 389 380 379 371 361 330 294 294 285 253 250 93 Appendix E: Popacular List of Most Tagged Sites – One Day Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Site Google Correlate Samsung Galaxy Tab 10.1 With Android 3.1 Coming in a Few Days Styling ordered list numbers The History Of Car Accidents The Best & Worst James Bond Themes Of All Time SpyBubble Review Advanced Google Analytics for Startups | Think Vitamin The Architecture of Open Source Applications Good News For Sri Lankan Auto Lovers: Tata Nano To be Sold in Sri Lanka Soon !!!... Hivelogic - Top 10 Programming Fonts Better Image Management With WordPress - Smashing Magazine 10 Types of Videos That YouTube Should Simply Ban Better Light Effect For Design Inspiration Kung Fu Panda 2 Review: WOWsomeness!!! 17 Futuristic Eco-Homes 5 Best Free File Compression Software Sheetalbhabhi.com Preview What is Internet Marketing? Is it for your business?? 7 Unique Jquery Navigation Menus for Everyones Needs The Success Story of Mycroburst Introduction to DNS: Explaining The Dreaded DNS Delay Smashing Magazine Press Brakes-Mechanical press Brake-Hydraulic Press Brake The Only Way to Get Important Things Done - Tony Schwartz Harvard Business Rev... Sustaining Continuous Innovation through Problem Solving ... 5 Cities in the U.S. with Excellent Public WiFi Number of Unique Taggers 314 294 239 233 227 219 171 169 166 150 139 139 129 124 118 117 101 91 91 89 87 87 87 87 86 94 Appendix F: Popacular List of Most Tagged Sites – 8 Hours Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Site Better Image Management With WordPress - Smashing Magazine The Architecture of Open Source Applications Samsung Galaxy Tab 10.1 With Android 3.1 Coming in a Few Days Kung Fu Panda 2 Review: WOWsomeness!!! Styling ordered list numbers The Success Story of Mycroburst Google Correlate What is Internet Marketing? Is it for your business?? Good News For Sri Lankan Auto Lovers: Tata Nano To be Sold in Sri Lanka Soon !!!... 17 Futuristic Eco-Homes Car Rental deals PhotoSwipe - The web image gallery for your mobile device Best Cloud Based Invoicing Software & Applications Kickoff - Coming soon Romnatic Dating Tips!! Want To Get Paid Faster? Top 5 Cloud-Based Financial Tools To Speed Up Your Rece... SpyBubble Review 70 Free PSD Web UI Elements For Designers | Free and Useful Online Resources for... Mercedes SLK With New Look loads.in - test how fast a webpage loads in a real browser from over 50 location... Top 5 Fastest Bikes of 2010 Disaster in Little Community Better Light Effect For Design Inspiration Press Brakes-Mechanical press Brake-Hydraulic Press Brake Sustaining Continuous Innovation through Problem Solving ... Number of Unique Taggers 139 135 130 113 92 79 77 72 60 59 55 55 54 52 52 50 48 47 45 44 44 44 43 42 42 95 Vita Highlights Over 13 years of experience in managing global teams, developing state of the art ITSM and data management solutions. Strong history of driving innovations at scale and making a significant difference. Strong leadership skills and proven ability to recruit the best, build solid teams, and set a clear vision. Data management and integration expert. Strong in data analytics, measurements definitions and representation. Outstanding problem solving and critical thinking skills. Architecture and creator of original ITSM solutions for various processes including; Incident, Change, Asset, Release, Procurement, Knowledge, and Business Continuity Management Processes. Process redesign and improvement expert including maturity road maps planning and implementations. ITIL and ITAM (IT Asset Management) certified. Experience Amazon.com Seattle, WA October 2009 – Present Catalog and Data Quality Ops and Program Mgt Responsible for managing Amazon’s catalog quality global team and build processes to enable fast based innovation and delivery of solutions on all our sites world-wide. Design and implement processes that improves customer experience on our site and especially when interacting with the data in the catalog. Partner with senior management to define the vision for catalog quality efforts at Amazon and sponsor key projects to make this vision a reality. Optimize the current team and use strategic sourcing to respond to high and un-predictable market and customer demands. Implement a set of light-weight project management practices for the catalog quality team to enable them to drive continuous improvement projects successfully. This work is empowering the team to break organizational barriers and innovate beyond the day-to-day responsibilities. Serving as a principal for operations research and process management for the internal Amazon community of businesses and companies. 96 Key contributor to Amazon’s hiring practices and raising the bar program with every new hire. Active mentor and contributor to Amazon’s leadership principles. We focus on identifying candidates and employees of high potential and mentor them into leadership positions. Pepperweed Consulting LLC Sewickley, PA March 2008 – October 2009 Sr. Management Consultant (on September 2009 Pepperweed Consulting became Cognizant Technologies due to acquisition) Responsible for delivering management consulting services to fortune 500 companies in the areas of ITSM, PPM, and ITAM. The following is a list of key clients: Boeing, Western Union, T-Mobile, Catholic Health West Health System, Cook Children Hospital Health Systems, and Great American Financial Resources Insurance. Responsible for managing complex engagements to address key problems by designing and implementing best practice processes. Help clients in transforming IT and making it more transparent to the business. Contributed to the development of the new practice area of IT Governance and Project Portfolio Management processes. Work closely with the leadership teams in large organizations to design and implement marketing programs geared to promote process improvement initiatives. Expert in designing process performance metrics programs to increase organizational awareness and present opportunities for improvements. Design and implement new organizational structures required to support PPM and ITSM programs needed for continual success. Dow Jones & Co. Inc. Employment History (10 years) Princeton, NJ (on December 2007 Dow Jones became News Corporation due to acquisition) August 2005 – February 2008 Process & Control Manager Responsible for engineering and launching critical enterprise processes such as Incident Management, Problem Management, Change Management, Configuration Management (CMDB), Knowledge Management, IT Asset Management, Procurement, Vendor Relationship Management, Request Management …etc. Process maturity planning includes the creation of the Capability Maturity 97 Model (CMM) road maps for the various enterprise processes while accounting for dependencies across processes. Overseeing the design and implementation of IT outsourcing processes along with implementation and compliance. Involved in designing SLA with outsourcing vendors. Provide consulting services for process implementations to internal departments including international and domestic divisions. Strategic planning of how to utilize social computing tools to support organizational objectives. Responsible to promoting process standards and increase compliance across all departments and acquired companies. June 2003 – August 2005 Production Control Manager Manage change coordinators team that oversees all enterprise infrastructure changes and related day-to-day activities. Lead the design and implementation of enterprise processes needed to logically and physically secure the production environment and data centers of mission critical applications and services. Serve as an internal consultant to assist other groups in solving chronic problems related to workflow, process or procedural communications. October 2001 – May 2003 Line of Business Supervisor (LOB) for Electronic Publishing and .com products (including Factiva.com) Serve as the Operational services Liaison for all the electronic publishing systems and the .com products in Dow Jones & Co. Provide post mortem disruption reports for all critical production issues. Track and approve new business projects and provide necessary training to operations staff. Address operational and technical exposures in the LOB technologies and provide solutions to mitigate risks. Analyze performance against set SLA levels and follow up with senior management on corrective actions when needed. May 2000 – September 2001 Systems Administration Consultant Achieve significant savings and reduce headcount by leading monitoring tools integrations efforts and automation of manual tasks. Analyzed systems performance and provided recommendations to enhance its efficiency. May 1998 – May 2000 Senior Operations Analyst 98 Led large automation efforts for all manual check of online commerce systems. Tested and evaluated new enterprise management tools software and provided recommendations of possible application and usage. Education 2004 – Present Drexel University Philadelphia, PA Ph.D. in Information Science & Technology Current GPA: 3.9/4.0 Research interests include information retrieval (search) on the Web, how organizations use information to solving problems, Web mining, online social network analysis and informal structures, as well as search algorithm and methods for user-generated content and use of tags in search 1999 – 2001 University of Maryland UC College Park, MD Masters of Science in Information Technology Management and Telecommunications GPA: 3.5/4.0 1997 - 1999 Rider University Lawrenceville, NJ Bachelor of Science in Business Administration with concentration in Computer Information Systems GPA: 3.92/4.0 Affiliation Industry Recognition Member of the following organizations: International Association of IT Asset Management (IAITAM), Information Systems Audit and Control Association (ISACA), and International Institute of Business Analysis (IIBA). Honorary President and Founder of the Computer Information Systems Society at Rider University Member, Phi Beta Delta, International Honor Society for International Scholars Member, Beta Gamma Sigma, National Honor Society for the Best Business Scholars Leader of the World Health Organization Committee of the Model United Nations team in 1999 for Rider University Invited Faculty for the HDI Service Management Annual Conference and Expo: Annually I get invited to present new and interesting discoveries related to ITSM processes and practices. Workshops are half day in duration and they attract top industry leaders seeking new and exciting ideas to take their ITSM practices and processes to the next level. List of workshops delivered through this venue so far: October 2010: Power to the People: harnessing the power of crowds 99 to succeed and thrive. For more info click here. November 2009: ITSM Organizational Structures and Leadership Requirements for Success. For more info click here. November 2009: Process Maturity Models and Self Assessment Tools. For more info click here. November 2009: Virtualization Demystified: All you need to know to ensure you survive and thrive. For more info click here. Columnist in the IT Asset Knowledge (ITAK) monthly magazine – Column Title: Process Demystified January 2008 International Association of IT Asset Managers Columnist in the IT Asset Knowledge (ITAK) monthly magazine – Column Title: Process Demystified In this column I take my readers on a journey to demystify IT processes. The focus is the ITAM domain and I try to discuss various aspects of the ITAM processes each month and provide useful tips to enable a successful ITAM program implementation. January 2008 International Association of IT Asset Managers Editorial Board Member of the Best Practice Library for IAITAM Practice I am one of six industry leaders that have been chosen to edit and comment on the IAITAM best practice library which is an industry standards for ITAM program implementation. May 2000 Assurance 2000 Conference Las Vegas, NV Presenter of Dow Jones Best Practices for Operations Management and 24X7 availability Assurance 2000 is an annual technical executive conference hosted by BMC Software. In this event about 3,500 industry leaders gather from all around the world to share knowledge and strategize for the future.