P. Nieuwenhuysen Libraries in the age of the Internet: NO to obsolescence and YES to synergy (research paper, presented in the session on Cloud Computing and Libraries) Published: In The Proceedings of the Fifth Shanghai (Hangzhou) International Library Forum = SILF2010 with the theme “City life and library service” hosted by the Hangzhou library in Hangzhou, China, 24-27 August 2010, Shanghai Scientific and technological Literature Publishing House, http://www.sstlp.com/ 2010, ISBN 978-7-5439-4415-2, 518 pp., pp. 452-463, http://www.libnet.sh.cn/silf2010/english/index.htm Libraries in the age of the Internet: NO to obsolescence and YES to synergy Paul Nieuwenhuysen University Library Vrije Universiteit Brussel = VUB Pleinlaan 2 B-1050 Brussel Belgium Tel 32 2 629 2436 Paul.Nieuwenhuysen@vub.ac.be http://www.vub.ac.be/BIBLIO/nieuwenhuysen/professional/ Abstract Introduction / background: Libraries are active in a world where the Internet and the WWW offer information services that increase in number, size and efficiency. Therefore it is increasingly important that libraries embrace these expanding services in their work and activities. Realizing this efficiently in practice should be based on investigations, assessments of services and methods. Problem statements: The purpose of this work was to evaluate the efficiency and effectiveness of embracing the Internet as an additional tool to make information available to potential users. More concretely, we wanted to find out if a purely informative WWW site developed as an information source can be discovered effectively by end users, even though many other sites compete for visibility, mainly through WWW search engines. Further outcomes should be recommendations on how to create an efficient WWW presence and on how to assess the visibility or impact of a WWW site. Methods: A specific WWW site has been set up and developed in a particular subject domain. After stabilization, several methods have been applied to analyze the visibility of the created WWW pages. Results / findings: The developed WWW pages are well visible and used. In particular, they can be discovered efficiently using a WWW image search engine. Discussion: The various methods applied in the analysis yield a consistent view. Conclusion / recommendations: The positive results indicate that the view of the Internet as a difficult and highly competitive place to offer information is realistic, but that this should not lead to pessimism and a passive attitude without creative actions. With a simple approach, contributions to the Internet can reach users and be useful. So an optimistic view is justified. Recommendations are offered on how to optimize and how to assess the visibility of your WWW pages. Introduction / background / context This conference contribution is a report of ongoing investigations by a scientific, academic librarian and active user of scientific information. To start with, the framework is presented. The broad aim is to find out how to adapt, change and optimize the library services that we offer as an important part of the information landscape that is evolving quite fast in this dynamic Internet age. Information can be found and accessed increasingly through the Internet and WWW. As a consequence, nowadays librarians should evaluate, select, offer and recommend information discovery services on the WWW to their clients. Therefore we assess, test and evaluate the performance of public access information services that are offered through the Internet and the WWW. Due to the constraints in the time available, we focus on a few information services that seem valuable and important in the framework of libraries, so that they deserve to be evaluated in a quantitative way. The aim is to provide a basis for decision making in the library concerning the implementation of these services for our users. The background of this work is shaped by information storage and retrieval systems through the Internet that have made spectacular progress, while practical searching for information still confronts us with retrieval systems that are far from perfect. The various investigations have in common the aims as mentioned, but also a welldefined subject domain, theme, topic of the information contents, which is the same in all cases: 1. In an evaluation of book search services on the WWW (Nieuwenhuysen, 2008, 2010a): the subject domain of most of the book titles that are searched. 2. In investigations of WWW image searching (Nieuwenhuysen, 2010b): the subject domain of the WWW images and documents that are searched. 3. In an investigation of how effectively a created WWW site can be discovered and used (this report): the subject domain of that WWW site. The common and narrow subject domain is probably not essential for the feasibility of this kind of investigations, but an advantage is that it allows the investigator to exploit a relatively high level of subject expertise and to increase this level even further during the ongoing investigation of the various information systems. Not the subject itself, but the expertise in some (any) subject is probably a necessity in order to carry out this kind of research in a reasonable, efficient and meaningful way. Problem statements 1. A broad research area is how relevant classical library collections and services still can be, in the face of the exploding quantity of information that people can discover and access through the Internet and WWW, independent of libraries. This kind of investigation has become a necessity since the birth of the Internet and becomes even more relevant, since classical printed documents from libraries are digitized and made searchable and available on the Internet at an impressive speed. Here we want to report on our quantitative analysis in a very specific subject domain. Information that is available by libraries or other players in the information world can be made accessible through the Internet. However, can potential users discover this digitized or born digital information easily and efficiently? If not, then efforts to bring information to the Internet to enrich its contents make less sense. The answer to this simple question and the optimal way to realize efficient projects in this way is not straightforward, since billions of documents are already available and searchable, all competing for a high ranking in the prominent Internet search engines. 2. The experience gained in tackling the problem above should yield the following types of recommendations: a) How to create a WWW site that is well visible? b) How to analyze the visibility of a WWW site? Nomenclature In this text we use mainly the words online “visibility” as synonym for “footprint” or “presence” “to analyze” as a synonym for “to test”, “to assess”, “to evaluate” “indicator” of website visibility, as a synonym for words used by other authors, such as “index”, “measure”, “metric”, “proxy” Methods and findings The WWW site that is analyzed The test subject domain for all investigations reported here is the domain of classical, ethnic, tribal art objects created in Africa. A specific WWW site has been set up and developed over the years in the chosen subject domain: http://www.vub.ac.be/BIBLIO/nieuwenhuysen/african-art/ This site is used now to test the feasibility of “exporting” information from an information intensive organization to the Internet. Most of the site consists of a bibliography of books/monographs in the chosen subject domain. This bibliography consists of classical WWW pages, each one about the books published in a particular year. An example is http://www.vub.ac.be/BIBLIO/nieuwenhuysen/africanart/african-art-books-2009.html Similar, competing WWW sites Finding WWW sites that are similar or related to a known WWW page or site can be useful to discover additional interesting information or --in the context of this paper-- to discover pages and sites that compete with your own site. Then these competing pages can be inspected, for instance to detect overlapping content so that duplication of efforts can be avoided, to get ideas on how to improve your own site, to compare the visibility of your site with the competing sites, by using methods described below. To identify similar sites in practice, a WWW search engine like Google web search can be exploited free of charge, using the function that is offered for this purpose. The algorithms used to determine similarity are not described, but Aguillo (2009) writes that “Google associates websites according to their link pattern, assuming two pages are closer if the overlap between their in- and out-links is high. This provides a hypertextual neighborhood…” This method can be taken one step further by applying the system http://www.touchgraph.com/TGGoogleBrowser.html as recommended by Espadas et al. (2008) and by Aguillo (2009). This service first finds similar sites through Google and afterwards adds value by visualizing those sites on the computer display with their similarity relations, and in clusters shown in different colours, that have been created on the basis of the subject of their content; the number of clusters can be decided by the user. The figure shows the identification of similar WWW sites. Some of these were inspected more closely for comparison with the analyzed site. More concretely, several indicators have been determined as described further in this paper. Figure: WWW sites similar to the analyzed WWW site http://www.vub.ac.be/BIBLIO/nieuwenhuysen/african-art/. Note that the WWW site URLs are not shown fully in this static screenshot, but they do show up in the working system, when the cursor is moved over a URL. Furthermore colours are relevant, but of course these cannot be appreciated in a version of the figure that is printed in black and white. Visibility of the WWW site, as reflected by links received The number of links from other WWW pages to a particular WWW page is related to the concept of visibility, as explained for instance by Espadas et al. (2008), Aguillo (2009) and De Andrés et al. (2009). Therefore we have also performed a link analysis. The number of links received may seem simple and well defined, but in practice this is not so. Even in the simpler case of number of citations received by scholarly publications, complications blur quantitative analysis and conclusions. Furthermore, measuring the number of links received is not straightforward. In practice we have used two query methods: In normal, simple search mode, queries were submitted to the classical Google WWW text search engine, simply searching for a part of the URL of the WWW site that is unambiguous, within quotation marks. An example is “nieuwenhuysen/african-art”. In “Advanced Search”|“Page-specific tools”|”Find pages that link to the page”, the same queries were submitted. This corresponds to searching in the command mode with the link operator. An example is link:nieuwenhuysen/african-art. Google text WWW search was used to search for the part of the URL that occurs in all the pages that form the analyzed WWW site: “nieuwenhuysen/african-art”. In various tests, this gave about 11000 or 12000 hits. In “Advanced Search”|“Page-specific tools”|”Find pages that link to the page”, the same query was submitted. This yielded in various tests about 400 or 1500 hits. This is a lower number, as expected. A search for one page only, uses a query with "nieuwenhuysen/african-art/african-art-collectionmasks". This yields about 900 hits, which is a lower than when links are searched to any page of the site, as expected. In “Advanced Search”|“Page-specific tools”|”Find pages that link to the page”, the same query yields about 200 hits. This is a lower number, as expected. In “Advanced Search”|“Page-specific tools”|”Find pages that link to the page”, almost the same query nieuwenhuysen/african-art/african-art-collection-masks.htm but with the file name extension .htm at the end added explicitly, yields about 110 hits. This is a lower number, as expected. The relevance of the number of hits found through “Advanced Search”|“Page-specific tools”|”Find pages that link to the page” or in an equivalent way by using the link: operator, is higher than when a simple URL query is submitted, because less links are found while these are similar to real, classical citations. For comparison, some queries have been executed to get an idea of the links to other WWW sites in the same subject domain, as follows. The query link:www.dapper.com.fr yields about 250 hits. This is the home page of one of the top museums dedicated to tribal art, but it receives less links than our analyzed WWW site; this is a pleasant surprise. The query link:www.hamillgallery.com yields about 100 hits. This is the home page of a big informative and commercial site dedicated to African art, that shows many photos of objects, but it receives less links than our analyzed WWW site; this is again a pleasant surprise. Visibility of the WWW site as reflected by Google PageRank The Google PageRank of a WWW page is continuously calculated by the prominent company Google that is specialized in searching, as an indicator of the importance and impact of this WWW page. The PageRank data are then used by Google to improve the ranking of search results. PageRank values range from 0 to 10. See for instance “PageRank” in Wikipedia. The PageRank value of a WWW page is mainly based on the links received from external WWW pages. So determining the number of links received (cfr. above) is related to inspecting the PageRank value. We have determined the PageRank value of pages of our WWW site, as well as of other WWW pages for comparison. To do this in practice we used PageRank Checker: http://www.prchecker.info/check_page_rank.php This is “a free tool to check Google page ranking of any web site pages”. Results are shown in table format. Table: Google PageRank values (between 0 and 10; higher is better). http://www.vub.ac.be/BIBLIO/ The home page of the VUB university library site gets a higher value than the underlying and much more specific sub-site that is analyzed here. This is expected as the whole library site and the starting page in particular receives many links from all over the WWW. http://www.vub.ac.be/BIBLIO/nieuwenhuysen/african-art/ The home page of the sub-site analyzed here gets a value that lies among values for various underlying pages. http://www.vub.ac.be/BIBLIO/nieuwenhuysen/african-art/african-art-collection-masks.htm/ The value for the page on masks is relatively high and this subject is indeed popular. http://www.vub.ac.be/BIBLIO/nieuwenhuysen/african-art/african-art-collection-statues.htm 8 http://www.vub.ac.be/BIBLIO/nieuwenhuysen/african-art/african-art-collection-textiles.htm The value for the page on textiles is relatively low and indeed this is a smaller page on a subject that is not as popular as masks for instance. http://users.telenet.be/african-shop/ The figure above on similar sites reveals this site as a near neighbor and thus as very similar and worthy of a close look. This is the home page of a dealer in old African art. http://www.brucefrankprimitiveart.com/ The figure above on similar sites shows this site as a near neighbor and thus as very similar and worthy of a close look. It is the site of a dealer in old African art. Pages show photos of many objects. Exceptionally its PageRank value could not be given by the system that was applied. http://www.hamillgallery.com/ The figure above on similar sites shows this site. Thus it is worthy of a close look. Furthermore the site is included in the WWW subject directory of the Open Directory Project. It is the homepage of the large Hamill Gallery in Boston, USA, which sells African art objects. The site is well structured, easy to use, large and informative, with many photos of objects. This makes it attractive for many people interested in African art. http://www.hamillgallery.com/MOSSI/MossiDolls/MossiDolls.html This is an example of a page in the site of Hamill Gallery. The value of the PageRank is lower than the value of the home page. This is understandable, because links made on external WWW sites point normally to a home page and not to a more specific, underlying page. http://www.hamillgallery.com/SITE/MasksandHeads.html The PageRank value for the entry page to information about masks is relatively high. This corresponds well with the case of our analyzed site, as described above. http://www.tribal-art-auktion.de/en/home/ This is the home page of the famous auction house that is specialized in tribal art including of course ethnic, African art. It is located in Germany and offers a very informative WWW site with illustrated catalogs of previous and coming auctions and with results of passed auctions. http://africa.si.edu/collections/index.htm This is the home page of the top level, world famous National Museum for African Art, in Washington, USA. As expected, the PageRank value is relatively high. http://anthro.amnh.org/anthropology/databases/africa_public/africa_public.htm This is a gateway page that gives access to the searchable database of African art and artifacts in the great collection of the famous American Museum of Natural History in New York, USA. 3 4 5 4 2 / 5 3 4 4 6 4 http://www.dapper.com.fr/ This is the home page of the small top level museum on tribal, ethnic art in the center of Paris, France, that organizes regularly top level exhibitions and that publishes well documented books with contributions by the greatest experts, as catalogs to these exhibitions. 5 Comparing the values for the PageRank of the selected pages related to African art, shows that those values fall in the range 2 to 6 out of 10. The PageRank value of the selected pages in the analyzed WWW site are relatively high. This is satisfactory, perhaps even surprising, considering the fame of the other organizations. This success corresponds well with the satisfactory results presented on retrieval through image searching, which are presented below. Visibility of the WWW site, as reflected by pages indexed for WWW searching Formulated in a negative way, WWW pages that have not been indexed by a search engine do not appear as a result of any query through that search engine. Formulated in a more positive way, the inclusion of your WWW pages in the database index of WWW search engines forms a basis for a high visibility in searches (see for instance Espadas et al. 2008). Another fact is that already since a few years Google Web Search is the leading, most popular, general WWW text search engine. The competing products offered by Yahoo! and Microsoft are less stable: Microsoft has changed their search engine several times in recent years and both companies have announced a cooperation in the form of a switch to one common WWW search engine database in 2010. Therefore we have checked if the pages of the analyzed WWW site have been indexed by this Google search engine. In practice we have submitted the query site:www.vub.ac.be/BIBLIO/nieuwenhuysen/african-art/ The search result shows the number of pages found and ideally this is close to the number of pages in the site. We found that all or almost all the relevant pages of the analyzed WWW site have been indexed by the classical WWW text search engine offered by Google. This is satisfactory as the whole WWW is not indexed completely at all. The inclusion in the database index forms a basis for a high visibility in searches. In an analogous way, we have tried to assess the WWW image search engine of Google. However, this turned out not to function in a straightforward way. The result showed a number much lower than the thousands of images present in the WWW site. This was disappointing at first. However: 1. As a further test we submitted the same search query, but we included an extra word that occurs in the WWW site, in what seems like an AND relation; in this way we expect an even lower number of hits. 2. Surprisingly the number of hits did not decrease, but it increased. 3. This was tested with various other additional query words and the same pattern emerged. All this indicates the following: A simple test with one simple query was not working as well as in the normal, classical Google web search, when we applied images.google.com. Here some clarification would be welcome. The initial disappointment could be replaced by a more optimistic view, because many more images of the analyzed WWW site have been indexed by images.google.com than could be concluded wrongly by the first simple test. Visibility of the WWW site, as reflected by image search results A very direct aspect of visibility of a WWW site is the appearance of a page from the site in a WWW search engine results list, in the case of a query with keywords in the subject domain of the site. This is explained in more detail by Espadas et al. (2008). Visual information is important in the created WWW site that is analyzed here, so that it is suitable to perform this analysis by searching for images on the Internet. Several image search engines are available on the WWW. Google’s system has been applied in this analysis, for several reasons, as explained in Nieuwenhuysen (2010b); summarized, its coverage is good, it performs relatively well and it is very popular. Image searches have been carried out that were not targeted only at this site, but in a way that most users search for information, not aware of the existence of our particular analyzed site. Most searches were carried out with a query that consists of 1, 2 or 3 words. Using only 1 word is not the best approach, because 1 word is not sufficient to express a real information need; the relevance of retrieved documents will be low; in other words precision would be low; furthermore ambiguity of meaning always hinders information retrieval and is certainly important in the case of queries with only 1 word as a context is lacking. On the other hand, using more than 3 words in a query can narrow down retrieved documents to just those that contain those words, such as the pages in the WWW site analyzed here. Using only few words simulates common usage of search engines well, because most users formulate short queries, as shown by research that is reviewed for instance by Lewandowski and Höchstötter (2008) and Machil et al. (2008). Afterwards, the result set for each query was inspected for occurrences of hits that point to the analyzed WWW site. The famous and popular Google Web Search can apply personalization in the sense that Google can store earlier queries and other user behavior on a Google server computer or as a cookie on the client computer; then Google can take this “older” information into account in presenting “newer” retrieval results to the user. So a test of retrieval like the one carried out here can be influenced or hindered by this mechanism. This personalization can be excluded by the user as explained by Google on their WWW site. The Google WWW site shows no indication that Google Image search also works with personalization. Anyway: Signing-in to Google with user-id was avoided. Some queries were repeated on a separate, independent client computer, working with a different IP network address and a different user id, to check if similar results were obtained. This was indeed the case. The images and corresponding pages that were retrieved from the analyzed WWW site ranked remarkably high. Is this perhaps still due to some kind of weak personalization? This could be based for instance on the IP address used. as this corresponds in many cases to a particular country. A problem here is that it is not completely made clear to users how exactly the popular search engines function. Google Image Search gives results in the form of small images (named thumbnails), each one annotated with the corresponding URL of the WWW page that contains the original image. In this analysis, we noted the rank of the first thumbnail image in such a result set, which originates from the analyzed WWW site. In this way, a low number reflects successful retrieval. The findings are shown in the Table. Query Basalampasu mask Salampasu masks Basalampasu masks Bamana iron mask Bamana iron masks Bambara iron mask Bambara iron masks Bambara iron Bambara ntomo mask Bambara ntomo masks Bambara ntomo Mask Kanaga Suku mask Namji doll doll Namji doll dowayo singiti Hemba Mossi Boulsa mask Boulsa mask mask Boulsa masque Boulsa mask Toussian Toussian mask Toussian masque Bamana iron mask Suku Dowayo doll Hemba singiti masque Toussian masque Toussian 2 books African art Ngere mask Google Image search rank 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 3 Suku hemba sculpture Hemba Wee mask Bamana ntomo masks Kanaga dogon Hemba sculpture African art books Bamana ntomo mask Namchi doll doll Namchi Kanaga mask Wee mask Ngere Dogon kanaga Kanaga mask Kanaga masque Guere mask Bamana ntomo mask Guere African masks African masks 2 Fante doll Fante doll 2 Baule blolo Baoule blolo Salampasu mask Salampasu mask 2 Salampasu mask 3 Salampasu doll Mossi Dogon door lock doll Fanti doll AND Fanti biiga Mossi puppet Mossi Fanti doll Fanti doll 2 biiga Basalampasu 4 4 4 5 5 5 5 7 8 8 9 9 9 10 10 10 11 14 23 28 30 1 of 600 1 of 18000 1 of 18000 1 of 280 1 of 280 1 of 400 1 of 400 1 of 400 1 of 700 2 of 10000 2 of 16000 2 of 700 2 of 700 3 of 200 3 of 7000 4 of 700 4 of 700 7 of 3500 To check reproducibility, some queries were submitted not just once, but twice, after a few minutes or after a few weeks. These are indicated in the Table with “2”. The results were satisfactory. We observed that the sequence of words in the query can have a significant influence on the search results (see examples in the Table). Blakeman (2010) noticed this also. We find that the ranking within results can be different, as well as the number of results. Most users are not aware of this and do not notice it in their applications. The phenomenon can complicate quantitative investigations and it can be exploited in practice to obtain alternative results with the same query words. The data of our analysis the Table show that the WWW image search engine presents a page from the test site among the first 20 retrieved pages that are shown on the first page of results, for almost all queries. The significance of these observations depends of course on the total number of hits given by the search engine. For instance, if the total number of hits is lower than 20, then it would be expected and meaningless to see that the WWW page that contains the query words features among those pages. Therefore it must be mentioned here that the total number of hits was generally much higher than the rank number given in the Table. For instance: The query ‘Salampasu mask’ gives a hit from the analyzed WWW site, which is ranked number 1, while more than 400 hits are reported by Google image search. The query ‘Dogon door lock’ gives a hit from the analyzed WWW site, which is ranked number 2, while more than 16 000 hits are reported by Google image search. In conclusion, this has demonstrated that the contents of our specific analyzed WWW site can be discovered quite well through image searching. Visibility of the WWW site through subject directories When a subject directory on the WWW has selected a WWW page or site and has included a descriptive entry about it, then this indicates an appreciation by the creators of the directory. Furthermore this increases the web presence and visibility of the included site in several ways: Users can discover and access the site through the directory. Search engines based on crawlers can easily discover the included site and will probably harvest them and include the contents in their search engine database. The existence of the link in the directory probably increases the Google PageRank or analogous values used in other search engines. Therefore it was checked if the analyzed WWW site has been included in some famous online subject directories. This can be considered as an important part of the more general link analysis presented above. The classical Yahoo! general subject does not include the site. Probably the site must be manually suggested online to Yahoo! for consideration and human editorial review, to have any chance of being included. (cfr. ‘Search engine optimization’ in Wikipedia). The general subject directory Open Directory Project (ODP) http://www.dmoz.org/ includes a link to the WWW site http://www.vub.ac.be/BIBLIO/nieuwenhuysen/ that includes the analyzed WWW subsite. The general Google Directory http://www.google.com/dirhp is based on data copied from the Open Directory Project, but shows these in a different sequence that is based on the Google PageRank of the included WWW sites. Indeed, this directory shows the same link as ODP. Furthermore, in the section of the directory where the analyzed site is included, the site ranks among the top 3 already for a few years. The general academic, scholarly subject directory of the University of California, named ‘Infomine’, includes an entry about one of the WWW pages that are included in the analyzed WWW site: http://www.vub.ac.be/BIBLIO/nieuwenhuysen/african-art/african-art-links.html The general academic, scholarly subject directory created in the United Kingdom, named ‘Intute’, does include the entry http://www.intute.ac.uk/cgi-bin/fullrecord.pl?handle=artifact7726 about a particular page of the WWW site, namely the page devoted to African masks http://www.vub.ac.be/BIBLIO/nieuwenhuysen/african-art/african-art-collection-masks.htm. The fact that this page was selected corresponds well with the high value for PageRank, that is observed for this page, as reported elsewhere in this paper. The subject directory of a university library in the USA includes also a link to the analyzed WWW site: http://library.agnesscott.edu/help/subjects/african_art.htm Besides the links from directories, a link is also present in the popular free encyclopedia Wikipedia: http://it.wikipedia.org/wiki/Maschere_tradizionali_africane In conclusion, the analyzed WWW site has been discovered, appreciated and included in several significant academic subject directories. Therefore the visibility through subject directories on the WWW is pleasing. Visibility of the WWW site, as reflected by usage Many statistical data related to usage of the WWW site can be collected and inspected in various ways, as explained for instance by Espadas et al. (2008) and Aguillo (2009). To realize this in practice, several WWW based statistical analysis systems have been incorporated in most pages of the analyzed WWW site and have been used over the past years, all free of charge. Of those systems, one that has become available recently is Google Analytics http://www.google.com/analytics/ . This system has been evaluated as relatively powerful and easy to use by many colleagues (see for instance Aguillo, 2009) and also by myself. So this system was applied to collect the data related to usage, which are reported here. During the last months of 2009 and the first months of 2010, usage was fairly constant. On average, during a week 2000 to 3000 visitors were counted, who looked at 1-2 pages. Almost half of the visits came from the USA and only 3 % from Africa; this reflects of course not only interest in African art, but also the penetration of Internet technology into the population. One WWW page received about half of all visits and these lasted on average 3 minutes: http://www.vub.ac.be/BIBLIO/nieuwenhuysen/african-art/african-art-collection-masks.htm. The usage analysis has revealed also that the sources of usage are mainly WWW search engines (about 90 %) and that most of the search queries that generate a visit include the words “African masks”. Usage of the pages that offer bibliographical information about books published on African art is lower, as expected, but many of these pages do receive about 10 visits per week. A comparison with usage of other sites is not straightforward, as data can be collected from Google Analytics only by the webmaster of a site. From these statistical usage data and from the more individual, personal contacts with users of the informative analyzed WWW site, we conclude that the site is successful enough to justify its maintenance and further development. Discussion The various aspects of the analysis yield various indicators and these shape a coherent, consistent picture. For instance, one particular page of the site receives most attention and usage, http://www.vub.ac.be/BIBLIO/nieuwenhuysen/african-art/african-art-collection-masks.htm and this agrees with various considerations and observations that are reported above: This page offers relatively rich information content in the form of text as well as images. The information provided is not only interesting for amateurs of African art, but also for children and others who want to make a mask for instance. The number of hyperlinks received from other pages on the WWW is relatively high. The Google PageRank value of this page is relatively high. This page and not the whole site or another page has been selected and included by the subject directory ‘Intute’. This page received about half of all visits, as shown by Google Analytics. Also the sources of usage are mainly WWW search engines and most of the search queries that generate a visit include the words “African masks”. Conclusions and recommendations 1. The results show the following. The view of the Internet as a difficult and highly competitive place to offer information is justified, but this should not lead to pessimism and a passive attitude without actions. Even with a simple and cheap approach, contributions to the Internet can reach users and be useful, even by users who use common, simple search methods. So an optimistic view is justified for all information providers and libraries in particular. The fact that the WWW site built by the author is well visible and used indicates that this aim can be reached by more professional and well funded digital libraries that hire specialized personnel to develop their site. Ideally, good digital libraries should come to the WWW and should take a more prominent rank in the lists of search results. Authors, publishers, librarians who add information to the WWW that includes a significant part of visual information, should ideally do this in such a way that image searching can be used as one of the possible methods to unlock the information sources. 2. a. Corresponding to this conclusion, it is recommended that other webmasters build and improve their sites in similar ways, i.e. according to most of the guidelines that are published in the form of articles, books and documents on the WWW, most of which have the words “Search Engine Optimization” or the abbreviation “SEO” in their title. Examples are the books by Thurow (2003) and Lieb (2009) the brief, concentrated, clear Google’s Search Engine Optimization Starter Guide (2008) that is available online free of charge the brief list given by Espadas et al. (2008) The focus of such guides is primarily on search engines, but most guidelines are also applicable and useful to increase the quality of the real user experience. The following offers directly a selection of guidelines; these are relatively easy to apply and probably efficient: Offer unique content or services. Consider creating pages in English or another important language that is used by your target audience, instead of the language used in the country where the WWW site is created and hosted. Host the site on a server in a well respected organization, that is harvested regularly by search engines, that functions fast and quasi-continuously. Ideally create an HTML-title for each page that is unique, clear, brief, descriptive, significant, and accurate. See that each page corresponds with a clear, brief, descriptive and user-friendly URL. Place your web pages in a simple, hierarchical folder/directory structure that is easy to navigate by users. Use mostly text for navigation (and not drop-down menus, images or animations). Avoid deep nesting of subdirectories. See that a user can navigate successfully by removing a part from the URL, in the hopes of finding more general contents. Each anchor text (the text that you use to give users an idea about the target of a link) should be brief but clear, descriptive and significant, and not generic and meaningless. Format links in such a way that they are easy to spot. Users should be able to distinguish between regular text and anchor text. Use HTML-heading tags appropriately. Use a WWW site format that does not hinder visibility. The analyzed WWW site consists of static web pages linked together. So the site belongs to the well visible web and not to the obscure, invisible web that is not covered and harvested well by the popular, classical WWW search engines (Sherman and Price, 2001). Storing information as a digital library in a database management system, often named content management system, may not be the optimal method in this context. In that case, visibility will depend on the particular management software used. Offer significant content on your pages. In the analyzed WWW site, most pages are much longer than can be displayed on a screen and includes a lot of information. In other words, the information that is offered is not scattered over numerous smaller pages, which would lead probably to a lower PageRank for each individual page. Some web development guidelines are specific for images: Offer each image as a separate file, and not hidden in a larger container file in a format such as the popular Adobe PDF or Microsoft Word DOC or DOCX or Microsoft PowerPoint PPT or PPS or PPTX. Use a file format for each image file, which can be interpreted by common, popular Internet browsing software. Use a brief but meaningful, descriptive file name for each image file. 2.b. The visibility of the resulting WWW pages can be analyzed in various ways, as described for instance in Espadas et al (2008), Aguillo (2009) and in this paper. Applying all methods together yields a useful view on the performance in terms of visibility and presence of your WWW site. If such a view is also needed for your web site, then it is recommended to perform a similar analysis. References Aguillo, Isidro Measuring the institution’s footprint on the web. Library Hi Tech, Vol. 27, No. 4, 2009, pp. 540-556. DOI 10.1108/073788309. Blakeman, Karen On the net: All change on the search front. Online Magazine, March/April 2010, pp. 44-47. De Andrés, Javier, Pedro Lorca, and Ana B. Martínez Economic and financial factors for the adoption and visibility effects of Web accessibility: The case of European banks. Journal of the American Society for Information Science and Technology, Vol. 60, No. 9, pp. 1769-1780, 2009, DOI: 10.1002/asi.21103 Espadas, Javier, Coral Calero, and Mario Piattini Web site visibility evaluation. Journal of the American Society for Information Science and Technology, Volume 59, Issue 11, September 2008, pp. 1727-1742. DOI 10.1002/asi.20865. Google’s Search Engine Optimization Starter Guide (2008) [online] Available free of charge from: http://google.com/ in the form of one PDF file. Lewandowski, D., and N Höchstötter (2008) Web Searching: A Quality Measurement Perspective. In Web Search. (edited by A. Spink and M. Zimmer) Berlin Heidelberg : Springer, 978-3-540-75828-0 (Print) 978-3-540-75829-7 (Online), DOI 10.1007/978-3-540-75829-7_16, pp. 309-340. Lieb, Rebecca (2009) The Truth about Search Engine Optimization. Que Publishing, ISBN 0789738317, 9780789738318, 208 pp. Machill, Marcel, Markus Beiler and Martin Zenker (2008) Search-engine research: a European-American overview and systemization of an interdisciplinary and international research field. Media, Culture & Society, Vol. 30, No. 5, pp. 591-608. Nieuwenhuysen (2008) Internet federated search engines for bookseller databases: a comparative evaluation. In Intelligence, Innovation and Library Services, Proceedings of the Fourth Shanghai International Library Forum = SILF2008 = Shanghai Library, Shanghai, China, October 20-22, 2008. Shanghai Scientific and Technological Literature Publishing House, 2008. 371 pp. ISBN 978-7-5439-3671-3, pp. 340-348. Nieuwenhuysen (2010a) Printed books and the WWW. In the proceedings of the annual BOBCATSSS international conference on library and information science, in 2010 hosted by the Universita degli Studi di Parma, Italia/Italy, 25-27 January 2010. Available online free of charge from: http://dspace-unipr.cilea.it/handle/1889/1273 or http://hdl.handle.net/1889/1273 PDF file, 9 pp. Nieuwenhuysen, Paul (2010b) Information retrieval via WWW image searching: a reality check. In the proceedings of the 2010 International Conference on Information Retrieval and Knowledge Management, CAMP’10 “Exploring the Invisible World”, at the Shah Alam Convention Centre, in Malaysia, 16-18 March 2010, hosted by the Universiti Technologi MARA and supported by the IEEE Computer Society, edited by Zainah Abu Bakar et al., 2010, pp. 73-78. PageRank. [online] Available free of charge from: http://en.wikipedia.org/wiki/PageRank Search engine optimization. [online] Available free of charge from: http://en.wikipedia.org/ Sherman, Chris, and Price, Gary (2001) The invisible web: uncovering information sources search engines can’t see. Medford : Information Today, Cyberage Books, 2001, 439 pp. Thurow, Shari (2003) Search engine visibility. Indianapolis : New Riders, 2003, 297 pp. Author / presenter: Since 1983, Paul Nieuwenhuysen is a full-time member of the academic staff at the Vrije Universiteit Brussel, nowadays as professor. These days his functions include: member of the management board of the University Library, science and technology librarian, teacher of courses on online information retrieval and presentation. At the University of Antwerp inter-university postgraduate program in Information and library science, he has been guest professor responsible for courses on information technology and on the information market until 2009. At the University of Antwerp he received the degrees of Licentiaat in Physics in 1974, Doctor in Science in 1979, the Belgian post-doctoral degree in 1983, and the inter-university postgraduate degree in “Documentation and library science” in 1986. He has been project leader of a 10 year co-operation with the National Agricultural Library of Tanzania and he organizes international training courses on management of information in science and technology. He is single author or co-author of more than 40 refereed publications in international scientific/technical journals and conference proceedings. In the area of information science, he has been a consultant for various international agencies, and he is a member of several societies, of the program committee of several international conferences, and of the editorial board of several academic and professional journals.