Hackfest 3: THIS TIME IT'S PERSONAL (or) Can I Get A Metasearch? A Cast of Dozens And the O'Reilly Lithographic Spirit Gods Access 2004 – Halifax – 13 October 2004 First, A Poem "Guinness, Murphy’s, Harp Hops is bitter fruit. All Good from Dublin Core" -A Librarian (Unknown) Found in a bar in Windsor, ONT, in 2002 A Definition • Hack \Hack\. noun: – “A quick job that produces what is needed, but not well.” (Jargon File 4.3.0) – “One who works hard at boring tasks [syn: drudge, hacker.]” (WordNet (r) 1.7) • Hack \Hack\. verb: – “To use frequently and indiscriminately, so as to render commonplace.” (Webster’s, 1913) • Fest \Fest\, Feste \Fes”te\, noun: – “A feast. [Obs.] –Chaucer.” (Webster’s 1913) Objectives • Solving problems or develop new ideas • Sharing a temporary, non-competitive, nonwork, no-pressure, no-string-attached, collaborative social, and educational environment • Learning about contemporary issues in libraries and tech • Learning from each other • Having fun The Event • 40+ signups, 35+ attendees, many newcomers • Held at St. Mary's University • Sixteen project ideas suggested in advance (but kept private until Hackfest) • One big lab, one big meeting room • Two dedicated remote hackfest servers • 30+ minutes of discussing projects • Group up and go! The People The Suggestions • Sixteen (16!) project suggestions • Several metasearch ideas (framework, training) • Several personal library ideas (rss from repositories, tables of contents, integrating with external systems) • Others: cobrowsing, harvesting, conversions, "bitter date" normalization Project: Tables of Contents • Who: Richard Baer, John Dobson, Grant GelinasBrown, Todd Holbrook, Sherri Vokey, William Wueppelmann • What: To discover a method for harvesting e-journal table of contents information from freely-accessible publisher web sites (without having to enter into negotiations with the publishers). Library users would be able to save a list of favorite journals in a web-based personal account and receive notification of updates. The ability to link to the full-text (with an active subscription) would be an additional feature, as would searching within the personal database. Project: Tables of Contents • Screen scraping: recovery of a document's underlying data structure by parsing its source code – inference of boundaries between records and fields through examination of patterns in the tag structure – inference of what data elements are represented through examination of table headings, field labels, other clues contained within tags such as name or class attributes Project: Tables of Contents Project: Tables of Contents Project: Tables of Contents Project: Personal Library • Who: Nancy Hoebelheinrich, Tracy Seneca, Brandon Uhlman, Lisa Yeo • What: To come up with the ways and means that our library systems can talk to personal library systems from Apple, Google, etc. Project: Personal Library • What is a personal library? – More than a simple list of the bibliographic (or sales) info about items I own or have read. – I should have access to the full text, and to related full text wherever feasible. • Reviews of the work • Materials by same author (or auto-link to federated search) • Recommended (related) reading (see project 8) Project: Personal Library • Not limited to items documented in databases, but can include scanned items, my own personal papers. • I should be able to navigate by methods meaningful to me, not just info about the item. (personal timeline, categories I create). • A personal library should grow more rich over the years, not just because I add items, but by learning how I use its contents. Not just the data about the item, but also data about how I used it. Project: Personal Library • Our bookshelves – The books – Our ephemera: folders, notepads, papers, photocopied articles • Bibliographic Management (ProCite, RefWorks, Citation Manager, Online Portfolios) • Browser bookmarks, our own web pages/sites, blogs • Accounts: Amazon, Netflix, AllMusic, ITunes • Hard drive: downloaded articles, directories for classes, projects Project: Personal Library • LibDB Emphasis on different interfaces for different user types • Citation Manager Simon Fraser University Integration with research sources; data entry not separate task • Delicious Library Visual interface; easy to enter, gather related information with physical item in hand. • Library Lookup Gather information from your library catalog while browsing amazon • Project 8: • Stuff I've Seen Blog book recommendation activity Susan Dumais - Microsoft Automatically index items you’re interested in; emphasis and ranking based on how you interact with the item, not just the item itself. • Federated Searching Enhance the information you have by pulling in related material. Your personal library should grow on its own. Project: Personal Library Project: Personal Library Project: Personal Library • Interesting issues: – Copyright / Authorization This can’t “belong to” an institution – it shouldn’t go away when you finish school, etc. – How does your access to related materials change as you move from place to place? Project: Metasearch Considerations • Who: Julie Arie, Roy Tennant, Kent Weaver • What: Document to "highlight issues to consider when reviewing metasearch software applications." • Definition: "an application that performs simultaneous searching of two or more different types of resources and effectively presents results, with appropriate machine-level communication between related applications." Project: Metasearch Considerations • Local: configure/control, compatibility, licensing, political/privacy/administrative • Application: protocols, syntax, authentication, configuration, target parameters (presentation, configuration, technical), deployment, results, interface, management, consortial support, hardware, interoperability • Vendor: implementation costs, maintenance costs, support, roadmap/vision, selection process Project: Metasearch Architecture • Who: Walter Lewis, calvin mah, Art Rhyno • What: How do you design an architecture for metasearch that can be used in different environments? • Artifact: design docs, sample profiles Project: Metasearch Architecture • Design Layers: – – – – Targets Instances Branding Application space • How do you model/define schemas for each? • SETH: "Search Everything 'Till it Hurts" Project: Metasearch Arch.: Target <?xml version="1.0"?> <targets> <target> <user_agent> <default/> </user_agent> <host/> <HTTP_parms> <steps no=""> <base_HREF /> <method> GET|POST </method> <nvp name=""/> <extract name="" type="regexp/Xpath"/> <cookie name="" path="" domain="" age="" secure=""> PASS_FORWARD COLLECT </cookie> </steps> </HTTP_parms> <Z_parms> <database_name/> <port/> <result_set_naming_required> </result_set_naming_required> <Z_RecordSyntaxes> <Z_Syntax/><Z_Syntax/> </Z_RecordSyntaxes> </Z_parms> <wsdl>URI</wsdl> <target_URL/><last_updated/> </target></targets> Project: Metasearch Arch.: Instance <?xml version="1.0"?> <target_sets> <target> <search_label /> <search_descriptor /> <result_label /> <preferred_record_syntax> mergeability criteria dedup </preferred_record_syntax> <result_ranking /> <transformer /> <timeout /> <authentication> <user>machine login/pass</user> <target> SAML? referrer apache_style form certificates</target> </authentication> <resolver_url /> <search_types><search_type> SUBJECT|TITLE| <transform type="HTML|SCREEN|PDF" /> </search_type> </search_types> <generator /> <target_hints> <!-- hand off when search fails --> URI|text </target_hints> <!-- if failed --> <alt_target/><meta><instance_name /> <form_type/><form_label /> <help_files /> </meta> </target> </target_sets> Project: Metasearch Guide • Who: Joyce Wong, Simon Lloyd, Lissa Potter • What: An interactive tool that incorporates critical thinking processes from library tutorials and help guides with meta-search functions • Artifact: A new prototype user interface Project: Metasearch Guide • Major changes between proto-type and new draft include: – a more structured approach in which the tool is presented as a series of steps. – emphasis is on examples to guide students through the critical thinking – user is asked for backup search words near the beginning of the tool. The backup search words are then included as alternate search strategies later on. – demos in the form of videos Project: Metasearch Guide • The team also discussed more advanced features such as: – "test run" options by which users can test their search statements by a preliminary result screen that provides both qualitative and quantitative evaluations. – the option to save their search history in some form of personalized Project: Rakoon • Who: Peter Binkley, Corey Davis, John Durno, Kenton Good, Michael Hohner, Ross Singer, Steve Zinck • What: a co-browser for RAKIM (virtual reference tool) using Cocoon • Artifact: a working prototype Project: Rakoon • Co-browsing: – Bandwidth-intensive session w/shared screen, mouse, etc. (e.g. QuestionPoint) – Co-proxy with regular screen refreshes from shared cache (e.g. 24x7) • Used latter as model, creating proxy using Cocoon, built on Art Rhyno's Hackfest I project Project: Rakoon Project: Rakoon Project: Rakoon Project: Rakoon • Future development: – – – – Proxies HTML well, but not other media types Security audit PATRIOT Act issues Actual integration w/RAKIM (!) Project: RSS From Repositories • Who: Kristina Aston, Cameron Metcalf, Pat Moore, Miles Poindexter • What: – getting an RSS feed out of a digital repository classified in their field of study – alerting users when the latest additions are added – an RSS feed of new book acquisitions on a library's homepage instead of statically-generated HTML pages. • Artifact: Prototypes! See http://rockies.med.yale.edu/~group8/ Project: Mirroring Weblogs • Who: Dan Chudnov, Brian Tingle • What: How to enable a LOCKSS-like "lots of copies" of weblog data? If an "important" weblog "goes dark," how can we re-light it elsewhere? • Artifact: Simple design, diagram Project: Mirroring Weblogs Summary • 30+ people • 8+ projects • Wide range of activities: – Focus on metasearch and personal library – New service models, new models for existing services – Working demos – Whiteboard-only hacking – Building on previous years' work Thoughts on Process • Still not sure whether to share suggestions beforehand; good reasons for/against • Quick re-assessment of ideas, skill balance soon after project groups assemble • Post-lunch-ish reassembly, quick reports • Whole-day pre-conf, single location, format works • Wiki helpful for organizing projects, perhaps we can do more with it New for 2004: Hackfest Awards The "There's More Than One Way To Hack It" Duct Tape Hackfest Awards Art Rhyno: Access Pimp (self-proclaimed!) (honest!!) Hackfest Awards 2002 2004 2003 Peter Binkley: Lifetime Achievement Acknowledgements • Tamsin, Steven, Peter, et al. at Acadia, Saint Mary's for everything, esp. logistics, support • Saint Mary's for facilities • John for co-coordinating • SFU (calvin), YCMI for servers • Roy for the hype • All the participants!