Eco (Bio)informatics Website Development 101 A primer on Creating Biologicallybased Websites 1 Introduction Modern, biologically oriented websites have evolved rapidly in the last ten years; and will continue to evolve at least as rapidly for the foreseeable future. Web 2.0: wikis, crowd sourcing, blogs, Flickr, YouTube Interactive queries and graphing. Static tables and ‘click here to download’ interactivity Flash & Java powered interactivity and effects The next big thing…? 2 Fundamentals • Know your site’s mission • Know your audience • Taxonomy • Discovery • Web 2.0 • Copyright and ownership issues • Getting Started 3 What is Your Mission? • What is the purpose(s) of your website • Outreach • Research • Communication • Data portal • Analytical tool • Education • Which leads us to the next question… 4 Who is your Audience? • Identify your audience(s) • Major audience groups include: • Elementary school • High school • University • Researcher • General public • Decision makers 5 Who is your Audience? • Each group requires different language sets, different assumptions about prior knowledge, different font/color styles, tools • Targeting multiple audiences with one website is possible, but difficult to do well • Multiple entrance paths and templates can be used 6 Taxonomy • The single, most difficult issue for managing biological data sets • Species name <> Species Concept • A species name is a particular label that someone has applied to a particular species concept • A given species concept may have many names (synonyms) 7 Taxonomy A species name is defined by a Latin binomial and an authority. The authority is critical as it defines who originally described the species. Currently Accepted name: Incilius melanochlorus Cope, 1877 Synonyms: Bufo melanochlora Cope, 1877 Ollotis melanochlora Cope, 1877 8 Taxonomy A species name may also include lower rank names that define a variety and/or sub-species. The lower rank name(s) also have authorities Swartzia simplex (Sw.) Spreng. var. continentalis Urb. 9 Taxonomy Ideally, there should be a one-to-one match between name and concept, unfortunately, the world is not ideal… • Many names are being revised due to misspellings of the original Latin name • Disagreements about what constitutes a species (lumpers vs. splitters, geneticists vs. naturalists, ecologists vs. taxonomists) • Disagreements among major plays such as ITIS, GBIF, Species 2000 and group-specific sites 10 Taxonomy DNA and the new Taxonomy • DNA analysis has led to major revisions of all major kingdoms. • Changes are being made at all taxonomic levels, from phylum on down • Angiosperm Phylogenetic Group (APG) pretends to reorganize the entire Angiosperm group • Next 10 years will see major shake-ups of many major taxonomic groups 11 Taxonomy Out of Chaos, order… • Rule #1. Accept that there is no universal agreement and move on • Rule #2. Pick a source or sources and stick with them • Rule #3. Manage taxonomy separately from metadata • Rule #4. Use species catalog numbers rather than names to link objects together 12 Taxonomy Synonyms can be handled using four variables: • Spnumber Species name catalog # • Taxstat Taxonomic status of name • Accepted • Synonym • Excluded • Incomplete • Synof • Synonyms Spnumber of Accepted name for this species synonym Synonym(s) for this accepted name 13 Taxonomy I want a picture of Algus grenus Algus Grenus Taxonomic Database Name Spnumber Taxstat Algus Grenus 5212 . . Algus verdus 1234 ACC Synof Synonyms SYN 1234 5212 Spnumber = 5212, 1234 Photo Database 14 Discovery Build for Discovery… Putting something on the web has little value if no one can find it • Metadata • Optimizing Site Navigation • Understand how search engines work • Understand how your audience thinks 15 Discovery Metadata It’s more than just information about photos… • it’s information about every object that you want people to know about in the future • it’s the primary method to rigorously document the who, what, when, where and how of an object and to make it machine searchable • The best metadata takes the available information and atomizes it as much as possible 16 Discovery This thing found here doing this on this date by this person & verified by 17 Discovery Only atomized information can be efficiently be searched. Reserve free text information for unsearched titles and comments. Control the vocabulary of the information used in databases. Any spelling difference, no matter how minor, will be interpreted as different. Controlled vocabulary makes it easier for the user to search for and discover information. 18 Discovery Navigation: Multiple access routes Traditional Linear Navigation Design Project 1 Project 2 Home Page "He's intelligent, but not experienced. His pattern indicates two dimensional thinking…" Project 3 Project 4 19 Discovery Navigation: Multiple access routes Multidimensional Navigation Design • Use persistent tabs and menus • Search boxes • Embedded hyperlinks • Anticipate user navigation behavior Homepage 20 Discovery Navigation: Minimizing Clicks Always aim to minimize the average number of clicks that a user should need to go from any page on your web site to any other. Ideally, a user should not need more than 3-4 clicks to go from anywhere to anywhere else. 21 Web 2.0 What is this Web 2.0 thing? The answer depends on who you ask 22 Web 2.0 “Web 2.0” refers to the second generation of web development and web design that facilitates information sharing and collaboration on the World Wide Web. Examples include social-networking sites, videosharing sites, wikis, blogs, mashups and folksonomies. (Wikipedia) 23 Web 2.0 • Web 2.0 websites allow users to do more than just retrieve information. • Users can own the data on a Web 2.0 site and exercise control over that data. • These sites may have an "Architecture of participation" that encourages users to add value to the application as they use it. This stands in contrast to traditional websites, the sort that limited visitors to viewing and whose content only the site's owner could modify. • Web 2.0 sites often feature richer, user-friendly interfaces 24 Web 2.0 Popular examples of Web 2.0 websites include: Wikipedia Flickr YouTube eBuddy Digg TravBuddy 25 Web 2.0 • Search. The ease of finding information through keyword search. • Links. Ad-hoc guides to other relevant information. • Authoring. The ability to create constantly updating content over a platform that is shifted from being the creation of a few to being constantly updated, interlinked work. In wikis, the content is iterative in the sense that users undo and redo each other's work. In blogs, content is cumulative in that posts and comments of individuals are accumulated over time. 26 Web 2.0 • Tags. Categorization of content by creating tags: simple, one-word user-determined descriptions to facilitate searching and avoid rigid, pre-made categories. • Extensions. Powerful algorithms that leverage the Web as an application platform as well as a document server. • Signals. The use of RSS* technology to rapidly notify users of content changes. *(most commonly translated as "Really Simple Syndication," but sometimes "Rich Site Summary") 27 Copyright & Ownership • Copyright is an important issue • Copyright law is complex, often vague, and varies considerably between countries • Ignorance is not an excuse – get informed 28 Copyright & Ownership Who owns this file? • Anything produced using US Federal funds is considered to be Public Domain and not subject to copyright. In general, the funding agent usually has copyright. • Otherwise, copyright is automatic (under US law) 29 Copyright & Ownership Objects can be re-copyrighted by others only if and when ‘significant new’ artistic content has been added Contrast enhancement, color corrections, sharpening, etc., do NOT constitute new artistic content 30 Copyright & Ownership Creative Commons Licenses Creative Commons is a nonprofit corporation dedicated to making it easier for people to share and build upon the work of others, consistent with the rules of copyright. CC provides free licenses and other legal tools to mark creative work with the freedom the creator wants it to carry, so others can share, remix, use commercially, or any combination thereof. 31 Copyright & Ownership There are Six current License agreements 1. 2. 3. 4. 5. 6. Attribution Attribution, No derivatives Attribution, Non-commercial, No derivatives Attribution, Non-commercial Attribution, Non-commercial, Share-alike Attribution, Share-alike 32 Copyright & Ownership Attribution: You let others copy, distribute, display, and perform your copyrighted work and derivative works based upon it - but only if they give you credit. Noncommercial: You let others copy, distribute, display, and perform your work - and derivative works based upon it - but for noncommercial purposes only. No Derivative Works: You let others copy, distribute, display, and perform only verbatim copies of your work, not derivative works based upon it. Share Alike: You allow others to distribute derivative works only under a license identical to the license that governs your work. 33 Copyright & Ownership Fair use is a doctStates copyright law that allows limited use of copyrighted material without requiring permission from the rights holders, such as use for scholarship or review. It provides for the legal, non-licensed citation or incorporation of copyrighted material in another author's work under a four-factor balancing test. The term "fair use" originated in the United States, but has been added to Israeli law as well; a similar principle, fair dealing, exists in some other common law jurisdictions. Civil law jurisdictions have other limitations and exceptions to copyright. (Wikipedia) rine in United 34 Copyright & Ownership In determining whether the use made of a work in any particular case is a fair use, the factors to be considered include: 1. the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes; 2. the nature of the copyrighted work; 3. the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and 4. the effect of the use upon the potential market for or value of the copyrighted work. 35 Getting Started Once you know your site’s mission & have an idea about who your audience is, do the following: 1. Sketch out the major logical blocks of your site 2. Search the web for similar sites and make a list of what works and what doesn’t 3. Imitate and copy the good stuff(people usually like it when you ‘steal’ their design ideas) 4. Avoid their mistakes 5. If you are creating a website for someone else, frequently check with the PI regarding design and programming 36 decisions Getting Started If you are going to make many websites: 1. Invest time in creating tools that can be shared between sites 2. Invest time in adopting a Content Management System (CMS) 3. Design your websites so that they can share data 4. Select a style and stick with it (CSS) 37 Getting Started No matter how many websites you have: 1. Document what you are doing • • • Internal programming comments (you can never put too much) External programming documentation listing all major program blocks, procedural calls, parameters passed, etc. Database documentation: variables (types & definitions) and general content 2. Back up often or bad things will happen to you 3. For really big projects, consider implementing roll-back technology 38 Getting Started What tools should you use? The most common suite of tools for low-budget, non-commercial operations include: • MySQL databases • PhP programming language (and/or PERL) • Flash and/or Java script • Linux operating system • Apache server 39 Getting Started What other tools might you use? There are many good tools available, both commercial and non-commercial (open source): • ArcGIS by ESRI, Grass, Mininesota MapServer • Drupal CMS • Wiki software • Blogging software • Graphing applications There are lots of arguments for and against commercial and open source software. There is also the possibility of creating your own software tools. Mixed models often work well. 40 Questions and Comments 41