An Introduction to Treejack Out on a limb with your IA Dave O’Brien Optimal Usability Welcome Dave O’Brien Optimal Usability Wellington, New Zealand 22 Jan 2010 36 attendees USA, CA, UK, NZ, AU, BR, CO Agenda • Quickie Treejack tour • What is tree testing? • • • • • Planning a tree test Setting up Treejack Running a test High-level results Detailed results • Lessons learned • (Q&A throughout) Poll • Have you used Treejack yet? • No, haven’t tried it yet = 20% • Yes, but only a practice test = 60% • Yes, have run a "real" test = 20% Tree testing - the 5-minute tour • Creating a medium or large website • Does your top-down structure make sense? Does your structure work? • Can users find particular items in the tree? • Can they find them directly, without having to backtrack? • Could they choose between topics quickly, without having to think too much? • Which parts of your tree work well? – Which fall down? Create a site tree Write some tasks Put this into Treejack Invite participants Participants do the test You see the results Live demo for participants* What is tree testing, really? • Testing a site structure for –Findability –Labeling What’s it good for? • Improving organisation of your site • Improving top-down navigation • Improving your structure’s terminology (labels) • Comparing structures (before/after, or A vs. B) • Isolating the structure itself • Getting user data early (before site is built) • Making it cheap & quick to try out ideas What it’s NOT • • • • • NOT testing other navigation routes NOT testing page layout NOT testing visual design NOT a substitute for full user testing NOT a replacement for card sorting Origin • Paper tree testing – “card-based classification” – Donna Spencer – Show lists of topics on index cards – In person, score manually, analyse in Excel Make it faster & easier • • • • • • Create a web tool for remote testing Quick for a designer to learn and use Simple for participants to do the test Able to handle a large sample of users Able to present clear results Quick turnaround for iterating But I already do card sorting! • Open card sorting is generative – Suggests how your users mentally group content – Helps you create new structures • Closed card sorting – almost not quite • Tree testing is evaluative – Tests a given site structure – Shows you where the structure is strong & weak – Lets you compare alternative structures A useful IA approach • Run a baseline tree test (existing structure) – What works? What doesn’t? • Run an open card sort on the content – How do your users classify things? • Come up with some new structures • Run tree tests on them (same tasks) – Compare to each other – Compare to the baseline results Planning a tree test • Stakeholder interview • Find out who, what, when, etc. – fill in "planning questions" template • Get the tree(s) in digital format – use Excel tree-import template, etc. Getting the tree • Import a digital format – Excel – Text file – Word • Or enter in Treejack Poll • How big are your trees? • • • • Small (less than 50 items) = 25% Medium (50 - 150 items) = 39% Large (150 - 250 items) = 22% Huge (more than 250 items) = 14% Tree tips • Recommend <1000 items • Bigger? Cut it down by: – Using top N levels (e.g. 3 or 4) – Testing subtrees separately* – Pruning branches that are unlikely to be visited • Remove “helper” topics – e.g. Search, Site Map, Help, Contact Us • Watch for implicit topics! Implicit topics • Create your tree based on the content, not just the page structure. Home Contact Us North America • • • • Lorem ipsum dolor sit amet, consectetur adipisicing elit sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Products Support Contact Us • • South America Europe South America Home Europe Products Support Contact Us • • • North America South America Europe User groups and tasks • Identify your user groups • Draft representative tasks for each group – Tasks must be “real” for those users! • ~10 tasks per participant – Beware the learning effect – Small tree ~8, large tree ~12 – More tasks? Limit per participant – Randomise the task order Drafting tasks • What parts of the tree do you want to test? – Coverage should reflect importance • Each task must: – – – – Be specific Be clearly worded Use the customer’s language Be concise • Beware “give-away” words! • Review now, preview before the real test Setting up a Treejack project • Creating a Treejack project • Entering your tree • Entering the tasks and answers • Less on mechanics, more on tips Creating a project • New vs. Duplicate • Survey name vs. address • Identification – The “Other” option – Passing an argument in the URL https://demo.optimalworkshop.com/treejack/s urvey/test1?i=12345 Entering your tree • Paste from Excel, Word, text file, etc. • “Top” – how to replace • Randomising – Not the same as randomising tasks • Changing the tree after entering answers • Lesson learned: – Edit/review/finalise the tree elsewhere before putting it into Treejack Entering tasks and answers • Preview is surprisingly useful • Multiple correct answers – The “main” answer is usually not enough – Check the entire tree yourself • Must choose bottom-level topics – Workaround: Mark all subtopics correct – Workaround: Remove the subtopics • Choose answers LAST Task options • Randomising tasks – almost always • Limiting the # of tasks – 20-30 tasks = 10 per participant – Increase the # of participants to get enough results per task • Skip limit – Eliminate users who didn’t really try – Defaults to 50% Testing the test • Not previewing/piloting is just plain dumb – Spot mistakes before launch • Preview the entire test yourself • Pilot it with stakeholders and sample users – Launch it, get feedback, duplicate, revise • Look for: – Task wording (unclear, ambiguous, typos) – Unexpected “correct” answers – Misc. problems (e.g. instructions) Poll • How many participants do you get per test? • • • • 1 – 20 = 44% 21 – 40 = 20% 41 – 100 = 24% Over 100 = 12% Running the tree test • Invite participants – Website-page invitations – email invitations • Recommend >30 users per user group/test • Monitor early results for problems – low # of surveys started • Email invitation not clear? Subject = spam? Not engaging? – low completion rate • email didn’t set expectations? Test too long? Too hard? • Generally less taxing than card sorting Skimming high-level results • 10/100/1000 level of detail • Middling overall score – Often many highs with a few lows • Inspect tasks with low scores (low total or low sub-scores) • Inspect the pie charts Success • % who chose a correct answer (directly or indirectly) • low Success score – check the spreadsheet to see where they went wrong – Destinations tab – Path tab Directness • % of successful users who did not backtrack – Coming soon: making this independent of success • low Directness score – check the spreadsheet for patterns in their wandering – Paths tab Speed • % who completed this task at about the same speed as their other tasks – % who completed task within 2 standard deviations of their average task time for all tasks • 70% Speed score – 7/10 users went their “normal” speed – 3/10 users took substantially longer than normal for them • Low Speed score – indicates that user hesitated when making choices – e.g. choices are not clear or not mutually distinguishable • Wish: add the raw times to the spreadsheet, so you can do your own crunching as needed. • Overall score uses a grid to combine these scores in a semi-intelligent fashion Detailed results – destinations • Where did people end up? • # who chose a given topic as the answer • Wrong answers – High totals - problem with that topic (perhaps in relation to its siblings) – Clusters of totals – problem with the parent level • Ignore outliers – For >30 sessions, ignore topics that get <3 clicks. Detailed results – destinations • Look for high “indirect success” rates (>20%) – Check paths for patterns of wandering • Look for high “failure” rates (>25%) – Check the wrong answers above • Look for high skip rates (> 10%) – Check paths for where they bailed out. • Look for "evil attractors" – Topics that get clicks across several seemingly unrelated tasks. – Usually a vague term that needs tightening up Detailed results – first clicks • Where they went on their first click – Important for task success • Which sections they visited overall – Did they visit the right section but back out? Detailed results – paths • Click-by-click paths that they took through the tree • Useful when asking: – How the heck did they get way over there? – Did a lot of them take the same detour? • No web UI for removing participants. – Email Support and we’ll fix you up. Some lessons learned Test new against old Revise and test again – quick cycles Test a few alternatives at the same time Cover the sections according to their importance • Analysis is easier than for card sorting • Use in-person testing to get the “why” • • • • – Paper is still effective (and free!) for this • Tree testing is only part of your IA work What’s coming • Better scoring for Directness, Speed • Improved results (10/100/1000) • General enhancements across Treejack, OptimalSort, and Chalkmark • Whatever you yell loudest for… – GetSatisfaction lets you “vote” for issues Tree testing – more resources • Boxes & Arrows article on tree testing http://www.boxesandarrows.com/view/tree-testing • Donna Spencer’s article on paper tree testing http://www.boxesandarrows.com/view/card_based_classi fication_evaluation • Treejack website Webinars, slides, articles, user forum http://www.optimalworkshop.com Getting your input • Specific issues/questions – support@optimalworkshop.com • Feature requests – Check the support forum (GetSatisfaction) – “Feedback” button Thanks!