Images, Alternative Text, and Artificial Intelligence SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com Agenda About Us About Me The Project What’s Next http://amp.ssbbartgroup.com/public/research/Automatic_Image_Classification_090707.doc http://amp.ssbbartgroup.com/public/research/SSB_BART_Group_Image_Alt_CSUN_2008.ppt Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 Corporate Overview History Approach Founded in 1997 by engineers with disabilities Violation profiling across 5.5M human validated accessibility issues Scalable Solutions 750 commercial and government customers Data driven and scalable 1,500 enterprise projects successfully One completed One to one million developers to one thousand production systems Pioneers of commercial accessibility validation tools Fifty percent staffing mix of individuals with disabilities Appropriately mixed automated, human and code level validation Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 Supported Platforms Web Compiled Software HTML JFC and SWT Java Applications XML .Net Applications JavaScript MFC Windows Native Applications CSS Macintosh Applications AJAX BMC Remedy Applications Adobe Flash and Flex Standalone Systems Adobe Acrobat Documents Telecommunications Hardware Streaming Audio and Video IVR Systems Agent Systems Digital Imaging Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 Industry Solutions Public Sector Federal Solutions United Manufacturers States European Union Education K-12 Government System Integrators Healthcare Software Hardware Web Based Service Providers Mass Transit Financial Services State and Local Universities Information Technology Consumer Banking Insurance Legal Web Based Service Providers Primary Care Providers Insurance Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 Accessibility Management Platform Requirements Implementation Certification Baseline Audit Development Audit Maintenance Audit Standards Development Standards Maintenance VPAT Creation eLearning Developer Support Certification InFocus™ Suite AMP – SSB’s web based platform for managing all aspects of Accessibility process Benefits Single point for tracking compliance over time Scalable solutions from one to one million developers across multiple domestic markets Support for all aspects of a successful accessibility initiative Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 About Me General Story Accessibility Work Founder and Managing Director of SSB BART Group Also Known As President and CEO Architected and developed first commercial accessibility testing and fixing years tool Started in 1994 at the dawn of the InSight and InFocus 1.x -> 4.x Web Initial release in mid-200 Next release in a few months BS Computer Science Leland Stanford Junior University (AKA Stanford) validation and education since 1999 Professional web site developer for 13 Involved in Web Accessibility activities, Odds on Brad Pitt to Architected and developed Accessibility Management Platform (AMP) play me in the movie Current Version – 2008 R1 Personal work with fifty enterprise class software vendors Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 Project Overview Project Description Create a decision tree to classify images into one of eight types Image types are organized by alternative text requirements Upon classification, alternative text validity can then be tested via straightforward heuristics Project Utility Alternative text provides a textual description of an image Alternative text validity Ensures access to content for people with disabilities Allows pages to be adapted effectively - low resolution, alternative browsers Increases search engine relevance for pages Bottom Line – Good alternative text is good for society and good for profits 8 Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 Automated Testing Tools A brief note on automated testing tools First generation of automated testing tools, where we are now, can test about 25% of requirements accurately Another 25% with so-so accuracy And the rest need to be checked manually We think the next generation of tools can double this efficacy through better AI, more complex page models and better leveraging of human judgment… …but ultimately tools can only facilitate the process of human review they cannot replace it Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 Image Types Layout Element – The image is used solely to layout elements on the page Decorative Picture – The image is a picture that is used solely for the purpose of making the page more visually appealing and it provides no information Text – The image is used to stylize text on the page but is not used as an active element on the page Picture – The image is a picture that contains information important to the use of the page Hidden Link – The image provides a “hidden” link on a page for search engine optimization or screen reader users Linked Text – The images is used to stylize text and provide a link to another page Skip Link - The image is the root of an inner-document link that provides a means of skipping past page content that is not relevant Linked Picture – The image is a picture that provides a link to another page 10 Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 Variables Width Height Edge Count The width of the image The height of the image The number of vertical and horizontal edges in the image Size The rectangular size of the image or width time height File Size Link The size of the file in bytes Whether or not the image is a link Inner-document Link Whether or not the image is a link within the current document Color Depth The number of unique colors that the image has 11 Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 Project Functionality Challenge No database of relevant image classifications exists Subject Matter Experts (SMEs) use experience to determine form of alternative text Without a good data set the decision tree isn’t going to decide much Solution Build a spider to crawl sites and gather sample data Classify the images using a basic interface Store the image classification and additional variables in a database Build a decision tree from the database rather than a live site Repeat using updated tree Result Created an image database of 1000 images with about an hour of actual data entry 12 Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 Project Functionality Challenge Build the decision tree …which became build the decision tree before the end of time …which became build the decision tree once and store it for later use Discussion Building the tree is fairly straightforward and involves splitting on variables and analyzing remaining sets Implementation uses Russell, Norvig algorithm More on the tricky parts later The “catch” - a lot of the queries involve eliminating groups of images SQL doesn’t have good concepts for handling unordered sets of keys so you enumerate out elements for queries… 13 Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 Project Functionality Discussion (Continued) This results in lots of nasty queries and a fair amount of time to build the tree This more or less grows exponentially as you add variables and quanta Solution Build the tree once and persist to disk Limit quanta for variables and require minimum information gain Result Creation of the tree takes about forty minutes Reading in the tree takes about forty milliseconds Resolving against the tree takes about forty nanoseconds 14 Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 Project Functionality Challenge Test the decision tree for accuracy Avoid peeking at the data set Solution Always test on new data [Tank!] Don’t store the test set so we avoid any temptation to peek Name Hi5 – www.hi5.com Hillary Clinton for President - http://www.hillaryclinton.com/ Department of Defense - http://www.defenselink.mil/ Engadget – www.engadget.com Gamespot.com – www.gamespot.com Average Accuracy 94.7% 98.6% 86.84% 91.45% 91.57% 92.63% 15 Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 The Tricky Parts Information Gain Overfitting Successful classification provides 2.391 bits of Observe information Permutations of Variable Quanta 460,800 Which means, what, exactly? Technically – You have enough information Sample Data Size – 1000 to answer 2.391 yes/no questions 460,800 >> 1000 Practically – You can order nodes to split on Thus the risk of over fitting is significant by information gain At each split choose node that provides highest information gain Solution Note - The amount of information provided by an attribute will change as you move Require that we gain at least .05 bits to split – otherwise just return the modal value for the remaining set through the tree Solution Calculate information gain for each split This is where the nasty set queries occur Silicon Valley (415) 975-8000 www.ssbbartgroup.com 16 Washington DC (703) 637-8955 The Tricky Parts Variable Quantification Edge Detection Strategy Make everything an integer Define ranges for all variables Used Sobel Edge detection and Java convolution application for images Initially picked quanta based on guesses Count the number of edges in the image divisions These turned out to be wildly inaccurate Solution Solution Count vertical and horizontal edges Picked variables based on image type Turns out to be a great proxy for text in the image grouping and average Lots of images have edges SQL AVG and COUNT make this easy Accuracy goes from 78.23% to 92.63% with this types of edge detection Silicon Valley (415) 975-8000 www.ssbbartgroup.com 17 Washington DC (703) 637-8955 Future Features Second Order Variables First order variables are primary data from images Second order variables are derived from one or more primary variables Specifically edge_count, color_depth have much more relevance as ratios to size height is more relevant as a ratio for width Classification Tightening Current classifications have some overlap which could be refined out Certain classifications evolved over the course of the project and the data set should be updated to reflect the final classification 18 Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 Future Features Safe Failure Okay to require alternative text when not necessary than not require text when necessary… …or is it?? Celebrity Endorsement If K-Fed uses it wouldn’t you 19 Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 InFocus 5.0 Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955 For More Information Silicon Valley Washington DC Phone (415) 975-8000 Phone (703) 637-8955 E-mail sales@ssbbartgroup.com E-mail sales@ssbbartgroup.com Fax Fax (415) 624-2708 (703) 734-8381 300 Brannan Street 1489 Chain Bridge Road Suite 608 Suite 204 San Francisco, CA 94107-1876 McLean, VA 22101 Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955