Image Classification for the Purposes of

advertisement
Images,
Alternative Text,
and Artificial Intelligence
SSB BART Group
Silicon Valley
(415) 975-8000
sales@ssbbartgroup.com
IT Accessibility Problem - Solved™
SSB BART Group
Washington DC
(703) 442-5023
sales@ssbbartgroup.com
Agenda

About Us

About Me

The Project

What’s Next
http://amp.ssbbartgroup.com/public/research/Automatic_Image_Classification_090707.doc
http://amp.ssbbartgroup.com/public/research/SSB_BART_Group_Image_Alt_CSUN_2008.ppt
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Corporate Overview
History

Approach
Founded in 1997 by engineers with

disabilities



Violation profiling across 5.5M
human validated accessibility
issues

Scalable Solutions
750 commercial and government
customers

Data driven and scalable
1,500 enterprise projects successfully
One
completed
One
to one million developers
to one thousand
production systems
Pioneers of commercial accessibility
validation tools

Fifty percent staffing mix of
individuals with disabilities

Appropriately mixed automated,
human and code level validation
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Supported Platforms
Web
Compiled Software

HTML

JFC and SWT Java Applications

XML

.Net Applications

JavaScript

MFC Windows Native Applications

CSS

Macintosh Applications

AJAX

BMC Remedy Applications

Adobe Flash and Flex
Standalone Systems

Adobe Acrobat Documents

Telecommunications Hardware

Streaming Audio and Video

IVR Systems

Agent Systems

Digital Imaging
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Industry Solutions

Public Sector


Federal Solutions
United

Manufacturers
States
European
Union
Education
K-12
Government System Integrators

Healthcare

Software

Hardware

Web Based Service Providers
Mass Transit

Financial Services
State and Local




Universities

Information Technology

Consumer Banking

Insurance

Legal

Web Based Service Providers
Primary Care Providers
Insurance
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Accessibility Management Platform
Requirements
Implementation
Certification
Baseline Audit
Development Audit
Maintenance Audit
Standards Development
Standards Maintenance
VPAT Creation
eLearning
Developer Support
Certification
InFocus™ Suite
AMP – SSB’s web based platform for managing all aspects of
Accessibility process
Benefits

Single point for tracking compliance over time

Scalable solutions from one to one million developers
across multiple domestic markets

Support for all aspects of a successful accessibility initiative
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
About Me
General Story

Accessibility Work
Founder and Managing Director of SSB

BART Group


Also Known As President and CEO

Architected and developed first
commercial accessibility testing and fixing
years
tool
Started in 1994 at the dawn of the

InSight and InFocus 1.x -> 4.x
Web

Initial release in mid-200

Next release in a few months
BS Computer Science Leland Stanford
Junior University (AKA Stanford)

validation and education since 1999
Professional web site developer for 13


Involved in Web Accessibility activities,

Odds on Brad Pitt to
Architected and developed Accessibility
Management Platform (AMP)
play me in the movie


Current Version – 2008 R1
Personal work with fifty enterprise class
software vendors
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Project Overview
Project Description

Create a decision tree to classify images into one of eight types

Image types are organized by alternative text requirements

Upon classification, alternative text validity can then be tested via straightforward heuristics
Project Utility

Alternative text provides a textual description of an image

Alternative text validity


Ensures access to content for people with disabilities

Allows pages to be adapted effectively - low resolution, alternative browsers

Increases search engine relevance for pages
Bottom Line – Good alternative text is good for society and good for profits
8
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Automated Testing Tools

A brief note on automated testing tools

First generation of automated testing tools, where we
are now, can test about 25% of requirements accurately

Another 25% with so-so accuracy

And the rest need to be checked manually

We think the next generation of tools can double this
efficacy through better AI, more complex page models
and better leveraging of human judgment…

…but ultimately tools can only facilitate the process of
human review they cannot replace it
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Image Types

Layout Element – The image is used solely to layout elements on the page

Decorative Picture – The image is a picture that is used solely for the purpose
of making the page more visually appealing and it provides no information

Text – The image is used to stylize text on the page but is not used as an active
element on the page

Picture – The image is a picture that contains information important to the use
of the page

Hidden Link – The image provides a “hidden” link on a page for search engine
optimization or screen reader users

Linked Text – The images is used to stylize text and provide a link to another
page

Skip Link - The image is the root of an inner-document link that provides a
means of skipping past page content that is not relevant

Linked Picture – The image is a picture that provides a link to another page
10
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Variables
Width
Height
Edge Count
The width of the image
The height of the image
The number of vertical and horizontal edges in
the image
Size
The rectangular size of the image or width time
height
File Size
Link
The size of the file in bytes
Whether or not the image is a link
Inner-document Link
Whether or not the image is a link within the
current document
Color Depth
The number of unique colors that the image
has
11
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Project Functionality
Challenge

No database of relevant image classifications exists

Subject Matter Experts (SMEs) use experience to
determine form of alternative text

Without a good data set the decision tree isn’t going to decide
much
Solution

Build a spider to crawl sites and gather sample data

Classify the images using a basic interface

Store the image classification and additional variables in a
database

Build a decision tree from the database rather than a live site

Repeat using updated tree
Result

Created an image database of 1000 images with about an
hour of actual data entry
12
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Project Functionality
Challenge

Build the decision tree

…which became build the decision tree before the end of time

…which became build the decision tree once and store it for
later use
Discussion

Building the tree is fairly straightforward and involves splitting
on variables and analyzing remaining sets

Implementation uses Russell, Norvig algorithm


More on the tricky parts later
The “catch” - a lot of the queries involve eliminating groups of
images

SQL doesn’t have good concepts for handling unordered
sets of keys so you enumerate out elements for queries…
13
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Project Functionality
Discussion (Continued)

This results in lots of nasty queries and a fair amount of time
to build the tree

This more or less grows exponentially as you add variables
and quanta
Solution

Build the tree once and persist to disk

Limit quanta for variables and require minimum information
gain
Result

Creation of the tree takes about forty minutes

Reading in the tree takes about forty milliseconds

Resolving against the tree takes about forty nanoseconds
14
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Project Functionality
Challenge

Test the decision tree for accuracy

Avoid peeking at the data set
Solution

Always test on new data [Tank!]

Don’t store the test set so we avoid any temptation to
peek
Name
Hi5 – www.hi5.com
Hillary Clinton for President - http://www.hillaryclinton.com/
Department of Defense - http://www.defenselink.mil/
Engadget – www.engadget.com
Gamespot.com – www.gamespot.com
Average
Accuracy
94.7%
98.6%
86.84%
91.45%
91.57%
92.63%
15
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
The Tricky Parts
Information Gain

Overfitting
Successful classification provides 2.391 bits of

Observe
information

Permutations of Variable Quanta 460,800

Which means, what, exactly?

Technically – You have enough information

Sample Data Size – 1000
to answer 2.391 yes/no questions

460,800 >> 1000

Practically – You can order nodes to split on

Thus the risk of over fitting is significant
by information gain

At each split choose node that provides highest

information gain

Solution
Note - The amount of information provided
by an attribute will change as you move
Require that we gain at least .05 bits to split
– otherwise just return the modal value for
the remaining set
through the tree
Solution

Calculate information gain for each split

This is where the nasty set queries occur
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
16
Washington DC (703) 637-8955
The Tricky Parts
Variable Quantification


Edge Detection
Strategy


Make everything an integer

Define ranges for all variables
Used Sobel Edge detection and Java
convolution application for images

Initially picked quanta based on guesses
Count the number of edges in the
image
divisions


These turned out to be wildly inaccurate
Solution

Solution

Count vertical and horizontal edges

Picked variables based on image type
Turns out to be a great proxy for
text in the image
grouping and average

Lots of images have edges
SQL AVG and COUNT make this easy

Accuracy goes from 78.23% to
92.63% with this types of edge
detection
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
17
Washington DC (703) 637-8955
Future Features
Second Order Variables

First order variables are primary data from images

Second order variables are derived from one or more primary variables

Specifically

edge_count, color_depth have much more relevance as ratios to size

height is more relevant as a ratio for width
Classification Tightening

Current classifications have some overlap which could be refined out

Certain classifications evolved over the course of the project and the data set
should be updated to reflect the final classification
18
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Future Features
Safe Failure

Okay to require alternative text when not necessary
than not require text when necessary…

…or is it??
Celebrity Endorsement

If K-Fed uses it wouldn’t you
19
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
InFocus 5.0
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
For More Information
Silicon Valley
Washington DC
Phone (415) 975-8000
Phone (703) 637-8955
E-mail sales@ssbbartgroup.com
E-mail sales@ssbbartgroup.com
Fax
Fax
(415) 624-2708
(703) 734-8381
300 Brannan Street
1489 Chain Bridge Road
Suite 608
Suite 204
San Francisco, CA 94107-1876
McLean, VA 22101
Silicon Valley (415) 975-8000
www.ssbbartgroup.com
Washington DC (703) 637-8955
Download