• Real-life websites usually show less-thanperfect accessibility — even those that strive to be accessible.
• WCAG Techniques and Failures have binary tests which make it difficult to deal with minor flaws: neglect them, or be too strict?
• The German BITV-Test ( www.bitvtest.eu
) uses a 5-point graded rating scale to address this problem.
• When rating individual instances, results can often be somewhere between pass and fail.
• Some ratings will apply not to instances but to patterns. What level of deficiency will constitute a failure?
• Some instances can be critical, others minor
• Often, some instances on a page pass while others fail. Should the page then pass or fail a particular success criteria?
• BITV-Test has 50 checkpoints mapping to
WCAG level AA with a weight of 1, 2 or 3 points (adding to 100 points)
• Full “pass” will contribute 100% of checkpoint weight. Further grades: 75%, 50%, 25%, 0 %
• Ratings reflect both the frequency and criticality of flaws
• Results per page are aggregated to a site score
(X of 100 points) based on the page sample
• Reliability can be expressed as degree of replicability in an independent test with another tester
• The BITV conformance test is conducted as independent tandem test followed by an arbitration phase
• Arbitration corrects oversights and rectifies both too lenient and too strict ratings
• Experience shows that the 5 point graded rating scale is quite reliable. A statistics function has been added to quantify inter-evaluator reliability