Karl Groves

Tech Accessibility Consultant
  • Web
  • Mobile
  • Software
  • Hardware
  • Policy
+1 443.875.7343

Affirming the Consequent

Today I came across a post by Simon Harper titled Web Accessibility Evaluation Tools Only Produce a 60-70% Correctness which is essentially a response to my earlier critique of a seriously flawed academic paper. I submitted a response on Simon’s site, but I want to copy it here for my regular readers. One thing that specifically bothers me is why do the responses continue to dodge the specific challenges I raise? You cannot claim something without evidence and you cannot supply data for one thing and claim that it leads additional, wholly unrelated conclusions. So, here goes:


Good post, and thank you for the response. It is unfortunate, however, that you didn’t read or respond to what I wrote. It is also unfortunate that the paper’s authors have similarly chosen to not respond directly to my statements. The blanket response “well, just replicate it” is an attempt at dodging my response and my [specific] criticisms of the paper (which again, you admittedly haven’t read). Furthermore, there’s little use in attempting to perform the same experiments when the conclusions presented have fully nothing to do with the data.

You said:
“Web accessibility evaluation can be seen as a burden which can be alleviated by automated tools.”
Actually, they don’t say that.

“In this case the use of automated web accessibility evaluation tools is becoming increasingly widespread.”
No data is supplied for this at all.

“Many, who are not professional accessibility evaluators, are increasingly misunderstanding, or choosing to ignore, the advice of guidelines by missing out expert evaluation and/or user studies.”
No data is supplied for this at all.

“This is because they mistakenly believe that web accessibility evaluation tools can automatically check conformance against all success criteria.”
No data is supplied for this at all.

“This study shows that some of the most common tools might only achieve between 60 and 70% correctness in the results they return, and therefore makes the case that evaluation tools on their own are not enough.”

Of all the things you said, this is the only thing actually backed by the data from the paper. Literally everything else is a case of affirming the consequent.

The data that they do present is very compelling and matches my own experience. The significant amount of variation between the tools tested was pretty shocking as well, and once you get past the unproven, hyperbolic claims, it is very interesting.

If this paper’s authors were to gather and present actual data regarding usage patterns (re: the claim that “the use of automated web accessibility evaluation tools is becoming increasingly widespread”) then I wouldn’t be so critical. There is no question that the data needed to substantiate this and similar statements simply isn’t supplied.

Finally, I’d like to address the statement “evaluation tools on their own are not enough”. As I say in my blog post, this is so obvious that it is hardly worth mentioning. No legitimate tool vendor says this. I’ve been working as an accessibility consultant for a decade. I’ve worked for/ along/ or in competition with all of the major tool vendors and have never heard any of them say that using their tool alone is enough. Whether end users think this or not is another matter. Again, it’d be great if the paper’s authors had data to show this happening, since they claim that it is.

The implication from this paper is that because tools do not provide complete coverage, they should not be used. This is preposterous and, I believe, born from a lack of experience outside of accessibility and a lack of experience in a modern software development environment. Automated testing, ranging from things like basic static code linting, to unit testing, to automated penetration testing is the norm and for good reason: it helps increase quality. But ask *any* number of skilled developers whether “passing” a check by JSHint means their JavaScript is good and you’ll get a universal “No”. That doesn’t stop contrib-jshint from being the most downloaded Grunt plugin (http://gruntjs.com/plugins). Ask any security specialist whether using IBM’s Rational Security is enough to ensure a site is secure, and they’ll say “No”. That doesn’t diminish its usefulness as a *tool* in a mature security management program.

Perhaps what we need most in terms of avoiding an “over-reliance” on tools is for people to stop treating them like they’re all-or-nothing.

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

Tutorial: Creating a PHP class to use with Tenon.io


Just wanna get the code? All of the code for this tutorial is available at an open repository on BitBucket

Tenon.io is an API that facilitates quick and easy JavaScript-aware accessibility testing. The API accepts a large number of request parameters that allow you to customize how Tenon does its testing and returns its results. Full documentation for client developers is available in a public repository on Bitbucket. As an API, getting Tenon to do your accessibility testing requires a little bit of work. Tenon users have to do relatively minimal work to submit their request and deal with the response. This blog post shows an example of how to do that with a simple PHP class and also provides a method of generating a CSV file of results.

Despite the fact that it is an API, you can create a simple app very easily. First thing, of course, is that you need a Tenon.io API key. Go to Tenon.io to get one. Right now, Tenon is in Private Beta. If you’re interested in getting started right away, email karl@tenon.io to get your key. The second thing is you need a PHP-enabled server with cURL. Most default installs of PHP on web hosts will have it. If not, installation is easy.

How to use this class

Using this class is super easy. In the code chunk below, we’re merely going to pass some variables to the class and get the response. This is not production-ready code. There are a lot of areas where this can be improved. Use this as a starting point, not an end point.

define('TENON_API_KEY', 'this is where you enter your api key');
define('TENON_API_URL', 'http://www.tenon.io/api/');
define('DEBUG', false);

$opts['key'] = TENON_API_KEY;
$opts['url'] = 'http://www.example.com'; // enter a real URL here, of course
$tenon = new tenon(TENON_API_URL, $opts);

Using the code chunk above, you now have a variable, $tenon->tenonResponse, formatted according to the Tenon response format (read the docs for full details.)

That’s it! From there, all you need to do is massage that JSON response into something useful for your purposes.

Let’s walk through a class that can help us do that.

Give it a name

First, create a file, called tenon.class.php. Then start your file like so.

class tenon

Declare some variables

Now, at the top of the file we want to declare some variables:

  • $url – this will be the URL to the Tenon.io API itself.
  • $opts – this will be an array of your request parameters
  • $tenonResponse – this will be populated by the JSON response from Tenon
  • $rspArray – this will be a multidimensional array of the decoded response.
    protected $url, $opts;
    public $tenonResponse, $rspArray;

Class Constructor

Time to get to our actual class methods. First up is our class constructor. Since constructors in PHP cannot return a value, we just set up some instance variables to be used by other methods. The arguments are the $url and $opts variables discussed above.

     * Class constructor
     * @param   string $url  the API url to post your request to
     * @param    array $opts options for the request
    public function __construct($url, $opts)
        $this->url = $url;
        $this->opts = $opts;
        $this->rspArray = null;

Submit your request to Tenon

Next up is the method that actually fires the request to the API. This function is nothing more than a wrapper around some cURL stuff. PHP’s functionality around cURL is excellent and makes it perfect for this type of purpose.

This method passes through our request parameters (from the $tenon->opts array) to the API as a POST request and returns a variable, $tenon->tenonResponse, populated with the JSON response from Tenon.

     * Submits the request to Tenon
     * @param   bool $printInfo whether or not to print the output from curl_getinfo (usually for debugging only)
     * @return    string    the results, formatted as JSON
    public function submit($printInfo = false)
        if (true === $printInfo) {
            echo '<h2>Options Passed To TenonTest</h2><pre><br>';
            echo '</pre>';

        //open connection
        $ch = curl_init();

        curl_setopt($ch, CURLOPT_URL, $this->url);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_POST, true);
        curl_setopt($ch, CURLOPT_FAILONERROR, true);
        curl_setopt($ch, CURLOPT_POSTFIELDS, $this->opts);

        //execute post and get results
        $result = curl_exec($ch);

        if (true === $printInfo) {
            echo 'ERROR INFO (if any): ' . curl_error($ch) . '<br>';
            echo '<h2>Curl Info </h2><pre><br>';
            echo '</pre>';

        //close connection

        //the test results
        $this->tenonResponse = $result;


Decode the response

From here, how you deal with the JSON is up to you. Most programming languages have ways to deal with JSON. PHP has some native functionality, albeit simple, to decode and encode JSON. Below, we use json_decode to turn the JSON into a multidimensional array. This gives us the $tenon->rspArray to use in other methods later.

     * @return mixed
    public function decodeResponse()
        if ((false !== $this->tenonResponse) && (!is_null($this->tenonResponse))) {
            $result = json_decode($this->tenonResponse, true);
            if (!is_null($result)) {
                $this->rspArray = $result;
            } else {
                return false;
        } else {
            return false;

Make some sense of booleans

Tenon returns some of its information as ‘1’ or ‘0’. We’re going to want that to be more useful for human consumption, so we convert those to ‘Yes’ and ‘No’. Because of some weirdness with json_decode and PHP’s loose typing, sometimes digits are actually strings, so that’s why we’re not using strict comparison.

     * @param $val
     * @return string
    public static function boolToString($val){
        if($val == '1'){
            return 'Yes';
            return 'No';

Create a summary

OK, now it is time to start doing something useful with the response array. The first thing we need is a summary of how our request went and the status of our document. This method creates a string of HTML showing the following details:

  • Your Request – Tenon echoes back your request to you. This section reports the request that Tenon uses, which may include items set to their defaults.
  • Response Summary – This section gives a summary of the response, such as response code, response type, execution time, and document size.
  • Global Stats – This section gives some high level stats on error rates across all tests run by Tenon. When compared against your document’s density (below), this is useful for getting an at-a-glance idea of your document’s accessibility
  • Density – Tenon calculates a statistic called ‘Density’ which is, basically, how many errors you have, compared to how big the document is. In other words how dense are the issues on the page?
  • Issue Counts – This section gives raw issue counts for your document
  • Issues By Level – This section provides issue counts according to WCAG Level
  • Client Script Errors – one of the things that may reduce the ability of Tenon to test your site is JavaScript errors and uncaught exceptions. A cool feature of Tenon is that it reports these to you.

     * @return mixed
    public function processResponseSummary()
        if ((false === $this->rspArray) || (is_null($this->rspArray))) {
            return false;

        $output = '';
        $output .= '<h2>Your Request</h2>';
        $output .= '<ul>';
        $output .= '<li>DocID: ' . $this->rspArray['request']['docID'] . '</li>';
        $output .= '<li>Certainty: ' . $this->rspArray['request']['certainty'] . '</li>';
        $output .= '<li>Level: ' . $this->rspArray['request']['level'] . '</li>';
        $output .= '<li>Priority: ' . $this->rspArray['request']['priority'] . '</li>';
        $output .= '<li>Importance: ' . $this->rspArray['request']['importance'] . '</li>';
        $output .= '<li>Report ID: ' . $this->rspArray['request']['reportID'] . '</li>';
        $output .= '<li>System ID: ' . $this->rspArray['request']['systemID'] . '</li>';
        $output .= '<li>User-Agent String: ' . $this->rspArray['request']['uaString'] . '</li>';
        $output .= '<li>URL: ' . $this->rspArray['request']['url'] . '</li>';
        $output .= '<li>Viewport: ' . $this->rspArray['request']['viewport']['width'] . ' x ' . $this->rspArray['request']['viewport']['height'] . '</li>';
        $output .= '<li>Fragment? ' . self::boolToString($this->rspArray['request']['fragment']) . '</li>';
        $output .= '<li>Store Results? ' . self::boolToString($this->rspArray['request']['store']) . '</li>';
        $output .= '</ul>';

        $output .= '<h2>Response</h2>';
        $output .= '<ul>';
        $output .= '<li>Document Size: ' . $this->rspArray['documentSize'] . ' bytes </li>';
        $output .= '<li>Response Code: ' . $this->rspArray['status'] . '</li>';
        $output .= '<li>Response Type: ' . $this->rspArray['message'] . '</li>';
        $output .= '<li>Response Time: ' . date("F j, Y, g:i a", strtotime($this->rspArray['responseTime'])) . '</li>';
        $output .= '<li>Response Execution Time: ' . $this->rspArray['responseExecTime'] . ' seconds</li>';
        $output .= '</ul>';

        $output .= '<h2>Global Stats</h2>';
        $output .= '<ul>';
        $output .= '<li>Global Density, overall: ' . $this->rspArray['globalStats']['allDensity'] . '</li>';
        $output .= '<li>Global Error Density: ' . $this->rspArray['globalStats']['errorDensity'] . '</li>';
        $output .= '<li>Global Warning Density: ' . $this->rspArray['globalStats']['warningDensity'] . '</li>';
        $output .= '</ul>';

        $output .= '<h3>Density</h3>';
        $output .= '<ul>';
        $output .= '<li>Overall Density: ' . $this->rspArray['resultSummary']['density']['allDensity'] . '%</li>';
        $output .= '<li>Error Density: ' . $this->rspArray['resultSummary']['density']['errorDensity'] . '%</li>';
        $output .= '<li>Warning Density: ' . $this->rspArray['resultSummary']['density']['warningDensity'] . '%</li>';
        $output .= '</ul>';

        $output .= '<h3>Issue Counts</h3>';
        $output .= '<ul>';
        $output .= '<li>Total Issues: ' . $this->rspArray['resultSummary']['issues']['totalIssues'] . '</li>';
        $output .= '<li>Total Errors: ' . $this->rspArray['resultSummary']['issues']['totalErrors'] . '</li>';
        $output .= '<li>Total Warnings: ' . $this->rspArray['resultSummary']['issues']['totalWarnings'] . '</li>';
        $output .= '</ul>';

        $output .= '<h3>Issues By WCAG Level</h3>';
        $output .= '<ul>';
        $output .= '<li>Level A: ' . $this->rspArray['resultSummary']['issuesByLevel']['A']['count'];
        $output .= ' (' . $this->rspArray['resultSummary']['issuesByLevel']['A']['pct'] . '%)</li>';
        $output .= '<li>Level AA: ' . $this->rspArray['resultSummary']['issuesByLevel']['AA']['count'];
        $output .= ' (' . $this->rspArray['resultSummary']['issuesByLevel']['AA']['pct'] . '%)</li>';
        $output .= '<li>Level AAA: ' . $this->rspArray['resultSummary']['issuesByLevel']['AAA']['count'];
        $output .= ' (' . $this->rspArray['resultSummary']['issueSummary']['AAA']['pct'] . '%)</li>';
        $output .= '</ul>';

        $output .= '<h3>Client Script Errors, if any</h3>';
        $output .= '<p>(Note: "NULL" or empty array here means there were no errors.)</p>';
        $output .= '<pre>' . var_export($this->rspArray['clientScriptErrors'], true) . '</pre>';

        return $output;

Output the issues

The most important part of Tenon is obviously the issues. The below method gets the issues and loops through them to print them out in a human-readable format. Each issue is presented to show what the issue is and where the issue is. For a full description of Tenon’s issue reports, read the Tenon.io Documentation

     * @return   string
    function processIssues()
        $issues = $this->rspArray['resultSet'];

        $count = count($issues);

        if ($count > 0) {
            $i = 0;
            for ($x = 0; $x < $count; $x++) {
                $output .= '<div class="issue">';
                $output .= '<div>' . $i .': ' . $issues[$x]['errorTitle'] . '</div>';
                $output .= '<div>' . $issues[$x]['errorDescription'] . '</div>';
                $output .= '<div><pre><code>' . trim($issues[$x]['errorSnippet']) . '</code></pre></div>';
                $output .= '<div>Line: ' . $issues[$x]['position']['line'] . '</div>';
                $output .= '<div>Column: ' . $issues[$x]['position']['column'] . '</div>';
                $output .= '<div>xPath: <pre><code>' . $issues[$x]['xpath'] . '</code></pre></div>';
                $output .= '<div>Certainty: ' . $issues[$x]['certainty'] . '</div>';
                $output .= '<div>Priority: ' . $issues[$x]['priority'] . '</div>';
                $output .= '<div>Best Practice: ' . $issues[$x]['resultTitle'] . '</div>';
                $output .= '<div>Reference: ' . $issues[$x]['ref'] . '</div>';
                $output .= '<div>Standards: ' . implode(', ', $issues[$x]['standards']) . '</div>';
                $output .= '<div>Issue Signature: ' . $issues[$x]['signature'] . '</div>';
                $output .= '<div>Test ID: ' . $issues[$x]['tID'] . '</div>';
                $output .= '<div>Best Practice ID: ' . $issues[$x]['bpID'] . '</div>';
                $output .= '</div>';
        return $output;

Full Usage Example

So now that we have the full class in place, let's put it all together. In the example below, we're taking our request parameters from a $_POST array, such as that which we'd get from a form submission.

define('TENON_API_KEY', 'this is where you enter your api key');
define('TENON_API_URL', 'http://www.tenon.io/api/');
define('DEBUG', false);

$expectedPost = array('src', 'url', 'level', 'certainty', 'priority',
    'docID', 'systemID', 'reportID', 'viewport',
    'uaString', 'importance', 'ref', 'importance',
    'fragment', 'store', 'csv');

foreach ($_POST AS $k => $v) {
    if (in_array($k, $expectedPost)) {
        if (strlen(trim($v)) > 0) {
            $opts[$k] = $v;

$opts['key'] = TENON_API_KEY;

$tenon = new tenon(TENON_API_URL, $opts);


if (false === $tenon->decodeResponse()) {
    $content = '<h1>Error</h1><p>No Response From Tenon API, or JSON malformed.</p>';
    $content .= '<pre>' . var_export($tenon->tenonResponse, true) . '</pre>';
} else {
    $summary = $tenon->processResponseSummary();
    $content .= '<h2>Issues</h2>';
    $content .= $tenon->processIssues();
    $content .= $tenon->rawResponse();
echo $content;

That's it! You now have an HTML output of Tenon's response summary and issue details!

Screw it, just gimme the issues

OK, what if you just want the issues and none of that output-to-HTML stuff? Getting the issues into a CSV file is ridiculously easy with PHP. Add this method to your PHP class:

     * @param $pathToFolder
     * @return bool
    public function writeResultsToCSV($pathToFolder)
        $url = $this->rspArray['request']['url'];
        $issues = $this->rspArray['resultSet'];
        $name = htmlspecialchars($this->rspArray['request']['docID']);
        $count = count($issues);

        if ($count < 1) {
            return false;

        for ($x = 0; $x < $count; $x++) {
            $rows[$x] = array(
                implode(', ', $issues[$x]['standards']),

        // Put a row of headers up on the beginning
        array_unshift($rows, array('URL', 'testID', 'Best Practice', 'Issue Title', 'Description',
            'WCAG SC', 'Issue Code', 'Line', 'Column', 'xPath', 'Certainty', 'Priority', 'Reference', 'Signature'));

        if (!file_exists($pathToFolder . $name . '.csv')) {
            $fp = fopen($pathToFolder . $name . '.csv', 'w');
            foreach ($rows as $fields) {
                fputcsv($fp, $fields);
            return true;
        return false;

Then all you need to do is call it like this:

define('TENON_API_KEY', 'this is where you'd enter your api key');
define('TENON_API_URL', 'http://www.tenon.io/api/');
define('DEBUG', false);
define('CSV_FILE_PATH', $_SERVER['DOCUMENT_ROOT'] . '/csv/');

$expectedPost = array('src', 'url', 'level', 'certainty', 'priority',
    'docID', 'systemID', 'reportID', 'viewport',
    'uaString', 'importance', 'ref', 'importance',
    'fragment', 'store', 'csv');

foreach ($_POST AS $k => $v) {
    if (in_array($k, $expectedPost)) {
        if (strlen(trim($v)) > 0) {
            $opts[$k] = $v;

$opts['key'] = TENON_API_KEY;

$tenon = new tenonTest(TENON_API_URL, $opts);


if (false === $tenon->decodeResponse()) {
    $content = '<h1>Error</h1><p>No Response From Tenon API, or JSON malformed.</p>';
    $content .= '<pre>' . var_export($tenon->tenonResponse, true) . '</pre>';
    echo $content;
} else {
    if(false !== $tenon->writeResultsToCSV(CSV_FILE_PATH)){
        echo 'CSV file written!';

Now what?

This blog post shows how easy it is to create a PHP implementation that will submit a request to Tenon, do some testing, and return results. We want to see what you can do with it. Register at Tenon.io and get started!

I'm available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

Accessibility Consulting is Broken

I’ve had an epiphany. Accessibility Consulting, that process where a client hires us to go through their system, test it for accessibility issues, and submit a report to them, is fundamentally broken. My personal interpretation of our goal, as professionals, is to make money doing Good. Our advanced level of knowledge, skills, and experience can and should drive higher amounts of money while allowing us to do a greater amount of good, like a snowball of awesomeness.

The client hires the consultant to help ease pain. That pain may have an array of causes and may have a varying degree of acuteness, but it always has the same root cause: ignorance. Organizationally there is systemic ignorance surrounding the topic of accessibility. The client neither understands what accessibility is or how to manage it and that results in a failure to effectively modify their ICT development or procurement processes to ensure an accessible outcome. Recognizing this, they turn to the services of a consultant. In all likelihood the resulting engagement involves an audit of the client’s problematic ICT system(s), the deliverable for which is a report outlining the findings of this audit process.

No matter how detailed and no matter how perfect the guidance, the delivery of the report fails to ease the customer’s pain. It does not directly address either the symptoms or the cause of the disease. In fact, the more extensive and greater scope of the testing, the higher likelihood that the client will be paralyzed by the magnitude of what they’ve been told. This paralysis is often made worse in cases where other technical debt has been incurred through bad architectural choices and longstanding legacy front-end code.

During in-person training, I use the following story to illustrate this paralysis:

Two years ago, my wife and I decided to make a large number of renovations to our house:

  • Paint all 3 bedrooms, including ceilings
  • Paint the great room
  • Create custom closet shelving
  • Replace badly stained wood floors in hallway
  • Sand, stain, and refinish wood floors in all 3 bedrooms
  • New chair molding, baseboard moldings, and wood trim throughout the house.
  • Replace wood floor in the great room.
  • New stairs

Excited by the new beauty I envisioned for our house, I dove right in and started doing the necessary demolition. I ripped out all of the carpet that was covering the wood floors, removed all the old molding, and skillfully removed the wood floor in the hallway. Removing the wood floor in the hallway was quite easy. I even skillfully staggered the removal of the boards so that the new boards would blend in without looking like they were replaced. Then the paralysis started.

Demolition complete, I was faced with the exact extent of what was ahead of me. Everywhere I went in the house I was reminded of everything I had to do to finish the house – some of which I really had no clue what to do. For instance, properly installing the new wood floor so it blended into the existing wood floor was far beyond my existing skill set. It scared me and, as a dependency for so many other things on the list, I knew I had to do it but had no idea where to start. So I didn’t start on it. I kept a long list of excuses prepared for why I couldn’t do it.

In hindsight, the real reason I didn’t dive right in to start the work was because I viewed the work ahead of me as a single massive job: Renovate the house. It wasn’t until I changed my outlook on the work as being a series of small distinct tasks I could tackle. This is the same type of overwhelmed feeling clients tend to get when they’re delivered a huge accessibility audit report. The lower their existing willingness to address accessibility and/ or the higher their level of pain & distress (such as threat of litigation), the more likely and more severely they’ll feel paralyzed by the bad news in the report.

Ideally, the client would take the report, read it in its entirety, absorb the excellent guidance contained therein, and jump in with both feet to start fixing their broken system. The consultant feels that report is more than a report. It is a learning document. When the final spelling and grammar check is run in Microsoft Word and saved to its final version, the consultant proudly ships it to the client. Idealistically, the consultant thinks their masterful wording, illustrative screenshots, and skillful code examples will trigger a revolution in the management and development of more accessible solutions by the client. Unfortunately the more likely outcome is confusion. None of the client’s pain is alleviated. Their long-term effectiveness and success with accessibility is not improved and, at best, the report becomes the basis for a series of new issues in the client’s issue tracking system. In practice, the issue reports created by the client are often lacking an acceptable level of detail for the issues to be properly and expeditiously repaired, further reducing the usefulness of the report.

How do we ease the client’s pain and ensure the client is successful in improving their system(s) and reducing their risk? If the delivery of the audit report doesn’t do it and the client’s repurposing of the report’s content doesn’t do it, where does this leave us? How can we more effectively help ensure client success? By becoming directly involved in alleviating their pain. By becoming the client. Remember, the root cause of the client’s pain is ignorance. The more closely the consultant works with the client as an internal stakeholder, a subject matter expert, and a mentor throughout the development lifecycle, the more directly involved the consultant is in ensuring the client’s success. Integrated into the process as a member of the team, the consultant has direct access to help steer better process and practices. This is the exact opposite of what happens when simply throwing the report over the fence, as it were. The days of the comprehensive audit cum mammoth report should come to an end, replaced with actual guidance.

This guidance can take the form of generating and delivering internal use assets for procuring, designing, developing, and maintaining ICT systems or it can take the form of direct involvement in the development and QA processes. Let’s take, for example, the QA process since it so closely resembles the audit report process in spirit.

A client with even marginally mature development processes has some system for keeping track of bugs and feature requests. This may be as simple as a spreadsheet or as complex as a standalone system which keeps track of not only the issues and feature requests but also various additional metadata related to the issues and feature requests which assist in managing, tracking, and reporting. In any case where such a system doesn’t exist, the consultant’s first order of business should be to assist the client in choosing and standing up such a system. In either case, the consultant needs to have direct access to this system equivalent to that of any other team member.

It is in this issue tracking system where the consultant must do their work, if they’re to be effective in facilitating actual improvement of the client’s systems. Working within a robust issue tracking system, the consultant can immediately log the issues they find. This is where the QA and development staff does their work and it is appropriate that, as a part of the team, so should the consultant. Here, the consultant can log the issues they find and take part in the ensuing discussion among development staff regarding the nature, severity, and priority of the issue. Will it take a long time to fix? Will it be difficult? What are the dependencies? How will the user benefit? How will the business logic or presentation logic be affected? How can the repair be verified? These are among the many questions that the development staff might ask that require the input and collaboration of the consultant. They also require a level of discussion not available in a long, one-sided report, no matter how detailed. Direct access to and use of the client’s issue tracking facilitates this seamless collaboration.

Merely replacing the mammoth report deliverable with direct issue logging obviously isn’t enough to address systemic ignorance of accessibility, but rather eliminates a significant roadblock to accessibility in current systems. In practice, this is particularly true because the issues are seen only as a series of issues and not a series of learning experiences. Testing of new work will show this to be true as the consultant is likely to discover issues identical in nature to those they had already reported in the earlier test effort(s). As mentioned in the second paragraph, the root cause of customer pain is ignorance and, while short-term pain is more effectively addressed with direct issue logging, the long-term plan must aggressively address ignorance. This is the domain of training. All persons involved in the management, design, development, and content of ICT systems should be trained to understand the need for accessibility, the specific challenges in a person with disability’s use of ICT products & services, and how that client staff person’s role impacts the deployment of accessible ICT. Through this role-based training the client staff person’s ignorance can be systematically eliminated. In the case of training, the more permanence the training materials have, the better – up to, and including, LMS and/ or video based materials that can be used as part of an onboarding process for new employees.

Last among the mechanisms which should be employed by the consultant to address ignorance is the generation of internal assets to be used in the procurement, design, and development if ICT systems. This should include things like policies and procedures, style guides, checklists, and other job aids to be used by management and staff. These assets should serve as guidance and reference materials, success criteria, and performance measures whenever new ICT work is undertaken or existing ICT systems are improved.

The days of the monolithic accessibility audit report are numbered, as it is an outdated medium that fails to directly address the actual problems faced by the consultant’s clients. Clients, often driven by pain based in ignorance, want and deserve a more direct and proactive approach to solving the root causes of their pain. The proscriptive nature of an audit report should give way to the close involvement and leadership of a skilled consultant.

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

Measuring the harm of flawed academic papers

For several years I’ve been interested in finding and reading academic work in the field of web accessibility. I have a very strong belief that the things we say regarding web accessibility must be based on a significant amount of rigor and I hold in higher esteem those who base their statements on fact rather than opinion and conjecture. Unfortunately I often find that much of the academic work in web accessibility to be deficient in many ways, likely caused by a lack of experiential knowledge of the professional web development environment. Web development practices change at such a lightning fast pace that even professional developers have trouble keeping up on what’s new. Academics who themselves aren’t really developers in the first place are likely to have even greater trouble in understanding not only the causes of accessibility issues in a web-based system but how to test for those causes. I deal specifically with those topics 8-10 hours a day and sometimes I still have to turn to coworkers for advice and collaboration.

The reason this matters is because out-of-date knowledge and experience leads to issues with the research methods being also out of date. The most obvious evidence of this is when web accessibility researchers perform automated testing with tools that are out of date and/ or technically incapable of testing the browser DOM. Testing the DOM is a vital feature for any accessibility testing tool, especially when used in academic research, because the DOM is what the end user actually experiences. It matters even more when studying accessibility because the DOM is interpreted by the accessibility APIs which pass information about content and controls to the assistive technology employed by the user. Performing research with a tool that does not test the DOM is like measuring temperature with a thermometer you know to be broken. You have no chance of being accurate.

Recently I’ve been reading a paper titled “Benchmarking Web Accessibility Evaluation Tools: Measuring the Harm of Sole Reliance on Automated Tests”. This compellingly titled paper fails to show any instances of “sole reliance” on automated tests and further it fails to demonstrate where such sole reliance caused actual “harm” to anyone or anything. Instead, the paper reads as if it was research performed to validate a pre-determined conclusion. In doing so, the paper’s authors missed an opportunity at a much more compelling discussion: the vast performance differences between well-known accessibility testing tools. The title alludes to this, saying “Benchmarking Web Accessibility Evaluation Tools” and then proceeds to instead focus on these ideas of “harm” and “sole reliance” while using bad results from bad tools as its evidence.

This point – that testing with automated tools only is bad – is so obvious that it almost seems unnecessary to mention. I’ve worked in accessibility and usability for a decade and many of those years were as an employee of companies with make automated testing tools. I’ve also developed my own such tools and count among my friends those who have also developed such tools. Not once do I recall the employees, owners, or developers of any such tools claiming that their automated testing product provides complete coverage. Training materials delivered by SSB BART Group and Deque Systems disclose clearly that automated testing is limited in its capability to provide coverage of all accessibility best practices. So, if “sole reliance” on automated testing is actually an issue, a better title for this paper would be “Measuring the Harm of Incomplete Testing Methodologies.” Instead, the reader is presented with what amounts to an either-or proposition by constant mention of the things that the automated tools couldn’t find vs. what human evaluators found. Thus the paper implies that either you use an automated tool and miss a bunch of stuff or you have an expert evaluate it and find everything.

This implication begins even in the first paragraph of the Introduction by stating:

The fact that webmasters put compliance logos on non-compliant websites may suggest that some step is skipped in the development process of accessible websites. We hypothesise that the existence of a large amount of pages with low accessibility levels, some of them pretending to be accessible, may indicate an over-reliance on automated tests.

Unfortunately, nowhere else in the paper is any data presented that suggests the above comments have any merit. The fact that “…webmasters put compliance logos on non-compliant websites” could mean the sites’ owners are liars. It could mean the site was at one time accessible but something changed to harm accessibility. It could mean the sites’ owners don’t know what accessibility means or how to measure it. In fact, it could mean almost anything. Without data to back it up, it really means nothing and is certainly no more likely to be evidence of “over-reliance on automated tests” as it is any of the other possibilities. Instead the reader is left with the implied claim that it is this “over-reliance on automated tests” that is the culprit.

Further unproved claims include:

With the advent of WCAG 2.0 the use of automated evaluation tools has become even more prevalent.

This claim is backed up by no data of any kind. The reader is given no data from surveys of web developers, no sales figures of tools, no download numbers of free tools, not even anecdotal evidence. Instead, it continues:

In the absence of expert evaluators, organizations increasingly rely on automated tools as the primary indicator of their stated level.

And again no data is supplied to substantiate this claim. In fact, my emperical data gained from dealing with over seven-doze clients over the last decade suggests that organizations often don’t do any testing of any kind, much less automated testing. These organizations also tend to lack any maturity of process regarding accessibility in general, much less accessible development, manual accessibility testing, or usability testing. My experience is that organizations don’t “do accessibility” in any meaningful way, automated or not.   The true smoking gun, as it were, for this so-called harm by “sole reliance” on automated testing could be made simply by supplying the reader with actual data surrounding the above claim. It is not supplied and there is no evidence that such data was even gathered.

Another issue with this paper is its nearly myopic discussion of accessibility as a topic concerned only with users who are blind. The most egregious example comes in the claim, referencing prior work (from 2005), that “Findings indicate that testing with screen readers is the most thorough, whilst automated testing is the least”. Later the paper states that during the expert evaluation process that, “If no agreement was reached among the three judges a legally blind expert user was consulted.” While this is follow by a claim that this person is also a web accessibility expert, the paper states that “This protocol goes further and establishes a debate between judges and last resort consultation with end users.” I don’t consider the experience of a single blind user to be the same as “users” and further do not consider it likely that this single expert user’s opinion would represent the broad range of other blind users much less all users with all disabilities. In the United States, the overall rate of disability for vision impairment and hearing impairment is roughly equal, while those with mobility impairments are more than double both of those combined. Cognitive disabilities account for a larger population than the previous three types combined. Clearly the opinion, however skilled, of  a single person who is blind is in no way useful as a means measuring the accessibility of a website for all users with disabilities.

Further problems with the expert evaluation have to do with the ad-hoc nature of the expert evaluation process:

The techniques used to assess the accessibility of each web page are diverse across judges: evaluation tools that diverge from the ones benchmarked (WAVE2), markup validators, browser extensions for developers (Firebug, Web Developer Toolbar, Web Accessibility Toolbar), screen readers (VoiceOver, NVDA) and evaluation tools based on simulation such as aDesigner[24]

The above passage betrays two rather significant flaws in both the paper itself and the evaluation process. The first is the rather humorous irony that some of the tools listed are by their nature automated testing tools. Both the WAVE web service and WAVE toolbar provide visual representation of automated test results for the page being tested. Markup validators are automated evaluation tools which access the code and automatically assess whether the markup itself is valid. In other words, the expert evaluation process used automated tools. In the process, the point is made that no skilled evaluator would solely rely on the results from automated tools. Adding to the irony, there is no discussion of any other evaluation methods other than testing with screen readers. This further adds to my argument that this paper has a myopic focus on blindness. The second and more important flaw is that there appears to have been no predefined methodology in place for their evaluation. Instead it appears to be assumed that either the reader will trust that the reviewers’ expertise speaks for itself or that a rigorous methodology is unnecessary. Regardless of why, the fact that the paper doesn’t supply a detailed description of the expert evaluation methodology is cause to question the accuracy and completeness of, at the very least, the results of such evaluation.

If the purpose of the paper is to evaluate what is found by machines measured against the results uncovered by expert evaluators, then it is critical that the human evaluation methods be disclosed in explicit detail. Based on the information provided, it would appear that the expert evaluation happened in a much more ad hoc fashion, with each expert performing testing in whatever fashion they deem fit. The problem with this practice is that regardless of the level of expertise of the evaluators, there will always be differences in what & how the testing was done. The importance of this cannot be overstated. This is a frequent topic of discussion at every accessibility consulting firm I’ve worked for.  The number and kind(s) of problems discovered can vary significantly depending upon who does the testing and the looser the consulting firm’s methodology (or lack thereof in some cases), the more variance in what is reported. In fact, at a previous job one client once remarked “I can tell who wrote which report just based on reading it”. This, to me, is a symptom of a test methodology that lacks rigor. On the upside paper does describe a seemingly collaborative workflow where the judges discuss the issues found, but this is still not the same as having and following a predefined rigorous methodology. The presence of a rigorous methodology of manual testing would be even further strengthened by the judges’ collaboration.

In this same section on Expert Evaluations, the paper states that “Dynamic content was tested conducting usability walkthroughs of the problematic functionalities…” and yet the process of conducting these “usability walkthroughs” was not discussed. The paper does not discuss how many participants (if any)  took part in these usability walkthroughs and does not disclose any details on any of the participants, their disabilities, their assistive technologies, and so on. Again, the reader is expected to assume this was performed with rigor.

Exacerbating the above, the paper does not provide any details on what the expert evaluation discovered.  Some of this data is posted at http://www.markelvigo.info /ds/bench12 but the data provided only discloses raw issue counts and not specific descriptions of what, where, and why the issues existed. There is also no discussion of the severity of the issues found. While I realize that listing this level of detail in the actual paper would be inappropriate, sharing the results of each tool and the results of each expert evaluator at the URL mentioned above would be helpful in validating the paper’s claims. In fact, the expert evaluation results are invalidated as a useful standard against which the tools are measured by stating that:

Even if experts perform better than tools, it should be noted that experts may also produce mistakes so the actual number of violations should be considered an estimation…

If the experts make mistakes and the likelihood of such mistakes is so high that “…the actual number of violations should be considered an estimation…” then the results discovered by these persons is in no way useful as a standard for the subsequent benchmarking of the tools. Remember, the purpose of this paper is to supply some form of benchmark. You can’t measure something against an inaccurate benchmark and expect reliable or useful data.

The description of the approach for automated testing does not disclose the specific versions of each tool used or the dates of the testing. The paper also does not disclose what level of experience the users of the tools had with the specific tools or what, if any, configuration settings were made to the tool(s). The tool version can, at times, be critical to the nature and quality of the results. For instance, Deque’s Worldspace contained changes in Version 5 that were significant enough to make a huge difference between the results of testing with it and its predecessor. Similarly, SSB BART Group’s AMP is on a seasonal release schedule which has in the past seen big differences in testing. Historically, automated testing tools are well-known for generating false positives. The more robust tools can be configured to avoid or diminish this but whether this was done is not known. Not disclosing details on the testing tools makes it difficult to verify the accuracy of the results. Were the results they found (or did not find) due to flaws in the tool(s), flaws in the configuration of the tools, or flawed use of the tools?  It isn’t possible to know whether any of these possible factors influenced the results without more details.

To that point, it also bears mentioning that some of the tools used in this paper do not test the DOM. Specifically, I’m aware that TAW, TotalValidator, and aChecker do not test the DOM. SortSite and Worldspace do test the DOM and it is alleged that latest versions of AMP does as well. This means that there is a built-in discrepancy between what the tools employed actually test. This discrepancy in what the tools test quite obviously leads to significant differences in the results delivered and, considering the importance of testing the DOM, calls into question the reason for including half of the tools in this study. On one hand it makes sense in this case to include popular tools no matter what, but on the other hand it seems that using tools that are known to be broken sets up a case for a pre-determined conclusion to the study. This skews the results and ensures that more issues are missed than should be.

The numerous flaws discussed above do not altogether make this paper worthless. The data gathered is very useful in providing a glimpse into the wide array of performance differences between automated testing tools. The issues I’ve discussed above certainly invalidate the paper’s claim that it was a “benchmark” study, but it is nonetheless compelling to see differences between each tool, especially in discussing that while a tool may out-perform its peers in one area, it may under-perform in other ways even more significantly. The findings paint a picture of an automated testing market where tool quality differs in wild and unpredictable ways which a non-expert customer may be unprepared to understand. Unfortunately the data that leads to these various stated conclusions isn’t exposed in a way that facilitates public review. As mentioned, some of this data is available at http://www.markelvigo.info byt/ds/bench12.  It is interesting to read the data on the very significant disparities between the tools and also sad that it has to be presented in a paper that is otherwise seriously flawed and obviously biased.

An unbiased academic study into the utility and best practices of automated testing is sorely needed in this field.  I’ve attempted my own personal stab at what can be tested and how and I stand by that information. I’ve attempted to discuss prioritizing remediation of accessibility issues. I’ve recommended a preferred order for different testing approaches. At the same time, none of this is the same as a formal academic inquiry into these topics.  We’ll never get there with academic papers that are clearly driven by bias for or against specific methodologies.


Markel Vigo has updated the URL I’ve cited at which you can find some of the data from the paper with a response to this blog post. Like him, I encourage you to read the paper. In his response, he says:

We do not share our stimuli used and data below by chance, we do it because in Science we seek the replicability of the results from our peers.

My comments throughout this blog post remain unchanged. The sharing of raw issue counts isn’t enough to validate the claims made in this paper. Specifically:

  1. There is no data to substantiate the claim of “sole reliance” on tools
  2. There is no data to substantiate the claim of “harm” done by the supposed sole reliance
  3. There is no data shared on the specific methodology used by the human evaluators
  4. There is no data shared regarding the exact nature and volume of issues found by human evaluators
  5. There is no data shared regarding the participants of the usability walkthroughs
  6. There is no data shared regarding the exact nature and volume of issues found by the usability walkthroughs
  7. There is no information shared regarding the version(s) of each tool and specific configuration settings of each
  8. There is no data shared regarding the exact nature and volume of issues found by each tool individually
  9. There is no data shared which explicitly marks the difference between exact issues found/ not found by each tool vs. human evaluators

It is impossible to reproduce this study without this information.

In his response, Markel states that this blog posts makes "…serious accusations of academic misconduct…". I have no interest in making any such accusations against any person. Throughout my life I’ve apparently grown the rare ability to separate the person from their work. I realize that my statement about this paper’s bias can be interpreted as a claim of academic misconduct, but that’s simply an avenue down which I will not travel. Markel Vigo has contributed quite a bit to the academic study of web accessibility and I wouldn’t dare accuse him or the other authors of misconduct of any kind. Nevertheless, the paper does read as though it were research aimed at a predetermined conclusion. Others are welcome to read the paper and disagree.

Finally, the response states:

Finally, the authors of the paper would like to clarify that we don’t have any conflict of interest with any tool vendor (in case the author of the blog is trying to cast doubt on our intentions).

Let me be clear to my readers: Nowhere in this blog post do I state or imply that there’s any conflict of interest with any tool vendor.

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343


Recently, I saw someone Tweet that “…ARIA should be last” when working to make a website accessible. As you learn in Logic 101, generalized statements are particularly false. Such a broad statement, though mostly correct at least in spirit, is wholly incorrect in certain situations. ARIA is clearly the right choice in cases where native semantics do not exist at all or are poorly supported. Still, there are some parts of ARIA that I think are just plain silly and ill-advised – namely roles which are intended to behave exactly like elements in native HTML.


There used to be a time when creating pseudo-buttons, like a link styled to look like a button, made sense. Styling the <button> element was incredibly difficult. These days that’s not the case. As I understand it, any browser that will support the ‘button’ role will also reliably support CSS on the <button> element, making the use of this role pretty silly.


I’m completely unable to find a use case for the ‘heading’ role. The heading role, as the name implies, can function as a substitute for <h1>, <h2>, etc. and the WAI-ARIA spec says If headings are organized into a logical outline, the aria-level attribute can be used to indicate the nesting level. In other words, you could do something like this:

<div role='heading' aria-level='3'>Foo</div>

I cannot imagine a scenario where this is at all a suitable alternative to HTML’s native heading elements. It is far more markup  than necessary and, I suspect, more prone to errors by uninformed devs.


This is another role that is ripe for developer error. Actual links – that is, an <a> element with an href attribute pointing to a valid URI – have specific methods and properties available to them, as I described in an earlier post titled Links are not buttons…. Adding a role of ‘link’ on something that is not a link now requires you to ensure that your code behaves the same way as a link. For instance, it should be in the tab order, should react to the appropriate events via keyboard, and that it actually navigate to a new resource when acted upon. These are all things an actual, properly marked up link can do, making this role silly as well.

role=list / role=listitem

Given the WAI-ARIA descriptions of the list Role and listitem Role I can’t see anything that these roles offer that can’t be handled by plain HTML. The latter is described as A group of non-interactive list items while the latter is A single item in a list or directory. In other words, these things are the same as a regular ole HTML list.


The radio Role is A checkable input in a group of radio roles, only one of which can be checked at a time. Of all of the roles listed here this is the only one I could justify using. Unlike all of the other roles listed, the native element this replaces cannot be styled with much flexibility. It is infinitely more easy to style something else and give it a role of ‘radio’. At the same time I must admit to wondering: Why? At the risk of sounding like I’m against “design”, it just doesn’t seem worth it to forego the reliability of a native control just for visual styling purposes. There are several JavaScript libraries, jQuery plugins, or whole front-end frameworks aimed at the styling of forms and almost universally they fail to meet accessibility requirements in at least one of the following ways.

  • The design itself has poor contrast
  • The styling doesn’t work in Windows High Contrast Mode
  • The styling would be incompatible with user-defined styles
  • The custom elements are not keyboard accessible or, at least, visual state change doesn’t work via keyboard

In the case of custom radio buttons, merely adding a role of ‘radio’ is not enough and the costs of doing it right should be strongly considered against the reliability and robustness of just using native radio buttons.


All though the roles discussed above are, in my opinion, just plain silly in HTML, WAI-ARIA wasn’t created just for making HTML documents accessible. Ostensibly, it can be used for any web content and, in fact, the role attribute was added to SVG Tiny 1.2 all the way back in 2008. SVG would otherwise have no way of exposing the same name, state, role, and value information without ARIA and it has been incorporated directly into SVG 2.
Meme: ARIA All The Things!
So on the topic of “Use ARIA first” vs. “Use ARIA last”, neither is right. The right answer is to use ARIA whenever ARIA is the best tool for the task at hand. That might be for a progressive enhancement scenario when the user’s browser doesn’t support a specific feature, or to enhance accessibility under certain use cases, or to create an accessible widget that doesn’t exist in native semantics. Blanket statements don’t help, but constructive guidance does.

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

The little button that could

The original issue

A link is used to perform the action of a button. The code below is used to show a region of the page which is hidden by default. Screen readers will read this code as a link and expect that it will navigate. Instead, upon activating this link, focus remains on the link and performs an action of a button.

<a id="x_123" class="niftyNuBtn" href="javascript:;">Do Something</a>

As a consequence, we recommend using an actual BUTTON element:

<button id="x_123" class="niftyNuBtn">Do Something</button>

The response

We can’t use a BUTTON because it would not look right with our CSS. Our stylesheets all reference this as a.niftyNuBtn. Why is this a problem anyway?

The follow-up

Well, there are two primary issues, the first of which is admittedly a little about semantic purity in that a button is a button and a link is a link. But there’s a bit more to it: users who can see an affordance which looks like a button will intuit immediately that it will (or should) behave like a button. And, were it to look like a link, they would intuit that it is a link. For a user who cannot see, or whose vision is very poor, may be using an assistive technology which reads out the properties of an underlying object. In short, a BUTTON will be announced via text-to-speech as "button". A button’s behavior and a link’s behavior are distinctly different – a button initiates an action in the current context whereas a link changes the context by navigating to a new resource. In order to meet users’ expectations of how this affordance will perform, it should be a button.

The follow-up’s response

Our engineer said we can use WAI-ARIA for this. He said that we can give this thing a button role which will mean that JAWS will announce this as a button and that will alleviate your earlier concerns. So, how about this:

<a id="x_123" class="niftyNuBtn" role="button" href="javascript:;">Do Something</a>

Almost there, I think

Yes. This will cause aria-supporting assistive technologies to announce this link as a button. Unfortunately, there's the issue of focus management and this impacts more than just users who are blind. A link is understood to change focus to a new resource. Buttons may or may not change focus, depending on the action being performed. In this specific button's case, focus should stay on the button. At first glance, you may think that this pseudo-button is doing what it needs to be doing because you're keeping focus on the button when the user clicks it. That's true. What's also true is focus stays on it when you hit the enter key, which is also fine. Unfortunately, activating it with the spacebar causes the page to scroll. Users who interact with their computer using only the keyboard will expect that they can activate the button with the spacebar as well. Overall the best option is to just use a button.

Digging in

Crap, you're right. Our engineer added the button role and everything was great, but then I hit the spacebar and the page scrolled! How do we stop this?!?

Prevent Default

Actually, stopping the scrolling is pretty easy. You can use event.preventDefault() like so:
$('.niftyNuBtn').on('click, keypress' function(event){
        if(event.type === 'click'){
        else if(event.type === keypress){
            var code = event.charCode || event.keyCode;
            if((code === 32) || (code === 13)){

Keep in mind, you'll need to do this event.preventDefault(); on every instance where you have code that acts like a button.


Turns out we've decided to use a button. All we needed to do was change a few CSS declarations. Thanks so much for the help.

Note: no, this isn't from a real client but actually reminiscent of multiple situations.

I'm available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

Everything you know about accessibility testing is wrong (Part 4)

…how many bigger issues have we missed wasting our time fixing this kind of crap? @thebillygregory

Literally every single audit report I’ve ever done includes issues relating to the following:

  • Missing alt attributes for images
  • Missing explicit relationships between form fields and their labels
  • Tables without headers or without explicit relationships between header cells and data cells

I also frequently find these others

  • Use of deprecated, presentational elements and attributes
  • Events bound to items that are not discoverable via keyboard
  • Poor color contrast
  • Blank link text
  • Missing/ inaccurate/ incomplete name, state, role, and value information on custom UI widgets

The sheer volume of these types of errors is, to put it lightly, frustrating. In fact, the title of my presentation “What is this thing and what does it do” is actually born from an inside joke. During one audit where the system I was testing was particularly bad, I joked to some coworkers that analyzing the code was a bit like a game to figure out, “what is this thing and what does it do?”. I only later decided to put a positive spin on it.

As I mentioned in the previous post in this series, there are an average of 54 automatically detectable errors per page on the Internet. The thing about automated testing is that, even though it is somewhat limited in the scope of what it can find, some of the errors it does find are pretty high impact for the user. Think about it: missing alt text for images and missing labels for form fields are a huge impact for users. While the total amount of accessibility best practice that are definitively testable by automated means are small, they tend to have a huge impact in whether people with disabilities can use the system.

Automatically detectable issues should never see the light of day

The reason why some people are against automated testing is that for such a long time we in the accessibility world haven’t really understood where the testing belongs. People have long regarded the applicability of automated accessibility testing as being a QA process and, even worse, it often exists as the only accessibility-related QA testing that occurs. If your approach to accessibility testing begins and ends with the use of an automated tool, you’re doing it wrong. This concept of automated-tool-or-nothing seems at times to be cooperatively perpetuated both by other tool vendors and by accessibility advocates who decry automated testing as not effective. We must turn our back – immediately and permanently – on this either-or mentality. We must adopt a new understanding that automated testing has an ideal time & place where it is most effective.

Automated accessibility testing belongs in the hands of the developer. It must be part of normal development practices and must be regarded as part of the workflow of checking ones’ own work. All developers do basic checking of their work along the way, be it basic HTML & CSS validation, or checking that it displays right across browsers. Good developers take this a step further, by using code inspection tools like JSLint, JSHint, PHP Mess Detector, PHP_CodeSniffer and the like. In fact, IDEs like WebStorm, Netbeans, Aptana, and Eclipse have plugins to enable developers to do static code analysis. Excellent developers perform automated unit testing on their code and do not deploy code that doesn’t pass. What prevents accessibility from being part of this? Existing toolsets.

The revolution in workflow that will change accessibility

Last week I created a new wordpress theme for this site. I’m not the world’s best designer, but I hope it looks better than before. I created it from scratch using my Day One theme as a base. It also includes FontAwesome and BootStrap. I use Grunt for managing a series of tasks while I built and modified the template’s design:

  • I use grunt-contrib-sass to compile 11 different SASS files to CSS
  • I use grunt-contrib-concat to combine my JS files into one JS file and my CSS files into one CSS file
  • I use grunt-contrib-uglify to minify the JS file and grunt-contrib-cssmin to minify the CSS file
  • I use grunt-uncss to eliminate unused CSS declarations from my CSS file.
  • I use grunt-contrib-clean to clear out certain folders during the above processes to ensure any cruft left behind is wiped and that the generated files are always the latest & greatest
  • I use grunt-contrib-jshint to validate quality of my JS work – even on the Gruntfile itself.
  • I use grunt-contrib-watch to watch my SASS files and compile them as I go so I can view my changes live on my local development server.

All of my projects use Grunt, even the small Ajax Chat demo I’m giving at CSUN. Some of the projects do more interesting things. For instance, the Ajax Chat pulls down an external repo. Tenon automatically performs unit testing on its own code. When something goes wrong, Grunt stops and yells at you. You can even tie Grunt to pre-commit hooks. In such a workflow nothing goes live without all your Grunt tasks running successfully.

Imagine, an enterprise-wide tool that can be used in each phase, that works directly as part of your existing workflows and toolsets. Imagine tying such a tool to everything from the very lowest level tasks all the way through to the build and release cycles and publication of content. That’s why I created Tenon.

While Tenon has a web GUI, the web GUI is actually a client application of the real Tenon product. In fact, internally Asa and I refer to and manage Tenon as a series of different things: Tenon Admin, Tenon UI, and Tenon (the API). The real deal, the guts, the muscle of the whole thing is the Tenon API which allows direct command line access to testing your code. This is fundamental to what we believe makes a good developer tool. When used from the command line Tenon can play happily with any *nix based systems. So a developer can open a command prompt and run:

$ tenon http://wwww.example.com

and get results straight away.

By using Tenon as a low level command it becomes possible to integrate your accessibility testing into virtual any build system such make, bash, ANT, Maven etc. As I mentioned above, one possibility is to tie Tenon to a git pre-commit hook, which would prevent developer committing code which could not pass Tenon’s tests. Like JSHint, you can customize the options this to match your local development environment and level of strictness to apply to such a pre-commit hook.

A typical workflow with Tenon might look a bit more relaxed for say a front-end developer working on a CMS and using Grunt to compile SASS to CSS and minify JS. As a node.js module we will be introducing a grunt plugin. So once grunt-tenon is introduced into your Gruntfile.js file, you can add grunt-contrib-watch to watch your work. Every time you save, your front-end will perform your normal Grunt tasks and test the page you’re working on for accessibility.

Processing: http://www.example.com
Options: {"timeout":3000,"settings":{},"cookies":[],"headers":{},"useColors":true,"info":true}
Injecting scripts:

>>  /var/folders/mm/zd8plqb15m38j4dzf3yf9pjw0000gn/T/1394733486320.647
>>  client/assets.js
>>  client/utils.js

Errors: 10
Issues: 10
Warnings: 0
Total run time: 3.27 sec

The same Gruntfile can also be run on your Jenkins, Travis-CI or Bamboo build server. Let’s say we’re using Jira for bug tracking and have it connected to our Bamboo build server. A developer on our team makes an accessibility mistake and commits that mistake with a Jira key — ISSUE-1234 — into our repo. As part of the Bamboo build, Tenon will return the test results in JUNIT format. The Bamboo build will fail and we can see in Jira that the commit against ISSUE-1234 was the cause for the red build. It will link directly to the source code in which the error originated. Because were using a CI build system from our developers standpoint all this can happen many times a day without requiring anything more than a simple commit!

Proper management of accessibility necessitates getting ahead of accessibility problems as soon as possible. Effectively there is no place before the code is committed. As a pre-commit hook or, at least, as a Grunt task before committing, accessibility problems are caught before they’re created. Automated testing is not the end, but the beginning of a robust accessibility testing methodology.

The next post is last post in this series, where we’ll put it all together.

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

Looking forward to CSUN 2014

I’m currently wrapping up the rest of my work for the week and getting ready for the annual pilgrimage to San Diego for the annual International Technology and Persons with Disabilities Conference, otherwise known as “CSUN”. Unlike previous years, I have relatively few presentations. I’m glad about that, really, because it means I can spend more time meeting people. If this is your first year at CSUN, you should read John Foliot’s CSUN For Newbies.

Preconference Workshop

On Tuesday, March 18, 2014, at 1:30 PST Billy Gregory and I will be assisting Steve Faulkner and Hans Hillen in a Pre-Conference Workshop titled “Implementing ARIA and HTML5 into Modern Web Applications (Part Two)”.

My Presentations

  1. Thursday, March 20, 2014 – 3:10 PM PST
    Roadmap For Making WordPress Accessible WordPress Accessibility Team members demonstrate progress and challenges and a roadmap for making WordPress accessible. Location: Balboa B, 2nd Floor, Seaport Tower
  2. Friday, March 21, 2014 – 1:50 PM PST
    No Beard Required. Mobile Testing With the Viking & the Lumberjack – Testing Mobile accessibility can be as daunting as it is important. This session will demystify and simplify mobile testing using live demonstrations and straightforward techniques. Location: Balboa A, 2nd Floor, Seaport Tower

Demonstrations of Tenon

If you’re interested in finding out more about Tenon, email me or just stop me in the hall and I’ll give you a demo.

If you’re going to CSUN I want to meet you

I love CSUN’s family-reunion-like atmosphere and getting to catch up with the many people I already know. But what I like more is meeting people I hadn’t already met. If you’re new to accessibility or we just don’t know each other yet, please just walk up and say hello. This is how I met many of the people I count among my best friends in accessibility!

Something more formal?

If you want to set up something more formal, especially for a one-on-one conversation, I strongly recommend emailing me directly. Typically what happens is that something intended to be a simple informal one-on-one get together winds up being a big group outing, so if you want to set up a private time to talk, here are some ideas.

  • Morning – I’m available all week before 8am. I’m open Tuesday and Thursday before 9.
  • Afternoon – As the day gets later, openings get more scarce. I’m currently open for lunches all week.
  • Evening – Evenings are often filled with impromptu group activities, so I won’t schedule something during the evening.

So, given the above, email me at karl@karlgroves.com to set something up!

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

Everything you know about accessibility is wrong (Part 3)

In the previous post in this series, I ended with a discussion that “current automatic accessibility testing practices take place at the wrong place and wrong time and is done by the wrong people” but really this applies to all accessibility testing. Of course every organization is different, but my experience substantiates the statement quite well. The “…by the wrong people” part is especially true. The wrong people are QA staff.

While QA practices vary, one nearly universal trait among QA staff is that they lack any training in accessibility. Further, they often lack the technical skill necessary to skillfully decipher the reports generated by automated tools. When you combine their inexperience in both accessibility and development, you’re left with significant growing pains when you thrust an automated testing tool at them. As I’ve said in previous posts, these users will trust the automated tool’s results explicitly. Regardless of the quality of the tool, this increases the opportunity for mistakes because as I’ve said in previous posts, there are always limitations to what can be found definitively and very likely that some interpretation is needed. There are also things that are too subjective or too complex for an automated tool to catch.

Irrespective of tool quality, truly getting the most out of an automated web accessibility tool requires three things:

  • Technical knowledge in that which is being tested
  • Knowledge and understanding of the tool itself
  • Knowledge around accessibility and how people with disabilities use the web

The first two points above apply to any tool of any kind. Merely owning a bunch of nice tools certainly hasn’t made me an expert woodworker. Instead, my significant expense in tools has allowed me to make the most of what little woodworking knowledge and skill I have. But, if I had even more knowledge and skill, these tools would be of even more benefit. Even the fact that I have been a do-it-yourselfer since I was a child helping my dad around the house only helps marginally when it comes to a specialized domain like fine woodworking.

The similar lack of knowledge on the part of QA staff is the primary reason why they’re the wrong users for automated testing tools – at least until they get sufficient domain knowledge in development and accessibility. Unfortunately learning-by-doing is probably a bad strategy in this case, due to the disruptive nature of erroneous issue reports that’ll be generated along the way.

So who should be doing the testing? That depends on the type of testing being performed. Ultimately, everyone involved in the final user-interface and content should be involved.

  • Designers who create mockups should test their work before giving it to developers to implement
  • Developers should test their work before it is submitted to version control
  • Content authors should test their work before publishing
  • QA staff should run acceptance tests using assistive technologies
  • UX Staff should do usability tests with people with disabilities.

At every step is an opportunity to discover issues that had not been previously discovered, but there’s also a high likelihood that as the code itself gets closer and closer to being experienced by a user that the issues found won’t be fixed. Among the test opportunities listed above, developers’ testing of their own work is the most critical piece. QA staff should never have functional acceptance tests that fail due to an automatically-detectable accessibility issue. Usability test participants should never have a failed task due to an automatically-detectable accessibility issue. It is entirely appropriate that the developer take on such testing of the own work.

Furthering the accessibility of the Web requires a revolution in how accessibility testing is done

Right now we’re experiencing a revolution in the workflow of the modern web developer. More developers are beginning to automate some or all of their development processes, whether this includes things like dotfiles or SASS / LESS or the use of automated task runners like Grunt and Gulp. Automated task management isn’t the exception on the web, it is the rule and it stems from the improvement in efficiency and quality I discussed in the first post in this series.

Of the top 24 trending projects on Github as of this writing:

  • 21 of them include automated unit testing
  • 18 of them use Grunt or Gulp for automated task management
  • 16 of them use jshint as part of their automated task management
  • 15 of them use Bower for package management
  • 15 of them use Travis (or at least provide Travis files)
  • 2 of them use Yeoman

The extent to which these automated toolsets are used varies pretty significantly. On smaller projects you tend to see file concatenation and minification, but the sky is the limit, as evidenced by this Gruntfile from Angular.js. The extensive amount of automated unit testing Angular does is pretty impressive as well.

Myself and others often contend that part of the problem that exists with accessibility on the web is the fact that it is seen as a distinctly separate process from everything else in the development process. Each task that contributes to the final end product impacts the ability for people to use the system. Accessibility is usability for persons with disabilities. It is an aspect of the overall quality of the system and a very large part of what directly impacts accessibility is purely technical in nature. The apparent promise made by automated accessibility testing tool vendors is that they can find these technical failings. Historically however, they’ve harmed their own credibility by being prone to the false positives I discussed in the second post in this series. Finding technical problems is one thing. Flagging things that aren’t problems is another.

Automated accessibility testing can be done effectively, efficiently, and accurately and with high benefit to the organization. Doing so requires two things:

  • It is performed by the right people at the right time. That is that it be done by developers during their normal automated processes.
  • The tools stop generating inaccurate results. Yes, this means that perhaps we need to reduce the overall number of things we test for.

It may seem somewhat non-intuitive to state that we should do less testing with automated tools. The thing is, the state of web accessibility in general is rather abysmal. As I get ready for the official release of Tenon, I’ve been testing the homepage of the most popular sites listed in Alexa. As of this writing, Tenon has tested 84,956 pages and logged 1,855,271 issues. Among the most interesting findings

  • 27% of issues relate to the use of deprecated, presentational elements or attributes
  • 19% of issues are missing alt attributes for images
  • 10% of issues are data tables with no headers
  • 5% of issues relate to binding events to non-focusable elements.
  • 2% of issues relate to blank link text (likely through the use of CSS sprites for the link)

85,000 tested pages is statistically significant and has a high confidence interval. In fact, it is more than enough.

There are an average of 54 definitively testable issues per page on the web. These are all development related issues that could be caught by developers if they had tested their work prior to deployment. Developers require the availability of a toolset that can allow them the ability to avoid these high-impact issues up front. This is the promise of Tenon.

In Part 4 I’ll talk about our need to move away from standalone, monolithic toolsets and toward integrating more closely with developers’ workflows

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

Woodshop tour

I posted this to Facebook but wanted to share on my site, too. This is where I spend my weekends when it is cold outside:

Note: alt attribute on each image is blank. Visible text under the image describes the image.

Looking into the entrance-way. Drill press and lathe straight ahead. Chalkboard paint along the left wall. Dust collection system viewable overhead. Pipe clamps clamped to support beam.

View from right past the entrance-way. Glue-ups in progress on my work table. There’s no such thing as too many clamps. Ahead and to the left you can see the band saw. To the left of that is my crappy little router table. Dust collection system also in view.

View straight ahead past entrance way. Drill press right in front. Just past that is the lathe with a rosewood bowl blank mounted. Further ahead is my new grinder. Various supplies are on the shelves along the wall. The lower shelf is all finishing supplies such as wipe-on poly, glue, sandpaper, etc. while the upper shelf is mostly misc. On the floor ahead are various scraps of wood. Most scraps are thrown away but I occasionally save stuff that may be useful later, such as for experimenting with a joint before doing the final piece.

View of the side of the room that has the drill press and lathe. The lathe is a 42-inch Delta Rockwell. Right behind the lathe is a dust collection box. Unfortunately my dust collector doesn’t have enough horsepower to make the box useful. On top of the dust collection box is a DIY air filter powered by a high-powered computer fan. To the left of that is another box that holds drills and various drill-related stuff like a Kreg jig, drill bits and forstner bits. Not shown: On this side, the workbench is actually a cabinet. Inside the cabinet is 6 glass carboys fermenting beer.

Corner of the room by the lathe. Parts bin on the wall. Shelves with finishing supplies, sharpening supplies, and sanding supplies as well as two grinders on the bench. Dust collection hoses along the top.

Dust collection hoses are also prominent in this picture as is the band saw.

Back wall showing 12-inch compound miter saw. Behind that is pegboard wall holding various tools. Hammers, chisels, screw drivers, files, pliers, and more. A shelf holds various router-related items.

Right corner of the back wall, from a little greater distance. Shows router table, router bits and various router related items. This is also where hand saws are stored as well as safety related stuff like safety glasses, face shield, and air masks. Underneath the router table is a wet-dry vac. It doesn’t get much use now that I have the dust collector, but this is such a good place to store it. On the side of the work bench is a pencil sharpener.

The “back room” of the shop holding more than 200 board-feet of walnut, about 100 board feet of cedar, and 5 walnut slabs. Some other misc. pieces are shown such as a right-angle jig, a spline jig, table saw miter jig, and box-joint jig. Barely in the foreground is a jointer.

View from the very back looking in toward the entrance. Upper left shows a filtered box fan. On the lower left is a new table saw, and beyond that is a downdraft sanding table. Like the dust collection box, the downdraft sanding table isn’t as useful as it could be because the dust collector doesn’t really have enough oomph. On the right side foreground is the jointer. Further ahead is the bandsaw on the right and beyond that is the worktable. On a shelf under the worktable is my 13-inch Dewalt planer, Bosch circular saw, and Porter Cable Dovetail jig. Eventually I’ll have to make a stand for the planer. It isn’t a big deal to pick it up and down right now but when I’m older its gonna be difficult, for sure, because the damn thing is heavy.

Special closeup view of my table saw. While I got a lot of use out of my Dewalt portable table saw, this thing is a thousand times more useful. Behind the table saw are a ton of empty jars ready for Jennifer Groves to put food in this summer!

Not really shown elsewhere in this photo album is a “closet”. This is the other side of the wall on the right of the entranceway. Inside of this room is a dust collector, shown prominently in this picture. To the lower left in this picture is a dust separator which basically separates the big chips before they make their way into the dust bag. Under the dust collector but not shown is a small air compressor.