Karl Groves

Tech Accessibility Consultant
  • Web
  • Mobile
  • Software
  • Hardware
  • Policy
+1 443.875.7343

Accessibility Consulting is Broken

I’ve had an epiphany. Accessibility Consulting, that process where a client hires us to go through their system, test it for accessibility issues, and submit a report to them, is fundamentally broken. My personal interpretation of our goal, as professionals, is to make money doing Good. Our advanced level of knowledge, skills, and experience can and should drive higher amounts of money while allowing us to do a greater amount of good, like a snowball of awesomeness.

The client hires the consultant to help ease pain. That pain may have an array of causes and may have a varying degree of acuteness, but it always has the same root cause: ignorance. Organizationally there is systemic ignorance surrounding the topic of accessibility. The client neither understands what accessibility is or how to manage it and that results in a failure to effectively modify their ICT development or procurement processes to ensure an accessible outcome. Recognizing this, they turn to the services of a consultant. In all likelihood the resulting engagement involves an audit of the client’s problematic ICT system(s), the deliverable for which is a report outlining the findings of this audit process.

No matter how detailed and no matter how perfect the guidance, the delivery of the report fails to ease the customer’s pain. It does not directly address either the symptoms or the cause of the disease. In fact, the more extensive and greater scope of the testing, the higher likelihood that the client will be paralyzed by the magnitude of what they’ve been told. This paralysis is often made worse in cases where other technical debt has been incurred through bad architectural choices and longstanding legacy front-end code.

During in-person training, I use the following story to illustrate this paralysis:

Two years ago, my wife and I decided to make a large number of renovations to our house:

  • Paint all 3 bedrooms, including ceilings
  • Paint the great room
  • Create custom closet shelving
  • Replace badly stained wood floors in hallway
  • Sand, stain, and refinish wood floors in all 3 bedrooms
  • New chair molding, baseboard moldings, and wood trim throughout the house.
  • Replace wood floor in the great room.
  • New stairs

Excited by the new beauty I envisioned for our house, I dove right in and started doing the necessary demolition. I ripped out all of the carpet that was covering the wood floors, removed all the old molding, and skillfully removed the wood floor in the hallway. Removing the wood floor in the hallway was quite easy. I even skillfully staggered the removal of the boards so that the new boards would blend in without looking like they were replaced. Then the paralysis started.

Demolition complete, I was faced with the exact extent of what was ahead of me. Everywhere I went in the house I was reminded of everything I had to do to finish the house – some of which I really had no clue what to do. For instance, properly installing the new wood floor so it blended into the existing wood floor was far beyond my existing skill set. It scared me and, as a dependency for so many other things on the list, I knew I had to do it but had no idea where to start. So I didn’t start on it. I kept a long list of excuses prepared for why I couldn’t do it.

In hindsight, the real reason I didn’t dive right in to start the work was because I viewed the work ahead of me as a single massive job: Renovate the house. It wasn’t until I changed my outlook on the work as being a series of small distinct tasks I could tackle. This is the same type of overwhelmed feeling clients tend to get when they’re delivered a huge accessibility audit report. The lower their existing willingness to address accessibility and/ or the higher their level of pain & distress (such as threat of litigation), the more likely and more severely they’ll feel paralyzed by the bad news in the report.

Ideally, the client would take the report, read it in its entirety, absorb the excellent guidance contained therein, and jump in with both feet to start fixing their broken system. The consultant feels that report is more than a report. It is a learning document. When the final spelling and grammar check is run in Microsoft Word and saved to its final version, the consultant proudly ships it to the client. Idealistically, the consultant thinks their masterful wording, illustrative screenshots, and skillful code examples will trigger a revolution in the management and development of more accessible solutions by the client. Unfortunately the more likely outcome is confusion. None of the client’s pain is alleviated. Their long-term effectiveness and success with accessibility is not improved and, at best, the report becomes the basis for a series of new issues in the client’s issue tracking system. In practice, the issue reports created by the client are often lacking an acceptable level of detail for the issues to be properly and expeditiously repaired, further reducing the usefulness of the report.

How do we ease the client’s pain and ensure the client is successful in improving their system(s) and reducing their risk? If the delivery of the audit report doesn’t do it and the client’s repurposing of the report’s content doesn’t do it, where does this leave us? How can we more effectively help ensure client success? By becoming directly involved in alleviating their pain. By becoming the client. Remember, the root cause of the client’s pain is ignorance. The more closely the consultant works with the client as an internal stakeholder, a subject matter expert, and a mentor throughout the development lifecycle, the more directly involved the consultant is in ensuring the client’s success. Integrated into the process as a member of the team, the consultant has direct access to help steer better process and practices. This is the exact opposite of what happens when simply throwing the report over the fence, as it were. The days of the comprehensive audit cum mammoth report should come to an end, replaced with actual guidance.

This guidance can take the form of generating and delivering internal use assets for procuring, designing, developing, and maintaining ICT systems or it can take the form of direct involvement in the development and QA processes. Let’s take, for example, the QA process since it so closely resembles the audit report process in spirit.

A client with even marginally mature development processes has some system for keeping track of bugs and feature requests. This may be as simple as a spreadsheet or as complex as a standalone system which keeps track of not only the issues and feature requests but also various additional metadata related to the issues and feature requests which assist in managing, tracking, and reporting. In any case where such a system doesn’t exist, the consultant’s first order of business should be to assist the client in choosing and standing up such a system. In either case, the consultant needs to have direct access to this system equivalent to that of any other team member.

It is in this issue tracking system where the consultant must do their work, if they’re to be effective in facilitating actual improvement of the client’s systems. Working within a robust issue tracking system, the consultant can immediately log the issues they find. This is where the QA and development staff does their work and it is appropriate that, as a part of the team, so should the consultant. Here, the consultant can log the issues they find and take part in the ensuing discussion among development staff regarding the nature, severity, and priority of the issue. Will it take a long time to fix? Will it be difficult? What are the dependencies? How will the user benefit? How will the business logic or presentation logic be affected? How can the repair be verified? These are among the many questions that the development staff might ask that require the input and collaboration of the consultant. They also require a level of discussion not available in a long, one-sided report, no matter how detailed. Direct access to and use of the client’s issue tracking facilitates this seamless collaboration.

Merely replacing the mammoth report deliverable with direct issue logging obviously isn’t enough to address systemic ignorance of accessibility, but rather eliminates a significant roadblock to accessibility in current systems. In practice, this is particularly true because the issues are seen only as a series of issues and not a series of learning experiences. Testing of new work will show this to be true as the consultant is likely to discover issues identical in nature to those they had already reported in the earlier test effort(s). As mentioned in the second paragraph, the root cause of customer pain is ignorance and, while short-term pain is more effectively addressed with direct issue logging, the long-term plan must aggressively address ignorance. This is the domain of training. All persons involved in the management, design, development, and content of ICT systems should be trained to understand the need for accessibility, the specific challenges in a person with disability’s use of ICT products & services, and how that client staff person’s role impacts the deployment of accessible ICT. Through this role-based training the client staff person’s ignorance can be systematically eliminated. In the case of training, the more permanence the training materials have, the better – up to, and including, LMS and/ or video based materials that can be used as part of an onboarding process for new employees.

Last among the mechanisms which should be employed by the consultant to address ignorance is the generation of internal assets to be used in the procurement, design, and development if ICT systems. This should include things like policies and procedures, style guides, checklists, and other job aids to be used by management and staff. These assets should serve as guidance and reference materials, success criteria, and performance measures whenever new ICT work is undertaken or existing ICT systems are improved.

The days of the monolithic accessibility audit report are numbered, as it is an outdated medium that fails to directly address the actual problems faced by the consultant’s clients. Clients, often driven by pain based in ignorance, want and deserve a more direct and proactive approach to solving the root causes of their pain. The proscriptive nature of an audit report should give way to the close involvement and leadership of a skilled consultant.

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

Measuring the harm of flawed academic papers

For several years I’ve been interested in finding and reading academic work in the field of web accessibility. I have a very strong belief that the things we say regarding web accessibility must be based on a significant amount of rigor and I hold in higher esteem those who base their statements on fact rather than opinion and conjecture. Unfortunately I often find that much of the academic work in web accessibility to be deficient in many ways, likely caused by a lack of experiential knowledge of the professional web development environment. Web development practices change at such a lightning fast pace that even professional developers have trouble keeping up on what’s new. Academics who themselves aren’t really developers in the first place are likely to have even greater trouble in understanding not only the causes of accessibility issues in a web-based system but how to test for those causes. I deal specifically with those topics 8-10 hours a day and sometimes I still have to turn to coworkers for advice and collaboration.

The reason this matters is because out-of-date knowledge and experience leads to issues with the research methods being also out of date. The most obvious evidence of this is when web accessibility researchers perform automated testing with tools that are out of date and/ or technically incapable of testing the browser DOM. Testing the DOM is a vital feature for any accessibility testing tool, especially when used in academic research, because the DOM is what the end user actually experiences. It matters even more when studying accessibility because the DOM is interpreted by the accessibility APIs which pass information about content and controls to the assistive technology employed by the user. Performing research with a tool that does not test the DOM is like measuring temperature with a thermometer you know to be broken. You have no chance of being accurate.

Recently I’ve been reading a paper titled “Benchmarking Web Accessibility Evaluation Tools: Measuring the Harm of Sole Reliance on Automated Tests”. This compellingly titled paper fails to show any instances of “sole reliance” on automated tests and further it fails to demonstrate where such sole reliance caused actual “harm” to anyone or anything. Instead, the paper reads as if it was research performed to validate a pre-determined conclusion. In doing so, the paper’s authors missed an opportunity at a much more compelling discussion: the vast performance differences between well-known accessibility testing tools. The title alludes to this, saying “Benchmarking Web Accessibility Evaluation Tools” and then proceeds to instead focus on these ideas of “harm” and “sole reliance” while using bad results from bad tools as its evidence.

This point – that testing with automated tools only is bad – is so obvious that it almost seems unnecessary to mention. I’ve worked in accessibility and usability for a decade and many of those years were as an employee of companies with make automated testing tools. I’ve also developed my own such tools and count among my friends those who have also developed such tools. Not once do I recall the employees, owners, or developers of any such tools claiming that their automated testing product provides complete coverage. Training materials delivered by SSB BART Group and Deque Systems disclose clearly that automated testing is limited in its capability to provide coverage of all accessibility best practices. So, if “sole reliance” on automated testing is actually an issue, a better title for this paper would be “Measuring the Harm of Incomplete Testing Methodologies.” Instead, the reader is presented with what amounts to an either-or proposition by constant mention of the things that the automated tools couldn’t find vs. what human evaluators found. Thus the paper implies that either you use an automated tool and miss a bunch of stuff or you have an expert evaluate it and find everything.

This implication begins even in the first paragraph of the Introduction by stating:

The fact that webmasters put compliance logos on non-compliant websites may suggest that some step is skipped in the development process of accessible websites. We hypothesise that the existence of a large amount of pages with low accessibility levels, some of them pretending to be accessible, may indicate an over-reliance on automated tests.

Unfortunately, nowhere else in the paper is any data presented that suggests the above comments have any merit. The fact that “…webmasters put compliance logos on non-compliant websites” could mean the sites’ owners are liars. It could mean the site was at one time accessible but something changed to harm accessibility. It could mean the sites’ owners don’t know what accessibility means or how to measure it. In fact, it could mean almost anything. Without data to back it up, it really means nothing and is certainly no more likely to be evidence of “over-reliance on automated tests” as it is any of the other possibilities. Instead the reader is left with the implied claim that it is this “over-reliance on automated tests” that is the culprit.

Further unproved claims include:

With the advent of WCAG 2.0 the use of automated evaluation tools has become even more prevalent.

This claim is backed up by no data of any kind. The reader is given no data from surveys of web developers, no sales figures of tools, no download numbers of free tools, not even anecdotal evidence. Instead, it continues:

In the absence of expert evaluators, organizations increasingly rely on automated tools as the primary indicator of their stated level.

And again no data is supplied to substantiate this claim. In fact, my emperical data gained from dealing with over seven-doze clients over the last decade suggests that organizations often don’t do any testing of any kind, much less automated testing. These organizations also tend to lack any maturity of process regarding accessibility in general, much less accessible development, manual accessibility testing, or usability testing. My experience is that organizations don’t “do accessibility” in any meaningful way, automated or not.   The true smoking gun, as it were, for this so-called harm by “sole reliance” on automated testing could be made simply by supplying the reader with actual data surrounding the above claim. It is not supplied and there is no evidence that such data was even gathered.

Another issue with this paper is its nearly myopic discussion of accessibility as a topic concerned only with users who are blind. The most egregious example comes in the claim, referencing prior work (from 2005), that “Findings indicate that testing with screen readers is the most thorough, whilst automated testing is the least”. Later the paper states that during the expert evaluation process that, “If no agreement was reached among the three judges a legally blind expert user was consulted.” While this is follow by a claim that this person is also a web accessibility expert, the paper states that “This protocol goes further and establishes a debate between judges and last resort consultation with end users.” I don’t consider the experience of a single blind user to be the same as “users” and further do not consider it likely that this single expert user’s opinion would represent the broad range of other blind users much less all users with all disabilities. In the United States, the overall rate of disability for vision impairment and hearing impairment is roughly equal, while those with mobility impairments are more than double both of those combined. Cognitive disabilities account for a larger population than the previous three types combined. Clearly the opinion, however skilled, of  a single person who is blind is in no way useful as a means measuring the accessibility of a website for all users with disabilities.

Further problems with the expert evaluation have to do with the ad-hoc nature of the expert evaluation process:

The techniques used to assess the accessibility of each web page are diverse across judges: evaluation tools that diverge from the ones benchmarked (WAVE2), markup validators, browser extensions for developers (Firebug, Web Developer Toolbar, Web Accessibility Toolbar), screen readers (VoiceOver, NVDA) and evaluation tools based on simulation such as aDesigner[24]

The above passage betrays two rather significant flaws in both the paper itself and the evaluation process. The first is the rather humorous irony that some of the tools listed are by their nature automated testing tools. Both the WAVE web service and WAVE toolbar provide visual representation of automated test results for the page being tested. Markup validators are automated evaluation tools which access the code and automatically assess whether the markup itself is valid. In other words, the expert evaluation process used automated tools. In the process, the point is made that no skilled evaluator would solely rely on the results from automated tools. Adding to the irony, there is no discussion of any other evaluation methods other than testing with screen readers. This further adds to my argument that this paper has a myopic focus on blindness. The second and more important flaw is that there appears to have been no predefined methodology in place for their evaluation. Instead it appears to be assumed that either the reader will trust that the reviewers’ expertise speaks for itself or that a rigorous methodology is unnecessary. Regardless of why, the fact that the paper doesn’t supply a detailed description of the expert evaluation methodology is cause to question the accuracy and completeness of, at the very least, the results of such evaluation.

If the purpose of the paper is to evaluate what is found by machines measured against the results uncovered by expert evaluators, then it is critical that the human evaluation methods be disclosed in explicit detail. Based on the information provided, it would appear that the expert evaluation happened in a much more ad hoc fashion, with each expert performing testing in whatever fashion they deem fit. The problem with this practice is that regardless of the level of expertise of the evaluators, there will always be differences in what & how the testing was done. The importance of this cannot be overstated. This is a frequent topic of discussion at every accessibility consulting firm I’ve worked for.  The number and kind(s) of problems discovered can vary significantly depending upon who does the testing and the looser the consulting firm’s methodology (or lack thereof in some cases), the more variance in what is reported. In fact, at a previous job one client once remarked “I can tell who wrote which report just based on reading it”. This, to me, is a symptom of a test methodology that lacks rigor. On the upside paper does describe a seemingly collaborative workflow where the judges discuss the issues found, but this is still not the same as having and following a predefined rigorous methodology. The presence of a rigorous methodology of manual testing would be even further strengthened by the judges’ collaboration.

In this same section on Expert Evaluations, the paper states that “Dynamic content was tested conducting usability walkthroughs of the problematic functionalities…” and yet the process of conducting these “usability walkthroughs” was not discussed. The paper does not discuss how many participants (if any)  took part in these usability walkthroughs and does not disclose any details on any of the participants, their disabilities, their assistive technologies, and so on. Again, the reader is expected to assume this was performed with rigor.

Exacerbating the above, the paper does not provide any details on what the expert evaluation discovered.  Some of this data is posted at http://www.markelvigo.info /ds/bench12 but the data provided only discloses raw issue counts and not specific descriptions of what, where, and why the issues existed. There is also no discussion of the severity of the issues found. While I realize that listing this level of detail in the actual paper would be inappropriate, sharing the results of each tool and the results of each expert evaluator at the URL mentioned above would be helpful in validating the paper’s claims. In fact, the expert evaluation results are invalidated as a useful standard against which the tools are measured by stating that:

Even if experts perform better than tools, it should be noted that experts may also produce mistakes so the actual number of violations should be considered an estimation…

If the experts make mistakes and the likelihood of such mistakes is so high that “…the actual number of violations should be considered an estimation…” then the results discovered by these persons is in no way useful as a standard for the subsequent benchmarking of the tools. Remember, the purpose of this paper is to supply some form of benchmark. You can’t measure something against an inaccurate benchmark and expect reliable or useful data.

The description of the approach for automated testing does not disclose the specific versions of each tool used or the dates of the testing. The paper also does not disclose what level of experience the users of the tools had with the specific tools or what, if any, configuration settings were made to the tool(s). The tool version can, at times, be critical to the nature and quality of the results. For instance, Deque’s Worldspace contained changes in Version 5 that were significant enough to make a huge difference between the results of testing with it and its predecessor. Similarly, SSB BART Group’s AMP is on a seasonal release schedule which has in the past seen big differences in testing. Historically, automated testing tools are well-known for generating false positives. The more robust tools can be configured to avoid or diminish this but whether this was done is not known. Not disclosing details on the testing tools makes it difficult to verify the accuracy of the results. Were the results they found (or did not find) due to flaws in the tool(s), flaws in the configuration of the tools, or flawed use of the tools?  It isn’t possible to know whether any of these possible factors influenced the results without more details.

To that point, it also bears mentioning that some of the tools used in this paper do not test the DOM. Specifically, I’m aware that TAW, TotalValidator, and aChecker do not test the DOM. SortSite and Worldspace do test the DOM and it is alleged that latest versions of AMP does as well. This means that there is a built-in discrepancy between what the tools employed actually test. This discrepancy in what the tools test quite obviously leads to significant differences in the results delivered and, considering the importance of testing the DOM, calls into question the reason for including half of the tools in this study. On one hand it makes sense in this case to include popular tools no matter what, but on the other hand it seems that using tools that are known to be broken sets up a case for a pre-determined conclusion to the study. This skews the results and ensures that more issues are missed than should be.

The numerous flaws discussed above do not altogether make this paper worthless. The data gathered is very useful in providing a glimpse into the wide array of performance differences between automated testing tools. The issues I’ve discussed above certainly invalidate the paper’s claim that it was a “benchmark” study, but it is nonetheless compelling to see differences between each tool, especially in discussing that while a tool may out-perform its peers in one area, it may under-perform in other ways even more significantly. The findings paint a picture of an automated testing market where tool quality differs in wild and unpredictable ways which a non-expert customer may be unprepared to understand. Unfortunately the data that leads to these various stated conclusions isn’t exposed in a way that facilitates public review. As mentioned, some of this data is available at http://www.markelvigo.info byt/ds/bench12.  It is interesting to read the data on the very significant disparities between the tools and also sad that it has to be presented in a paper that is otherwise seriously flawed and obviously biased.

An unbiased academic study into the utility and best practices of automated testing is sorely needed in this field.  I’ve attempted my own personal stab at what can be tested and how and I stand by that information. I’ve attempted to discuss prioritizing remediation of accessibility issues. I’ve recommended a preferred order for different testing approaches. At the same time, none of this is the same as a formal academic inquiry into these topics.  We’ll never get there with academic papers that are clearly driven by bias for or against specific methodologies.


Markel Vigo has updated the URL I’ve cited at which you can find some of the data from the paper with a response to this blog post. Like him, I encourage you to read the paper. In his response, he says:

We do not share our stimuli used and data below by chance, we do it because in Science we seek the replicability of the results from our peers.

My comments throughout this blog post remain unchanged. The sharing of raw issue counts isn’t enough to validate the claims made in this paper. Specifically:

  1. There is no data to substantiate the claim of “sole reliance” on tools
  2. There is no data to substantiate the claim of “harm” done by the supposed sole reliance
  3. There is no data shared on the specific methodology used by the human evaluators
  4. There is no data shared regarding the exact nature and volume of issues found by human evaluators
  5. There is no data shared regarding the participants of the usability walkthroughs
  6. There is no data shared regarding the exact nature and volume of issues found by the usability walkthroughs
  7. There is no information shared regarding the version(s) of each tool and specific configuration settings of each
  8. There is no data shared regarding the exact nature and volume of issues found by each tool individually
  9. There is no data shared which explicitly marks the difference between exact issues found/ not found by each tool vs. human evaluators

It is impossible to reproduce this study without this information.

In his response, Markel states that this blog posts makes "…serious accusations of academic misconduct…". I have no interest in making any such accusations against any person. Throughout my life I’ve apparently grown the rare ability to separate the person from their work. I realize that my statement about this paper’s bias can be interpreted as a claim of academic misconduct, but that’s simply an avenue down which I will not travel. Markel Vigo has contributed quite a bit to the academic study of web accessibility and I wouldn’t dare accuse him or the other authors of misconduct of any kind. Nevertheless, the paper does read as though it were research aimed at a predetermined conclusion. Others are welcome to read the paper and disagree.

Finally, the response states:

Finally, the authors of the paper would like to clarify that we don’t have any conflict of interest with any tool vendor (in case the author of the blog is trying to cast doubt on our intentions).

Let me be clear to my readers: Nowhere in this blog post do I state or imply that there’s any conflict of interest with any tool vendor.

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343


Recently, I saw someone Tweet that “…ARIA should be last” when working to make a website accessible. As you learn in Logic 101, generalized statements are particularly false. Such a broad statement, though mostly correct at least in spirit, is wholly incorrect in certain situations. ARIA is clearly the right choice in cases where native semantics do not exist at all or are poorly supported. Still, there are some parts of ARIA that I think are just plain silly and ill-advised – namely roles which are intended to behave exactly like elements in native HTML.


There used to be a time when creating pseudo-buttons, like a link styled to look like a button, made sense. Styling the <button> element was incredibly difficult. These days that’s not the case. As I understand it, any browser that will support the ‘button’ role will also reliably support CSS on the <button> element, making the use of this role pretty silly.


I’m completely unable to find a use case for the ‘heading’ role. The heading role, as the name implies, can function as a substitute for <h1>, <h2>, etc. and the WAI-ARIA spec says If headings are organized into a logical outline, the aria-level attribute can be used to indicate the nesting level. In other words, you could do something like this:

<div role='heading' aria-level='3'>Foo</div>

I cannot imagine a scenario where this is at all a suitable alternative to HTML’s native heading elements. It is far more markup  than necessary and, I suspect, more prone to errors by uninformed devs.


This is another role that is ripe for developer error. Actual links – that is, an <a> element with an href attribute pointing to a valid URI – have specific methods and properties available to them, as I described in an earlier post titled Links are not buttons…. Adding a role of ‘link’ on something that is not a link now requires you to ensure that your code behaves the same way as a link. For instance, it should be in the tab order, should react to the appropriate events via keyboard, and that it actually navigate to a new resource when acted upon. These are all things an actual, properly marked up link can do, making this role silly as well.

role=list / role=listitem

Given the WAI-ARIA descriptions of the list Role and listitem Role I can’t see anything that these roles offer that can’t be handled by plain HTML. The latter is described as A group of non-interactive list items while the latter is A single item in a list or directory. In other words, these things are the same as a regular ole HTML list.


The radio Role is A checkable input in a group of radio roles, only one of which can be checked at a time. Of all of the roles listed here this is the only one I could justify using. Unlike all of the other roles listed, the native element this replaces cannot be styled with much flexibility. It is infinitely more easy to style something else and give it a role of ‘radio’. At the same time I must admit to wondering: Why? At the risk of sounding like I’m against “design”, it just doesn’t seem worth it to forego the reliability of a native control just for visual styling purposes. There are several JavaScript libraries, jQuery plugins, or whole front-end frameworks aimed at the styling of forms and almost universally they fail to meet accessibility requirements in at least one of the following ways.

  • The design itself has poor contrast
  • The styling doesn’t work in Windows High Contrast Mode
  • The styling would be incompatible with user-defined styles
  • The custom elements are not keyboard accessible or, at least, visual state change doesn’t work via keyboard

In the case of custom radio buttons, merely adding a role of ‘radio’ is not enough and the costs of doing it right should be strongly considered against the reliability and robustness of just using native radio buttons.


All though the roles discussed above are, in my opinion, just plain silly in HTML, WAI-ARIA wasn’t created just for making HTML documents accessible. Ostensibly, it can be used for any web content and, in fact, the role attribute was added to SVG Tiny 1.2 all the way back in 2008. SVG would otherwise have no way of exposing the same name, state, role, and value information without ARIA and it has been incorporated directly into SVG 2.
Meme: ARIA All The Things!
So on the topic of “Use ARIA first” vs. “Use ARIA last”, neither is right. The right answer is to use ARIA whenever ARIA is the best tool for the task at hand. That might be for a progressive enhancement scenario when the user’s browser doesn’t support a specific feature, or to enhance accessibility under certain use cases, or to create an accessible widget that doesn’t exist in native semantics. Blanket statements don’t help, but constructive guidance does.

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

The little button that could

The original issue

A link is used to perform the action of a button. The code below is used to show a region of the page which is hidden by default. Screen readers will read this code as a link and expect that it will navigate. Instead, upon activating this link, focus remains on the link and performs an action of a button.

<a id="x_123" class="niftyNuBtn" href="javascript:;">Do Something</a>

As a consequence, we recommend using an actual BUTTON element:

<button id="x_123" class="niftyNuBtn">Do Something</button>

The response

We can’t use a BUTTON because it would not look right with our CSS. Our stylesheets all reference this as a.niftyNuBtn. Why is this a problem anyway?

The follow-up

Well, there are two primary issues, the first of which is admittedly a little about semantic purity in that a button is a button and a link is a link. But there’s a bit more to it: users who can see an affordance which looks like a button will intuit immediately that it will (or should) behave like a button. And, were it to look like a link, they would intuit that it is a link. For a user who cannot see, or whose vision is very poor, may be using an assistive technology which reads out the properties of an underlying object. In short, a BUTTON will be announced via text-to-speech as "button". A button’s behavior and a link’s behavior are distinctly different – a button initiates an action in the current context whereas a link changes the context by navigating to a new resource. In order to meet users’ expectations of how this affordance will perform, it should be a button.

The follow-up’s response

Our engineer said we can use WAI-ARIA for this. He said that we can give this thing a button role which will mean that JAWS will announce this as a button and that will alleviate your earlier concerns. So, how about this:

<a id="x_123" class="niftyNuBtn" role="button" href="javascript:;">Do Something</a>

Almost there, I think

Yes. This will cause aria-supporting assistive technologies to announce this link as a button. Unfortunately, there's the issue of focus management and this impacts more than just users who are blind. A link is understood to change focus to a new resource. Buttons may or may not change focus, depending on the action being performed. In this specific button's case, focus should stay on the button. At first glance, you may think that this pseudo-button is doing what it needs to be doing because you're keeping focus on the button when the user clicks it. That's true. What's also true is focus stays on it when you hit the enter key, which is also fine. Unfortunately, activating it with the spacebar causes the page to scroll. Users who interact with their computer using only the keyboard will expect that they can activate the button with the spacebar as well. Overall the best option is to just use a button.

Digging in

Crap, you're right. Our engineer added the button role and everything was great, but then I hit the spacebar and the page scrolled! How do we stop this?!?

Prevent Default

Actually, stopping the scrolling is pretty easy. You can use event.preventDefault() like so:
$('.niftyNuBtn').on('click, keypress' function(event){
        if(event.type === 'click'){
        else if(event.type === keypress){
            var code = event.charCode || event.keyCode;
            if((code === 32) || (code === 13)){

Keep in mind, you'll need to do this event.preventDefault(); on every instance where you have code that acts like a button.


Turns out we've decided to use a button. All we needed to do was change a few CSS declarations. Thanks so much for the help.

Note: no, this isn't from a real client but actually reminiscent of multiple situations.

I'm available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

Everything you know about accessibility testing is wrong (Part 4)

…how many bigger issues have we missed wasting our time fixing this kind of crap? @thebillygregory

Literally every single audit report I’ve ever done includes issues relating to the following:

  • Missing alt attributes for images
  • Missing explicit relationships between form fields and their labels
  • Tables without headers or without explicit relationships between header cells and data cells

I also frequently find these others

  • Use of deprecated, presentational elements and attributes
  • Events bound to items that are not discoverable via keyboard
  • Poor color contrast
  • Blank link text
  • Missing/ inaccurate/ incomplete name, state, role, and value information on custom UI widgets

The sheer volume of these types of errors is, to put it lightly, frustrating. In fact, the title of my presentation “What is this thing and what does it do” is actually born from an inside joke. During one audit where the system I was testing was particularly bad, I joked to some coworkers that analyzing the code was a bit like a game to figure out, “what is this thing and what does it do?”. I only later decided to put a positive spin on it.

As I mentioned in the previous post in this series, there are an average of 54 automatically detectable errors per page on the Internet. The thing about automated testing is that, even though it is somewhat limited in the scope of what it can find, some of the errors it does find are pretty high impact for the user. Think about it: missing alt text for images and missing labels for form fields are a huge impact for users. While the total amount of accessibility best practice that are definitively testable by automated means are small, they tend to have a huge impact in whether people with disabilities can use the system.

Automatically detectable issues should never see the light of day

The reason why some people are against automated testing is that for such a long time we in the accessibility world haven’t really understood where the testing belongs. People have long regarded the applicability of automated accessibility testing as being a QA process and, even worse, it often exists as the only accessibility-related QA testing that occurs. If your approach to accessibility testing begins and ends with the use of an automated tool, you’re doing it wrong. This concept of automated-tool-or-nothing seems at times to be cooperatively perpetuated both by other tool vendors and by accessibility advocates who decry automated testing as not effective. We must turn our back – immediately and permanently – on this either-or mentality. We must adopt a new understanding that automated testing has an ideal time & place where it is most effective.

Automated accessibility testing belongs in the hands of the developer. It must be part of normal development practices and must be regarded as part of the workflow of checking ones’ own work. All developers do basic checking of their work along the way, be it basic HTML & CSS validation, or checking that it displays right across browsers. Good developers take this a step further, by using code inspection tools like JSLint, JSHint, PHP Mess Detector, PHP_CodeSniffer and the like. In fact, IDEs like WebStorm, Netbeans, Aptana, and Eclipse have plugins to enable developers to do static code analysis. Excellent developers perform automated unit testing on their code and do not deploy code that doesn’t pass. What prevents accessibility from being part of this? Existing toolsets.

The revolution in workflow that will change accessibility

Last week I created a new wordpress theme for this site. I’m not the world’s best designer, but I hope it looks better than before. I created it from scratch using my Day One theme as a base. It also includes FontAwesome and BootStrap. I use Grunt for managing a series of tasks while I built and modified the template’s design:

  • I use grunt-contrib-sass to compile 11 different SASS files to CSS
  • I use grunt-contrib-concat to combine my JS files into one JS file and my CSS files into one CSS file
  • I use grunt-contrib-uglify to minify the JS file and grunt-contrib-cssmin to minify the CSS file
  • I use grunt-uncss to eliminate unused CSS declarations from my CSS file.
  • I use grunt-contrib-clean to clear out certain folders during the above processes to ensure any cruft left behind is wiped and that the generated files are always the latest & greatest
  • I use grunt-contrib-jshint to validate quality of my JS work – even on the Gruntfile itself.
  • I use grunt-contrib-watch to watch my SASS files and compile them as I go so I can view my changes live on my local development server.

All of my projects use Grunt, even the small Ajax Chat demo I’m giving at CSUN. Some of the projects do more interesting things. For instance, the Ajax Chat pulls down an external repo. Tenon automatically performs unit testing on its own code. When something goes wrong, Grunt stops and yells at you. You can even tie Grunt to pre-commit hooks. In such a workflow nothing goes live without all your Grunt tasks running successfully.

Imagine, an enterprise-wide tool that can be used in each phase, that works directly as part of your existing workflows and toolsets. Imagine tying such a tool to everything from the very lowest level tasks all the way through to the build and release cycles and publication of content. That’s why I created Tenon.

While Tenon has a web GUI, the web GUI is actually a client application of the real Tenon product. In fact, internally Asa and I refer to and manage Tenon as a series of different things: Tenon Admin, Tenon UI, and Tenon (the API). The real deal, the guts, the muscle of the whole thing is the Tenon API which allows direct command line access to testing your code. This is fundamental to what we believe makes a good developer tool. When used from the command line Tenon can play happily with any *nix based systems. So a developer can open a command prompt and run:

$ tenon http://wwww.example.com

and get results straight away.

By using Tenon as a low level command it becomes possible to integrate your accessibility testing into virtual any build system such make, bash, ANT, Maven etc. As I mentioned above, one possibility is to tie Tenon to a git pre-commit hook, which would prevent developer committing code which could not pass Tenon’s tests. Like JSHint, you can customize the options this to match your local development environment and level of strictness to apply to such a pre-commit hook.

A typical workflow with Tenon might look a bit more relaxed for say a front-end developer working on a CMS and using Grunt to compile SASS to CSS and minify JS. As a node.js module we will be introducing a grunt plugin. So once grunt-tenon is introduced into your Gruntfile.js file, you can add grunt-contrib-watch to watch your work. Every time you save, your front-end will perform your normal Grunt tasks and test the page you’re working on for accessibility.

Processing: http://www.example.com
Options: {"timeout":3000,"settings":{},"cookies":[],"headers":{},"useColors":true,"info":true}
Injecting scripts:

>>  /var/folders/mm/zd8plqb15m38j4dzf3yf9pjw0000gn/T/1394733486320.647
>>  client/assets.js
>>  client/utils.js

Errors: 10
Issues: 10
Warnings: 0
Total run time: 3.27 sec

The same Gruntfile can also be run on your Jenkins, Travis-CI or Bamboo build server. Let’s say we’re using Jira for bug tracking and have it connected to our Bamboo build server. A developer on our team makes an accessibility mistake and commits that mistake with a Jira key — ISSUE-1234 — into our repo. As part of the Bamboo build, Tenon will return the test results in JUNIT format. The Bamboo build will fail and we can see in Jira that the commit against ISSUE-1234 was the cause for the red build. It will link directly to the source code in which the error originated. Because were using a CI build system from our developers standpoint all this can happen many times a day without requiring anything more than a simple commit!

Proper management of accessibility necessitates getting ahead of accessibility problems as soon as possible. Effectively there is no place before the code is committed. As a pre-commit hook or, at least, as a Grunt task before committing, accessibility problems are caught before they’re created. Automated testing is not the end, but the beginning of a robust accessibility testing methodology.

The next post is last post in this series, where we’ll put it all together.

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

Looking forward to CSUN 2014

I’m currently wrapping up the rest of my work for the week and getting ready for the annual pilgrimage to San Diego for the annual International Technology and Persons with Disabilities Conference, otherwise known as “CSUN”. Unlike previous years, I have relatively few presentations. I’m glad about that, really, because it means I can spend more time meeting people. If this is your first year at CSUN, you should read John Foliot’s CSUN For Newbies.

Preconference Workshop

On Tuesday, March 18, 2014, at 1:30 PST Billy Gregory and I will be assisting Steve Faulkner and Hans Hillen in a Pre-Conference Workshop titled “Implementing ARIA and HTML5 into Modern Web Applications (Part Two)”.

My Presentations

  1. Thursday, March 20, 2014 – 3:10 PM PST
    Roadmap For Making WordPress Accessible WordPress Accessibility Team members demonstrate progress and challenges and a roadmap for making WordPress accessible. Location: Balboa B, 2nd Floor, Seaport Tower
  2. Friday, March 21, 2014 – 1:50 PM PST
    No Beard Required. Mobile Testing With the Viking & the Lumberjack – Testing Mobile accessibility can be as daunting as it is important. This session will demystify and simplify mobile testing using live demonstrations and straightforward techniques. Location: Balboa A, 2nd Floor, Seaport Tower

Demonstrations of Tenon

If you’re interested in finding out more about Tenon, email me or just stop me in the hall and I’ll give you a demo.

If you’re going to CSUN I want to meet you

I love CSUN’s family-reunion-like atmosphere and getting to catch up with the many people I already know. But what I like more is meeting people I hadn’t already met. If you’re new to accessibility or we just don’t know each other yet, please just walk up and say hello. This is how I met many of the people I count among my best friends in accessibility!

Something more formal?

If you want to set up something more formal, especially for a one-on-one conversation, I strongly recommend emailing me directly. Typically what happens is that something intended to be a simple informal one-on-one get together winds up being a big group outing, so if you want to set up a private time to talk, here are some ideas.

  • Morning – I’m available all week before 8am. I’m open Tuesday and Thursday before 9.
  • Afternoon – As the day gets later, openings get more scarce. I’m currently open for lunches all week.
  • Evening – Evenings are often filled with impromptu group activities, so I won’t schedule something during the evening.

So, given the above, email me at karl@karlgroves.com to set something up!

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

Everything you know about accessibility is wrong (Part 3)

In the previous post in this series, I ended with a discussion that “current automatic accessibility testing practices take place at the wrong place and wrong time and is done by the wrong people” but really this applies to all accessibility testing. Of course every organization is different, but my experience substantiates the statement quite well. The “…by the wrong people” part is especially true. The wrong people are QA staff.

While QA practices vary, one nearly universal trait among QA staff is that they lack any training in accessibility. Further, they often lack the technical skill necessary to skillfully decipher the reports generated by automated tools. When you combine their inexperience in both accessibility and development, you’re left with significant growing pains when you thrust an automated testing tool at them. As I’ve said in previous posts, these users will trust the automated tool’s results explicitly. Regardless of the quality of the tool, this increases the opportunity for mistakes because as I’ve said in previous posts, there are always limitations to what can be found definitively and very likely that some interpretation is needed. There are also things that are too subjective or too complex for an automated tool to catch.

Irrespective of tool quality, truly getting the most out of an automated web accessibility tool requires three things:

  • Technical knowledge in that which is being tested
  • Knowledge and understanding of the tool itself
  • Knowledge around accessibility and how people with disabilities use the web

The first two points above apply to any tool of any kind. Merely owning a bunch of nice tools certainly hasn’t made me an expert woodworker. Instead, my significant expense in tools has allowed me to make the most of what little woodworking knowledge and skill I have. But, if I had even more knowledge and skill, these tools would be of even more benefit. Even the fact that I have been a do-it-yourselfer since I was a child helping my dad around the house only helps marginally when it comes to a specialized domain like fine woodworking.

The similar lack of knowledge on the part of QA staff is the primary reason why they’re the wrong users for automated testing tools – at least until they get sufficient domain knowledge in development and accessibility. Unfortunately learning-by-doing is probably a bad strategy in this case, due to the disruptive nature of erroneous issue reports that’ll be generated along the way.

So who should be doing the testing? That depends on the type of testing being performed. Ultimately, everyone involved in the final user-interface and content should be involved.

  • Designers who create mockups should test their work before giving it to developers to implement
  • Developers should test their work before it is submitted to version control
  • Content authors should test their work before publishing
  • QA staff should run acceptance tests using assistive technologies
  • UX Staff should do usability tests with people with disabilities.

At every step is an opportunity to discover issues that had not been previously discovered, but there’s also a high likelihood that as the code itself gets closer and closer to being experienced by a user that the issues found won’t be fixed. Among the test opportunities listed above, developers’ testing of their own work is the most critical piece. QA staff should never have functional acceptance tests that fail due to an automatically-detectable accessibility issue. Usability test participants should never have a failed task due to an automatically-detectable accessibility issue. It is entirely appropriate that the developer take on such testing of the own work.

Furthering the accessibility of the Web requires a revolution in how accessibility testing is done

Right now we’re experiencing a revolution in the workflow of the modern web developer. More developers are beginning to automate some or all of their development processes, whether this includes things like dotfiles or SASS / LESS or the use of automated task runners like Grunt and Gulp. Automated task management isn’t the exception on the web, it is the rule and it stems from the improvement in efficiency and quality I discussed in the first post in this series.

Of the top 24 trending projects on Github as of this writing:

  • 21 of them include automated unit testing
  • 18 of them use Grunt or Gulp for automated task management
  • 16 of them use jshint as part of their automated task management
  • 15 of them use Bower for package management
  • 15 of them use Travis (or at least provide Travis files)
  • 2 of them use Yeoman

The extent to which these automated toolsets are used varies pretty significantly. On smaller projects you tend to see file concatenation and minification, but the sky is the limit, as evidenced by this Gruntfile from Angular.js. The extensive amount of automated unit testing Angular does is pretty impressive as well.

Myself and others often contend that part of the problem that exists with accessibility on the web is the fact that it is seen as a distinctly separate process from everything else in the development process. Each task that contributes to the final end product impacts the ability for people to use the system. Accessibility is usability for persons with disabilities. It is an aspect of the overall quality of the system and a very large part of what directly impacts accessibility is purely technical in nature. The apparent promise made by automated accessibility testing tool vendors is that they can find these technical failings. Historically however, they’ve harmed their own credibility by being prone to the false positives I discussed in the second post in this series. Finding technical problems is one thing. Flagging things that aren’t problems is another.

Automated accessibility testing can be done effectively, efficiently, and accurately and with high benefit to the organization. Doing so requires two things:

  • It is performed by the right people at the right time. That is that it be done by developers during their normal automated processes.
  • The tools stop generating inaccurate results. Yes, this means that perhaps we need to reduce the overall number of things we test for.

It may seem somewhat non-intuitive to state that we should do less testing with automated tools. The thing is, the state of web accessibility in general is rather abysmal. As I get ready for the official release of Tenon, I’ve been testing the homepage of the most popular sites listed in Alexa. As of this writing, Tenon has tested 84,956 pages and logged 1,855,271 issues. Among the most interesting findings

  • 27% of issues relate to the use of deprecated, presentational elements or attributes
  • 19% of issues are missing alt attributes for images
  • 10% of issues are data tables with no headers
  • 5% of issues relate to binding events to non-focusable elements.
  • 2% of issues relate to blank link text (likely through the use of CSS sprites for the link)

85,000 tested pages is statistically significant and has a high confidence interval. In fact, it is more than enough.

There are an average of 54 definitively testable issues per page on the web. These are all development related issues that could be caught by developers if they had tested their work prior to deployment. Developers require the availability of a toolset that can allow them the ability to avoid these high-impact issues up front. This is the promise of Tenon.

In Part 4 I’ll talk about our need to move away from standalone, monolithic toolsets and toward integrating more closely with developers’ workflows

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

Woodshop tour

I posted this to Facebook but wanted to share on my site, too. This is where I spend my weekends when it is cold outside:

Note: alt attribute on each image is blank. Visible text under the image describes the image.

Looking into the entrance-way. Drill press and lathe straight ahead. Chalkboard paint along the left wall. Dust collection system viewable overhead. Pipe clamps clamped to support beam.

View from right past the entrance-way. Glue-ups in progress on my work table. There’s no such thing as too many clamps. Ahead and to the left you can see the band saw. To the left of that is my crappy little router table. Dust collection system also in view.

View straight ahead past entrance way. Drill press right in front. Just past that is the lathe with a rosewood bowl blank mounted. Further ahead is my new grinder. Various supplies are on the shelves along the wall. The lower shelf is all finishing supplies such as wipe-on poly, glue, sandpaper, etc. while the upper shelf is mostly misc. On the floor ahead are various scraps of wood. Most scraps are thrown away but I occasionally save stuff that may be useful later, such as for experimenting with a joint before doing the final piece.

View of the side of the room that has the drill press and lathe. The lathe is a 42-inch Delta Rockwell. Right behind the lathe is a dust collection box. Unfortunately my dust collector doesn’t have enough horsepower to make the box useful. On top of the dust collection box is a DIY air filter powered by a high-powered computer fan. To the left of that is another box that holds drills and various drill-related stuff like a Kreg jig, drill bits and forstner bits. Not shown: On this side, the workbench is actually a cabinet. Inside the cabinet is 6 glass carboys fermenting beer.

Corner of the room by the lathe. Parts bin on the wall. Shelves with finishing supplies, sharpening supplies, and sanding supplies as well as two grinders on the bench. Dust collection hoses along the top.

Dust collection hoses are also prominent in this picture as is the band saw.

Back wall showing 12-inch compound miter saw. Behind that is pegboard wall holding various tools. Hammers, chisels, screw drivers, files, pliers, and more. A shelf holds various router-related items.

Right corner of the back wall, from a little greater distance. Shows router table, router bits and various router related items. This is also where hand saws are stored as well as safety related stuff like safety glasses, face shield, and air masks. Underneath the router table is a wet-dry vac. It doesn’t get much use now that I have the dust collector, but this is such a good place to store it. On the side of the work bench is a pencil sharpener.

The “back room” of the shop holding more than 200 board-feet of walnut, about 100 board feet of cedar, and 5 walnut slabs. Some other misc. pieces are shown such as a right-angle jig, a spline jig, table saw miter jig, and box-joint jig. Barely in the foreground is a jointer.

View from the very back looking in toward the entrance. Upper left shows a filtered box fan. On the lower left is a new table saw, and beyond that is a downdraft sanding table. Like the dust collection box, the downdraft sanding table isn’t as useful as it could be because the dust collector doesn’t really have enough oomph. On the right side foreground is the jointer. Further ahead is the bandsaw on the right and beyond that is the worktable. On a shelf under the worktable is my 13-inch Dewalt planer, Bosch circular saw, and Porter Cable Dovetail jig. Eventually I’ll have to make a stand for the planer. It isn’t a big deal to pick it up and down right now but when I’m older its gonna be difficult, for sure, because the damn thing is heavy.

Special closeup view of my table saw. While I got a lot of use out of my Dewalt portable table saw, this thing is a thousand times more useful. Behind the table saw are a ton of empty jars ready for Jennifer Groves to put food in this summer!

Not really shown elsewhere in this photo album is a “closet”. This is the other side of the wall on the right of the entranceway. Inside of this room is a dust collector, shown prominently in this picture. To the lower left in this picture is a dust separator which basically separates the big chips before they make their way into the dust bag. Under the dust collector but not shown is a small air compressor.

“Should we detect screen readers?” is the wrong question

The recent release of WebAIM’s 5th Screen Reader User Survey has heated up a recently simmering debate regarding whether or not it should be possible to detect screen readers. Currently there are no reliable means of determining whether a user with disabilities is visiting your site and, specific to screen readers, this is because that information isn’t available as part of the standard information that is used in tracking users, such as user-agent strings. Modern development best practice has shied away from klunky user-agent detection and instead toward feature detection. The thought then, is that it should be possible to detect whether or not a visitor is using a screen reader. This has drawn sharply negative reactions from the accessibility community, including those who I’d have thought would have been in favor of the approach. In all cases, people seem to be ignoring a more obvious shortcoming of this idea: Accessibility isn’t just about blind people. Accessibility is about all people.

Getting data at a sufficient level of granularity is a bit difficult, but the conventional wisdom around disability statistics is:

  • There are more people who are low-vision than who are blind
  • There are more people who are hard of hearing than who are visually impaired
  • There are more people who are motor impaired than who are hard of hearing
  • There are more people who are cognitively impaired than all of the above

In fact, depending on age group this can vary. The Census Bureau data does validate the claim that across all age groups the percentage of people who are visually impaired is consistently the smallest of all disability types. In other words, if your approach to accessibility has anything to do with detecting screenreaders, you’ve clearly misunderstood accessibility.

But let’s skip that for a moment. Let’s assume you could detect a screen reader as easily as including Modernizr on your site. Now what? What do you do differently? Well, no matter what you do, your approach “solves” accessibility issues for less than 2% of the working-age population. Put another way, whatever money or time you’ve spent on detecting and adapting to screen reader users, you only gotten yourself 1/5 of the way toward being “accessible”. Instead of asking whether it should be possible to detect screen readers, the question should be “how do we make our site more usable for all users?”.

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343

Everything you know about accessibility testing is wrong (part 2)

In Everything you know about accessibility testing is wrong (part 1) I left off talking about automated accessibility testing tools. It is my feeling that a tool of any kind absolutely must deliver on its promise to make the user more effective at the task they need the tool to perform. As a woodworker, I have every right to expect that my miter saw will deliver straight cuts. Further, if I set the miter saw’s angle to 45 degrees, I have the right to expect that the cut I make with the saw will be at an exact 45 degrees. If the saw does not perform the tasks it was designed to do and does not do so accurately or reliably, then the tool’s value is lost and my work suffers as a consequence. This is the case for all tools and technologies we use, and this has been the biggest failing of automated testing tools of any kind, not just those related to accessibility. Security scanning tools tend to generate false results at times, too.

I’m not sure if this is their exact motivation, but it often seems as though accessibility testing tool vendors interpret their tool’s value as being measured by the total number of issues they can report on, regardless of whether those issues are accurate. In fact nearly all tools on the market will tell you about things that may not actually be an issue at all. In this 2002 evisceration of Bobby, Joe Clark says “And here we witness the nonexpert stepping on a rake.” He goes on to highlight examples of wholly irrelevant issues Bobby had reported. From this type of experience came the term “false positives”, representing issues reported that are inaccurate or irrelevant and it is a favorite whipping post for accessibility testing tools.

It would be easy to dismiss false positives as the result of a young industry, because nearly all tools of the time suffered from the same shortcoming. Unfortunately even today this practice remains. For example, in the OpenAjax Alliance Rulesets, merely having an object, embed, applet, video, or audio element on a page will generate nearly a dozen error reports telling you things like “Provide text alternatives to live audio” or “Live audio of speech requires realtime captioning of the speakers.” This practice is ridiculous. The tool has no way of knowing whether or not the media has audio at all let alone whether the audio is live or prerecorded. Instead of reporting on actual issues found, the tool’s developer would rather saddle the end user with almost a dozen possibly irrelevant issues to sort out on their own. This type of overly-ambitious reporting does more harm than good on both the individual website level and for accessibility of the web as a whole.

No automated testing tool should ever report an issue that it cannot provide evidence for. Baseless reports like those I mentioned from the OpenAjax alliance are no better than someone randomly pointing at the screen and saying “Here are a dozen issues you need to check out!” then walking out of the room. An issue report is a statement of fact. Like a manually entered issue report, a tool should be expected to answer very specifically what the issue is, where it is, why it is an issue, and who is affected by it. It should be able to tell you what was expected and what was found instead. Finally, if a tool can detect a problem then it should also be able to make an informed recommendation of what must be done to pass a retest.

False positives (or false negatives, or whatever we call these inaccurate reports) basically do everything but that. By reporting issues that don’t exist, they confuse developers and QA staff, cause unnecessary work, and harm the overall accessibility efforts for the organization. I’ve observed several incidents where inaccurate test results caused rifts in the relationships between QA testers and developers. In these cases, the QA testers believe the tool’s results explicitly. After all, why shouldn’t they expect that the tools results would be accurate? As a consequence, QA testers log issues into internal issue tracking systems that are based on the results of their automated accessibility-testing tool. Developers then must sift through each one, determine where the issue exists in the code, attempt to decipher the issue report, and figure out what needs to be fixed. In cases where the issue report is bogus, either due to inaccuracy or irrelevance, it generates – at the very least – unnecessary work for all involved. Worse, I’ve seen numerous cases where bugs get opened, closed as invalid by the developer, and reopened after QA tester retests it because they’ve again been told by the tool that it is still an issue. Every minute developers and QA testers spend arguing over whether an issue is real is a minute that could be spent on remediation efforts for issues that are valid. Consequently, it is best to either avoid tools prone to such false reports or to invest the time required to configure the tool in a way that squelches whatever tests are generating them. By doing so the system(s) under development are likely to get more accessible and developers less likely to brush off accessibility. In fact, I envision a gamification type impact to this approach of only reporting and fixing real issues. A large number of these “definitively testable” accessibility best practices can often be quick to fix with minimal impact on the user interface. Over time, developers will instinctively avoid those errors as accessible markup will become part of developers’ coding style and automated accessibility testing can remain part of standard development and QA practices, instead finding anomalous mistakes rather than instances of bad practice. This possibility can never exist while trying to decipher which issues are or are not real problems because developers are instead left feeling like they’re chasing their tails.

Current automatic accessibility testing practices take place at the wrong place and wrong time and is done by the wrong people

Automated testing tools can only test that which they can access. Historically this has meant that the content to be tested has to exist at an URL that can be accessed by the tool, which then performs a GET request for the URL, receives the response and, if the response is successful, tests the document at that URL. Implicitly this means that work on the tested document has progressed to a point where it is, in all likelihood, close to being (or is, in fact) finished. That is, unless the tested URL is a “mockup” and the tool resides in or has access to the same environment as the development environment. Historically the experience has been that the tested documents have been deployed already. This is the worst possible place and time for accessibility testing to happen because at that point in the development cycle a vast array of architectural, design, workflow, and production decisions have been made that have a direct impact on the team’s ability to fix many of the issues that will be found. This is especially true when selecting things like front-end JavaScript frameworks, MVC frameworks, selecting colors, or when creating templates and other assets to be presented via a Content Management System. In each case, early pre-deployment testing could help determine whether additional work is needed or whether different products need to be selected. Post-deployment remediation is always more costly, more time consuming, and is less likely to be successful. In all cases, accessibility testing that is performed late in the development lifecycle has a very real and very negative impact, including lost sales, lost productivity, and increased risk to project success. Late testing also increases the organization’s risk of litigation.

The best way to avoid this is, as the popular refrain goes: “Test early and test often”. Usability and accessibility consultants worldwide frequently lament that their clients don’t do so. This website, for instance, happens to perform very well in search engines for the term “VPAT”, and about once a week I get a call from a vendor attempting to sell to a US government agency that has asked for a VPAT. The vendor needs the VPAT “yesterday” and unfortunately at that point any VPAT they get from me is going to contain some very bad news that could have been avoided had they gotten serious about accessibility much earlier in the product lifecycle. In fact, as early as possible: When the first commit is submitted to version control, and when the first pieces of content are submitted in the content management system. Testing must happen before deployment and before content is published.

Stay tuned for part 3 where I talk about critical capabilities for the next generation of testing tools.

I’m available for accessibility consulting, audits, VPATs, training, and accessible web development, email me directly at karl@karlgroves.com or call me at +1 443-875-7343