From Zero to Five, to whatever Number that ensures Sample Representativeness

Goodman et al’s (2012: 304) assert that “… usability is a means of directed product evaluation, not scientific inquiry.” What do they mean?

A reason why usability tests are not scientific inquiry can be found just two pages following said Goodman et al assertion – “Usability tests are not statistically representative” (authors’ emphasis.) In science, statistical representativeness requires strict sampling procedures to ensure the sample is representative of a well-defined target population. Random sampling and appropriate sample size are among the ways to ensure sample representativeness.

The meaning of “random” is not what lay people usually use this word for. Only when everyone in the target population has equal chance to be included in the sample through the sampling process can we say the sample is randomly selected. Therefore, sitting at a coffee shop and testing willing customers does not give us a random sample.

“Being scientific” may be the gold standard of scientific research, but in the context of design and usability tests, this standard is not feasible in most design cases. Even in the rare cases where it is, it is most likely not desirable.

On the feasibility front, the sample size required to achieve representativeness may be beyond what the design project can afford, and the time it takes to complete the tests may be too long for the time line of the product development the tests are for.

Regarding the desirability of adhering to scientific methods in conducting usability tests, the resources these methods require can be better spent in not-so-scientific usability tests. For example, Nielsen (2000) finds the marginal benefits of testing more than five users significantly drop, as the graph below shows. Later and more tests (Nielsen 2012) confirm such a finding.



Source: Nielsen (2000)


Another non-scientific aspect of usability tests is the difficulty in replicating findings. Studies adhering to scientific methods are supposed to be replicated in findings if the same methods are used, but replicating findings exactly is not usability tests' forte. However, does it make good sense to try to make usability tests' findings replicable? Nielsen (2011) argues against attempts to find all issues in usability. The marginal benefits in relation to costs as demonstrated in the graph above is one of the reasons behind this argument. The other is that he finds most websites, applications, and mobile apps have serious usability issues. By focusing on the big issues, usability tests can significantly improve the key performance of the website or application. This is the 80/20 argument, which makes good sense from the pragmatism viewpoint.

Heuristic evaluation provides arguably an even more cost effective way to evaluate usability than usability tests do, because it has proven to be able to find the majority of major and even minor problems usability tests can (Nielsen 1995). However, sometimes heuristic evaluation may not find some of the problems that usability tests on the same design uncover. One such scenario is that the experts who conduct the heuristic evaluation do not have the specific domain knowledge (Nielsen 1995). In this case, it would be appropriate to make design decisions based on usability tests.

In all, usability tests are not scientific inquiry, but they have an importance place in the design process. As Nielsen (2000) indicates, a critical take-away of the said graph is: when the number of tested users is zero, we find zero usability problems. Therefore, one is much better than zero, and five can just hit the sweet spot. It doesn’t seem nearly as daunting as the kind of sample size that is required to qualify as scientific inquiry. Does it?


References
  1. Goodman, E., M. Kuniavsky, and A. Moed. (2012). Chapter 11: Usability Tests. In Goodman, E., Kuniavsky, M., & Moed, A. Observing the User Experience : A Practitioner's Guide to User Research (2nd Edition), pp. 273-326. Saint Louis, MO: Morgan Kaufmann. Retrieved from http://www.ebrary.com.
  2. Nielsen, J. 1995. Characteristics of Usability Problems Found by Heuristic Evaluation. Access via http://www.nngroup.com/articles/usability-problems-found-by-heuristic-evaluation/ on January 30, 2015.
  3. Nielsen, J. 2011. Accuracy vs. Insights in Quantitative Usability. Access via http://www.nngroup.com/articles/accuracy-vs-insights-quantitative-ux/ on January 30, 2015.
  4. Nielsen, J. 2000. Why You Only Need to Test with 5 Users. Accessed via http://www.nngroup.com/articles/why-you-only-need-to-test-with-5-users/ on January 30, 2105.
  5. Nielsen, J. 2012. How many Test Users in a Usability Study? Accessed via http://www.nngroup.com/articles/how-many-test-users/ on January 30, 2015.


Generative Efforts

In Goodman, Stolterman, and Wakkary’s 2011 article “Understanding Interaction Design Practices,” they call for interaction design researchers to pay more attention to the design practice. Originally I thought this article is somewhat research-centric, especially in the section where they talk about the ideal theory of interaction design practice. However, upon reflection, I realize that if what these researchers advocate is followed, it could bring welcoming development to design practitioners as well.
Even though it is very difficult - if not impossible - to reduce the complexity of design practice to simple design principles or rules, with researchers’ strengths in synthesis, there will be interesting insights from interaction design research. These insights, coupled with “reflection in action” and “reflection on action” as advocated by Schön (1983 & 1987), will help practitioners learn to become more designerly. Like Goodman, Stolterman, and Wakkary state in the said article, “empirically grounded descriptions and critical analyses of design practice activities will offer frameworks for reflection on practices that designers can find useful.”
Indeed, it is not hard to see that academic inquiry into design practice will help extract knowledge from it, so that instead of relying on their direct experience case by case, designers can broaden the scope of experiences from which they develop what Kolko (2014) calls “higher-order organizing principles.” Research with a locus on actual design practices will also help designers recognize and perceive the patterns and structure from what Dewey (1934) characterizes as “experiences” and distill meanings from them in their significant design experiences.
Therefore, other than their own reflexivity, designers can tap in to academic research as one of their sources of knowing, learning, and making sense of the activities, experiences, and contexts of their design practice, thereby open up alternative perspectives about past practice, current trends, and future possibilities. Even though there is no substitute for their own experience, designers can leverage their experience with the aid of the direction of research Goodman, Stolterman, and Wakkary call for.
A closer connection between design theorists and practitioners will not only embody the intertwining nature of the intellectual and practical parts of experiences as Dewey identifies, it will also encourage mutual learning and enrich each others’ viewpoints. Additionally, it can also have significant pedagogical implications - it can help interaction design students gain better understanding about design practice before they actually get into the field, and potentially generate interesting sparks in the research-practice-teaching triangle.
Like Goodman, Stolterman, and Wakkary posit, the “mutual intelligibility of language” between interaction design researchers and practitioners is something notable. My concern is that even with an attempt to bridge the gap between HCI research on interaction design practice and related designers, academics may find themselves in an odd position of balancing between publication requirements on the one hand, and making the knowledge accessible and easily digestible to practitioners on the other. However, there are already publication venues that make this balancing act less stringent. Hopefully, as this line of research gains momentum, more venues will open up.
In all, as a researcher-turned designer, I appreciate Goodman, Stolterman, and Wakkary’s efforts in connecting the research and practice of interaction design. I believe the resulting benefits will feedback to both parties and be generative, with a positive externality to teaching/learning in design schools as well.

References
  1. Dewey, J. 1934. Having an Experience. In J. Dewey, Art as Experience, pp. 35-57. NY: New York: Minton, Balch and Company.
  2. Goodman, E., E. Stolterman, and R. Wakkary. 2011. Understanding Interaction Design Practices. Paper presented at CHI 2011, May 7-12, Vancouver, BC, Canada.
  3. Kolko, J. 2014. Why I teach Theory. Interactions 21(6): 22-23. http://doi.acm.org/10.1145/2663294
  4. Schön, D.A. 1983. The Reflective Practitioner: How Professionals Think in Action. New York, NY: Harper Collins.
  5. Schön, D.A. 1987. Educating the Reflective Practitioner: Toward a New Design for Teaching and Learning in the Professions. San Francisco, CA: Jossey-Bass Publishers.