By Frank Spillers

3 circles- first circle and last circle many people in it, middle circle has one person in it

Summary: Inclusive Design starts with including users. Sadly, the 5 user sample size does not scale to inclusion efforts. Hence, you will need to include more users. However, it is important that you do not include a token user from an underrepresented community (eg. one wheelchair user). Instead, take an intersectional approach to user recruiting (eg. a wheelchair users who is Trans and Black). Finally, make sure to dedicate your audience research to full inclusiveness (versus adding a couple of users to check the inclusive box), so that you might understand the depth of experience for that impacted community and the issues that you need to know about to avoid bias, harm or exclusion in your product. 

The shift toward Inclusive Design sees us adding more representation to visual search results, product strategy, and more (see Nike inclusive sizing below).

Nike plus sized model clothing manequin display

Above: Nike plus-sized workout clothing for women.

One of the starting points of exclusion is who you invite to the party. If your assumptions and biases limit who you do user research with, you can sabotage your inclusive design efforts. Therefore, inclusive user recruiting is critical for your Inclusive Design strategy. This post will discuss inclusive user recruiting in the context of sample size and emphasize the problems with the five-user sample for user testing, which has become an unchallenged ‘default’ in the industry.

Inclusion breaks with the “5 users” model

With inclusive recruiting, just having five users to test with will get you in trouble.

First of all, the five-user sample is a User Testing “shortcut” proposed by Jakob Nielsen (NNGroup) — who holds a lot of weight on this topic because he evangelized “quick, cheap” user testing in the 1990s and has continued to advocate for five user samples to this day. Nielsen defends “5 users” with economic theory (the law of diminishing returns), which translates to, “there’s nothing to see beyond five users, don’t waste your time.” It is essential to recognize that, even though Nielsen does not seem to convey this when he advocates for five users, the original 1992 study — he and Landauer based the five user sample recommendation on — called for multiple iterations of testing. It also specified that you use a sample of 5 users per audience segment. Most products have 2–3+ segments, and I have seen up to 15 segments or more. Following that guidance means you will have 3–4 rounds of user testing, and for three segments (5x3), test with 15 users per round! Do you know anyone who actually does that? (Universities not included).

In reality, very few UX professionals that practice today do more than 1 or 2 iterations of testing, and everyone seems to have missed the detail regarding five per audience segment. Oddly Nielsen does not mention this but instead focuses on the “5 users is all you need” mantra, which has become a meme outside the wider UX community (aka your stakeholders and your colleagues not reading this blog post). I feel like the strong message of ‘don’t waste your time with users’ has done some damage. One point is that spending time with users makes UX designers smarter. If you limit your user contact, you will not learn user patterns of behavior and thinking so essential to good UX ROI. Jared Spool spoke of this intangible value when he said, “the quality of your product is proportional to how much time you spend with your users.”

Note: If you are looking for more advice, criticism, and discussion on sample sizes for user testing, check The 5 User Sample Size myth: how many users should you really test your UX with?

Now let’s apply this guidance to inclusive user recruiting. Let’s assume an under-represented community is an “audience segment.” For example, Black, Brown, LGBTQIA+, Poor, People with a disability, and more…

Now you are looking at an audience sample size of 20–25 users, assuming 5 users per community. The good news is that if you take an intersectional lens, e.g., Black and with a Disability, you can overlap and keep the sample size under control.

 

black barbie doll in wheelchair

Above: the new Barbie Doll featuring a black female with a disability offers an excellent example of intersectionality or how to understand “double discrimination”.

How to prioritize your recruiting of under-represented user groups?

 
This will depend on your business, research, and equity goals. First, you need some idea of who you need to understand most. In Inclusive Design, we ask the question “Who else did we leave out?”. Next, you need to carefully prioritize the balance of which identity takes the lead in your recruiting. For example, LGBTQIA+ is a big category. Perhaps you want to more deeply understand and design for the needs of Transgender individuals. In this case, of that 5 sample size, you would recruit 3 Transgender users, so their representation or voice was stronger in the sample size. For disability, at Experience Dynamics, we typically weigh toward Blind/Low-Vision users since they overlap with keyboard use (physical impairments) and can have multiple disabilities. Also, screen readers are a major Assistive Technology you are trying to optimize for so it stands to reason. However, in choosing inclusive recruiting sample sizes, it is important you avoid tokenism traps…
 

Avoid Tokenism in Recruiting users

One thing you have to be careful of in doing inclusive recruiting is unintended tokenism.

See more on Avoiding User Tokenism or excluding users when your intent was Inclusive Design

The risk of this is very high in our “5 users is all we have time for” world of enterprise, institutional, and start-up cultures of UX. What I mean by tokenism is the following:

  • You want to increase representation so you recruit one black woman for your user test (yep, that’s tokenism).
  • You want to cover your accessibility bases so you recruit one user with a disability, and check that box (that’s tokenism too).

Instead, don’t see the task of recruiting a diverse user sample as a simple quota exercise or a gesture of inclusion. At Experience Dynamics, we recommend giving justice to that underrepresented group to minimize tokenism, weak insights, or worse undermining your representation efforts. In other words, do your study with the recruiting goal of contacting that underrepresented community entirely, instead of inviting one or two voices for ‘tickbox representation’. If you don’t know where to start, recruit intersectionally for better insights.

Intersectional Recruiting can help scale inclusive user recruiting efforts

Remember that users do not live in just one community but criss-cross or intersect. What this means is that you can use your entire e.g. 20 user sample (4 segments of 5 users per group) for that single community or ‘segment’ you want to impact or understand eg. the LGBTQIA+ community. If you also wanted to understand how disability impacted that community, you would recruit from both of those segments. So with 20 users, you could get a good understanding of the LGBTQIA+ community’s needs plus the disability experience.

Taking an intersectional approach helps us scale inclusive recruiting, for example, you want to understand Black women’s experience in general but also would like to include LGBTQIA+ and Disability. You would recruit 20 users with the following breakdown:

  • Black= 100%; (20 users)
  • Female= 100%; (20 users)

        +

  • Disability= 50% (emphasis on blind users; 10 users) also Black and female.
  • LGBTQIA+= 50% (10 users) also Black and female.

Such recruit criteria would give you representation and inclusion of the Black female experience plus an understanding of disability and sexual orientation within that community. For a smaller user testing scope, you could recruit 10 Black, females of which 5 have a disability and another 5 come from the LGBTQIA+ community.

While the five-user sample issue focuses on user testing, it is important to note that User Research has two sides: needs analysis and user testing. When you do the former: user interviews, contextual inquiry, or Ethnography with your users to deep-dive into their ‘problem space’, you can apply the same model of inclusive user recruiting. To start, pay attention to the fact that five users are a sample recommended only for user testing. Doing user interviews and observations requires higher samples (e.g. 10 users per audience segment). Our field studies run 30–40 users typically, while our user testing runs 10–20 users.

In Conclusion

In this post, we have been focusing on sample size and growing out of the five-user bind to accommodate the need for inclusive user recruiting. We have looked at the limits of testing with only five users, and how to correctly apply the original user sample size advice to an intersectional approach to inclusive user recruiting without tokenizing a user group you want to get closer to, versus including for the sake of being seen to include. In addition, we have seen how an intersectional approach can help bring you closer to understanding even more underrepresented user perspectives.

Best wishes,

Frank Spillers

CXO @ Experience Dynamics

Want more? Join Frank's Inner Circle