MENU

Our Thinking

WHY LAB-BASED USER TESTING IS FLAWED

Uncategorised
Sean Patterson
Sean PattersonExperience Director

I observed my first lab-based user testing session 10 years ago. The facility was equipped with several observation rooms, multiple cameras, microphones, eye tracking kit, a large selection of computers (no touch devices back then) and two facilitators that co-ran the test sessions.

The observation rooms were hidden behind one-way mirrors and were like mini cinemas – comfortable tiered seating that offered a clear view into the test room. The test room was mic’d up and monitors allowed us to see what was happening on the test computer.

We were testing a website concept for a well known global brand to see how it would perform with real users. Was it intuitive? Would they be able to find key features and information? Would they be able to complete the core journeys without issue? We really wanted to validate our solutions and this lab was the perfect, professional setting in which to do it.

This one-off exercise cost a hefty 5-figure sum and I’m sure the agency I was consulting for back then would have re-charged this to the client with a significant mark-up. We’d spent several weeks planning the tests, putting together the prototype and recruiting the right test candidates.

We got there early on the day of the test so that we could load up our prototype and run through last minute changes with the facilitators before the test candidates arrived. I recall the arrival of our first user. He was greeted in reception and given a form to fill out by the test agency. I can only guess that it was a release form or an agreement detailing their compensation (or both).

He was taken into the test room, briefed on what it was we were trying to achieve and reassured that there were no right or wrong answers but just to do whatever felt intuitive to him. The facilitators did their best to put him at ease but I couldn’t help but notice a distinct air of nervousness. Who could blame him – he was in an unfamiliar setting – the people, the location, the equipment – all alien to him. This particular user was in the 50+ demographic and did not appear comfortable with the computer set-up in front of him.

The test got underway and we all leaned forward with bated breath – willing him to intuitively understand the interface we’d spent months working on.

“I’ve done it wrong again, haven’t I?”

He appeared to struggle with the most basic of tasks. I wanted to reach through the glass and show him how easy it was – how could he not get it? A similar story played out throughout the day with other test users.

Many sought assurances after each step, “Do you want me to click this?”, “Did I select the right one?”, “Did I do that right?”, “I’ve done it wrong again, haven’t I?” The facilitator fought hard to not lead the candidates, but caved on several occasions and offered some guidance which felt… wrong.

The first user, the one who appeared uncomfortable with the computer setup, actually moved the mouse to the left side of the keyboard towards the end of the session – meaning he’d been riding goofy for the majority of the test.

After a morning of testing, I considered our concept a failure. In my mind, the majority of candidates had not successfully completed key tasks. We’d have to go back to the drawing board. A week or so later we received a glossy formal report, complete with heat maps and all kinds of pretty pictures. To my amazement, the test agency had deemed the test a success – bar a few recommendations.

Our solution had been validated and there was much backslapping. The client was happy and felt reassured given it had been independently tested and validated by an ‘impartial’ third party.

This did not sit well with me for the following reasons:

  1. It wasn’t really impartial
    The agency I was working for was the actual client of the test agency – not the big brand whose site we were testing. The test agency had pitched for this work and knew how keen we were for the test be a success. I wouldn’t consider these ideal conditions for an impartial test.

  2. Incentivised behaviour
    The test candidates had been incentivised with payments. This changes the dynamic of the relationship. If they’re getting paid, a test candidate might feel the need to please or perform well – telling you what they think you want to hear. I know it’s not always possible to find free volunteers – particularly when searching for very specific candidates, but we always try in the first instance.

  3. Unnatural environment
    The test lab environment can be very intimidating. You’re unfamiliar with the equipment, the people and it’s a new social setting which can make people anxious – and again – eager to please / make a good impression.

  4. Artificial responses
    The test candidates were very aware of the fact that they were being observed and recorded. Again, candidates can become self conscious when they know they are being recorded which can skew their behaviours.

So how did that project end? The agency pushed ahead with the ‘validated’ solution. As with all websites it continued to evolve – bugs were ironed out, new sections added, various features were improved, but for me, I was never quite satisfied that the solution we launched with was the best one.

The Learnings.

I’ve since sat in on or conducted what must be approaching close to 300 separate user testing sessions (not projects). After each session I critically analyse how I planned and facilitated the session so that my next test session can yield even better insights. There’s always room for improvement, but I feel the following are the key learnings over the years. These are not hard rules – every project is different and this requires you to be flexible to work under certain constraints,

  1. Always, always, always observe the user contextually
    This means travelling to their home, to their place of work or place of leisure. Somewhere they feel comfortable and able to speak their mind.

  2. Conduct these sessions on a 1-to-1 basis.
    Attending a test session mob-handed with several people does not put the test candidate at ease.

  3. Don’t incentivise your test candidate
    Depending on who you need, this may not always be possible, however generally speaking, you can find users in your target demographic that are happy to spare some time if you make the effort to travel to them.

  4. Make your recording methods discrete
    It’s taken practise but I’ve learned to note down salient observations with pen & paper. I try to avoid video or audio recording where possible, but if you absolutely have to record, try your best to do this discreetly (with the candidates knowledge of course). In other words, make it easy to forget they are being recorded. Leave Evernote recording audio on an iPad instead of sticking a shotgun mic in their face for example.

  5. Conduct more informal tests earlier on in the process.
    Rather than spending long periods of time working on what you feel is the perfect solution, sketch, discuss, do quick mock-ups, hallway testing –  just get it in front of potential users no matter what state i’s in and observe.

  6. Separate signal from noise (what they say, may not always be what they mean)
    The number of times I’ve seen a candidate fail to complete a task and then in the post-test interview explain how they really liked the solution and found it easy to use. Or perhaps they’ve become hung-up on a non-functional element on the prototype. What matters is observing what they are doing and assessing whether this is in line with what they are saying.

  7. Use the user’s own equipment
    A user knows how to use their computer or phone. Shove another device in front of them and they are suddenly not so sure – certain shortcuts don’t work as expected, hot corners through them off, they use a mouse and not a trackpad, perhaps their mouse speed is set differently… we want to remove these potential distractions from the equation and so we try, where possible, to use their own device.

I find that these informal and contextual user testing sessions yield far better insights than I’ve ever obtained through a formal lab-based testing. They are quick to plan, cheap to conduct and provide results much quicker.

So what are the downsides?

Perception and trust.

Perception in that clients may feel a lab-based test sounds scientific and professional, therefore yielding better results. Contrast that to observing a housewife trying to order groceries online from the sofa in her front room. The test lab doesn’t show you that she actually goes to her fridge several times during the order and calls her husband to check on his whereabouts that week to see if she needs to cater for him on certain nights too. If a client actually cares about genuine insights, then informal contextual enquiry is far superior.
Trust in that many clients want to observe the user for themselves and don’t trust that we’d report everything or that we’d somehow lead the user. This is a legitimate concern. I know I’d prefer to be present when watching someone use my product. As a client, you’ll need to trust that we WANT to find the issues and pain points – warts and all.

If you would like us to test your product, software, website or service with real users and find identify the actual problems, then get in touch.