Making the most of MaxDiff

How to prioritise concepts & features with a powerful survey technique

  • Why MaxDiff is more suitable for prioritisation than other survey methods

  • How you can generate data that you trust

  • Things to consider to ensure your study is a success



MaxDiff: A brief introduction

In the world of quantitative market research the ubiquity of MaxDiff is a bit of a running joke. Think of it as a victim of its own success. Be it helping weed out the killer ideas and concepts following generative research or leaning on user views, opinions and preferences to shape the development of a proposition, there is a lot it can do to cut through the noise and provide clarity and certainty as to how you can meet the requirements of your intended audience. However in the world of product management it remains an under-utilised tool to mitigate risk, support quick decision making and understand what your users want before you have built anything.

There are already some great pieces written as to how MaxDiff as a method works, and how you can use MaxDiff to support feature prioritisation. So we are going to side-step a re-tread of this here. But we do want to share some thoughts on the underlying principles.

Firstly, to convey why it is a great method (and better than other survey-based question types) to provide direction and deliver good quality user data. As well as flag up some pitfalls to watch out for when setting up a study or experiment.

Beyond this, in a companion piece, we want to provide some examples as to how you can take your MaxDiff analysis a step or two further; from the prosaic to the profound (possibly).

1. A practical example

So, let’s work with a practical example: imagine a product manager for a music platform wants to be confident about what to develop next to meet the needs of users. Qualitative research with an online community of these users or potential users has surfaced a list of features the audience would like to see:

As the community is qualitative, involving maybe 30–40 users, lots of feedback has been gathered on some pre-existing concepts that have been tested. But it is hard to place a priority order on what (from a user perspective) you should build first given the small sample size.

So the next logical step is to conduct some survey research amongst a larger group of users, and potential users, to find out how appealing these new features are via a rating scale question (How appealing do you find this feature…):

Unfortunately, what the research turned up wasn’t too helpful. Taking this absolute view of appeal, we learn that all the features above are very appealing, and whilst there is a bit of differentiation, even with a large sample size it is hard to parse what should and shouldn’t make the cut, let alone be able to divine where on the roadmap features should go based on user appetite.

(It is worth noting this isn’t an unexpected scenario, great minds within brands cook up broadly interesting concepts for things they think users want and ask them how they feel about them. Unsurprisingly then, the things that get tested in research often tend to find an audience and exhibit high levels of appeal!) 

In despair, they may try to cut the data by different key audiences to look for some clarity. But to no avail. So, what next? Maybe just do another survey, and ask people to rank the features? As quantitative researchers this makes us recoil in horror. Yes it may give us a rough approximation of what people are more or less likely to prefer. But if someone asked you which one of 20 things you prefer and gave you ten seconds to answer, are you sure you would give a reliable answer?

2. An elegant solution

This is where MaxDiff comes in. The beauty of the approach is it makes hard questions (Which of these 20 features do you prefer?) easy to answer by chunking down the question into lots of smaller questions (Which one of these three features do you most and least prefer?) as part of a balanced model where every feature is tested against every other feature.

The benefits

 1.      Ratings questions are great at giving you an overall level of appeal, but regardless of the number points on a scale they tend to offer poor differentiation and certainly cannot be relied upon to do so.  

 2.     Smaller questions with fewer options are easier to answer accurately, meaning better quality data than showing tens of options to people which are impossible to compare and contrast especially on a mobile screen which is where the majority of surveys take place:

 
 

3.      The approach is gamified – horrible word but true, a rapid-fire trade-off of options in a survey with a fixed rule set encourages an instinctive response.

 4.     The repetition of the MaxDiff exercise provides far more data than a single preference question therefore you can have more confidence in the results.

 5.     The biases present in rating scales where at a cultural or individual level people will rate higher or lower than others are controlled for as the comparisons are always relative

In a typical example, if you have twenty or so features, participants should see about a dozen screens with five features per screen. The output will provide a relative view of preference which will almost certainly be more discriminating than an absolute metric such as appeal. Here’s a hypothetical example:


And what it has revealed is that what appeared to be the fifth most popular option based on an appeal rating question, is actually the most preferred feature of the lot when the features have been traded-off against each other. A far more reliable metric of what users most want!

This relative data can have (and if you are not using it yet, should have) a profound impact on how you prioritise features, propositions, claims, or any range of (typically) text-based ideas to clarify the way ahead with evidence before you start building or investing.

3. Some considerations

On the flip side, a few watch-outs as promised to ensure you get what you need from your MaxDiff study:

  • Be nice to your survey particpants!

MaxDiff exercises can be fun to take part in… for a while, but if designed badly they can really drag. There are three key interrelated variables to consider; the number of features or concepts you want to test; the number of features/concepts per question; and the number of questions you show. It is a balancing act but try to keep the no. features limited to 20 or so and use a calculator like this to understand the implications of the number of features you show per screen on your users.

  • Don’t overwhelm them with stimulus

A related point is to think about the demands you can place on participants in an experiment. If your features are shorter and snappier feel free to show five (or more) on screen at once. If they are longer, try to show fewer. As MaxDiff is a partial ranking exercise keep in mind the more features you show at once the more missing data there will be and therefore your output will be highly differentiated at the extremes and less so in the middle as you will have collected less data on things people are ambivalent about.

  • Try to keep your survey tight

If you can avoid it try not to load a MaxDifff exercise into an already long survey. Your participants won’t thank you for it. And frustration and fatigue will lead to poor quality, unreliable, data (keep an eye on completion times and repetitive patterns in responses as an easy indicator of this)

  • Use the survey as an educational tool

This is especially the case if your feature descriptions are long and detailed. It may be worth warming people up by introducing them first one at a time. A great way to do this is with a rating scale question regardless of whether you use this data in your analysis.

  • Combine MaxDiff with absolute rating questions

As a counterpoint to the above consider whether its sensible to rely on a MaxDiff output alone. As mentioned, rating scale data is a good sense check to see if you are on the right path, MaxDiff won’t tell you if all your ideas are great / terrible only which ones are more or less so.

  • Ensure your inputs are comparable

This is an art not a science but try to ensure that your inputs are clear concise and comparable. Make sure you provide equal levels of detail about each feature you test and try to use a similar format e.g., name of feature | who it’s for | to be used in what situation | what the benefit is

  • Be clear up front and as things go

MaxDiff exercises can be long, so make sure to prime your research participants before the exercise begins. Explain how the question type begins and warn them they will see lots of similar questions (otherwise they may think the survey is broken). Also, from a UX perspective it is great to have a progress bar within surveys generally but especially so within a MaxDiff exercise to show a defined endpoint to avoid confusion and frustration.

  •  Finally, make sure you MaxDiff exercise can be answered

Lots of survey platforms have not got their heads around how to use advanced question types or what makes a good UX. We won’t name and shame anyone, but ensure as per the example above all the information can be compared at a glance with a minimum of scrolling or hunting around for radio buttons. Colour coding, icons and contextual shading of responses which can no longer be selected are all also best practice.

We hope this has been of use, our next piece on the subject will focus more on analysis and getting beyond a one-dimensional output as there is lots more you can do with the MaxDiff data to help support prioritisation decisions.


Concept Screening Surveys

Previous
Previous

5 ways to enhance Personas: For empathy