How do I reduce a large amount of data into something more meaningful? (Factor analysis)

  1. How do I appeal to the largest number of consumers? (TURF analysis)
  2. How do I prioritise marketing messages or product attributes? (Max Diff)
  3. How do I find out what people value in my (new) product / service? (Conjoint)
  4. How do I identify what drives a desired behaviour or outcome? (Key driver analysis)
  5. How do I know what to prioritise to meet strategic goals? (Gap analysis)
  6. How do I build consumer loyalty? (Consumer journey mapping)
  7. How do I use behavioural science to improve my research? (Cognitive biases)
  8. How do I live without you? (LeAnn Rimes)
  9. How do I know how many people will buy my product at a given price? (Van Westendorp’s price sensitivity meter)
  10. How do I assess the impact of my advertising? (Ad effectiveness)
  11. How do I turn data into clear findings (Data visualisation)
  12. How do I tap into the unconscious perceptions that influence decision-making? (Implicit response testing)
  13. How do I reduce a large amount of data into something more meaningful? (Factor analysis)
  14. How do I group people together based on shared characteristics? (Segmentation)
  15. How do I forecast market share at a given price point? (Brand price trade off)
  16. How do I account for cultural differences when surveying across markets? (ANOVA)
  17. How do I judge brand performance relative to competitors (Correspondence analysis / brand mapping)

Consumers are complicated

Avid readers of the ‘How do I…’ series might have spotted a theme across many of the business challenges and research techniques we’ve been exploring. In consumer research, we’re often trying to unpick complex human behaviour to understand where our focus should be. This could be by identifying what drives a behaviour or outcome through key driver analysis, understanding where to prioritise our efforts for maximum impact with gap analysis, or ensuring we’re accounting for cognitive biases in questionnaire design and analysis. Arguably, if human behaviour weren’t complex, we wouldn’t need to research it!

How can factor analysis help?

Factor analysis reduces the many attributes that are influencing something into a smaller set of attributes (called factors). This process allows us to explore behaviours, attitudes or opinions that aren’t easy to isolate and measure, by understanding the underlying factors at play. For this reason, it’s a particularly useful tool for prioritising which data to use in analysis e.g., for a segmentation, or identifying KPIs.

Factor analysis assumes that there is a linear relationship between data points, and no inter-correlations, so is best suited to analysis where patterns of response in the data are thought to be due to another data point that isn’t directly measured (known as a latent variable). For example, let’s say your survey captures barriers to sign-up for your new subscription service. Your answers options include, ‘I haven’t heard of the service before today’, ‘I don’t know what’s available on the service’, and ‘I don’t know how I would access the service’. Similar responses to these three answer options are associated with the latent variable ‘lack of understanding’, which wasn’t asked but is the real barrier that marketing needs to address.

What does factor analysis tell me?

Each factor explains a proportion of the overall variance in the data, the aim is to reduce your original large number of data points to fewer factors that collectively explain a large proportion of the variance. In the example above, this might be taking a long list of barriers to sign-up, and reducing them to five factors (underlying themes preventing sign-up) such as:

  • lack of understanding (explaining the responses: I haven’t heard of the service before today’, ‘I don’t know what’s available on the service’, and ‘I don’t know how I would access the service’)
  • inertia (explaining the responses: ‘never really thought about it’ and ‘my existing services meet my needs’)
  • price sensitivity (explaining the responses: ‘it’s too expensive’, ‘I already spend enough on entertainment services’, and ‘I don’t agree with paying for content’)

The process of identifying how many factors to include is done by looking at the eigenvalue of each factor, individually and cumulatively. If a factor has an eigenvalue of ≥1 it is able to explain more variance than a single data point that has been captured in your survey and is therefore potentially a useful analysis tool. Plotting how much data can be explained cumulatively by the factors then helps you to identify how many factors are meaningful to use in your analysis (this step typically involves a scree plot, so called because of the geographical term scree slop, used to describe where rocks have fallen down and accumulated on the side of a mountain).

Each factor has a score against every data point that you included in the analysis, in our example, each barrier to sign-up. This score is called a factor loading and analysing the factor loadings allows you to understand what characterises each factor. The higher the factor loading, the more strongly that data point is associated with that factor.

By looking at which data points are most associated with a factor, the factor can be named and this can be used to help simplify the data. In the examples above, lack of understanding, inertia and price sensitivity are the labels given to each factor after reviewing the factor loadings. The factor labels then become shorthand themes for understanding the key barriers to tackle.