If usability can be defined by a single, standardized usability metric it would me much easier for it to be measured and managed and to be put on the agenda of boardroom meetings. Defining usability in a single metric would make it more accessible to managers. Is it possible though?

However there are several definitions of what usability is. Schneiderman for example identify five usability measures: time to learn, speed of performance, rate of errors by users, retention over time and subjective satisfaction. According to ISO-9241-11 usability consists of the three aspects: effectiveness, efficiency and subjective satisfaction and Jakob Nielsen defines usability by the five following components: learnability, efficiency, memorability, errors and satisfaction.

Some researcher like McGee and Sauro and Kindlund believe that measuring usability in a single, standardizd metric is possible. They think that the different aspects of usability to some degree contribute the same information and can be combined.

To find out if such a strong correlation between the different aspects of usability exists Hornbaek and Lai-Chong Law performed a Meta-Analysis of Correlations Among Usability Measures.

Altogether 73 datasets were analyzed. Their study shows that there is a small to medium correlation between typical measures of usability. This makes defining usability in a single, standardized metric extremely difficult. Until the holy grail has been found it is better to use one of the defintions as stated by Schneiderman, ISO-9241-11 or Nielsen.

No related posts.



  1. Jeff Sauro on Friday 1, 2010

    Hi, interesting post.

    You should also take a look at: Sauro, J. & Lewis J.R. (2009) “Correlations among Prototypical Usability Metrics: Evidence for the Construct of Usability.” (CHI 2009) which addresses the data from Hornbaek and Law. It found correlations between usability metrics from 90 distinct usability tests were strong when measured at the task-level (r between .44 and .60), but lower at the test-level (r between .16 and .24).

    It discusses the implications for combining data into single scores and the agreements and discrepancies between the Hornbaek and Law dataset (e.g. they looked at a broader range of HCI metrics from published papers, whereas this paper only looked at Summative Usability tests).