Abstract
Everyday, people are confronted with devices they have to control for all kinds of reasons.
Designers hope that their creations can be controlled easily. For that purpose, they can turn
to user-system interaction theories to guide them in design and evaluation. However, so far,
no empirical methods have been developed to evaluate the usability that corresponds well
with the increasingly popular approach of component-based software engineering. Instead
of building a device from scratch, the component-based software engineering approach
focuses on building artefacts from already made components (e. g. pop-up menus, radio
buttons, and list boxes). The usability of components has not yet been assessed individually,
but only for their impact on the overall usability (e. g. number of keystrokes, task
duration, or questionnaires about the overall ease of use and satisfaction).
The Layered Protocol Theory (LPT) regards interaction as an exchange of messages between
components and the user. LPT decomposes the user-system interaction into different
layers that can be designed and analysed separately. It claims the possibility of
component-specific usability evaluation. This is indeed very welcome since the creation
and deployment of components is allocated to different processes in the component-based
software engineering approach. Usability evaluation of a component in its creation process
would be more efficient than testing the usability of the component each time it is deployed
in an application. Usability evaluation in the deployment process is not even necessary
if the usability of an entire application only depends on the usability of the individual
components. The latter is the case according to LPT, because layers are unaffected when
lower-level layers are replaced as long as they provide the same message services to the
layer above it.
Until now, LPT has only been used to analytically evaluate the user interface of products.
However, LPT is also suggested to provide a basis to evaluate the usability of separate
components empirically. To do so the claim about the independence of the layers is essential,
which however, has not yet been examined empirically. Therefore, the thesis has the
following main research question: Is usability compositional? The question basically has
two underlying questions:
1. Whether and how the usability of components can be tested empirically.
2. Whether and how the usability of components can be affected by other components.
187
The research was conducted in a series of laboratory experiments in which subjects had
to perform tasks with prototypes of various user interfaces. The first experiment was
conducted to search for an objective component-specific performance measure. In this
explorative experiment, 80 university students operated a fictitious user interface. In a
training session, the subjects received one out of eight instruction sets, which were created
by providing or withholding information about three components. Before and after the
subjects performed the tasks their knowledge about the components was tested. Furthermore,
the message exchange between the components was recorded in a log file during
the task execution. The results showed that the subjects’ knowledge about a component
affected the number of messages it received. This suggests that the number of messages a
user interface component received can be taken as an objective component-specific performance
measure. The measure indicates the users’ effort to control their perception of the
component. Each message is an expression that users are unsatisfied with the state of the
system they perceive, and that they spend effort changing it to a perception they desire.
A framework has been established to test the usability of components, which is based
on the findings from this explorative experiment. The testing framework supports two
testing paradigms, a single version and a multiple versions testing paradigm. In the single
version testing paradigm, only one version of each component is tested. The focus is on
identifying components that hamper the overall usability of the user interface. In the
multiple versions testing paradigm, different versions of components are compared with
each other. The question in this case is which version has the highest usability.
For the single version testing paradigm, the number of messages a component receives is
compared with the performance of an ideal user and is corrected for control effects of lower
and higher-level components. An effort value is assigned to each message received. These
effort values are based on the effort value of the lowest-level messages that are linked to
higher-level messages. At the lowest-level layer, weight factors are assigned to the messages,
which represent the user effort value of sending a single lower-level message.
The subjective component-specific measure for the ease of use and satisfaction were obtained
through a standard usability questionnaire. These measures were expected to be
more powerful than subjective overall measures, because the specific questions could assist
users in the recall of their control experience of individual components.
In a second experiment, the framework was evaluated by comparing overall and componentspecific
usability measures for their ability to identify usability problems that were created
in advance. Eight different prototypes of a mobile telephone were constructed by designing
two versions of three components. The versions of each component were designed to differ
clearly with respect to their usability. Eight groups of ten university students had to
complete a set of tasks with one specific mobile telephone prototype. The results did
not reveal subjective component-specific measures to be more effective in determining
variations in the perceived ease of use or the satisfaction between versions of a component
than their overall counterparts in the case of the multiple versions testing paradigm. This,
however, could be an artefact of the experiment because both component-specific and
overall questions were given in a random order. The memory recall of the componentspecific
questions could have affected the overall questions as well.
The results of the second experiment did, however, show that an objective componentspecific
performance measure is more effective in determining usability variations between
versions of both lower and higher-level components than overall performance measures
in cases where components operate independently. The power of this component-specific
measure comes from the reduction in statistical variance by limiting the focus to one component,
and, consequently, locking out the variance caused by the users’ effort to control
other components. For the single version testing paradigm, the results showed the objective
component-specific performance measure to correlate well with overall and subjective
component-specific usability measures and to allow evaluators to order the components
according to their potential to improve the usability.
LPT’s claim about the independence of a component was examined in two other experiments.
Two factors were studied which cause components to influence one another, i. e.
consistency and mental effort. The first experiment explored the effects that inconsistency
has at various layers of the user-system interaction. In the experiment, 48 university students
operated PC simulations of room thermostats, web-enabled TV sets, microwaves and
radio alarm clocks. The effect of inconsistency was examined between components on the
same or on different layers, and between components and the application domain. The
results of the experiment showed that components in a user interface could affect each
other’s usability significantly. Components in the same layer or in other layers can activate
an inappropriate component-specific mental model, which users apply to understand
the feedback of another component. The inconsistency between the application domain
and a component’s feedback was not found to affect the component’s usability. Whether
this only was the case in this experiment or can be generalised, is a topic for further research.
However, the study did show that the application domain had an effect on the
users’ understanding of the functionality the component provides.
In another experiment it was shown that mental effort could link the control of higher-level
components to lower-level components. A poor implementation of the lower-level layers
can force users to adopt less efficient higher-level control strategies to cope with a task
that is mentally over-demanding. In this experiment, 24 university students had to solve
equations with calculators, which were composed of different versions of a component that
operated on a low-level layer. One calculator was equipped with a small display, which
could only display one value. The other calculator was equipped with a large display, which
could display 5 lines of 34 symbols each. The subjects’ cardiovascular activity was recorded
as well as the message exchange between the components. Furthermore, after solving an
equation, the subjects rated the effort experienced on the Rating Scale Mental Effort and,
at the end of the experiment, the subjects filled out a questionnaire with ease-of-use and
satisfaction questions regarding the calculators and their specific components. Results
showed a significant interaction effect between the equation difficulty and the display size
in the heart-rate variability, which is regarded as a mental effort index. Next, a significant
interaction effect between the same variables was found in the number of times subjects
stored intermediate outcomes in a calculator, which is regarded as a change in control
strategy of the high-level component.
Based on the conducted research the following answers can be given to the question: Is
usability compositional? A first answer is yes; LPT indeed provides a basis for empirical
evaluation of user interface components. A second answer is no; the usability of the
entire user interface can not always be predicted solely on the usability of the individual
components. Inconsistency and mental effort are factors that allow one component to
reduce the users’ ability to control another component. Therefore, components should not
be designed, deployed and evaluated entirely independently of other components in the
user interface. Within the component-based software engineering approach, the creation
and deployment processes are separate. Therefore, two kinds of usability evaluations are
advised: one focusing on components in isolation when they are created, and the other
focusing on the component within the context of the particular application in which they
are deployed.
Original language | English |
---|---|
Qualification | Doctor of Philosophy |
Awarding Institution |
|
Supervisors/Advisors |
|
Award date | 15 Jan 2003 |
Place of Publication | Eindhoven |
Publisher | |
Print ISBNs | 90-386-1947-2 |
DOIs | |
Publication status | Published - 2003 |