Abstract:
Multimodal Interaction has created exciting new opportunities for the future
of HCI. Current interfaces are, however, becoming congested owing to the
easy transformations of digitised information from one application to
another. For this reason, it is increasingly difficult to clearly realise new
potentials, computer assistance often adding rather than reducing
complexity. At the core of the problem lies the fact that the different
requirements of time and space in multimodal interfaces are not compatible.
The dimensional variations of image, sound, text and gesture recognition - to
name but some of the data permutations possible - create complexities that
are difficult to manage especially when they operate side by side in the same
machine. As such, what is required is new modelling to resolve these very
difficult design issues. The paper will define a number of characteristics of
multimodal interaction through the understanding that dramatic action, that
is performance, is a primary human activity which the computer is now
capable of amplifying as a technological extension of a spatial activity.
Secondary to this is reading, the decoding of text, including images, which
reactivates the body from the storage of the written. These two elements, the
performed and the read, are respectively spatial and temporal forms present
in computing and an understanding of their differences forms a matrix for
multimodal interaction design. A discussion on previous modelling is
included.