Abstract:
Given the research advancement in recognition technologies such as voice recognition, handwriting
recognition, etc., multimodal interaction is an emerging technology that can revolutionize the mobile device
input/output capabilities given the limitations of mobile device keypads and screens. One important resource
that can be accessed from mobile devices is the World Wide Web, but Web pages, as they are visual and may
require typed input, may not be readily suited to mobile device access. While there are technology initiatives
addressing the authoring of new Web pages that are multimodal-enabled, the generating of multimodal
versions of existing Web pages has not been addressed. To help address this problem, we propose in this paper
an architecture and approach to auto-generating multimodal representations of existing Web pages and
services, thereby helping to enable the Mobile Web.