Abstract:
As the desire to see robots ubiquitous in society grows, so does the need for providing the robots with the means of building awareness of any humans with which it may be sharing the environment. This paper presents a real-world suitable system which enables robots to robustly perceive the presence of people acoustically. The proposed binaural system first identifies voiced signal by means of a novel approach to Voice Activity Detection that exploits the spectral signature and characteristics of speech without reliance on a priori knowledge. Bearing estimates for each speaker are then made using a multi-track particle filter with a belief update function comprised of a Cross-correlation bearing estimate and an estimate of the speaker's fundamental frequency. Results, from an evaluation of each of the major system components and a system evaluation in which the robot successfully built human-centric situational awareness of the three humans with which it shared an office lunch-room containing typical background noises, are presented and discussed.