Abstract: | Auditory Gestalt perception by grouping of species-specific vocalizations to a perceptual stream with a defined meaning is typical for human speech perception but has not been studied in non-human mammals so far. Here we use synthesized models of vocalizations (series of wriggling calls) of mouse pups (Mus domesticus) and show that their mothers perceive the call series as a meaningful Gestalt for the release of instinctive maternal behavior, if the inter-call intervals have durations of 100–400 ms. Shorter or longer inter-call intervals significantly reduce the maternal responsiveness. We also show that series of natural wriggling calls have inter-call intervals mainly in the range of 100–400 ms. Thus, series of natural wriggling calls of pups match the time-domain auditory filters of their mothers in order to be optimally perceived and recognized. A similar time window exists for the production of human speech and the perception of series of sounds by humans. Neural mechanisms for setting the boundaries of the time window are discussed. |