English Spontaneous IVR
About the Dataset
1566 hours
This audio dataset contains 1566 hours of English Speech Data in various domains, recorded by speakers from the US, UK, and India.
IVR stands for Interactive Voice Response. The Spontaneous IVR data is created by having a human respond to an IVR system. The human (playing a customer) is following simplified real-life scenarios, making a spontaneous query on the given topic. The IVR system will ask the human to repeat his/her query in 2 more different ways. The speech is then transcribed. In rare cases, the IVR system is mimicked by another human. The recording is done via telephony and is saved in 8kHz 16 bit per channel.
The dataset is covered by Defined.ai's standard license agreement. The license agreement is perpetual and allows for the commercialization of all models built on the data.
There are 300 hours of English (India) IVR with the following domain distribution per dataset:
- 75.48 hours of Banking
- 77.67 hours of Insurance
- 71.62 hours of Retail
- 76.13 hours of Telecommunication
There are 502 hours of English (UK) IVR with the following domain distribution per dataset:
- 119.03 hours of Banking
- 130.95 hours of Insurance
- 132.57 hours of Retail
- 119.48 hours of Telecommunication
There are 764 hours of English (US) IVR with the following domain distribution per dataset:
- 173.85 hours of Banking
- 183.38 hours of Insurance
- 195.27 hours of Retail
- 209.1 hours of Telecommunication
Other characteristics:
- Audio format: WAV
- Recording environment: noisy, silent
- Communication band: narrowband
- Sample rate: 8kHz
- Device type: mobile
Metadata Distribution
English (UK)
English (US)
Samples
Check this 5-minute audio sample of English (US) Banking here. Transcription for the sample is also available