Arabic Spontaneous IVR

Banking
Arabic
Telecommunication
Retail
Spontaneous IVR
Insurance

About the Dataset

900 hours

This audio dataset contains 900 hours of Arabic Speech Data in various domains, recorded by speakers from Egypt, Jordan, Yemen, and United Arab Emirates.

IVR stands for Interactive Voice Response. The Spontaneous IVR data is created by having a human respond to an IVR system. The human (playing a customer) is following simplified real-life scenarios, making a spontaneous query on the given topic. The IVR system will ask the human to repeat his/her query in 2 more different ways. The speech is then transcribed. In rare cases, the IVR system is mimicked by another human. The recording is done via telephony and is saved in 8khz 16 bit per channel.

The dataset is covered by Defined.ai's standard license agreement. The license agreement is perpetual and allows for the commercialization of all models built on the data.

There are 269 hours of Arabic (Egypt) IVR with the following domain distribution per dataset:

  • 72.3 hours of Banking
  • 69.02 hours of Insurance
  • 64.1 hours of Retail
  • 64.15 hours of Telecommunication

There are 262 hours of Arabic (Jordan) IVR with the following domain distribution per dataset:

  • 72.73 hours of Banking
  • 63.17 hours of Insurance
  • 62.88 hours of Retail
  • 63.4 hours of Telecommunication

There are 262 hours of Arabic (UAE) IVR with the following domain distribution per dataset:

  • 23.88 hours of Banking
  • 23.8 hours of Insurance
  • 24.45 hours of Retail
  • 24.83 hours of Telecommunication

There are 272 hours of Arabic (Yemen) IVR with the following domain distribution per dataset:

  • 68.23 hours of Banking
  • 68.92 hours of Insurance
  • 66.45 hours of Retail
  • 68.78 hours of Telecommunication

Other characteristics:

  • Audio format: WAV
  • Recording environment: noisy, silent
  • Communication band: narrowband
  • Sample rate: 8Hz
  • Device type: mobile

Samples

Check this 5-minute audio sample of Arabic (United Arab Emirates) Banking here. Transcription for the sample is also available

Download Free 30-minute Sample

All fields are required

By clicking on the appropriate button or by downloading, installing, accessing, and/or using the data sample, you are agreeing with Defined.ai Privacy Policy, Terms of Use, and Data License Agreement.

You might also be interested in these audio datasets:

DAI logo
Defined.ai hosts the leading online marketplace for buying and selling AI data, tools and models, and offers professional services to help deliver success in complex machine learning projects. Defined.ai is a community of AI professionals building fair, accessible and ethical AI of the future.
Datasets
Contact
1201 3rd Avenue, STE 2200, Seattle WA
[email protected]
Wired logo
Forbes 2019 AI50 logo
CB insights logo
Forbes 2020 logo
Inc. 5000 logo
PME logo

© 2023 DefinedCrowd. All rights reserved.