## About the Dataset
### 434 hours

This audio dataset contains 434 hours recorded by speakers of French, Arabic, Spanish, and other languages:
- **74.73 hours of English recorded by native French speakers**
- **50 hours recorded by native Arabic speakers**
- **40 hours recorded by native Spanish speakers**
- **269.8 hours recorded by speakers of other languages**

The speakers are presented with a prompt (script) and asked to read it out loud and record. Our clients will receive an audio recording, the prompt and information about the speaker. The audio is recorded on-device, typically in 16kHz 16 bit. We also provide information on which device each record was recorded.

The dataset is covered by [Defined.ai's standard license agreement](https://www.defined.ai/dataset/data-license-agreement). The license agreement is perpetual and allows for the commercialization of all models built on the data.

## Metadata Distribution
### Arabic Accent 

![Scripted_English_Accented_Arabic_Age.png](https://prdstrapimediastorage.blob.core.windows.net/prdstrapimediastorage/assets/Scripted_English_Accented_Arabic_Age_4d162d1656.png)
![Scripted_English_Accented_Arabic_Gender.png](https://prdstrapimediastorage.blob.core.windows.net/prdstrapimediastorage/assets/Scripted_English_Accented_Arabic_Gender_5174d40f39.png)

## Samples
- [Single file sample English Global Accents](https://defineddata.blob.core.windows.net/samples/TOF_en-us%3Bacc%3Dxx-xx_single-scripted_generic_30m_v01_Sample/Audio/183028607.wav?se=2024-06-15T15%3A34%3A57Z&sp=r&sv=2020-06-12&ss=b&srt=o&sig=qNwwompEswW0rEdovYDjN393fTOz9/53161f9M1S1Xc%3D). Transcription for the sample is also [available](https://prdstrapimediastorage.blob.core.windows.net/prdstrapimediastorage/assets/Scripted_English_Accented_Global_Short_Transcription_b570d64762.tsv)
- [Single file sample English French Accented](https://defineddata.blob.core.windows.net/samples/MAI_en-us%3Bacc%3Dfr-xx_single-scripted_generic_30m_v01_Sample/Audio/162702285.wav?se=2024-06-15T15%3A38%3A37Z&sp=r&sv=2020-06-12&ss=b&srt=o&sig=76yW1mS3jn/p/MFrYUZhKzlLgcL2JJLb7/4IUG7nTzI%3D). Transcription for the sample is also [available](https://prdstrapimediastorage.blob.core.windows.net/prdstrapimediastorage/assets/Scripted_English_Accented_French_Short_Transcription_ea8204e019.tsv)
- [Single file sample English Arabic](https://defineddata.blob.core.windows.net/samples/MAI_en-us%3Bacc%3Dar-xx_single-scripted_generic_30m_v01_Sample/Audio/174361611.wav?se=2024-06-15T15%3A17%3A59Z&sp=r&sv=2020-06-12&ss=b&srt=o&sig=%2BFCbU4PhSfcl28AkHcUTaenFO8g/kz3VbveRge1DXI8%3D). Transcription for the sample is also [available](https://prdstrapimediastorage.blob.core.windows.net/prdstrapimediastorage/assets/Scripted_English_Accented_Arabic_Short_Transcription_9d977c9ed1.tsv)

- [30-minutes sample of English, Generic Accent](https://defineddata.blob.core.windows.net/samples/MAI_en-us%3Bacc%3Dfr-xx_single-scripted_generic_30m_v01_Sample/MAI_en-us%3Bacc%3Dfr-xx_single-scripted_generic_30m_v01_Sample.zip?se=2024-06-15T15%3A38%3A37Z&sp=r&sv=2020-06-12&ss=b&srt=o&sig=76yW1mS3jn/p/MFrYUZhKzlLgcL2JJLb7/4IUG7nTzI%3D)
- [30-minutes sample of English, French Accent](https://defineddata.blob.core.windows.net/samples/TOF_en-us%3Bacc%3Dxx-xx_single-scripted_generic_30m_v01_Sample/TOF_en-us%3Bacc%3Dxx-xx_single-scripted_generic_30m_v01_Sample.zip?se=2024-06-15T15%3A34%3A57Z&sp=r&sv=2020-06-12&ss=b&srt=o&sig=qNwwompEswW0rEdovYDjN393fTOz9/53161f9M1S1Xc%3D)
- [30-minutes sample of English, Arabic Accent](https://defineddata.blob.core.windows.net/samples/MAI_en-us%3Bacc%3Dar-xx_single-scripted_generic_30m_v01_Sample/MAI_en-us%3Bacc%3Dar-xx_single-scripted_generic_30m_v01_Sample.zip?se=2024-06-15T15%3A17%3A59Z&sp=r&sv=2020-06-12&ss=b&srt=o&sig=%2BFCbU4PhSfcl28AkHcUTaenFO8g/kz3VbveRge1DXI8%3D)

Download Free 30-minute Sample

Speech

Arabic Spontaneous Dialogue

Spontaneous IVR

Arabic Spontaneous IVR

French Spontaneous Dialogue

English Scripted Monologue

You might also be interested in these audio datasets:

Accented English Scripted Monologue

434 hours recorded by speakers of French, Arabic, Spanish, and other languages