## About the Dataset
### 382 hours

This audio dataset contains 382 hours of Mandarin Chinese Speech Data in generic domain.

The speakers are presented with a prompt (script) and asked to read it out loud and record. Our clients will receive an audio recording, the prompt and information about the speaker. The audio is recorded on-device, typically in 16Khz 16 bit. We also provide information on which device each record was recorded.

The dataset is covered by [Defined.ai's standard license agreement](https://www.defined.ai/dataset/data-licence-agreement). The license agreement is perpetual and allows for the commercialization of all models built on the data.



Request Sample

Spontaneous IVR

Mandarin Chinese Spontaneous IVR

Mandarin Chinese (PRC) Scripted Monologue

Request Sample

You might also be interested in these audio datasets:

Mandarin Chinese Spontaneous IVR

Mandarin Chinese (PRC) Scripted Monologue

Korean Spontaneous Dialogue

Japanese Spontaneous Dialogue