Bengali Spontaneous Dialogue

Spontaneous Dialogue

About the Dataset

256 hours

This audio dataset contains 256 hours of Bengali Speech Data in Banking, Telecommunication, Insurance and Retail domains, recorded by native Bengali speakers from India.

Domain distribution per dataset:

  • 102.93 hours of Banking
  • 86.98 hours of Insurance
  • 29.32 hours of Retail
  • 36.93 hours of Telecommunication creates scenarios for our crowd members to follow, which they study beforehand. They then record a conversation, one speaker playing the agent, the other speaker “playing out” the scenario with spontaneous content. The recording is done via telephony and is saved in 8kHz 16 bit per channel. That content is then transcribed.

The dataset is covered by's standard license agreement. The license agreement is perpetual and allows for the commercialization of all models built on the data.

Other characteristics:

  • Audio format: WAV
  • Recording environment: noisy, silent
  • Bits per sample: 16
  • Communication band: broadband
  • Sample rate: 8kHz

Metadata Distribution

Spontaneous_Bengali_Accents.png Spontaneous_Bengali_Gender.png Spontaneous_Bengali_Age.png


Check this 5-minute audio sample here. Transcription for the sample is also available.

Download Free 30-minute Sample

Tell us about yourself, and get access to a 30-minute sample of Bengali Spontaneous Dialogue

All fields are required

By downloading, installing, accessing, and/or using this data sample, you consent to receive communications from and affirm your acceptance of our Privacy Policy, Terms of Use, and Data License Agreement. Consent can be revoked at your discretion.

You might also be interested in these datasets:

Accented English Scripted Monologue

434 hours recorded by speakers of French, Arabic, Spanish, and other languages
Scripted Speech
DAI logo hosts the leading online marketplace for buying and selling AI data, tools and models, and offers professional services to help deliver success in complex machine learning projects. is a community of AI professionals building fair, accessible and ethical AI of the future.
1201 3rd Avenue, STE 2200, Seattle WA
[email protected]
Wired logo
Forbes 2019 AI50 logo
CB insights logo
Forbes 2020 logo
Inc. 5000 logo
PME logo

© 2024 DefinedCrowd. All rights reserved.