BVC Gender & Age from Voice Challenging Set

About BVC Gender & Age from Voice Challenging Dataset

The Biometrics Visions and Computing (BVC) Gender & Age from voice dataset comprises voice utterances from 526 individuals at one to five voice recordings per individual, of which 336 are males and 190 are females.

It is a challenging and noisy dataset that requires filtering of some white noise present in all the voice samples.

Content

The total number of voice utterances are 3,964 consisting of 2,149 male and 1,815 female voice utterances. Five different speeches of English and the equivalent translated native languages were acquired from the subjects in the first and second sessions. One to five voice recordings in English language and Native languages were acquired from each of the subjects. Twenty-eight (28) different native languages make up the native language set.

Acknowledgements / Citation for the BVC Challenging Voice dataset

When the BVC Voice dataset is used in any form of research, please the following paper should be cited:

{ O. Iloanusi, U. Ejiogu, I. Okoye, I. J. F. Ezika, S. Ezichi, C. Osuagwu and E. Ejiogu, “Voice Recognition and Gender Classification in the Context of Native Languages and Lingua Franca,” in the 6th IEEE International Conference on Soft Computing & Machine Intelligence (ISCMI), Johannesburg, South Africa, November 19-20, 2019, pp. 175–179. https://ieeexplore.ieee.org/document/9004306 }

The paper can be accessed from https://ieeexplore.ieee.org/document/9004306

More Information on the BVC Challenging Voice Set

The total number of subjects with complete information in the BVC voice data collection are 526, of which 336 are males and 190 are females.

The total number of voice utterances are 3,964 consisting of 2,149 male and 1,815 female voice utterances.

Five different speeches of English and the equivalent translated native languages were acquired from the subjects in the first and second sessions.  

One to five voice recordings in English language and Native languages were acquired from each of the subjects. Twenty-eight (28) different native languages make up the native language set.  

Citation for the BVC Voice dataset When the BVC Voice dataset is used in any form of research, please the following paper should be cited: { O. Iloanusi, U. Ejiogu, I. Okoye, I. J. F. Ezika, S. Ezichi, C. Osuagwu and E. Ejiogu, “Voice Recognition and Gender Classification in the Context of Native Languages and Lingua Franca,” in the 6th IEEE International Conference on Soft Computing & Machine Intelligence (ISCMI), Johannesburg, South Africa, November 19-20, 2019, pp. 175–179. https://ieeexplore.ieee.org/document/9004306 }

Request for Datasets