The total number of subjects with complete information in the BVC voice data collection are 526, of which 336 are males and 190 are females. The total number of voice utterances are 3,964 consisting of 2,149 male and 1,815 female voice utterances.
Five different speeches of English and the equivalent translated native languages were acquired from the subjects in the first and second sessions.
One to five voice recordings in English language and Native languages were acquired from each of the subjects. Twenty-eight (28) different native languages make up the native language set.
The voices in this dataset are challenged with some background noise which however do not seriously affect the audios.
When the BVC Voice dataset is used in any form of research, please the following paper should be cited:
O. Iloanusi, U. Ejiogu, I. Okoye, I. J. F. Ezika, S. Ezichi, C. Osuagwu and E. Ejiogu, “Voice Recognition and Gender Classification in the Context of Native Languages and Lingua Franca,” in the 6th IEEE International Conference on Soft Computing & Machine Intelligence (ISCMI), Johannesburg, South Africa, November 19-20, 2019, pp. 175–179. https://ieeexplore.ieee.org/document/9004306