PCVC Speech Dataset: Difference between revisions

Content deleted Content added
Sabemalek (talk | contribs)
mNo edit summary
Declining submission: nn - Submission is about a topic not yet shown to meet general notability guidelines (be more specific if possible) (AFCH 0.9)
Line 1:
{{AFC submission|d||ts=20180330123128nn|u=Sabemalek|ns=118|decliner=Joe Decker|declinets=20180330161206|ts=20180330123128}} <!-- Do not remove this line! -->
 
{{AFC comment|1=The dataset may very well meet our notability requirements, but the article will need references to multiple, reliable, independent sources in order to show that before w can accept.
 
It is often the case, in fact, nearly universally the case, that new editors find the specifics of what we need there tough to understand, you may wish to consider asking for assistance at the [[WP:Teahouse|Wikipedia Teahouse]]. Thanks. [[User:Joe Decker|joe decker]][[User talk:Joe Decker|<sup><small><i>talk</i></small></sup>]] 16:12, 30 March 2018 (UTC)}}
 
----
 
The '''PCVC Speech Dataset''' is a [[Modern Persian]] [[speech corpus]] for [[speech recognition]]. The dataset contains sound samples of [[Modern Persian]] combination of [[vowel]] and [[consonant]] phonemes from different speakers. Every sound sample contains just one consonant and one vowel So it is somehow labeled in phoneme level. This dataset contains of 23 Persian consonants and 6 vowels. The sound samples are all possible combinations of vowels and consonants (138 samples for each speaker). The sample rate of all speech samples is 48000 which means there are 48000 sound samples in every 1 second. Every sound sample is 276 seconds(138 two seconds samples). In each 2s sample, in average, 0.5 second of each sample is speech and the rest is silence. In each sound sample 0.25s of start and 0.25s of end of it is surely scilence. Also in each 2s first consonant phoneme pronounced and then vowel is. All of sound samples are denoised with "Adaptive noise reduction" algorithm.<ref> Saber MalekzadeH, Mohammad Hossein Gholizadeh, Seyed Naser Razavi {{cite paper |title=Full Persian Vowel recognition with MFCC and ANN on PCVC speech dataset |url=http://bayanbox.ir/download/2723849504007807268/Full-Persian-Vowel-recognition-with-MFCC-and-ANN-on-PCVC-speech-dataset.pdf }} 5th International conference of electrical engineering, computer science and information technology, Iran, Tehran, 2018.</ref>
Line 18 ⟶ 24:
{{Corpus linguistics}}
 
[[:Category:Corpora]]
[[:Category:Datasets in machine learning]]
[[:Category:Persian language]]