:::

詳目顯示

回上一頁
題名:Pitch Marking Based on an Adaptable Filter and a Peak-Valley Estimation Method
書刊名:International Journal of Computational Linguistics & Chinese Language Processing
作者:Chen,Jau-hungKao,Yung-an
出版日期:2001
卷期:6:2
頁次:頁31-42
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:0
  • 共同引用共同引用:3
  • 點閱點閱:10
In a text-to-speech (TTS) conversion system based on the time-domain pitch-synchronous overlap-add (TD-PSOLA) method, accurate estimation of pitch periods and pitch marks is necessary for pitch modification to assure optimal quality of synthetic speech. In general, there are two major tasks in pitch marking: pitch detection and location determination. In this paper, an adaptable filter, which serves as a bandpass filter, is proposed for use in pitch detection to transform voiced speech into a sine-like wave. The pass band of the adaptable filter can be adapted based on the fundamental frequency. Based on the sine-like wave, a peak-valley decision method is proposed to determine the appropriate parts (positive part and negative part) of voiced speech for use in pitch mark estimation. In each pitch period, two possible peaks/valleys are searched, and dynamic programming is performed to obtain pitch marks. Experimental results indicate that our proposed method performs very well if correct pitch information is estimated.
期刊論文
1.Shih,Chilin、Sproat,Richard(19960800)。Issues in Text-to-Speech Conversion for Mandarin。International Journal of Computational Linguistics & Chinese Language Processing,1:1,頁37-86。new window  new window
2.Markel, John D.(1972)。The sift algorithm for fundamental frequency estimation。IEEE Transactions on Audio and Electroacoustics,20,367-377。  new window
3.Iwahashi, N.、Sagisaka, Yoshinori(1995)。Speech segment network approach for optimization of synthesis unit set。Computer Speech and Language,335-352。  new window
4.陳順孝、Hwang, S. H.、Wang, Y. R.(1998)。An RNN-based prosodic information Synthesizer for Mandarin text-to-speech。IEEE Trans. Speech and Audio Proc.,6(3),226-239。  new window
5.Rabiner, L. R.、Chen, Ming-Jun、Rosenberg, A. E.、McGonegal, C. A.(1976)。A Comparative performance study of several pitch detection algorithms。IEEE transactions on acoustics, speech, and signal processing,24,399-417。  new window
6.Rabiner, Lawrence R.(1977)。On the use of autocorrelation analysis for pitch detection。IEEE Transactions on Acoustics, Speech and Signal Processing,25,24-33。  new window
7.Noll, A. M.(1967)。Cepstrum pitch determination。The Journal of the Acoustical Society of America,47,293-309。  new window
8.Barnard, E.、Cole, R. A.、Vea, M. P.、Alleva, F. A.(1991)。Pitch detection with a neural-net classifier。IEEE Trans. Signal Proc.,39(2),298-307。  new window
9.Barner, K. E.(2000)。Colored L-1 filters and their application in speech pitch detection。IEEE Trans. Signal Proc.,48(9),2601-2606。  new window
10.Kobayashi, M.、Sakamoto, M.、Hashimoto, Y.、Nishimura, Masanari、Suzuki, K.(1998)。Wavelet analysis used in text-to-speech synthesis。IEEE Transactions on Circuists and Systems-II, Analog and Digital Signal Processing,45(8),1125-1129。  new window
會議論文
1.Hamon, C.、Moulines, E.、Charpentier, F.(1989)。A diphone synthesis based on time-domain prosodic modifications of speech。New York。238-241。  new window
2.Chou, F. C.、Tseng, C. Y.(1998)。Corpus-based Mandarin speech synthesis with contextual syllabic units based on phonetic properties。New York。893-896。  new window
3.Charpentier, F. J.、Stella, M. G.(1986)。Diphone synthesis using an overlap-add technique for speech waveforms concatenation。New York。2015-2020。  new window
4.Huang, H.、Seide, F.(2000)。Pitch tracking and tone features for Mandarin speech recognition。New York。1523-1526。  new window
5.Moulines, E.、Emerard, F.、Larreur, D.、Milon, J. L. Le Saint、Faucheur, L. Le、Marty, F.、Charpentier, F.、Sorin, C.(1990)。A real-time French text-to-speech system generating high-quality synthetic speech。New York。309-312。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
QR Code
QRCODE