赞
踩
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-qxSrOLXU-1618406721036)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210412153218618.png)]
a.
qsize = 2^3=8
b = -(qsize-1)/2 : (qsize-1)/2 [-3.5 -2.5 … 2.5 3.5]
del=2*15/qsize; 3.75
b = b*del [-13.1250000000000 -9.37500000000000 -5.62500000000000 -1.87500000000000 1.87500000000000 5.62500000000000 9.37500000000000 13.1250000000000]
b即为均匀量化表
b.
E0=x-y0=0.8-(-13.125)=13.925
…
E4=x-y4=abs(0.8-1.875)=1.075 is the minmum value
量化器指数为 4,Q(0.8)=1.875
c.
使用变形公式 warping equation:
这里的S就是输入0.8,Smax为最高幅值15得到F(0.8)=7.2523
由于E5=X-Y5=7.2523-5.625=1.XXX为最小值,量化器指数为5
Q(F(0.8))=5.625注意带入后面F(s)是5.625
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-nrvG54LE-1618406721048)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210412225111737.png)]
d.
Eq1=1.875-0.8=1.075 输出-输入的绝对值
Eq2=0.8-0.4118=0.3882
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-1v3hYq8D-1618406721058)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210412231740658.png)]
混合决策选择了浊音时的周期激励和非浊音时的随机激励。这个刺激被输入到一个模拟声道的过滤器中,而声道的输出是合成语音。
The mixing decison choose periodic excitation during voiced speech and random excitation during unvoiced speech.This excitation is input to a fliter that models vocal tract,the output of which is synthesised speech.
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-nEuczMJl-1618406721069)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210413161126780.png)]
a.
0~0.1S 周期信号是浊音部分,剩下的是非浊音部分。
First 0.1s is voiced and remaining section is unvoiced.
b.
波形的基音是最高峰间的距离。
The pitch of the waveform is the distance between the large peaks.
记住E的公式,关于人求导,令他为0得到人的值即可。
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ZapgFBW2-1618406721080)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210413162839069.png)]
对E关于x求偏导 Differentiating E with respect to x
折衷在于AbyS编码器的计算复杂度远高于直接量化的方法,因为它需要对码本进行闭环搜索。
The tradeoffs include that the computational complexity of the abys coder is much higher than the direct quantision method.Because it requires a closed-loop search of a codebook.
MPE编码是指将激励编码为脉冲,编码的信息是脉冲的位置(时间样本)和振幅。
MPE coding is where the excitation is encoded as a series of impulses,where the information to be encoded is position and amplitude of each impulse.
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-MzJuooKG-1618406721089)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210413224458290.png)]
上述解决方案应具有3个脉冲,(振幅、时间位置)的近似值为:(2.5、15)、(2.5、26)、(3,36).
The solution above should have 3 impulses.In the figure,the approximate value for(amplitude,position) are:(2.5,15)…
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-1G1b1j64-1618406721099)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210413230014622.png)]
对于50个样本,脉冲的位置需要6位,2^6=64>50
题目给定脉冲的增益需要5位,总共需要11位/脉冲
因此比特率=11*5/50 * 8k=8.8bps
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-L4Lx7BTe-1618406721105)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210414104812370.png)]
激励发生器:确定在语言合成阶段使用的激励信号
Excitation generator :Determines the excitation signal to use in the speech synthesisi stage.
逆长期预测器:处理激励产生线性预测剩余信号
Inverse long term predictor:Processes the excitation to produce a linear prediction residual signal.
逆短期预测器:对剩余信号应用LPC合成滤波器产生合成语音。
Inverse short term predictor:Applies an LPC synthesis fliter to the residual signal to produce synthesised speech.
总结阶段:从合成版本中减去原始语音,产生错误信号。
Summing stage:Subtracts the original speech from the synthesised version to produce an error signal.
误差最小化:根据误差信号确定要用来自适应地选择替代激励信号的参数,从而使合成语音的新版本的误差低于以前版本的误差。
Error minimistation:Determines an parameters based on the error signal to be used to adaptively choose an alternative excitation signal such that a new version of the synthesised speech has a lower error than the previous version.
稳定噪声:随着时间保持不变 例如 风扇噪声
stationary noise:remains unchanged over time Eg noise from fans
非稳定部分:噪声的光谱特性随时间的变化而变化
Non stationary:the spectral characteristics of the noise change over time Eg noise on the street.
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Y35uiUqI-1618406721114)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210414202059715.png)]
a.
大约1500HZ2000HZ和2750HZ3000HZ,噪声大小超过了语音的大小
approx 1500~2000hz,the noise magnitude exceeds the speech magnitude.
b.
当光谱减除算法产生一个负的输出幅度时,就会发生音乐噪声,在这种情况下,半波整流函数将输出设置为0。这将导致上述这两个部分没有输出,并将导致语音信号的显著失真。
Musical noise occurs when spectral subtraction algorithm results in a negative output magnitude,in which case the half-wave rectification functions sets the output to 0.
Musical noise occurs when spectral subtraction algorithm results in a negative output magnitude,in which case half wave rectification function sets the output to 0 .This would case a significant distortion of the speech .
光谱减算法的假设:
噪声具有可加性
稳定性、缓慢改变
在更新周期 噪声光谱改变不明显
由此,它表明麦克风B在两侧和麦克风上提供了最高水平的衰减
From this, it shows that
microphone B provides the highest level of attenuation at the sides and of the microphone
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。