笔触狂放9

这个屌丝很懒，什么也没留下！

热门标签

2021-04-14_matlab的quantizer的量化区间

作者：笔触狂放9 | 2024-07-20 11:31:55

踩

matlab的quantizer的量化区间

多媒体考点

1.*用3bit量化器量化幅度范围为-15~15

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-qxSrOLXU-1618406721036)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210412153218618.png)]

qsize = 2^3=8

b = -(qsize-1)/2 : (qsize-1)/2 [-3.5 -2.5 … 2.5 3.5]

del=2*15/qsize; 3.75

b = b*del [-13.1250000000000 -9.37500000000000 -5.62500000000000 -1.87500000000000 1.87500000000000 5.62500000000000 9.37500000000000 13.1250000000000]

b即为均匀量化表

E0=x-y0=0.8-(-13.125)=13.925

…

E4=x-y4=abs(0.8-1.875)=1.075 is the minmum value

量化器指数为 4，Q(0.8)=1.875

使用变形公式 warping equation:

这里的S就是输入0.8,Smax为最高幅值15得到F(0.8)=7.2523

由于E5=X-Y5=7.2523-5.625=1.XXX为最小值，量化器指数为5

Q(F(0.8))=5.625注意带入后面F(s)是5.625

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-nrvG54LE-1618406721048)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210412225111737.png)]

Eq1=1.875-0.8=1.075 输出-输入的绝对值

Eq2=0.8-0.4118=0.3882

2.描述语音生成的源-滤波器模型，并解释该模型

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-1v3hYq8D-1618406721058)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210412231740658.png)]

混合决策选择了浊音时的周期激励和非浊音时的随机激励。这个刺激被输入到一个模拟声道的过滤器中，而声道的输出是合成语音。

The mixing decison choose periodic excitation during voiced speech and random excitation during unvoiced speech.This excitation is input to a fliter that models vocal tract,the output of which is synthesised speech.

3.语音信号时域波形的认识

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-nEuczMJl-1618406721069)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210413161126780.png)]

0~0.1S 周期信号是浊音部分，剩下的是非浊音部分。

First 0.1s is voiced and remaining section is unvoiced.

波形的基音是最高峰间的距离。

The pitch of the waveform is the distance between the large peaks.

4.*LTP长期预测公式的推导

记住E的公式，关于人求导，令他为0得到人的值即可。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ZapgFBW2-1618406721080)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210413162839069.png)]

对E关于x求偏导 Differentiating E with respect to x

5.AbyS编码器在达到高质量语音的缺陷

折衷在于AbyS编码器的计算复杂度远高于直接量化的方法，因为它需要对码本进行闭环搜索。

The tradeoffs include that the computational complexity of the abys coder is much higher than the direct quantision method.Because it requires a closed-loop search of a codebook.

6.*多脉冲激励编码和比特率的计算 Multi-Pulse Excitation (MPE) coding

MPE编码是指将激励编码为脉冲，编码的信息是脉冲的位置（时间样本）和振幅。

MPE coding is where the excitation is encoded as a series of impulses,where the information to be encoded is position and amplitude of each impulse.

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-MzJuooKG-1618406721089)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210413224458290.png)]

上述解决方案应具有3个脉冲,（振幅、时间位置）的近似值为：（2.5、15）、（2.5、26）、（3,36）.

The solution above should have 3 impulses.In the figure,the approximate value for(amplitude,position) are:(2.5,15)…

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-1G1b1j64-1618406721099)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210413230014622.png)]

对于50个样本，脉冲的位置需要6位，2^6=64>50

题目给定脉冲的增益需要5位，总共需要11位/脉冲

因此比特率=11*5/50 * 8k=8.8bps

7.合成分析 *Analysis-by-Synthesis (AbyS) 编码的关键阶段

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-L4Lx7BTe-1618406721105)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210414104812370.png)]

激励发生器：确定在语言合成阶段使用的激励信号

Excitation generator ：Determines the excitation signal to use in the speech synthesisi stage.

逆长期预测器：处理激励产生线性预测剩余信号

Inverse long term predictor:Processes the excitation to produce a linear prediction residual signal.

逆短期预测器：对剩余信号应用LPC合成滤波器产生合成语音。

Inverse short term predictor:Applies an LPC synthesis fliter to the residual signal to produce synthesised speech.

总结阶段：从合成版本中减去原始语音，产生错误信号。

Summing stage:Subtracts the original speech from the synthesised version to produce an error signal.

误差最小化：根据误差信号确定要用来自适应地选择替代激励信号的参数，从而使合成语音的新版本的误差低于以前版本的误差。

Error minimistation:Determines an parameters based on the error signal to be used to adaptively choose an alternative excitation signal such that a new version of the synthesised speech has a lower error than the previous version.

8.噪声可以分成哪2类

稳定噪声：随着时间保持不变例如风扇噪声

stationary noise:remains unchanged over time Eg noise from fans

非稳定部分：噪声的光谱特性随时间的变化而变化

Non stationary:the spectral characteristics of the noise change over time Eg noise on the street.

9.给出语音信号帧和噪声信号帧。光谱减算法，音乐噪声

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Y35uiUqI-1618406721114)(C:\Users\fxj\AppData\Roaming\Typora\typora-user-images\image-20210414202059715.png)]

大约1500HZ_{2000HZ和2750HZ}3000HZ，噪声大小超过了语音的大小

approx 1500~2000hz,the noise magnitude exceeds the speech magnitude.

当光谱减除算法产生一个负的输出幅度时，就会发生音乐噪声，在这种情况下，半波整流函数将输出设置为0。这将导致上述这两个部分没有输出，并将导致语音信号的显著失真。

Musical noise occurs when spectral subtraction algorithm results in a negative output magnitude,in which case the half-wave rectification functions sets the output to 0.

Musical noise occurs when spectral subtraction algorithm results in a negative output magnitude,in which case half wave rectification function sets the output to 0 .This would case a significant distortion of the speech .

光谱减算法的假设：

噪声具有可加性

稳定性、缓慢改变

在更新周期噪声光谱改变不明显

10.判断哪个麦克风最好

由此，它表明麦克风B在两侧和麦克风上提供了最高水平的衰减

From this, it shows that

microphone B provides the highest level of attenuation at the sides and of the microphone

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/笔触狂放9/article/detail/856658