赞
踩
在处理音视频数据时,我们可以先将音视频数据读取成数组,这样进行剪切、截取、合并等等操作就非常的方便。
封装了一个方读取音频数组数据:
支持从wav、pcm、raw中读取音频数组和采样率等信息
依赖库:os、array、numpy、scipy.io.wavfile
def read_audio_data(audio_file, pcm_rate=16000, chs=1, bit_depth='int16'): # 读取音频数组 if not os.path.exists(audio_file): raise IOError if audio_file.endswith('.wav'): wave_rate, audio_data = wavfile.read(audio_file) return audio_data.T, wave_rate elif audio_file.endswith('.pcm') or audio_file.endswith('.raw'): data_array = array.array('h') if bit_depth == 'float32': data_array = array.array('f') elif bit_depth == 'int16': data_array = array.array('h') with open(audio_file, 'rb') as f: data_array.frombytes(f.read()) if chs > 1: mono_length = len(data_array[0::int(chs)]) data_array = np.array(data_array).reshape(mono_length, int(chs)) return data_array.T, pcm_rate elif chs == 1: return np.array(data_array), pcm_rate else: raise ValueError('please check the file chs')
归一化,转化数据类型,将int数组转化为浮点float32
def rescale_to_float32(audio_data, signal_dtype=np.float32): if not np.issubdtype(signal_dtype, np.floating): raise ValueError('signal_dtype not support') if np.issubdtype(audio_data.dtype, np.floating): max_num_audio_signal = np.max(abs(audio_data)) if max_num_audio_signal > 1: if len(audio_data.shape) > 1: return audio_data.astype(signal_dtype) / np.max(np.absolute(audio_data)) elif len(audio_data.shape) ==1: return audio_data.astype(signal_dtype) / max(abs(audio_data)) else: return audio_data.astype(signal_dtype) elif np.issubdtype(audio_data.dtype, np.integer): return (audio_data / np.iinfo(audio_data.dtype).max).astype(signal_dtype) else: raise ValueError('signal_dtype not support')```
计算音频的电平rms值
def eva_rms(audio_data, chs=1, mics=1): # audio_data:音频数组 # chs:通道数 # mics:目标通道 new_audio_data = rescale_to_float32(audio_data) if len(audio_data.shape) == 1: total_rms = np.sqrt(np.mean(np.square(new_audio_data))) if total_rms == 0 or math.isinf(total_rms): total_rms = 1e-10 total_rms_db = 20 * np.log10(total_rms) + 3 return total_rms_db elif len(audio_data.shape) > 1: if audio_data.shape[0] == int(chs): if int(mics - 1) >= 0: total_rms = np.sqrt(np.mean(np.square(new_audio_data), axis=1)) total_rms_db = 20 * np.log10(total_rms) + 3 total_rms_db_list = [i for i in total_rms_db if i > -90] return np.mean(total_rms_db_list) else: return ValueError('mics error') elif audio_data.shape[1] == int(chs): if int(mics - 1) >= 0: total_rms = np.sqrt(np.mean(np.square(new_audio_data.T), axis=1)) total_rms_db = 20 * np.log10(total_rms) + 3 total_rms_db_list = [i for i in total_rms_db if i > -90] return np.mean(total_rms_db_list) else: raise ValueError('mics error')
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。