赞
踩
万丈高楼平地起,先搭建环境测试下wav2lib效果。
wav2lib 是一种基于深度学习的语音驱动面部动画生成算法。该算法的核心思想是将语音信号中的信息映射到面部动画参数中,从而生成逼真的面部动画。Wav2Lip算法主要包括两个阶段:特征提取阶段和动画生成阶段。在特征提取阶段,算法通过对输入的语音信号进行特征提取,得到与语音相关的特征表示。在动画生成阶段,算法利用提取的特征表示,预测面部动画参数,进而生成面部动画。
git clone https://github.com/Rudrabha/Wav2Lip
Cloning into 'Wav2Lip'...
remote: Enumerating objects: 381, done.
remote: Counting objects: 100% (3/3), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 381 (delta 0), reused 0 (delta 0), pack-reused 378
Receiving objects: 100% (381/381), 538.67 KiB | 941.00 KiB/s, done.
Resolving deltas: 100% (209/209), done.
conda create -n wav2lib python=3.7
conda activae wav2lib
//安装pytorch
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
cd wav2lib
pip install -r requirements.txt //注释掉里面的pythorch 、torchvision版本
//结果如下
Collecting cffi>=1.0
Using cached cffi-1.15.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (427 kB)
Collecting pycparser
Using cached pycparser-2.21-py2.py3-none-any.whl (118 kB)
Building wheels for collected packages: librosa
Building wheel for librosa (setup.py) ... done
Created wheel for librosa: filename=librosa-0.7.0-py3-none-any.whl size=1598349 sha256=a92ac1ebb2dac233b3fea89810abf827ff483be9980b3bfe426b9cc33c5f9fa8
Stored in directory: /home/ps/.cache/pip/wheels/78/fc/20/f0576a7fe176fa34e400f46fd92ae9663cc65c2d01cddb85aa
Successfully built librosa
Installing collected packages: llvmlite, tqdm, threadpoolctl, six, pycparser, numpy, joblib, decorator, audioread, scipy, opencv-python, opencv-contrib-python, numba, cffi, soundfile, scikit-learn, resampy, librosa
Attempting uninstall: numpy
Found existing installation: numpy 1.21.6
Uninstalling numpy-1.21.6:
Successfully uninstalled numpy-1.21.6
Successfully installed audioread-3.0.1 cffi-1.15.1 decorator-5.1.1 joblib-1.3.2 librosa-0.7.0 llvmlite-0.31.0 numba-0.48.0 numpy-1.17.1 opencv-contrib-python-4.9.0.80 opencv-python-4.1.0.25 pycparser-2.21 resampy-0.3.1 scikit-learn-1.0.2 scipy-1.7.3 six-1.16.0 soundfile-0.12.1 threadpoolctl-3.1.0 tqdm-4.45.0
python inference.py --checkpoint_path <ckpt> --face <video.mp4> --audio <an-audio-source>
//查看结果文件
ffplay -autoexit filename.mp4 //ubuntu 下查看mp4文件
//整体效果
整体上看,效果一般。
需要优化的点:嘴型部分比较模糊,牙齿不清晰
实际生成的视频效果
result_voice
Traceback (most recent call last):
File "inference.py", line 3, in <module>
import scipy, cv2, os, sys, argparse, audio
File "/devdata/anaconda3/envs/wav2lib/lib/python3.7/site-packages/cv2/__init__.py", line 181, in <module>
bootstrap()
File "/devdata/anaconda3/envs/wav2lib/lib/python3.7/site-packages/cv2/__init__.py", line 175, in bootstrap
if __load_extra_py_code_for_module("cv2", submodule, DEBUG):
File "/devdata/anaconda3/envs/wav2lib/lib/python3.7/site-packages/cv2/__init__.py", line 28, in __load_extra_py_code_for_module
py_module = importlib.import_module(module_name)
File "/devdata/anaconda3/envs/wav2lib/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/devdata/anaconda3/envs/wav2lib/lib/python3.7/site-packages/cv2/mat_wrapper/__init__.py", line 39, in <module>
cv._registerMatType(Mat)
AttributeError: module 'cv2' has no attribute '_registerMatType'
//解决办法
pip install --upgrade opencv-python
pip install --upgrade opencv-contrib-python
pip install --upgrade opencv-python-headless
opencv-python 版本>= 4.5.4
Traceback (most recent call last):
File "inference.py", line 283, in <module>
main()
File "inference.py", line 255, in main
model = load_model(args.checkpoint_path)
File "inference.py", line 174, in load_model
checkpoint = _load(path)
File "inference.py", line 165, in _load
checkpoint = torch.load(checkpoint_path)
File "/devdata/anaconda3/envs/wav2lib/lib/python3.7/site-packages/torch/serialization.py", line 595, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/devdata/anaconda3/envs/wav2lib/lib/python3.7/site-packages/
//解决办法
这个错误通常是由于pickle模块在读取文件时发现文件结束符(EOF)而引起的。可能的解决方案如下:
1.检查文件路径和文件名是否正确;下载的文件是否完整。(经过分析是权重没有下载完全)
2.确保文件存在并且您有足够的权限读取它。
3.如果您正在使用数据流,请确保数据流不为空,并且已经打开。您可以使用data_stream.readable()检查数据流是否可读。
4.尝试使用不同版本的Python或pickle协议重新生成您的pickle文件,以确保其与您当前使用的Python版本兼容。
5.如果pickle文件过大,可能会导致内存问题,建议使用pickle.load()的mmap_mode参数,这将启用内存映射模式,从而减少内存占用。
wav2lib 能够根据图片和语音生成视频,目前在嘴型的清晰度需要优化,牙齿部分存在缺失失真的情况。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。