Audio2Face - a project that transforms audio to blendshape weights,and drives the digital human,xiaomei,in UE project

Audio2Face

Notice

The Test part and The UE project for xiaomei created by FACEGOOD is not available for commercial use.they are for testing purposes only.

Description


ue

We create a project that transforms audio to blendshape weights,and drives the digital human,xiaomei,in UE project.

Base Module


figure1

figure2

The framework we used contains three parts.In Formant network step,we perform fixed-function analysis of the input audio clip.In the articulation network,we concatenate an emotional state vector to the output of each convolution layer after the ReLU activation. The fully-connected layers at the end expand the 256+E abstract features to blendshape weights .

Usage


pipeline

this pipeline shows how we use FACEGOOD Audio2Face.

Test video 1 Test video 2 Ryan Yun from columbia.edu

Prepare data

  • step1: record voice and video ,and create animation from video in maya. note: the voice must contain vowel ,exaggerated talking and normal talking.Dialogue covers as many pronunciations as possible.
  • step2: we deal the voice with LPC,to split the voice into segment frames corresponding to the animation frames in maya.

Input data

Use ExportBsWeights.py to export weights file from Maya.Then we can get BS_name.npy and BS_value.npy .

Use step1_LPC.py to deal with wav file to get lpc_*.npy . Preprocess the wav to 2d data.

train

we recommand that uses FACEGOOD avatary to produces trainning data.its fast and accurate. http://www.avatary.com

the data for train is stored in dataSet1

python step14_train.py --epochs 8 --dataSet dataSet1

test

In folder /test,we supply a test application named AiSpeech.
wo provide a pretrained model,zsmeif.pb
In floder /example/ueExample, we provide a packaged ue project that contains a digit human created by FACEGOOD can drived by /AiSpeech/zsmeif.py.

you can follow the steps below to use it:

  1. make sure you connect the microphone to computer.
  2. run the script in terminal.

    python zsmeif.py

  3. when the terminal show the message "run main", please run FaceGoodLiveLink.exe which is placed in /example/ueExample/ folder.
  4. click and hold on the left mouse button on the screen in UE project, then you can talk with the AI model and wait for the voice and animation response.

Dependences

tersorflow-gpu 1.15 cuda 10.0

python-libs: pyaudio requests websocket websocket-client

note: test can run with cpu.

Data


The testing data, Maya model, and ue4 test project can be downloaded from the link below.

data_all code : n6ty

GoogleDrive

Update

uploaded the LPC source into code folder.

Reference


Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion

Contact

Wechat: FACEGOOD_CHINA
Email:[email protected]
Discord: https://discord.gg/V46y6uTdw8

License

Audio2Face Core is released under the terms of the MIT license.See COPYING for more information or see https://opensource.org/licenses/MIT.

Owner
FACEGOOD
Make a World of Avatars
FACEGOOD
Comments
  • Difference between dataSet1 and dataSetx?

    Difference between dataSet1 and dataSetx?

    Hi, what is the difference between dataSet1 and dataSetx?

    Does it mean different people? Could we combine all data to train and get a person-independent model ?

    Thanks!

  • 与云渲染结合使用

    与云渲染结合使用

    您好,我初步了解了下您的项目,觉得和我们的产品有较大的合作空间。

    我们这边专注于云渲染技术,就是把 UE4,UNITY3D 等三维应用上云然后通过轻终端的浏览器等方式访问。

    在我们云渲染产品里已经把语音输入,智能语音交互(Speech)等功能集成了,如果与我们的云渲染结合使用,您这边可以专注于算法和三维渲染。

    对于高保真数组人的场景,上云渲染可以解决对终端算力的依赖。

    我们的接入Demo 点这里

    我准备先初步测试下,如果有深度合作的想法可以联系我。

  • Problem in downloading the Data from baidu cloud

    Problem in downloading the Data from baidu cloud

    Hi! I am a student from India researching audio2face and came across your project which was really incredible! However I am unable to open the baidu cloud link from India to download the training and testing data due to restriction on the website here. is there any way I can access the data? It will be of great help! Thanks

  • 关于Audio2Face的pipeline疑问

    关于Audio2Face的pipeline疑问

    由于本人还没有跑通整个pipeline,暂时有些疑问。 (1)整个pipeline,就是那张图片展示的(ASR、TTS、FACEGOOD Audio2face),其实是一个语音对话交互系统是吗? (2)最后产生的blendshape系数,其实是对话模块TTS产生的语音预测出来的系数,和一开始的麦克风录入的声音无关是吧? (3)如果我想用自己的语音驱动,是不是要使用自己的语音数据,重新训练模型?

  • 训练模型时提示GPU内存不足&是否能上传训练好的model.ckpt

    训练模型时提示GPU内存不足&是否能上传训练好的model.ckpt

    问题概述

    当我运行命令 python step14_train.py --epochs 8 --dataSet dataSet1 最后报错终止程序,控制台提示(完整报错信息放在最后): (0) Internal: Blas GEMM launch failed : a.shape=(32, 272), b.shape=(272, 150), m=32, n=150, k=272 [[node dense/MatMul (defined at C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]

    初步筛查

    我去网上查这个报错信息,发现主要都是讲GPU内存不足、GPU被其他进程占用的问题。 经排查,GPU只运行了这个程序,后面按照网上的方法为tf.GPUOptions添加了allow_growth=True,或者将per_process_gpu_memory_fraction调低一些也没用(单独测试或者组合测试都失败了)

    电脑配置

    系统:win10 GPU:device:GPU:0 with 9830 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:01:00.0, compute capability: 8.6 Python版本:3.7.11 CUDA版本:cuda_10.0.130_411.31_win10 cudnn版本:cudnn-10.0-windows10-x64-v7.6.5.32

    程序运行时内存变化

    另外程序运行时内存变化状况如下: 加载cublas64_100.dll之后GPU就直接从437M到了10358M(总共12288M),此时占用率已经到了84.3%了,应该已经突破per_process_gpu_memory_fraction=0.8的限制了 加载cudnn64_7.dll之后GPU到了10424M(总共12288M)——最大达到了10645M 之后程序就崩了

    完整报错信息

    2022-03-04 09:34:16.145502: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll WARNING:tensorflow:From step14_train.py:30: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

    WARNING:tensorflow:From step14_train.py:37: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.

    (2200, 32, 64, 1) (2200, 90000) (1000, 32, 64, 1) (1000, 90000) WARNING:tensorflow:From step14_train.py:86: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

    WARNING:tensorflow:From step14_train.py:86: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

    WARNING:tensorflow:From step14_train.py:90: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

    WARNING:tensorflow:From D:\Ningxin\Coding\Voice2Face-main\code\train\model_paper.py:21: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras.layers.Conv2D instead. WARNING:tensorflow:From C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\layers\convolutional.py:424: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version. Instructions for updating: Please use layer.__call__ method instead. WARNING:tensorflow:From D:\Ningxin\Coding\Voice2Face-main\code\train\model_paper.py:49: flatten (from tensorflow.python.layers.core) is deprecated and will be removed in a future version. Instructions for updating: Use keras.layers.flatten instead. WARNING:tensorflow:From D:\Ningxin\Coding\Voice2Face-main\code\train\model_paper.py:51: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version. Instructions for updating: Use keras.layers.Dense instead. WARNING:tensorflow:From D:\Ningxin\Coding\Voice2Face-main\code\train\model_paper.py:52: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version. Instructions for updating: Please use rate instead of keep_prob. Rate should be set to rate = 1 - keep_prob. WARNING:tensorflow:From step14_train.py:98: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

    WARNING:tensorflow:From step14_train.py:103: The name tf.train.exponential_decay is deprecated. Please use tf.compat.v1.train.exponential_decay instead.

    WARNING:tensorflow:From step14_train.py:105: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.

    WARNING:tensorflow:From step14_train.py:105: The name tf.GraphKeys is deprecated. Please use tf.compat.v1.GraphKeys instead.

    WARNING:tensorflow:From step14_train.py:106: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

    WARNING:tensorflow:From step14_train.py:108: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.

    WARNING:tensorflow:From step14_train.py:111: The name tf.GPUOptions is deprecated. Please use tf.compat.v1.GPUOptions instead.

    WARNING:tensorflow:From step14_train.py:122: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

    WARNING:tensorflow:From step14_train.py:122: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

    2022-03-04 09:34:25.346984: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2022-03-04 09:34:25.351173: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll 2022-03-04 09:34:25.389703: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: NVIDIA GeForce RTX 3080 Ti major: 8 minor: 6 memoryClockRate(GHz): 1.71 pciBusID: 0000:01:00.0 2022-03-04 09:34:25.389860: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2022-03-04 09:34:25.469703: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2022-03-04 09:34:25.566662: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll 2022-03-04 09:34:25.589486: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll 2022-03-04 09:34:25.662159: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll 2022-03-04 09:34:25.713455: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll 2022-03-04 09:34:25.799900: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2022-03-04 09:34:25.800359: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2022-03-04 09:37:34.818241: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix: 2022-03-04 09:37:34.818380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0 2022-03-04 09:37:34.818619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N 2022-03-04 09:37:34.819527: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9830 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:01:00.0, compute capability: 8.6) WARNING:tensorflow:From step14_train.py:126: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

    WARNING:tensorflow:From step14_train.py:127: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.

    2022-03-04 09:37:35.857089: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2022-03-04 09:38:32.908660: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2022-03-04 09:49:39.479241: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows Relying on driver to perform ptx compilation. This message will be only logged once. 2022-03-04 09:49:39.610678: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call return fn(*args) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn target_list, run_metadata) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found. (0) Internal: Blas GEMM launch failed : a.shape=(32, 272), b.shape=(272, 150), m=32, n=150, k=272 [[{{node dense/MatMul}}]] [[Adam/update/_38]] (1) Internal: Blas GEMM launch failed : a.shape=(32, 272), b.shape=(272, 150), m=32, n=150, k=272 [[{{node dense/MatMul}}]] 0 successful operations. 0 derived errors ignored.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "step14_train.py", line 190, in train() File "step14_train.py", line 136, in train train_op = sess.run(train_step, feed_dict={data: train_data, label: train_label, keep_pro: 0.5}) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run run_metadata_ptr) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run run_metadata) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found. (0) Internal: Blas GEMM launch failed : a.shape=(32, 272), b.shape=(272, 150), m=32, n=150, k=272 [[node dense/MatMul (defined at C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]] [[Adam/update/_38]] (1) Internal: Blas GEMM launch failed : a.shape=(32, 272), b.shape=(272, 150), m=32, n=150, k=272 [[node dense/MatMul (defined at C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]] 0 successful operations. 0 derived errors ignored.

    Original stack trace for 'dense/MatMul': File "step14_train.py", line 190, in train() File "step14_train.py", line 95, in train output, emotion_input = net(data,output_size,keep_pro) File "D:\Ningxin\Coding\Voice2Face-main\code\train\model_paper.py", line 51, in net fc1 = tf.layers.dense(inputs=flat, units=150 , activation=None) #activation=None表示使用线性激活器 File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 324, in new_func return func(*args, **kwargs) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\layers\core.py", line 187, in dense return layer.apply(inputs) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 324, in new_func return func(*args, **kwargs) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 1700, in apply return self.call(inputs, *args, **kwargs) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\layers\base.py", line 548, in call outputs = super(Layer, self).call(inputs, *args, **kwargs) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 854, in call outputs = call_fn(cast_inputs, *args, **kwargs) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\autograph\impl\api.py", line 234, in wrapper return converted_call(f, options, args, kwargs) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\autograph\impl\api.py", line 439, in converted_call return _call_unconverted(f, args, kwargs, options) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\autograph\impl\api.py", line 330, in _call_unconverted return f(*args, **kwargs) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\keras\layers\core.py", line 1050, in call outputs = gen_math_ops.mat_mul(inputs, self.kernel) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 6136, in mat_mul name=name) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op attrs, op_def, compute_device) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal op_def=op_def) File "C:\ProgramData\Anaconda3\envs\py37_tensorflow\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in init self._traceback = tf_stack.extract_stack()

Nuclei - Burp Extension allows to run nuclei scanner directly from burp and transforms json results into the issues
Nuclei - Burp Extension allows to run nuclei scanner directly from burp and transforms json results into the issues

Nuclei - Burp Extension Simple extension that allows to run nuclei scanner directly from burp and transforms json results into the issues. Installatio

May 17, 2022
Audio-analytics for music-producers! Automate tedious tasks such as musical scale detection, BPM rate classification and audio file conversion.
Audio-analytics for music-producers! Automate tedious tasks such as musical scale detection, BPM rate classification and audio file conversion.

Click here to be re-directed to the Beat Inspect Streamlit Web-App You are a music producer? Let's get in touch via LinkedIn Fundamental Analytics for

May 16, 2022
DSG - Source code for Digital Scholarship Grant project.

DSG Source code for Dr. Stephanie Tsang's Digital Scholarship Grant project. Work performed by Mr. Wang Minghao while as her Research Assistant. The s

Jan 4, 2022
Purge your likes and wall comments from VKontakte. Set yourself free from your digital footprint.

vk_liberator Regain liberty in the cruel social media world. This program assists you with purging your metadata from Russian social network VKontakte

Jun 11, 2021
The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.

The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.

Mar 24, 2022
Mnemosyne: efficient learning with powerful digital flash-cards.

Mnemosyne: Optimized Flashcards and Research Project Mnemosyne is: a free, open-source, spaced-repetition flashcard program that helps you learn as ef

May 19, 2022
Reproduce digital electronics in Python

Pylectronics Reproduce digital electronics in Python Report Bug · Request Feature Table of Contents About The Project Getting Started Prerequisites In

Dec 20, 2021
Various hdas (Houdini Digital Assets)
Various hdas (Houdini Digital Assets)

aaTools My various assets for Houdini "ms_asset_loader" - Custom importer assets from Quixel Bridge "asset_placer" - Tool for placment sop geometry on

May 15, 2022
About A python based Apple Quicktime protocol,you can record audio and video from real iOS devices

介绍 本应用程序使用 python 实现,可以通过 USB 连接 iOS 设备进行屏幕共享 高帧率(30〜60fps) 高画质 低延迟(<200ms) 非侵入性 支持多设备并行 Mac OSX 安装 python >=3.7 brew install libusb pkg-config 如需使用 g

Apr 26, 2022
This app converts an pdf file into the audio file.

PDF-to-Audio This app takes an pdf as an input and convert it into audio, and the library text-to-speech starts speaking the preffered page given in t

Aug 4, 2021
Transpiles some Python into human-readable Golang.

pytago Transpiles some Python into human-readable Golang. Try out the web demo Installation and usage There are two "officially" supported ways to use

May 14, 2022
Direct Multi-view Multi-person 3D Human Pose Estimation
Direct Multi-view Multi-person 3D Human Pose Estimation

Implementation of NeurIPS-2021 paper: Direct Multi-view Multi-person 3D Human Pose Estimation [paper] [video-YouTube, video-Bilibili] [slides] This is

May 16, 2022
Neogex is a human readable parser standard, being implemented in Python

Neogex (New Expressions) Parsing Standard Much like Regex, Neogex allows for string parsing and validation based on a set of requirements. Unlike Rege

Dec 17, 2021
This is a Python 3.10 port of mock, a library for manipulating human-readable message strings.

This is a Python 3.10 port of mock, a library for manipulating human-readable message strings.

Dec 31, 2021
May 12, 2022
edgetest is a tox-inspired python library that will loop through your project's dependencies, and check if your project is compatible with the latest version of each dependency

Bleeding edge dependency testing Full Documentation edgetest is a tox-inspired python library that will loop through your project's dependencies, and

May 13, 2022
Covid-19-Trends - A project that me and my friends created as the CSC110 Final Project at UofT

Covid-19-Trends Introduction The COVID-19 pandemic has caused severe financial s

Jan 7, 2022
🛠️ Learn a technology X by doing a project - Search engine of project-based learning
🛠️ Learn a technology X by doing a project  - Search engine of project-based learning

Learn X by doing Y ??️ Learn a technology X by doing a project Y Website You can contribute by adding projects to the CSV file.

May 19, 2022
The-White-Noise-Project - The project creates noise intentionally

The-White-Noise-Project High quality audio matters everywhere, even in noise. Be

Jan 2, 2022