Stitch it in Time: GAN-Based Facial Editing of Real Videos

STIT - Stitch it in Time

arXiv CGP WAI

[Project Page]

Stitch it in Time: GAN-Based Facial Editing of Real Videos
Rotem Tzaban, Ron Mokady, Rinon Gal, Amit Bermano, Daniel Cohen-Or

Abstract:
The ability of Generative Adversarial Networks to encode rich semantics within their latent space has been widely adopted for facial image editing. However, replicating their success with videos has proven challenging. Sets of high-quality facial videos are lacking, and working with videos introduces a fundamental barrier to overcome - temporal coherency. We propose that this barrier is largely artificial. The source video is already temporally coherent, and deviations from this state arise in part due to careless treatment of individual components in the editing pipeline. We leverage the natural alignment of StyleGAN and the tendency of neural networks to learn low frequency functions, and demonstrate that they provide a strongly consistent prior. We draw on these insights and propose a framework for semantic editing of faces in videos, demonstrating significant improvements over the current state-of-the-art. Our method produces meaningful face manipulations, maintains a higher degree of temporal consistency, and can be applied to challenging, high quality, talking head videos which current methods struggle with.

Requirements

Pytorch(tested with 1.10, should work with 1.8/1.9 as well) + torchvision

For the rest of the requirements, run:

pip install Pillow imageio imageio-ffmpeg dlib face-alignment opencv-python click wandb tqdm scipy matplotlib clip lpips 

Pretrained models

In order to use this project you need to download pretrained models from the following Link.

Unzip it inside the project's main directory.

You can use the download_models.sh script (requires installing gdown with pip install gdown)

Alternatively, you can unzip the models to a location of your choice and update configs/path_config.py accordingly.

Splitting videos into frames

Our code expects videos in the form of a directory with individual frame images. To produce such a directory from an existing video, we recommend using ffmpeg:

ffmpeg -i "video.mp4" "video_frames/out%04d.png"

Example Videos

The videos used to produce our results can be downloaded from the following Link.

Inversion

To invert a video run:

python train.py --input_folder /path/to/images_dir \ 
 --output_folder /path/to/experiment_dir \
 --run_name RUN_NAME \
 --num_pti_steps NUM_STEPS

This includes aligning, cropping, e4e encoding and PTI

For example:

python train.py --input_folder /data/obama \ 
 --output_folder training_results/obama \
 --run_name obama \
 --num_pti_steps 80

Weights and biases logging is disabled by default. to enable, add --use_wandb

Naive Editing

To run edits without stitching tuning:

python edit_video.py --input_folder /path/to/images_dir \ 
 --output_folder /path/to/experiment_dir \
 --run_name RUN_NAME \
 --edit_name EDIT_NAME \
 --edit_range EDIT_RANGE \  

edit_range determines the strength of the edits applied. It should be in the format RANGE_START RANGE_END RANGE_STEPS.
for example, if we use --edit_range 1 5 2, we will apply edits with strength 1, 3 and 5.

For young Obama use:

python edit_video.py --input_folder /data/obama \ 
 --output_folder edits/obama/ \
 --run_name obama \
 --edit_name age \
 --edit_range -8 -8 1 \  

Editing + Stitching Tuning

To run edits with stitching tuning:

python edit_video_stitching_tuning.py --input_folder /path/to/images_dir \ 
 --output_folder /path/to/experiment_dir \
 --run_name RUN_NAME \
 --edit_name EDIT_NAME \
 --edit_range EDIT_RANGE \
 --outer_mask_dilation MASK_DILATION

We support early breaking the stitching tuning process, when the loss reaches a specified threshold.
This enables us to perform more iterations for difficult frames while maintaining a reasonable running time.
To use this feature, add --border_loss_threshold THRESHOLD to the command(Shown in the Jim and Kamala Harris examples below).
For videos with a simple background to reconstruct (e.g Obama, Jim, Emma Watson, Kamala Harris), we use THRESHOLD=0.005.
For videos where a more exact reconstruction of the background is required (e.g Michael Scott), we use THRESHOLD=0.002.
Early breaking is disabled by default.

For young Obama use:

python edit_video_stitching_tuning.py --input_folder /data/obama \ 
 --output_folder edits/obama/ \
 --run_name obama \
 --edit_name age \
 --edit_range -8 -8 1 \  
 --outer_mask_dilation 50

For gender editing on Obama use:

python edit_video_stitching_tuning.py --input_folder /data/obama \ 
 --output_folder edits/obama/ \
 --run_name obama \
 --edit_name gender \
 --edit_range -6 -6 1 \  
 --outer_mask_dilation 50

For young Emma Watson use:

python edit_video_stitching_tuning.py --input_folder /data/emma_watson \ 
 --output_folder edits/emma_watson/ \
 --run_name emma_watson \
 --edit_name age \
 --edit_range -8 -8 1 \  
 --outer_mask_dilation 50

For smile removal on Emma Watson use:

python edit_video_stitching_tuning.py --input_folder /data/emma_watson \ 
 --output_folder edits/emma_watson/ \
 --run_name emma_watson \
 --edit_name smile \
 --edit_range -3 -3 1 \  
 --outer_mask_dilation 50

For Emma Watson lipstick editing use: (done with styleclip global direction)

python edit_video_stitching_tuning.py --input_folder /data/emma_watson \ 
 --output_folder edits/emma_watson/ \
 --run_name emma_watson \
 --edit_type styleclip_global \
 --edit_name lipstick \
 --neutral_class "Face" \
 --target_class "Face with lipstick" \
 --beta 0.2 \
 --edit_range 10 10 1 \  
 --outer_mask_dilation 50

For Old + Young Jim use (with early breaking):

python edit_video_stitching_tuning.py --input_folder datasets/jim/ \
 --output_folder edits/jim \
 --run_name jim \
 --edit_name age \
 --edit_range -8 8 2 \
 --outer_mask_dilation 50 \ 
 --border_loss_threshold 0.005

For smiling Kamala Harris:

python edit_video_stitching_tuning.py \
 --input_folder datasets/kamala/ \ 
 --output_folder edits/kamala \
 --run_name kamala \
 --edit_name smile \
 --edit_range 2 2 1 \
 --outer_mask_dilation 50 \
 --border_loss_threshold 0.005

Example Results

With stitching tuning:

out.mp4

Without stitching tuning:

out.mp4

Gender editing:

out.mp4

Young Emma Watson:

out.mp4

Emma Watson with lipstick:

out.mp4

Emma Watson smile removal:

out.mp4

Old Jim:

out.mp4

Young Jim:

out.mp4

Smiling Kamala Harris:

out.mp4

Out of domain video editing (Animations)

For editing out of domain videos, Some different parameters are required while training. First, dlib's face detector doesn't detect all animated faces, so we use a different face detector provided by the face_alignment package. Second, we reduce the smoothing of the alignment parameters with --center_sigma 0.0 Third, OOD videos require more training steps, as they are more difficult to invert.

To train, we use:

python train.py --input_folder datasets/ood_spiderverse_gwen/ \
 --output_folder training_results/ood \
 --run_name ood \
 --num_pti_steps 240 \
 --use_fa \
 --center_sigma 0.0

Afterwards, editing is performed the same way:

python edit_video.py --input_folder datasets/ood_spiderverse_gwen/ \
 --output_folder edits/ood --run_name ood \
 --edit_name smile --edit_range 2 2 1

out.mp4

python edit_video.py --input_folder datasets/ood_spiderverse_gwen/ \
 --output_folder edits/ood \
 --run_name ood \
 --edit_type styleclip_global
 --edit_range 10 10 1
 --edit_name lipstick
 --target_class 'Face with lipstick'

out.mp4

Credits:

StyleGAN2-ada model and implementation:
https://github.com/NVlabs/stylegan2-ada-pytorch Copyright © 2021, NVIDIA Corporation.
Nvidia Source Code License https://nvlabs.github.io/stylegan2-ada-pytorch/license.html

PTI implementation:
https://github.com/danielroich/PTI
Copyright (c) 2021 Daniel Roich
License (MIT) https://github.com/danielroich/PTI/blob/main/LICENSE

LPIPS model and implementation:
https://github.com/richzhang/PerceptualSimilarity
Copyright (c) 2020, Sou Uchida
License (BSD 2-Clause) https://github.com/richzhang/PerceptualSimilarity/blob/master/LICENSE

e4e model and implementation:
https://github.com/omertov/encoder4editing Copyright (c) 2021 omertov
License (MIT) https://github.com/omertov/encoder4editing/blob/main/LICENSE

StyleCLIP model and implementation:
https://github.com/orpatashnik/StyleCLIP Copyright (c) 2021 orpatashnik
License (MIT) https://github.com/orpatashnik/StyleCLIP/blob/main/LICENSE

StyleGAN2 Distillation for Feed-forward Image Manipulation - for editing directions:
https://github.com/EvgenyKashin/stylegan2-distillation
Copyright (c) 2019, Yandex LLC
License (Creative Commons NonCommercial) https://github.com/EvgenyKashin/stylegan2-distillation/blob/master/LICENSE

face-alignment Library:
https://github.com/1adrianb/face-alignment
Copyright (c) 2017, Adrian Bulat
License (BSD 3-Clause License) https://github.com/1adrianb/face-alignment/blob/master/LICENSE

face-parsing.PyTorch:
https://github.com/zllrunning/face-parsing.PyTorch
Copyright (c) 2019 zll
License (MIT) https://github.com/zllrunning/face-parsing.PyTorch/blob/master/LICENSE

Citation

If you make use of our work, please cite our paper:

@misc{tzaban2022stitch,
      title={Stitch it in Time: GAN-Based Facial Editing of Real Videos},
      author={Rotem Tzaban and Ron Mokady and Rinon Gal and Amit H. Bermano and Daniel Cohen-Or},
      year={2022},
      eprint={2201.08361},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Comments
  • ImportError: No module named 'upfirdn2d_plugin'

    ImportError: No module named 'upfirdn2d_plugin'

    I got this warning...

    (stitenv) [email protected]:/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT$ python train.py --input_folder ./data/obama --output_folder ./training_results/obama --run_name obama --num_pti_steps 80
    Number of images: 200
    Aligning images
    100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:01<00:00, 117.42it/s]
    100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:04<00:00, 47.81it/s]
    Aligning completed
    Loading e4e over the pSp framework from checkpoint: ./pretrained_models/e4e_ffhq_encode.pt
    Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
    Loading model from: /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/lpips/weights/v0.1/alex.pth
    Calculating initial inversions
    100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:37<00:00,  5.30it/s]
    Fine tuning generator
      0%|                                                                                                                                                                                                 | 0/80 [00:00<?, ?it/s]Setting up PyTorch plugin "bias_act_plugin"... Failed!
    /mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/bias_act.py:50: UserWarning: Failed to build CUDA kernels for bias_act. Falling back to slow reference implementation. Details:
    
    Traceback (most recent call last):
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1673, in _run_ninja_build
        env=env)
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/subprocess.py", line 512, in run
        output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/bias_act.py", line 48, in _init
        _plugin = custom_ops.get_plugin('bias_act_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math'])
      File "/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/custom_ops.py", line 110, in get_plugin
        torch.utils.cpp_extension.load(name=module_name, verbose=verbose_build, sources=sources, **build_kwargs)
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1091, in load
        keep_intermediates=keep_intermediates)
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1302, in _jit_compile
        is_standalone=is_standalone)
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1407, in _write_ninja_file_and_build_library
        error_prefix=f"Error building extension '{name}'")
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1683, in _run_ninja_build
        raise RuntimeError(message) from e
    RuntimeError: Error building extension 'bias_act_plugin': [1/2] /usr/bin/nvcc --generate-dependencies-with-compile --dependency-output bias_act.cuda.o.d -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/TH -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/THC -isystem /home/hari/anaconda3/envs/stitenv/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' --use_fast_math -std=c++14 -c /mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/bias_act.cu -o bias_act.cuda.o 
    FAILED: bias_act.cuda.o 
    /usr/bin/nvcc --generate-dependencies-with-compile --dependency-output bias_act.cuda.o.d -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/TH -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/THC -isystem /home/hari/anaconda3/envs/stitenv/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' --use_fast_math -std=c++14 -c /mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/bias_act.cu -o bias_act.cuda.o 
    nvcc fatal   : Unknown option '-generate-dependencies-with-compile'
    ninja: build stopped: subcommand failed.
    
    
      warnings.warn('Failed to build CUDA kernels for bias_act. Falling back to slow reference implementation. Details:\n\n' + traceback.format_exc())
    Setting up PyTorch plugin "upfirdn2d_plugin"... Failed!
    /mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/upfirdn2d.py:34: UserWarning: Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:
    
    Traceback (most recent call last):
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1673, in _run_ninja_build
        env=env)
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/subprocess.py", line 512, in run
        output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/upfirdn2d.py", line 32, in _init
        _plugin = custom_ops.get_plugin('upfirdn2d_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math'])
      File "/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/custom_ops.py", line 110, in get_plugin
        torch.utils.cpp_extension.load(name=module_name, verbose=verbose_build, sources=sources, **build_kwargs)
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1091, in load
        keep_intermediates=keep_intermediates)
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1302, in _jit_compile
        is_standalone=is_standalone)
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1407, in _write_ninja_file_and_build_library
        error_prefix=f"Error building extension '{name}'")
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1683, in _run_ninja_build
        raise RuntimeError(message) from e
    RuntimeError: Error building extension 'upfirdn2d_plugin': [1/2] /usr/bin/nvcc --generate-dependencies-with-compile --dependency-output upfirdn2d.cuda.o.d -DTORCH_EXTENSION_NAME=upfirdn2d_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/TH -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/THC -isystem /home/hari/anaconda3/envs/stitenv/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' --use_fast_math -std=c++14 -c /mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/upfirdn2d.cu -o upfirdn2d.cuda.o 
    FAILED: upfirdn2d.cuda.o 
    /usr/bin/nvcc --generate-dependencies-with-compile --dependency-output upfirdn2d.cuda.o.d -DTORCH_EXTENSION_NAME=upfirdn2d_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/TH -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/THC -isystem /home/hari/anaconda3/envs/stitenv/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' --use_fast_math -std=c++14 -c /mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/upfirdn2d.cu -o upfirdn2d.cuda.o 
    nvcc fatal   : Unknown option '-generate-dependencies-with-compile'
    ninja: build stopped: subcommand failed.
    
    
      warnings.warn('Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:\n\n' + traceback.format_exc())
    Setting up PyTorch plugin "upfirdn2d_plugin"... Failed!
    /mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/upfirdn2d.py:34: UserWarning: Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:
    
    Traceback (most recent call last):
      File "/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/upfirdn2d.py", line 32, in _init
        _plugin = custom_ops.get_plugin('upfirdn2d_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math'])
      File "/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/custom_ops.py", line 110, in get_plugin
        torch.utils.cpp_extension.load(name=module_name, verbose=verbose_build, sources=sources, **build_kwargs)
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1091, in load
        keep_intermediates=keep_intermediates)
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1317, in _jit_compile
        return _import_module_from_library(name, build_directory, is_python_module)
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1699, in _import_module_from_library
        file, path, description = imp.find_module(module_name, [path])
      File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/imp.py", line 296, in find_module
        raise ImportError(_ERR_MSG.format(name), name=name)
    ImportError: No module named 'upfirdn2d_plugin'
    
  • TypeError: Got secondary option for non boolean flag.

    TypeError: Got secondary option for non boolean flag.

    Traceback (most recent call last): File "train.py", line 60, in @click.option('--use_wandb/--no_wandb', default=False) File "/home/xwh/anaconda3/lib/python3.7/site-packages/click/decorators.py", line 173, in decorator _param_memo(f, OptionClass(param_decls, **option_attrs)) File "/home/xwh/anaconda3/lib/python3.7/site-packages/click/core.py", line 1601, in init raise TypeError('Got secondary option for non boolean flag.') TypeError: Got secondary option for non boolean flag.

    Got an error. Anyone knows how to fix this?

  • ValueError: Image must be a numpy array.

    ValueError: Image must be a numpy array.

    File "edit_video.py", line 137, in _main
        imageio.mimwrite(os.path.join(folder_path, 'out.mp4'), frames, fps=18, output_params=['-vf', 'fps=25'])
      File "/opt/conda/lib/python3.8/site-packages/imageio/core/functions.py", line 338, in mimwrite
        raise ValueError('Image must be a numpy array.')
    

    imageio.mimwrite(uri, ims, format=None, **kwargs) Write multiple images to the specified file. Parameters ... ims [sequence of numpy arrays] The image data. Each array must be NxM, NxMx3 or NxMx4.

    So there is something wrong.

  • Aligning images

    Aligning images

    Hi!Your work is amazing!

    But there is a problem with image alignment. (https://github.com/rotemtzaban/STIT/blob/ef851d3f0ecc1839cbd307fdfdc7fa6f981f6228/train.py#L81)

    The next two rows are the input and output respectively. And the result in the second row does not look well aligned(The angle of the face posture is different). Do you have any advice for me? 164

  • not an issue - question on beta value

    not an issue - question on beta value

    I've been playing around with styleclip -

    python edit_video_stitching_tuning.py --input_folder data/obama \
     --output_folder edits/obama/ \
     --run_name obama \
     --edit_type styleclip_global \
     --edit_name aids \
     --neutral_class "Face" \
     --target_class "Face with sores" \
     --beta 0.1 \ 
     --edit_range 10 10 1 \
     --outer_mask_dilation 50 \
     --start_frame 0 \
     --end_frame 100
    

    changing the target class - ValueError: Beta value 0.15 is too high for mapping from Face to Face with sores, try setting it to a lower value

    I have to bump it down to 0.1 for the program to run. But then the results seem indifferent to original video. Is there something obvious between the two? 'Face with sores' vs 'Face with lipstick'

    also styleclip has global mappers which take > 10 hrs to build - but then they're fast for inference. https://github.com/orpatashnik/StyleCLIP

    "mapper/pretrained/afro.pt": "https://drive.google.com/uc?id=1i5vAqo4z0I-Yon3FNft_YZOq7ClWayQJ",
    "mapper/pretrained/angry.pt": "https://drive.google.com/uc?id=1g82HEH0jFDrcbCtn3M22gesWKfzWV_ma",
    "mapper/pretrained/beyonce.pt": "https://drive.google.com/uc?id=1KJTc-h02LXs4zqCyo7pzCp0iWeO6T9fz",
    "mapper/pretrained/bobcut.pt": "https://drive.google.com/uc?id=1IvyqjZzKS-vNdq_OhwapAcwrxgLAY8UF",
    "mapper/pretrained/bowlcut.pt": "https://drive.google.com/uc?id=1xwdxI2YCewSt05dEHgkpmmzoauPjEnnZ",
    "mapper/pretrained/curly_hair.pt": "https://drive.google.com/uc?id=1xZ7fFB12Ci6rUbUfaHPpo44xUFzpWQ6M",
    "mapper/pretrained/depp.pt": "https://drive.google.com/uc?id=1FPiJkvFPG_y-bFanxLLP91wUKuy-l3IV",
    "mapper/pretrained/hilary_clinton.pt": "https://drive.google.com/uc?id=1X7U2zj2lt0KFifIsTfOOzVZXqYyCWVll",
    "mapper/pretrained/mohawk.pt": "https://drive.google.com/uc?id=1oMMPc8iQZ7dhyWavZ7VNWLwzf9aX4C09",
    "mapper/pretrained/purple_hair.pt": "https://drive.google.com/uc?id=14H0CGXWxePrrKIYmZnDD2Ccs65EEww75",
    "mapper/pretrained/surprised.pt": "https://drive.google.com/uc?id=1F-mPrhO-UeWrV1QYMZck63R43aLtPChI",
    "mapper/pretrained/taylor_swift.pt": "https://drive.google.com/uc?id=10jHuHsKKJxuf3N0vgQbX_SMEQgFHDrZa",
    "mapper/pretrained/trump.pt": "https://drive.google.com/uc?id=14v8D0uzy4tOyfBU3ca9T0AzTt3v-dNyh",
    "mapper/pretrained/zuckerberg.pt": "https://drive.google.com/uc?id=1NjDcMUL8G-pO3i_9N6EPpQNXeMc3Ar1r",
    
    "example_celebs.pt": "https://drive.google.com/uc?id=1VL3lP4avRhz75LxSza6jgDe-pHd2veQG"
    

    Did you experiment using these ? It may shorten the time to render..... or is the time mostly on pti?

  • missing import line in styleclip_global_utils.py?

    missing import line in styleclip_global_utils.py?

    Hi, thank you so much for sharing this amazing work. I just wanted to ask that if there's a missing line import clip in the file editings/styleclip_global_utils.py, because I got the following error when trying to run it with styleclip and it was fixed by adding that import line.

    image

    Thanks!

  • How to use the pre-trained StyleGAN2 for another architecture?

    How to use the pre-trained StyleGAN2 for another architecture?

    Hi I want to replace the pre-trained StyleGAN2 model ffhq.pkl with rosinality's StyleGAN2 Implementation, It seems that currently your implementation of StyleGAN2 is based on StyleGAN2-ada. These two implementations have different weights paramenters & architecture. Besides changing the Generator network, Is there any way for a convertion? Thanks for your great work!

  • fix bug of train & edit_video

    fix bug of train & edit_video

    1.Fix bug of type dismatch for transform.Resize() in training part. The original code would results in TypeError: img should be PIL Image. Got <class 'torch.Tensor'>. I replace the Resize op with F.interpolate when the image was still a Tensor so do not need to convert type TWICE.

    2.Fix bug of Normalize for transform.normalize in edit_video.py.

  • help

    help

    warnings.warn('Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:\n\n' + traceback.format_exc()) Setting up PyTorch plugin "upfirdn2d_plugin"... Failed! D:\dongzuoqianyi\STIT-main\torch_utils\ops\upfirdn2d.py:34: UserWarning: Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:

    Traceback (most recent call last): File "D:\dongzuoqianyi\STIT-main\torch_utils\ops\upfirdn2d.py", line 32, in _init _plugin = custom_ops.get_plugin('upfirdn2d_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math']) File "D:\dongzuoqianyi\STIT-main\torch_utils\custom_ops.py", line 110, in get_plugin torch.utils.cpp_extension.load(name=module_name, verbose=verbose_build, sources=sources, **build_kwargs) File "C:\ProgramData\Anaconda3\envs\stit\lib\site-packages\torch\utils\cpp_extension.py", line 1136, in load keep_intermediates=keep_intermediates) File "C:\ProgramData\Anaconda3\envs\stit\lib\site-packages\torch\utils\cpp_extension.py", line 1362, in _jit_compile return _import_module_from_library(name, build_directory, is_python_module) File "C:\ProgramData\Anaconda3\envs\stit\lib\site-packages\torch\utils\cpp_extension.py", line 1752, in _import_module_from_library module = importlib.util.module_from_spec(spec) File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: DLL load failed: 找不到指定的模块。

    warnings.warn('Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:\n\n' + traceback.format_exc()) Setting up PyTorch plugin "upfirdn2d_plugin"...

  • Duration of out.mp4 is messed up

    Duration of out.mp4 is messed up

    Hi

    The out.mp4 time duration results to be longer than original video becouse of this line in imageio.mimwrite(os.path.join(folder_path, 'out.mp4'), frames, fps=18, output_params=['-vf', 'fps=25'])

    18 must be changed to 25 to fix this issue

  • Training my own InterfaceGAN boundary

    Training my own InterfaceGAN boundary

    Thanks for your great work, the results were impressive!

    I'd to train my own InterfaceGAN w_direction, and noticed that your w_direction vector(such as 'age.npy') is the size of (18, 512) rather than (1,512).

    My previous experience was sampling a lot of images with their w counter part with the size of (1,512), then train a model for binary classification. So I wonder whether you're using w+ with the size of(18,512) for training InterfaceGAN?

    Refer to https://github.com/omertov/encoder4editing/issues/9#issuecomment-806139382

  • Improper reconstruction of view by Stitching Tuning Method (edit_video_stitching_tuning.py)

    Improper reconstruction of view by Stitching Tuning Method (edit_video_stitching_tuning.py)

    Hi I ran the code on the following mp4 file : https://user-images.githubusercontent.com/75319437/187859899-f8d52a47-cb55-4cff-903f-24b157916969.mp4 The reconstructed video on stitching tuning was : https://user-images.githubusercontent.com/75319437/187860422-43cb399f-03e2-4192-a983-ef518a26ec7c.mp4 The stitching tuning part has a black mask over the face of the person in the video. Please do look into it

    Thanks

  • Result video different duration

    Result video different duration

    Hi

    I know this problem was fixed (18->25fps in code). I reinstalled STIT yesterday all new but still getting result duration problem

    Its notable and i use comand to get exact duration ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 out.mp4

  • Image glitches

    Image glitches

    I have been using the program and I have come across images with visual glitches.

    These are the images that I see that are correct. 00139

    And here is one of them in which I see a visual glitch. 00148

    Any idea why this is happening? It could be because in that exact frame the face is seen with the eyes closed.

    I also know that the original image is in bad resolution and that doesn't help.

    Thanks

Related tags
[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"
[CVPR 2022] Official code for the paper:

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

Aug 29, 2022
Instant Real-Time Example-Based Style Transfer to Facial Videos
Instant Real-Time Example-Based Style Transfer to Facial Videos

FaceBlit: Instant Real-Time Example-Based Style Transfer to Facial Videos The official implementation of FaceBlit: Instant Real-Time Example-Based Sty

Sep 19, 2022
DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time
DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time Introduction This is official implementation for DR-GAN (IEEE TCS

Jul 14, 2022
Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0
Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

OpenGaze: Web Service for OpenFace Facial Behaviour Analysis Toolkit Overview OpenFace is a fantastic tool intended for computer vision and machine le

May 23, 2022
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

OpenFace 2.2.0: a facial behavior analysis toolkit Over the past few years, there has been an increased interest in automatic facial behavior analysis

Sep 19, 2022
Automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azure

fwhr-calc-website This project is to automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azur

Feb 7, 2022
Code for Talk-to-Edit (ICCV2021). Paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog.
Code for Talk-to-Edit (ICCV2021). Paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog.

Talk-to-Edit (ICCV2021) This repository contains the implementation of the following paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog Yumin

Sep 16, 2022
Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

Feb 15, 2022
Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing
Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

Sep 26, 2022
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

Sep 5, 2022
TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain
TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

TCNN Pandey A, Wang D L. TCNN: Temporal convolutional neural network for real-time speech enhancement in the time domain[C]//ICASSP 2019-2019 IEEE Int

Sep 12, 2022
Keeping it safe - AI Based COVID-19 Tracker using Deep Learning and facial recognition
Keeping it safe - AI Based COVID-19 Tracker using Deep Learning and facial recognition

Keeping it safe - AI Based COVID-19 Tracker using Deep Learning and facial recognition

Jun 17, 2021
This dlib-based facial login system

Facial-Login-System This dlib-based facial login system is a technology capable of matching a human face from a digital webcam frame capture against a

Apr 23, 2022
An automated facial recognition based attendance system (desktop application)

Facial_Recognition_based_Attendance_System An automated facial recognition based attendance system (desktop application) Made using Python, Tkinter an

Jun 21, 2022
Real-world Anomaly Detection in Surveillance Videos- pytorch Re-implementation

Real world Anomaly Detection in Surveillance Videos : Pytorch RE-Implementation This repository is a re-implementation of "Real-world Anomaly Detectio

Aug 31, 2022
Code for PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Relighting and Material Editing

PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Relighting and Material Editing CVPR 2021. Project page: https://kai-46.github.io/

Sep 17, 2022
Invert and perturb GAN images for test-time ensembling
Invert and perturb GAN images for test-time ensembling

GAN Ensembling Project Page | Paper | Bibtex Ensembling with Deep Generative Views. Lucy Chai, Jun-Yan Zhu, Eli Shechtman, Phillip Isola, Richard Zhan

Aug 27, 2022
Invert and perturb GAN images for test-time ensembling

Invert and perturb GAN images for test-time ensembling

May 2, 2021
An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation
An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation

Hierarchical GAN for large dimensional financial market data Implementation This repository is an implementation of the [Hierarchical (Sig-Wasserstein

Jul 21, 2022