mirror of https://github.com/KwaiVGI/LivePortrait.git synced 2024-12-22 20:42:38 +00:00

Bring portraits to life!

face-animation image-animation video-editing video-generation

Go to file

Komiljon Mukhammadiev 1ee0aa497e Update readme.md		2024-07-10 15:50:00 +09:00
.vscode	feat: launch LivePortrait	2024-07-04 04:32:47 +08:00
assets	Added new inference code for webcam	2024-07-10 12:48:10 +09:00
pretrained_weights	feat: launch LivePortrait	2024-07-04 04:32:47 +08:00
src	Added new inference code for webcam	2024-07-10 12:48:10 +09:00
.gitignore	feat: launch LivePortrait	2024-07-04 04:32:47 +08:00
app.py	chore: slightly refine the codebase	2024-07-05 15:09:43 +08:00
inference.py	Added new inference code for webcam	2024-07-10 12:48:10 +09:00
LICENSE	Initial commit	2024-07-04 00:15:03 +08:00
readme.md	Update readme.md	2024-07-10 15:50:00 +09:00
requirements.txt	chore: update readme and remove timm	2024-07-05 14:16:03 +08:00
speed.py	feat: launch LivePortrait	2024-07-04 04:32:47 +08:00
video2template.py	feat: launch LivePortrait	2024-07-04 04:32:47 +08:00

readme.md

Webcam Live Portrait

🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥

https://github.com/Mrkomiljon/Webcam_Live_Portrait/assets/92161283/4e16fbc7-8c13-4415-b946-dd731ac00b6e

🔥 Updates

2024/07/10: 🔥 I released the initial version of the inference code for webcam. Continuous updates, stay tuned!

Introduction

This repo, named Webcam Live Portrait, contains the official PyTorch implementation of author paper LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control. I am actively updating and improving this repository. If you find any bugs or have suggestions, welcome to raise issues or submit pull requests (PR) 💖.

🔥 Getting Started

1. Clone the code and prepare the environment

git clone https://github.com/Mrkomiljon/Webcam_Live_Portrait.git
cd Webcam_Live_Portrait

# create env using conda
conda create -n LivePortrait python==3.9.18
conda activate LivePortrait
# install dependencies with pip
pip install -r requirements.txt

2. Download pretrained weights

Download pretrained LivePortrait weights and face detection models of InsightFace from Google Drive or Baidu Yun. We have packed all weights in one directory 😊. Unzip and place them in ./pretrained_weights ensuring the directory structure is as follows:

pretrained_weights
├── insightface
│   └── models
│       └── buffalo_l
│           ├── 2d106det.onnx
│           └── det_10g.onnx
└── liveportrait
    ├── base_models
    │   ├── appearance_feature_extractor.pth
    │   ├── motion_extractor.pth
    │   ├── spade_generator.pth
    │   └── warping_module.pth
    ├── landmark.onnx
    └── retargeting_models
        └── stitching_retargeting_module.pth

3. Inference 🚀

python inference.py

If the script runs successfully, you will get an output mp4 file named animations/s6--d0_concat.mp4. This file includes the following results: driving video, input image, and generated result.

https://github.com/Mrkomiljon/Webcam_Live_Portrait/assets/92161283/7c4daf41-838d-4eb8-a762-9188cd337ee6

Or, you can change the input by specifying the -s and -d arguments come from webcam:

python inference.py -s assets/examples/source/MY_photo.jpg 

# or disable pasting back
python inference.py -s assets/examples/source/s9.jpg -d assets/examples/driving/d0.mp4 --no_flag_pasteback

# more options to see
python inference.py -h

4. Gradio interface

We also provide a Gradio interface for a better experience, just run by:

python app.py

5. Inference speed evaluation 🚀🚀🚀

We have also provided a script to evaluate the inference speed of each module:

python speed.py

Below are the results of inferring one frame on an RTX 4090 GPU using the native PyTorch framework with torch.compile:

Model	Parameters(M)	Model Size(MB)	Inference(ms)
Appearance Feature Extractor	0.84	3.3	0.82
Motion Extractor	28.12	108	0.84
Spade Generator	55.37	212	7.59
Warping Module	45.53	174	5.21
Stitching and Retargeting Modules	0.23	2.3	0.31

Note: the listed values of Stitching and Retargeting Modules represent the combined parameter counts and the total sequential inference time of three MLP networks.

Acknowledgements

I would like to thank the contributors of FOMM, Open Facevid2vid, SPADE, InsightFace repositories, for their open research and main authors.