# EX-4D
**Repository Path**: fengshenmeng/EX-4D
## Basic Information
- **Project Name**: EX-4D
- **Description**: No description available
- **Primary Language**: Python
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-07-03
- **Last Updated**: 2025-07-03
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh

[📄 Paper](https://arxiv.org/abs/2506.05554) | [🎥 Homepage](https://tau-yihouxiang.github.io/projects/EX-4D/EX-4D.html) | [💻 Code](https://github.com/tau-yihouxiang/EX-4D)
## 🌟 Highlights
- **🎯 Extreme Viewpoint Synthesis**: Generate high-quality 4D videos with camera movements ranging from -90° to 90°
- **🔧 Depth Watertight Mesh**: Novel geometric representation that models both visible and occluded regions
- **⚡ Lightweight Architecture**: Only 1% trainable parameters (140M) of the 14B video diffusion backbone
- **🎭 No Multi-view Training**: Innovative masking strategy eliminates the need for expensive multi-view datasets
- **🏆 State-of-the-art Performance**: Outperforms existing methods, especially on extreme camera angles
## 🎬 Demo Results
*EX-4D transforms monocular videos into camera-controllable 4D experiences with physically consistent results under extreme viewpoints.*
## 🏗️ Framework Overview
Our framework consists of three key components:
1. **🔺 Depth Watertight Mesh Construction**: Creates a robust geometric prior that explicitly models both visible and occluded regions
2. **🎭 Simulated Masking Strategy**: Generates effective training data from monocular videos without multi-view datasets
3. **⚙️ Lightweight LoRA Adapter**: Efficiently integrates geometric information with pre-trained video diffusion models
## 🚀 Quick Start
### Installation
```bash
# Clone the repository
git clone https://github.com/tau-yihouxiang/EX-4D.git
cd EX-4D
# Create conda environment
conda create -n ex4d python=3.10
conda activate ex4d
# Install PyTorch (2.x recommended)
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu124
# Install Nvdiffrast
pip install git+https://github.com/NVlabs/nvdiffrast.git
# Install dependencies and diffsynth
pip install -e .
# Install depthcrafter for depth estimation. (Follow DepthCrafter's installing instruction for checkpoints preparation.)
git clone https://github.com/Tencent/DepthCrafter.git
```
### Download Pretrained Model
```bash
huggingface-cli download Wan-AI/Wan2.1-I2V-14B-480P --local-dir ./models/Wan-AI
huggingface-cli download yihouxiang/EX-4D --local-dir ./models/EX-4D
```
### Example Usage
#### 1. DW-Mesh Reconstruction
```bash
python recon.py --input_video examples/flower/input.mp4 --cam 30 (/60/90/180) --output_dir examples/flower
```
#### 2. EX-4D Generation (48GB VRAM required)
```bash
python generate.py --color_video examples/flower/render_180.mp4 --mask_video examples/flower/mask_180.mp4 --output_video examples/output.mp4
```
Input Video
|
➜
|
Output Video
|
### User Study Results
- **70.7%** of participants preferred EX-4D over baseline methods
- Superior performance in physical consistency and extreme viewpoint quality
- Significant improvement as camera angles become more extreme
## 🎯 Applications
- **🎮 Gaming**: Create immersive 3D game cinematics from 2D footage
- **🎬 Film Production**: Generate novel camera angles for post-production
- **🥽 VR/AR**: Create free-viewpoint video experiences
- **📱 Social Media**: Generate dynamic camera movements for content creation
- **🏢 Architecture**: Visualize spaces from multiple viewpoints
## ⚠️ Limitations
- **Depth Dependency**: Performance relies on monocular depth estimation quality
- **Computational Cost**: Requires significant computation for high-resolution videos
- **Reflective Surfaces**: Challenges with reflective or transparent materials
## 🔮 Future Work
- [ ] Real-time inference optimization (3DGS / 4DGS)
- [ ] Support for higher resolutions (1K, 2K)
- [ ] Neural mesh refinement techniques
## 🙏 Acknowledgments
We would like to thank the [DiffSynth-Studio v1.1.1](https://github.com/modelscope/DiffSynth-Studio/tree/v1.1.1) project for providing the foundational diffusion framework.
## 📚 Citation
If you find our work useful, please consider citing:
```bibtex
@misc{hu2025ex4dextremeviewpoint4d,
title={EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh},
author={Tao Hu and Haoyang Peng and Xiao Liu and Yuewen Ma},
year={2025},
eprint={2506.05554},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2506.05554},
}
```