# himor **Repository Path**: yejun668/himor ## Basic Information - **Project Name**: himor - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-09-13 - **Last Updated**: 2025-09-13 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # HiMoR: Monocular Deformable Gaussian Reconstruction with Hierarchical Motion Representation (CVPR 2025) ### [Project Page](https://pfnet-research.github.io/himor/) | [arXiv](https://arxiv.org/abs/2504.06210) ![video](assets/mochi.gif) ## Installation Please follow the instructions below to set up the environment: ```bash # Create a new conda environment conda create -n himor python=3.10 conda activate himor # Install dependencies conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit pip install -r requirements.txt pip install git+https://github.com/nerfstudio-project/gsplat.git pip install "git+https://github.com/facebookresearch/pytorch3d.git" pip install git+https://github.com/rahul-goel/fused-ssim/ --no-build-isolation ``` ## Data preparation ### iPhone Dataset Download the preprocessed iPhone dataset from [here](https://github.com/vye16/shape-of-motion?tab=readme-ov-file#evaluation-on-iphone-dataset) and place it under `./data/iPhone/`. Pretrained checkpoints are available [here](https://drive.google.com/file/d/1s8pTSbUrfhrADYdsdB1X2C-Hle-k0ZzB/view?usp=sharing). ### Nvidia Dataset We use the dataset provided by [Gaussian Marbles](https://github.com/coltonstearns/dynamic-gaussian-marbles), with foreground masks recomputed using the preprocessing scripts from [Shape of Motion](https://github.com/vye16/shape-of-motion). Download the preprocessed dataset from [here](https://github.com/coltonstearns/dynamic-gaussian-marbles?tab=readme-ov-file#downloading-data) and place it under `./data/nvidia/`. ### Custom Dataset To train on a custom dataset, please follow the instruction provided by [Shape of Motion](https://github.com/vye16/shape-of-motion) for preprocessing. Note that in our case, the data should be formatted following the iPhone dataset structure. ## Visualization To visualize results using an interactive viewer, first download the pretrained checkpoints, then run the following command: ```bash python run_rendering.py --ckpt-path ``` ## Training ### iPhone Dataset For better reconstruction especially in background: ```bash python run_training.py --work-dir ./outputs/paper-windmill --port 8888 data:iphone --data.data-dir ./data/iphone/paper-windmill --data.depth_type depth_anything_colmap --data.camera_type refined ``` In the paper, we report results using the original camera poses: ```bash # First, align monocular depth with LiDAR depth. python preproc/align_monodepth_with_lidar.py --lidar_depth_dir ./data/iphone/paper-windmill/depth/1x/ --input_monodepth_dir ./data/iphone/paper-windmill/flow3d_preprocessed/depth_anything/1x --output_monodepth_dir ./data/iphone/paper-windmill/flow3d_preprocessed/aligned_depth_anything_lidar/1x --matching_pattern "0*" # Then, run training. python run_training.py --work-dir ./outputs/paper-windmill --port 8888 data:iphone --data.data-dir ./data/iphone/paper-windmill --data.depth_type depth_anything_lidar --data.camera_type original ``` ### Nvidia Dataset Train with the following command: ```bash python run_training.py --work-dir ./outputs/Balloon1 --num_fg 20000 --num_bg 40000 --num_epochs 800 --port 8888 data:nvidia --data.data-dir ./data/nvidia/Balloon1 --data.depth_type lidar --data.camera_type original ``` ## Evaluation Ensure that the checkpoint file `outputs//checkpoints/last.ckpt` is available. You can either obtain this by training the model or download the provided checkpoints. ### Render Images Use the checkpoint to render images: ```bash python run_evaluation.py --work-dir outputs/paper-windmill/ --ckpt-path outputs/paper-windmill/checkpoints/last.ckpt data:iphone --data.data-dir ./data/iphone/paper-windmill ``` ### Compute Metrics Evaluate the rendered images to compute quantitative metrics: ```bash # For the iPhone dataset PYTHONPATH="." python scripts/evaluate_iphone.py --data_dir ./data/iphone/paper-windmill --result_dir ./outputs/paper-windmill/ # For the Nvidia dataset PYTHONPATH="." python scripts/evaluate_nvidia.py --data_dir ./data/nvidia/Balloon1/ --result_dir ./outputs/Balloon1/ ``` ## Citation ``` @inproceedings{liang2025himor, author = {Liang, Yiming and Xu, Tianhan and Kikuchi, Yuta}, title = {{H}i{M}o{R}: Monocular Deformable Gaussian Reconstruction with Hierarchical Motion Representation}, booktitle = {CVPR}, year = {2025}, } ``` ## Acknowledgement Our implementation builds on [Shape of Motion](https://github.com/vye16/shape-of-motion). We thank the authors for open-sourcing their code.