# batrack **Repository Path**: yejun668/batrack ## Basic Information - **Project Name**: batrack - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-11-06 - **Last Updated**: 2025-11-06 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction **[ICCV2025, Oral]** This repository contains the official implementation of [BA-Track](https://wrchen530.github.io/projects/batrack/). Our method achieves dynamic scene reconstruction via motion decoupling, bundle adjustment, and global refinement. > **Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction**
> [Weirong Chen](https://wrchen530.github.io/), [Ganlin Zhang](https://ganlinzhang.xyz/), [Felix Wimbauer](https://fwmb.github.io/), [Rui Wang](https://rui2016.github.io/), [Nikita Araslanov](https://arnike.github.io/), [Andrea Vedaldi](https://www.robots.ox.ac.uk/~vedaldi/), [Daniel Cremers](https://cvg.cit.tum.de/members/cremers)
> ICCV 2025 **[[Paper](https://arxiv.org/abs/2504.14516)] [[Project Page](https://wrchen530.github.io/projects/batrack/)]**

Demo

## Todo - [x] Initial release with demo - [x] Release pre-trained checkpoints - [x] Add scripts for evaluation - [ ] Add visualization for motion decoupling - [ ] Add scripts for training data preparation ## Setting Up the Environment ### Requirements The code was tested on Ubuntu 22.04, PyTorch 2.1.1, and CUDA 11.8 with an NVIDIA A40. Follow the steps below to set up the environment. ### Clone the repository ``` git clone https://github.com/wrchen530/batrack.git cd batrack ``` ### Create a conda environment and install dependencies ``` conda env create -f environment.yml conda activate batrack pip install -r requirements.txt ``` ### Install the batrack package ``` wget https://gitlab.com/libeigen/eigen/-/archive/3.4.0/eigen-3.4.0.zip unzip eigen-3.4.0.zip -d thirdparty pip install . ``` ### Install xformers for UniDepth To install xformers for the UniDepth model, follow the instructions at https://github.com/facebookresearch/xformers. If you encounter installation issues, we recommend installing from a prebuilt package. For example, for Python 3.10 + CUDA 11.8 + PyTorch 2.1.1: ``` wget https://anaconda.org/xformers/xformers/0.0.23/download/linux-64/xformers-0.0.23-py310_cu11.8.0_pyt2.1.1.tar.bz2 conda install xformers-0.0.23-py310_cu11.8.0_pyt2.1.1.tar.bz2 ``` ## Demo with DAVIS We follow [MegaSAM](https://github.com/mega-sam/mega-sam) to extract monocular depth priors from UniDepthV2 and DepthAnythingV2. Then we run our method in two stages: (1) sparse SLAM and (2) dense global alignment. ### Download sample sequence - Download sample DAVIS sequence from [Google Drive](https://drive.google.com/file/d/1hlHyxqW0AaPrv6NwJya0P2UJpcVNYbBU/view?usp=drive_link) and save it to `data/davis`. ### Download checkpoints - Download the DepthAnythingV2 checkpoint from [this link](https://huggingface.co/depth-anything/Depth-Anything-V2-Large/resolve/main/depth_anything_v2_vitl.pth) and save it to `batrack/Depth-Anything/checkpoints/depth_anything_v2_vitl.pth`. - Download our tracker checkpoint from [Google Drive](https://drive.google.com/file/d/1wWK_ur0Pr4jivqDUdyRUFzHPaF-f1clC/view?usp=sharing) and save it to `batrack/checkpoints/md_tracker.pth`. ### Step 1: Monocular Depth Estimation Compute monocular depth priors from UniDepthV2 and DepthAnythingV2, and align their scales: ``` bash scripts/demo/run_mono_depth.sh ``` ### Step 2: Sparse SLAM Run the sparse SLAM pipeline to perform motion decoupling and bundle adjustment for pose estimation and initial sparse reconstruction: ``` bash scripts/demo/run_sparse.sh ``` ### Step 3: Dense Global Alignment Perform dense global alignment to refine the reconstruction using monocular depth priors: ``` bash scripts/demo/run_dense.sh ``` ### Step 4: Visualization (Optional) Visualize reconstruction results with Rerun: ``` bash scripts/demo/run_vis.sh ``` ## Evaluations We provide evaluation scripts for MPI-Sintel and TartanAir-Shibuya. ### MPI-Sintel Download MPI-Sintel from [MPI-Sintel](http://sintel.is.tue.mpg.de/) and place it in the `data` folder at `data/sintel`. For evaluation, also download the [ground-truth camera pose data](http://sintel.is.tue.mpg.de/depth). The folder structure should look like: ``` sintel └── training ├── final └── camdata_left ``` **Precomputed depths.** To avoid environment/dependency conflicts, we provide precomputed ZoeDepth results at [this link](https://drive.google.com/file/d/1y8zPOMlwRzeP43RBKgA6gg8-_EjlurCy/view?usp=drive_link). Download and place the folder at `data/Monodepth/sintel/zoedepth_nk`. Run pose evaluation: ``` bash scripts/eval_sintel/eval_sintel_pose.sh ``` Run depth evaluation: ``` bash scripts/eval_sintel/eval_sintel_depth.sh ``` ### TartanAir-Shibuya Download TartanAir-Shibuya following the instructions at [TartanAir-Shibuya](https://github.com/haleqiu/tartanair-shibuya) and place it in the `data` folder at `data/shibuya`. For `RoadCrossing07/image_0`, skip the first 5 images (000000.png to 000004.png) because there is no depth ground truth. You can delete these files with: ```bash # Delete first 5 images (000000.png to 000004.png) for RoadCrossing07/image_0 rm data/shibuya/RoadCrossing07/image_0/00000{0,1,2,3,4}.png ``` **Precomputed depths.** To avoid environment/dependency conflicts, we provide precomputed ZoeDepth results at [this link](https://drive.google.com/file/d/14XHNH9WNDf3fMm5rGNDH-NP1n00eUpHF/view?usp=drive_link). Download and place the folder at `data/Monodepth/shibuya/zoedepth_nk`. Run pose evaluation: ``` bash scripts/eval_shibuya/eval_shibuya_pose.sh ``` Run depth evaluation: ``` bash scripts/eval_shibuya/eval_shibuya_depth.sh ``` ## Citations If you find this repository useful, please consider citing our paper: ``` @InProceedings{chen2025back, title={Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction}, author={Chen, Weirong and Zhang, Ganlin and Wimbauer, Felix and Wang, Rui and Araslanov, Nikita and Vedaldi, Andrea and Cremers, Daniel}, journal={IEEE/CVF International Conference on Computer Vision (ICCV)}, year={2025} } ``` ## Acknowledgements We adapted code from several excellent repositories, including: - [CoTracker](https://github.com/facebookresearch/co-tracker) - [SpaTracker](https://github.com/henry123-boy/SpaTracker) - [DPVO](https://github.com/princeton-vl/DPVO) - [LEAP-VO](https://github.com/wrchen530/leapvo) - [MegaSAM](https://github.com/mega-sam/mega-sam) We sincerely thank the authors for open-sourcing their work. ## Concurrent Efforts Several exciting concurrent works explore related aspects of dynamic scene reconstruction and point tracking! Check them out: - **[SpaTrackerV2](https://github.com/henry123-boy/SpaTrackerV2)** - SpatialTrackerV2: 3D Point Tracking Made Easy - **[MVTracker](https://github.com/ethz-vlg/mvtracker)** - Multi-View 3D Point Tracking - **[C4D](https://littlepure2333.github.io/C4D/)** - C4D: 4D Made from 3D through Dual Correspondences ## Limitations This project attempts to disentangle camera-induced and object motion via point tracking. The model was trained on a relatively small, domain-specific dataset (Kubric), which may limit its generalization to challenging or novel scenes. Future directions include expanding the training data and refining the tracker architecture to improve robustness and efficiency.