# SC-SfMLearner-Release **Repository Path**: uiemUI/SC-SfMLearner-Release ## Basic Information - **Project Name**: SC-SfMLearner-Release - **Description**: Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video (NeurIPS 2019) - **Primary Language**: Unknown - **License**: GPL-3.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-08-08 - **Last Updated**: 2020-12-28 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # SC-SfMLearner This codebase implements the system described in the paper: Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video [Jia-Wang Bian](https://jwbian.net/), Zhichao Li, Naiyan Wang, Huangying Zhan, Chunhua Shen, Ming-Ming Cheng, Ian Reid **NeurIPS** 2019 See the paper on [[arXiv](https://arxiv.org/abs/1908.10553)] and the [[project webpage](https://jwbian.net/sc-sfmlearner/)] for more details. drawing ## Video demo of dense reconstruction using estimated depth [![reconstruction demo](https://jwbian.net/Data/reconstruction.png)](https://www.youtube.com/watch?v=i4wZr79_pD8) ## Highlighted Features 1. A geometry consistency loss for enforcing the scale-consistency of predictions between consecutive frames. 2. A self-discovered mask for detecting moving objects and occlusions. 3. Enabling the unsupervised estimator (learned from monocular videos) to do visual odometry on a long video. ## Preamble This codebase was developed and tested with python 3.6, Pytorch 1.0.1, and CUDA 10.0 on Ubuntu 16.04. It is based on [Clement Pinard's SfMLearner implementation](https://github.com/ClementPinard/SfmLearner-Pytorch), in which we make little modification and add our proposed losses. ## Prerequisite ```bash pip3 install -r requirements.txt ``` or install manually the following packages : ``` torch >= 1.0.1 imageio matplotlib scipy argparse tensorboardX blessings progressbar2 path.py evo ``` It is also advised to have python3 bindings for opencv for tensorboard visualizations ## Preparing training data See "scripts/run_prepare_data.sh" for examples, including KITTI Raw, Cityscapes, and KITTI Odometry. For [KITTI Raw dataset](http://www.cvlibs.net/datasets/kitti/raw_data.php), download the dataset using this [script](http://www.cvlibs.net/download.php?file=raw_data_downloader.zip) provided on the official website. For [Cityscapes](https://www.cityscapes-dataset.com/), download the following packages: 1) `leftImg8bit_sequence_trainvaltest.zip`, 2) `camera_trainvaltest.zip`. You will probably need to contact the administrators to be able to get it. For [KITTI Odometry dataset](http://www.cvlibs.net/datasets/kitti/eval_odometry.php) download the dataset with color images. ## Training The "scripts" folder provides several examples for training and testing. You can train the depth model on KITTI Raw by running ```bash sh scripts/train_resnet_256.sh ``` or train the pose model on KITTI Odometry by running ```bash sh scripts/train_posenet_256.sh ``` Then you can start a `tensorboard` session in this folder by ```bash tensorboard --logdir=checkpoints/ ``` and visualize the training progress by opening [https://localhost:6006](https://localhost:6006) on your browser. ## Evaluation You can evaluate depth using Eigen's split by running ```bash sh scripts/run_depth_test.sh ``` and test visual odometry by running ```bash sh scripts/run_vo_test.sh ``` You can evaluate visual odometry results using KITTI provided C++ codes, or you can use the python code at this [repo](https://github.com/Huangying-Zhan/kitti_odom_eval) Besides, you can evaluate 5-frame pose as SfMLearner by running ```bash sh scripts/run_pose_test.sh ``` ## Pretrained Models [Avalaible here](https://1drv.ms/u/s!AiV6XqkxJHE2g2LA8enHaQQOg0jZ?e=FNbH3c) Note that depth models are trained on KITTI Raw dataset, and pose models are trained on KITTI Odometry dataset, respectively. They are not coupled. ### Depth Results (KITTI Eigen's splits) | Models | Abs Rel | Sq Rel | RMSE | RMSE(log) | Acc.1 | Acc.2 | Acc.3 | |------------|---------|--------|-------|-----------|-------|-------|-------| | k_depth | 0.137 | 1.089 | 5.439 | 0.217 | 0.830 | 0.942 | 0.975 | | cs+k_depth | 0.128 | 1.047 | 5.234 | 0.208 | 0.846 | 0.947 | 0.976 | ### Visual Odometry Results (Train on KITTI 00-08) | Models | | Seq. 09 | Seq. 10 | |------------|---------------------|---------|---------| | k_pose |t_err (%) | 11.2 | 10.1 | | |r_err (degree/100m) | 3.35 | 4.96 | | cs+k_pose |t_err (%) | 8.24 | 10.7 | | |r_err (degree/100m) | 2.19 | 4.58 | drawing ## If you use this work, please cite our paper @inproceedings{bian2019depth, title={Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video}, author={Bian, Jia-Wang and Li, Zhichao and Wang, Naiyan and Zhan, Huangying and Shen, Chunhua and Cheng, Ming-Ming and Reid, Ian}, booktitle= {Thirty-third Conference on Neural Information Processing Systems (NeurIPS)}, year={2019} }