# FBNet **Repository Path**: shipxu/FBNet ## Basic Information - **Project Name**: FBNet - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-10-31 - **Last Updated**: 2024-11-26 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # **FBNet** This repository reproduces the results of the following paper: [**FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search**](https://arxiv.org/pdf/1812.03443.pdf) Bichen Wu1, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, Kurt Keutzer (Fasebook Research) Layers to Search are from a [FacebookResearch repository](https://github.com/facebookresearch/maskrcnn-benchmark/tree/master/maskrcnn_benchmark/modeling/backbone) Utils stuff is taken from [DARTS repository](https://github.com/quark0/darts/blob/master/cnn/utils.py) # Advantages * Building blocks (searched layers) was taken from the FacebookResearch repository (Quick Note: their repo consists files with names fbnet*, but doesn't include any fbnet architecture from their paper) * Latency Calculation Code * Successfully Tested on Cifar10 * Logging. You can find all my logs and tensorboards into *SAVED_LOGS/architectures_training* (for architectures training) # Disadvantages * Loss : $CE(a, w_a) · α β log(LAT(a))$ instead of $CE(a, w_a) · α log(LAT(a))^β$ (occasionally) * SAVED_LOGS/supernet_training* (for supernet training) - logs with validation on the training data to thetas optimization (bug, code was fixed) * No MultiGPU Support yet * Training only CIFAR10 # Good News! FacebookResearch published weights for the resulted architectures: FBNet-A, FBNet-B & FBNet-C (trained on imagenet) https://github.com/facebookresearch/mobile-vision # Results, Cifar10 > The architectures are not SOTAs: we **search only for filters' sizes** (these numbers are good for the simple architecture) and the goal is to **Reduce Inference Time for Your Device** > FacebookResearch didn't share latencies for their test machines, so, I couldn't prove their latencies results, but I have builded and trained theier proposed architectures: | FBNet Architecture | **top1** validation accuracy | **top3** validation accuracy | | ------ | ------ | ------ | | FBNet-A | 78.8% | 95.4% | | FBNet-B | 82% | 96% | | FBNet-C | 79.9% | 95.6% | | FBNet-s8 (for Samsung Galaxy S8) | 79.6% | 95.7% | | FBNet-iPhoneX | 76.2% | 94.3% | | ------ | ------ | ------ | | fbnet_cpu_sample1 | 82.8% | 98.9% | | fbnet_cpu_sample2 | 80.6% | 95.7% | **Note: be cautious!** these numbers are just validation's the bests (without confidence intervals, measured in a single run). Do not use these numbers to make decisions. They are here to compliment the tensorboards in the `SAVED_LOGS` directory. The reason why I don't split data into validation and test is *in the next note*. **Note**: as it was stated in the paper and according to my results, if we train with small images (as cifar's 32x32), we can see a lot of 'skip' layers into the resulting architecture. I feel, for cifar10 we should search for less number of layers # FBNet Optimization performance We have no theoretical guarantees of converjence. So, I run a distinct checking experiment to compare the method with the pioneer of gradient NASes in application to search a part of a NN. See **[DARTS VS FBNet.md](https://github.com/AnnaAraslanova/FBNet/blob/master/DARTS_VS_FBNET.md)** for results # Code Structure and Training Pipeline, Cifar10 The repository consists of 2 Neural Net Models: **(1)** FBNet Searched Architectures. All tools in the *architecture_functions* folder **(2)** Stochastic SuperNet to search for new architectures. All tools in the *supernet_functions* folder > They use different functions and architectures specification. Functions used by both Nets are in the folders: *general_functions* (utilities) and *fbnet_building_blocks* (modified code of facebookresearch team) I encourage you to visit **[TRAINIG_DETAILS.md](https://github.com/AnnaAraslanova/FBNet/blob/master/TRAINIG_DETAILS.md) in this folder** for details and instructions. # Dependencies I have tested the code with the following dockerfile: [DOCKERFILE](https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/docker/Dockerfile) (Pytorch 0.4.1 Nightly) btw I think it should work well with Pytorch 0.4.0+ # License MIT