# AlphaGOZero-python-tensorflow **Repository Path**: bobhu2020/AlphaGOZero-python-tensorflow ## Basic Information - **Project Name**: AlphaGOZero-python-tensorflow - **Description**: Congratulation to DeepMind! This is a reengineering implementation (on behalf of many other git repo in /support/) of DeepMind's Oct19th publication: [Mastering the Game of Go without Human Knowledge]. The supervised learning approach is more practical for individuals. (This repository has single purpose of education only) - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2020-07-21 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # AlphaGOZero (python tensorflow implementation) This is a trial implementation of DeepMind's Oct19th publication: [Mastering the Game of Go without Human Knowledge](https://www.nature.com/articles/nature24270.epdf?author_access_token=VJXbVjaSHxFoctQQ4p2k4tRgN0jAjWel9jnR3ZoTv0PVW4gB86EEpGqTRDtpIz-2rmo8-KG06gqVobU5NSCFeHILHcVFUeMsbvwS-lxjqQGg98faovwjxeTUgZAUMnRQ). **DeepMind release [AlphaZero Teaching Go](https://alphagoteach.deepmind.com)**. It's a lot of fun! --- # From Paper Pure RL has outperformed supervised learning+RL agent ![](/figure/rl_vs_sl.png) # SL evaluation ![](/figure/Nov20large20eval.png) ## Download trained model 1. [https://drive.google.com/drive/folders/1Xs8Ly3wjMmXjH2agrz25Zv2e5-yqQKaP?usp=sharing](https://drive.google.com/drive/folders/1Xs8Ly3wjMmXjH2agrz25Zv2e5-yqQKaP?usp=sharing) 2. Place under ./savedmodels/large20/ --- # Set up ## Install requirement python 3.6 tensorflow/tensorflow-gpu (version 1.4, version >= 1.5 can't load trained models) ``` pip install -r requirement.txt ``` ## Download Dataset (kgs 4dan) Under repo's root dir ``` cd data/download chmod +x download.sh ./download.sh ``` ## Preprocess Data *It is only an example, feel free to assign your local dataset directory* ``` python preprocess.py preprocess ./data/SGFs/kgs-* ``` ## Train A Model ``` python main.py --mode=train ``` ## Play Against An A.I. ``` python main.py --mode=gtp —-gtp_poliy=greedypolicy --model_path='./savedmodels/your_model.ckpt' ``` ## Play in Sabaki ![](/figure/Sabaki.png) 1. In console: ``` which python ``` add result to the headline of ```main.py``` with ```#!``` prefix. 2. Add the path of ```main.py``` to Sabaki's manage Engine with argument ```--mode=gtp``` # TODO: - [x] AlphaGo Zero Architecture - [x] Supervised Training - [x] Self Play pipeline - [x] Go Text Protocol - [x] Sabaki Engine enabled - [ ] *Tabula rasa* (failed) - [x] Distributed learning # Credit (orderless): *Brain Lee *Ritchie Ng *Samuel Graván *森下 健 *yuanfengpang