Here you can see an example where I (black) played against the model in the repo (white): One of the strengths of MCTS is it scales quite well with computing power. New results (after a great number of modifications due to supervised learning on about 10k games, I trained a model (7 residual blocks of 256 filters) to a guesstimate of 1200 elo with 1200 sims/move. Note: This project is still under construction!! Environment In fact, in chess AlphaZero outperformed Stockfish after just 4 hours (300k steps) Wow! DeepMind just released a new version of AlphaGo Zero (named now AlphaZero) where they master chess from scratch:.The great Reversi development of the DeepMind ideas that did in his repo:.DeepMind's Oct 19th publication: Mastering the Game of Go without Human Knowledge.This project is based on these main resources: Chess reinforcement learning by AlphaGo Zero methods.
0 Comments
Leave a Reply. |