How to Train Your ResNet 8: Bag of Tricks

Note: this post is also available as Colab notebook here. Whilst we’ve been otherwise occupied – investigating hyperparameter tuning, weight decay and batch norm – our entry for training CIFAR10 to 94% test accuracy has slipped five (!) places on the DAWNBench leaderboard: The top six entries all use 9-layer ResNets which are cousins – or twins – of the network … Continue reading How to Train Your ResNet 8: Bag of Tricks