Great work, this is really interesting. I just read the README, I hope to have some time to look at the code later but there are so many interesting hobby projects to work on!
About the global pooling, in the examples you listed several channels that don't seem global to me. Last-move, urgency, etc will have different values in different parts of the board. How would max-pooling and rebroadcasting those channels will help make a local decision?