I was doing roughly the same thing, but with ZeroMQ. ZMQ port binding issuesmade me drop it. ZeroMQ itself when used to communicate was GREAT.
What I ended up doing was way simpler, but I did not push it to GitHub yet because it doesn't quite work:
- A master script that runs things, keeps track of win/loss, and invokes
- A regular bot, but it reads its parameters from a named file passed in on
ARGV (a very simple text file with a parameter per line)
The master script "creates" 4 bots, with random initial parameters, and invokes them in games like
./halite -q -t "python MyBot.py --paramsfile 1.txt" "python MyBot.py --paramsfile 2.txt" (and 4 player game just includes 3.txt and 4.txt)
After each "round" of training (in which many 2 and 4 player games will be run to see how bots 1, 2, 3, 4 stack up), the best parameters are picked, mutated (this is almost like genetic programming, I suppose) and run again for 4 permutations of the last generation's winner. I thought about adding some new random parameters every few generations, but the permutation to previous winner values seems to add enough. (And I'm not doing any crossover like genetic programming might use) The whole point is that adding some randomness up or down to parameters may help to keep us out of local maxima. I am not sure if that is true.
This isn't ML in the deep learning sense. I'm using an approach much like @arjunvis's policy gradients / gradient descent to find good parameters for functions that I'm tuning. (I also decided not to use @arjunvis's vector field, elegant as it is, in favor of something more akin to "hand rolled strategies, enacted based on tuned parameters."
I've yet to upload this version of the bot; I haven't had much time over the holidays to babysit training and play with starting parameters. The bot itself (which the parameters would be tuned for) isn't doing very well against my local stable of previous bots or against my ML bot that is descended from the ML Python/Tensorflow starter kit. (The current version of this ML Tensorflow bot is what is on my profile currently.)