A simple way to do this is to use a counter to keep track of each request, and only process a response if it's newer than any other processed so far.
Continue reading "In-order responses for asynchronous work"
Averages are used, in some form or other, and many machine-learning algorithms. Stochastic gradient descent is a great example of an average in disguise, thin though it may be.
Picking the right kind of average can be critical. As learning algorithms explore sub-optimal choices, the resulting negative impact on backed-up state values can persist over epochs, hampering performance. Alternatively, some kinds of average don't converge, preventing the algorithm from settling into optimal outcomes.
Here, I officially release a paper on a particular kind of average that's adaptable like an exponential moving average, but has guaranteed convergence like a simple average.
Continue reading "Better averages for online machine-learning"
Implementing a Q-table reinforcement-learner is in many ways simple and straight-forward and also somewhat tricky. The basic concept is easy to grasp; but, as many have mentioned, reinforcement-learners almost want to work, despite whatever bugs or sub-optimal math might be in the implementation.
Here are some quick notes about the approach I've come to use, specifically about action-selection (e.g. epsilon-greedy versus UCB) and managing learning-rates. They've helped my learners converge to good strategies faster and more reliably. Hopefully they can help you, too!
Continue reading "Action-selection and learning-rates in Q-learning"
I recently worked on a small project simulating random events that were far too numerous to enumerate. In such cases, every bit of speed matters.
The project in this case was similar to determining likelihood of five-card Poker hands in seven-card draws.
Simulation of shuffling the deck and drawing cards can take a large part of the runtime if not done well, but there's a trick that makes it almost trivial.
Continue reading "Simulating deck-shuffling"