Skip to content

In-order responses for asynchronous work

Sometimes we end up executing some asynchronous function several times in a row, but we need only the results of the last call. The difficulty is that some earlier invocations may finish after the latest. I encounter this most often in Javascript, when I call an API in response to ongoing user input, like for looking up an address. While debouncing can help reduce this problem (and should be done anyway to lighten the load), it does not eliminate the problem.

A simple way to do this is to use a counter to keep track of each request, and only process a response if it's newer than any other processed so far.

Continue reading "In-order responses for asynchronous work"

Better averages for online machine-learning

Averages are used, in some form or other, and many machine-learning algorithms. Stochastic gradient descent is a great example of an average in disguise, thin though it may be.

Picking the right kind of average can be critical. As learning algorithms explore sub-optimal choices, the resulting negative impact on backed-up state values can persist over epochs, hampering performance. Alternatively, some kinds of average don't converge, preventing the algorithm from settling into optimal outcomes.

Here, I officially release a paper on a particular kind of average that's adaptable like an exponential moving average, but has guaranteed convergence like a simple average.

Continue reading "Better averages for online machine-learning"

Action-selection and learning-rates in Q-learning

Implementing a Q-table reinforcement-learner is in many ways simple and straight-forward and also somewhat tricky. The basic concept is easy to grasp; but, as many have mentioned, reinforcement-learners almost want to work, despite whatever bugs or sub-optimal math might be in the implementation.

Here are some quick notes about the approach I've come to use, specifically about action-selection (e.g. epsilon-greedy versus UCB) and managing learning-rates. They've helped my learners converge to good strategies faster and more reliably. Hopefully they can help you, too!

Continue reading "Action-selection and learning-rates in Q-learning"

Simulating deck-shuffling

I recently worked on a small project simulating random events that were far too numerous to enumerate. In such cases, every bit of speed matters.

The project in this case was similar to determining likelihood of five-card Poker hands in seven-card draws.

Simulation of shuffling the deck and drawing cards can take a large part of the runtime if not done well, but there's a trick that makes it almost trivial.

Continue reading "Simulating deck-shuffling"