My Attempt at Outperforming Deepmind’s Atari Results – UPDATE 5

Hey everybody!

Another update!

The reinforcement learning portion is done, so now I turn my attention to the image recognition portion so that I may scale up better to the ALE. I originally tried to go immediately to the ALE, but it takes so long to find out whether or not what you did is working so instead I decided to stay with pole balancing. However, instead of feeding values such as position and velocities of the cart directly to the HTM bottom-most region, I am now using an encoding of the image of the experiment. This provides me with an intermediate step on the path to scaling up to the ALE.

At this point, I could dump the HTM stuff and just do ConvNets, but that feels like a cop out. I have found an interesting paper that compares CLA (HTM) to the state of the art, and they showed that it indeed outperforms things like convolutional neural networks (they used LeNet), mostly due to the time signal, on tasks where sequences of information are presented. I have such a task. So, I will continue working on my HTM implementation and improving it. I am currently in the process of reading that paper (available here:http://bias.csr.unibo.it/maltoni/HTM_TR_v1.0.pdf) to see how they were able to adapt it for classification (from which I can adapt it to function approximation).

EDIT: Upon further examination, the paper’s validity appears questionable. The “HTM” they use is also very different from Numenta’s HTM (the one I am using).

I am still not quite certain if I am encoding the greyscale images properly for the HTM, right now the regions are way too stable. The previously mentioned paper talks of feedback signals from higher regions in the hierarchy that allow for adaptation of lower levels with the context gathered from the higher levels. My current HTM model does not have this, so I am very interested in seeing how that affects things for reinforcement learning.

Right after this post, I will continue reading the paper, and start coding a new HTM model. Besides incorporating any improvements I can gather from the paper, I am moving everything to the GPU using OpenCL. I used to be mainly a graphics developer, so this shouldn’t take too long to do 🙂

For those just seeing this for the first time, the source code for this is available here, under the directory “htmrl”:https://github.com/222464/AILib It uses the CMake build system.

That’s it for this update, until next time!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s