User Tools

Site Tools



This shows you the differences between two versions of the page.

Link to this comparison view

structured_memory_for_neural_turing_machines [2015/12/17 21:59] (current)
Line 1: Line 1:
 +{{tag>"​Talk Summaries"​ "​Neural Networks"​ "NIPS 2015"​}}
 +====== Structured Memory for Neural Turing Machines ======
 +| Presenter | Wei Zhang |
 +| Context | NIPS 2015 Reasoning, Attention, and Memory Workshop |
 +| Date | 12/12/15 |
 +Neural Turing Machines try to simulate human working memory using a linearly structured memory, which has proven successful for simple algorithms. ​ The linear memory structure may be insufficient for more complicated tasks, such as question answering. ​ From a neurological perspective,​ the Neural Turing Machine makes sense because our "​controller"​ and "​memory"​ are jointly learned from our environment. ​ This advantage is also a disadvantage because it means that the memory interaction between parameters and memory content could be hard to control. ​ For some tasks, such as copy and recall, and on question answering, this can prevent or slow convergence. ​ The human brain does not necessarily have a linear structure; it may be more tree-structured,​ with branches and conditionals. ​ This suggests that by mimicking structured memory in the Neural Turing Machine may improve its results.
 +===== Structure =====
 +A simple hierarchy was introduced, with an upper and lower layer, where signals from the upper layer are accumulated into the lower memory - signals are just sent from one to another, not read/​write/​etc. ​ Three different ways of communicating between the upper and lower memory layers; one where the lower memory is "​hidden"​ and cannot be read from and the upper layer only receives content from the lower memory ("​hidden memory"​),​ one where there is a second read head to write to the upper layer ("​double-controlled"​),​ and finally to use two separate LSTMs for writing into each memory ("​tightly coupled"​). ​ The idea is that the upper layer is a short-term memory which remembers immediate input where the lower layer is a longer-term memory.
 +===== Experiments =====
 +These models were tested on three tasks (copy, recall, and bAbI question-answering),​ it was observed that the "​vanilla"​ NTM had more trouble determining the correct procedure for the problem as quickly during training. ​ Overfitting was observed for the vanilla NTM with two memories, and for the tightly coupled NTM.  On the supporting-fact(s)-only bAbI tasks, the hidden-memory,​ double-controlled and tightly-coupled NTMs also converged quickly but the vanilla NTM was not able to converge. ​ The success could be thanks to the fact that the additional memory structures are stabilizing/​smoothing. ​ Adding a more complicated structure (e.g. graphs rather than a depth-two tree) may improve things. ​ It may also be valuable to study a real question answering dataset.
structured_memory_for_neural_turing_machines.txt ยท Last modified: 2015/12/17 21:59 (external edit)