Morphology Optimizing Tempotron
Tempotron. Binary Synapses. Spiking Neuron. Active dendrites. Structural Plasticity.
Optimizing memory capacity for pattern recognition
Our first hardware efficient Machine Learning algorithm namely Liquid State Machine with Dendritically Enhanced Readout (LSM-DER) was designed for real-time operation. It is an anytime algorithm i.e. it (unlike Turing machines or attractor neural networks) can be prompted at any time to return its current best possible answer. LSM-DER does that by inherently subdividing each pattern into multiple subpatterns and identifying them independently. Hence, it has a high memory requirement and can be an overkill for certain pattern recognition tasks that require an output to be provided by the network only after an entire pattern has been applied. Morphology Optimizing Tempotron (MOT) was designed while keeping this in mind and has approximately half the memory requirement as compared to LSM-DER.
Architecture and Theory
The structure is characterized by m dendritic branches and k excitatory synaptic contacts per branch. For each branch, the synaptic contact is formed by one of the d dimensions of input afferents where d >> k. At the relevant times governed by incoming spikes, the synapses are activated and the membrane voltage is calculated by weighted sum of postsynaptic potentials (PSPs) as follows:
where wij is the weight of the ith synapse formed on the jth branch, vj(t) is the input to the jth dendritic nonlinearity, b() is the nonlinear activation function of the dendritic branch, K denotes the post-synaptic potential kernel and tif are times of incoming spikes on the ith afferent. We consider binary synapses, hence wij = 0/1. The neuron fires at least one spike if V(t) crosses a firing threshold Vthr; otherwise it remains quiscent.
Learning the connections
MOT is a binary classifier and the idea is that after training the network should produce a single spike for one class of patterns and remain quiscent for the other as shown in the accompanying figure. Since we have binary synapses, a morphological learning rule that can modify the connections between afferent lines and synapses is needed. Inspired by the Tempotron learning rule, we start with a cost function that measures the deviation between the maximum membrane voltage (Vmax) generated by misclassified patterns and Vthr. On applying gradient descent to this cost function we obtain a correlation term cij that is used to guide the process of swapping connections. At every iteration of the learning process, the synapse with the lowest cij averaged over an entire batch of patterns from a randomly chosen target set will be replaced with the highest cij synapse in a candidate replacement.
Learning the threshold
Since we do not have an arbitrary multiplicative weight in our neural model, the range of maximum voltages obtainable in response to a fixed temporal spike pattern is limited. Thus, improper selection of threshold may largely degrade the classification performance since a very large threshold may never be crossed by V(t). To combat this, we propose an automatic mechanism for adapting Vthr during training. This technique involves updating the value of Vthr after each iteration guided by the formula ΔVthr = η (wfp FP - wfn FN), where FP, FN, wfp, wfn and η > 0 are the number of false positives, number of false negatives, weightage associated with false positive error, weightage associated with false negative error and threshold learning rate respectively.
Classifying Random Latency Patterns
To evaluate MOT, the first task we consider is binary classification of single spike random latency patterns. To perform the task, P spike patterns were generated and randomly assigned to one of the two classes P+ (Class 1) or P- (Class 2). Each spike pattern consists of d afferents, where each of them spiked only once at a time drawn independently from a uniform distribution between 1 and T ms. The accompanying figure shows the performance of MOT in comparison to Tempotron. The results depict that only at 4-bit quantization level, the classification performance of Tempotron (93.05%(SD = 0.55%) and 89.14%(SD = 0.7%) accuracy for 500 and 1000 patterns) is comparable to MOT using 1-bit or binary weights.
Classifying Pairwise Synchrony Patterns
To examine the ability of our algorithm to learn correlations in multiple spikes, another data set that consists of pairwise synchrony events in each pattern is generated. In this data set, all the d afferents are grouped into (d/2) pairs and afferents in a given pair fire single spike patterns synchronously. Since synchronous events occur at random, uniformly distributed times in both pattern categories, so that class information is embedded solely in the patterns of synchrony; neither spike counts nor spike timing of individual neurons carry any information relevant for the classification task. This task mimicked spike synchrony-based sensory processing. The performance of MOT with binary weights is comparable to performance of Tempotron at 4-bit quantization showing about 100 % and 98.12%(SD = 0.12%) accuracy for 500 and 1000 patterns respectively.
Application Example: Tactile Sensing
To check whether the algorithm can work in real world problems, it is used to classify Tactile information. The job is to identify objects of different shapes and sizes as recorded by sensors present in the palm of a robotic arm. The performance of the proposed method on this application is compared with Tempotron at different quantization levels. The results, averaged over 10 independent trials, show that the proposed algorithm is able to achieve an accuracy of 96.54%(SD = 0.6543%). Although Tempotron without quantization performs better (97.06%(SD = 0.5123%)) than MOT, but after quantization, at least 6 bits of weight resolution is needed by Tempotron to match our performance with 1 bit weights.
Check out our paper to learn more
S. Roy, P. P. San, S. Hussain, L. W. Wei and A. Basu, "Learning Spike time codes through Morphological Learning with Binary Synapses," IEEE Transactions on Neural Networks and Learning Systems, 2015. [pdf]