From SnOwy - Ed's Wiki Notebook
ART continued
- competitive learning classification
- hidden nodes compete to remain active
- there is only one layer of processing clusters are conceptually both hidden and output nodes
- highest activation is the winner
- proven that if not too many input patterns and clusters relative to the number of nodes ...
- then learning will stabilize
- best distribution of the long term memory traces (weights) is guaranteed
- competitive learning not always stable -- encodings can change
- response to the same input patterns can be different (may never stabilize)
ART 1
- can learn to recognize binary input patterns
- until the memory is full
- stable
- encodes new patterns by changing weights
- bottom-up weights are an adaptive filter -- generates a classification
- this behaviour is actually more like a clustering
- top-down weights are used to test if the classification is correct
- compares what class represents with the input pattern
- uses traditional set theory and logic to describe operations
ART 2
- works with analogue patterns
- many variations -- these are all complicated
- not impossible to implement, but difficult
ART 3
- biologically inspired -- used to develop biological theory
Fuzzy ART
- same as ART 1, but it replaces the logic operators with fuzzy logic
- allows it to work with continuous (analogue) values
- this is probably the best one to work with
- easy to program
- has all of the original characteristics of ART -- binary data too, stability etc.
ART-MAP and Fuzzy ART MAP
- combines two unsupervised networks with an associative memory
- creates a supervised classification learning system
- ART-A clusters the independent variables
- ART-B clusters the dependent variables
- ART-A and ART-B is connected together with an associative memory
ART and Fuzzy ART processing
- two layers, unsupervised
- use a vigilance parameter (a distance) which controls how many clusters will be created
- high vigilance creates more clusters (with fewer patterns in each)
- low vigilance creates fewer, larger clusters
- learning ...
- can be fast -- means cluster centres are immediately formed to represent input patterns
- weights are set immediately
- slow learning means weights are changed gradually to create cluster centres
- weights leading to a cluster node encode the prototype (centre) values
- slow is better if the data is noisy
- the network will not bother learning noise (it should be smoothed over amongst the other exemplars)
Network Architecture -- Fuzzy ART
- a layer is called a field (for ARTs)
F2 [_____] -- single active node which represents a cluster or category (receives bottom-up processed activations)
W 〈 Weight Layer -- completely connected 〉
F1 [_____] -- receives input vector (bottom-up copy operation) and input from F2 (top-down)
F0 [_____] -- ← A = (a, ac)
- the exemplar A is fed into F0
- A is presented twice to F0 simultaneously as a itself and its complement ac
- a = (0.1, 0.7); ac = (0.9, 0.3); → A = (0.1, 0.7, 0.9, 0.3)
- by using complements, we ensure that the sum of the elements of the inputs is equal to some value
- F0 .. 2 -- fields (of activity)
- F1 -- receives inputs from F0
- F2 -- active cluster / category
Next Time