Notes 20110118 CIS 6050 Neural Networks
From SnOwy - Ed's Wiki Notebook
Contents |
Assignments
keywords: projects, evaluation
- February 4 -- very likely BP network
- February 13 -- Literature Review
- March 4 -- likely SOM or ART
- March 25 -- small and large paper
- Week of March 28 ~ April 1 -- 10 minute paper presentations
Literature Review
- review three or more papers
- default topic: ANN
- other topics: architectures?
- aim for 15 pages (12 ~ 20)
- applications type paper ...
- what type ANN is used
- what it is
- what it does
- describe the problem (application, data, problem)
- results -- what worked and what didn't work
- complexity (time, space)
- data characteristics -- temporal, non-stationary, non-linear
- data source
- hint -- older papers are easier because there are fewer technical steps
- possibly requires using the printed library
Learning
- methods of making systems learn
- learning is the primary ability of the ANN
- the defining characteristic of ANNs
- learn and improve performance
- requires time to learn
- involves adjusting the weight values
- the learning algorithm are the rules for changing the weights
- learning algorithms differ in how weights are adjusted
Error Correction Learning
- used in back propagation learning
- used in adaline
- minimizes error between target output and actual output by adjusting weights
- derived from optimum filtering
- for a feedforward network, node k is driven by input vector x(n) at time n
- the output of the node is yk(n)
- desired output is dk(n)
- the error is given by ek(n) = dk(n) - yk(n)
- the error drives the mechanism which adjusts the weights to node k
- makes yk(n) approach dk(n)
- a cost function ε is minimized
- given as ...
- ε(n) = 0.5 e2k(n)
- weights are adjusted until the system reaches a steady state
- error starts high and gets low over time
- start with a fairly unstable network
- minimizing cost function is known as the delta rule
- aka Widro-Hoff rule
- weight adjustment is defined by ...
- Δkj(n) = ηek(n)xj(n)
- where η is the learning rate (learning constant)
- adjustment of weights are proportional to product of error and input signal
- requires that the error is directly measurable
- need desired response dk(n) -- need the right answer
- error correction is done for each node k
- stability is controlled by learning rate η
Memory Based Learning
- radial basis function -- RBFs
- training ...
- past experiences (data patterns) are explicitly stored
- -- stored as a memory of input, output pairs.
- {(xi, di)}Ni=1
- the network literally stores these items
- store and retrieve scheme
- xi -- input vector
- di -- desired output vector -- corresponding output
- memories: literally just store things away
- testing ...
- given a new pattern, ask how this new pattern relates to the previous patterns
- testing data xtest is compared to a local neighbourhood around xtext
- requires -- definition for neighbourhood
- what's near xtest
- requires a learning rule
- i.e. how are these pairs of vectors stored?
- nearest neighbour rule
- xi ←→ xtext
- the k-nearest neighbour rule
- identifies k-learned patterns nearest xtest
- assigns xtest to class with most frequent neighbours
- for k = 3, the {0}s are the classes of the three nearest neighbours
- k is user defined
*1
*1
*test
*0
- in the above example, the three nearest neighbours have di = {0, 1, 1}
- the solution for xtest is 1
Hebbian Learning
- if a cell A is near enough to cell B to encourage B's activation ...
- then a change will occur that will allow A ...
- to more efficiently activate B
- if A triggers B
w
A* --------> *B
- change the synapse value between A and B to encourage B to fire given A has fired
- developed by Donald Hebb (neuroscientist)
- can be restated as ...
- if two nodes which are attached by a weight are simultaneously activated ...
- then the value of the weight is increased
Aside
- other memories: content addressable, associative memory