2 1 Introduction to Neural Networks

41
Neural Networks NN 1 1 Neural Networks Teacher: Elena Marchiori R4.47 [email protected] .nl Assistant: Marius Codrea S4.16 [email protected]

Transcript of 2 1 Introduction to Neural Networks

Neural Networks NN 1 1

Neural Networks

Teacher:Elena

MarchioriR4.47

[email protected]

Assistant: Marius Codrea

S4.16 [email protected]

Neural Networks NN 1 2

Course OutlineThe course is divided in two parts:

theory and practice. 1. Theory covers basic topics in

neural networks theory and application to supervised and unsupervised learning.

2. Practice deals with basics of Matlab and application of NN learning algorithms.

Neural Networks NN 1 3

Course Information• Register for practicum: send

email to [email protected] with:1. Subject: NN practicum2. Content: names, study numbers, study

directions (AI,BWI,I, other)• Course information, plan, slides

and links to on-line material are available at

http://www.cs.vu.nl/~elena/nn.html

Neural Networks NN 1 4

Course Evaluation• Course value: 6 ects• Evaluation is based on the

following two parts: – theory (weight 0.5): final exam at

the end of the course consisting of questions about theory part. (Dates to be announced)

– practicum(weight 0.5): Matlab programming assignments to be done in couples (Available during the course at http://www.few.vu.nl/~codrea/nn)

Neural Networks NN 1 5

What are Neural Networks?

• Simple computational elements forming a large network– Emphasis on learning (pattern recognition)

– Local computation (neurons)

• Definition of NNs is vague– Often | but not always | inspired by biological brain

Neural Networks NN 1 6

History• Roots of work on NN are in:• Neurobiological studies (more than one century

ago):• How do nerves behave when stimulated by

different magnitudes of electric current? Is there a minimal threshold needed for nerves to be activated? Given that no single nerve cel is long enough, how do different nerve cells communicate among each other?

• Psychological studies:• How do animals learn, forget, recognize and

perform other types of tasks?• Psycho-physical experiments helped to understand

how individual neurons and groups of neurons work.• McCulloch and Pitts introduced the first

mathematical model of single neuron, widely applied in subsequent work.

Neural Networks NN 1 7

HistoryPrehistory: • Golgi and Ramon y Cajal study the nervous system

and discover neurons (end of 19th century)History (brief):• McCulloch and Pitts (1943): the first artificial

neural network with binary neurons• Hebb (1949): learning = neurons that are together

wire together• Minsky (1954): neural networks for reinforcement

learning• Taylor (1956): associative memory• Rosenblatt (1958): perceptron, a single neuron for

supervised learning

Neural Networks NN 1 8

History• Widrow and Hoff (1960): Adaline• Minsky and Papert (1969): limitations of single-

layer perceptrons (and they erroneously claimed that the limitations hold for multi-layer perceptrons)

Stagnation in the 70's:• Individual researchers continue laying foundations• von der Marlsburg (1973): competitive learning and

self-organizationBig neural-nets boom in the 80's• Grossberg: adaptive resonance theory (ART)• Hopfield: Hopfield network• Kohonen: self-organising map (SOM)

Neural Networks NN 1 9

History

• Oja: neural principal component analysis (PCA)• Ackley, Hinton and Sejnowski: Boltzmann machine• Rumelhart, Hinton and Williams: backpropagationDiversification during the 90's:• Machine learning: mathematical rigor, Bayesian

methods, infomation theory, support vector machines (now state of the art!), ...

• Computational neurosciences: workings of most subsystems of the brain are understood at some level; research ranges from low-level compartmental models of individual neurons to large-scale brain models

Neural Networks NN 1 10

Course Topics Learning Tasks

Supervised UnsupervisedData:Labeled examples (input , desired output)

Tasks:classificationpattern recognition regressionNN models:perceptron adalinefeed-forward NN radial basis functionsupport vector machines

Data:Unlabeled examples (different realizations of the input)

Tasks:clusteringcontent addressable memory

NN models:self-organizing maps (SOM)Hopfield networks

Neural Networks NN 1 11

NNs: goal and design– Knowledge about the learning task is given in the form of a set of examples (dataset) called training examples.

– A NN is specified by:• an architecture: a set of neurons and links connecting neurons. Each link has a weight,

• a neuron model: the information processing unit of the NN,

• a learning algorithm: used for training the NN by modifying the weights in order to solve the particular learning task correctly on the training examples.

The aim is to obtain a NN that generalizes well, that is, that behaves correctly on new examples of the learning task.

Neural Networks NN 1 12

Example: AlvinnAutonomous driving at 70 mph on a public highway

Camera image

30x32 pixelsas inputs

30 outputsfor steering 30x32 weights

into one out offour hiddenunit

4 hiddenunits

Neural Networks NN 1 13

Dimensions of a Neural Network

• network architectures• types of neurons• learning algorithms• applications

Neural Networks NN 1 14

Network architectures

• Three different classes of network architectures

– single-layer feed-forward neurons are organized– multi-layer feed-forward in acyclic layers– recurrent

• The architecture of a neural network is linked with the learning algorithm used to train

Neural Networks NN 1 15

Single Layer Feed-forward

Input layerof

source nodes

Output layerof

neurons

Neural Networks NN 1 16

Multi layer feed-forward

Inputlayer

Outputlayer

Hidden Layer

3-4-2 Network

Neural Networks NN 1 17

Recurrent Network with hidden neuron: unit delay operator z-1

is used to model a dynamic system

z-1

z-1

z-1

Recurrent network

inputhiddenoutput

Neural Networks NN 1 18

The Neuron

Inputvalues

weights

Summingfunction

Biasb

ActivationfunctionLocal

Fieldv Output

y

x1

x2

xm

w2

wm

w1

)(

………….

Neural Networks NN 1 19

Input Signal and Weights

Input signalsAn input may be either a raw / preprocessed signal or

image. Alternatively, some specific features can also be

used.If specific features are used

as input, their number and selection is crucial and application dependent

WeightsWeights are connectedbetween an input and asumming node. These affect

to the summing operation. The quality of network can

be seen from weightsBias is a constant input

with certain weight. Usually the weights are randomized in the beginning

Neural Networks NN 1 20

The Neuron• The neuron is the basic information processing unit of a NN. It consists of:1 A set of links, describing the neuron inputs, with weights W1, W2, …, Wm

2 An adder function (linear combiner) for computing the weighted sum of the inputs (real numbers):

3 Activation function (squashing function) for limiting the amplitude of the neuron output.

m

1jj xwu

j

) (u y b

Neural Networks NN 1 21

Bias of a Neuron • The bias b has the effect of applying an affine transformation to the weighted sum u

v = u + b• v is called induced field of the neuron

x2x1 u x1-x2=0

x1-x2= 1

x1

x2

x1-x2= -1

Neural Networks NN 1 22

Bias as extra input

Inputsignal

Synapticweights

Summingfunction

ActivationfunctionLocal

Fieldv Output

y

x1

x2

xm

w2

wm

w1

)(

w0x0 = +1

• The bias is an external parameter of the neuron. It can be modeled by adding an extra input.

bw

xwv jm

jj

0

0

…………..

Neural Networks NN 1 23

Activation Function There are different activation functions used in

different applications. The most common ones are:

Hard-limiter

Piecewise linear

Sigmoid

Hyperbolic tangent

0001

vifvif

v

2102121211

vifvifv

vifv )exp(1

1av

v

vv tanh

Neural Networks NN 1 24

Neuron Models• The choice of determines the neuron model. Examples:• step function:

• ramp function:

• sigmoid function: with z,x,y parameters

• Gaussian function:

2

21exp

21)(

vv

)exp(11)(

yxvzv

otherwise ))/())(((

if if

)(cdabcva

dvbcva

v

cvbcva

v if if )(

Neural Networks NN 1 25

Learning Algorithms

Depend on the network architecture:• Error correcting learning (perceptron)

• Delta rule (AdaLine, Backprop)• Competitive Learning (Self Organizing Maps)

Neural Networks NN 1 26

Applications• Classification:

– Image recognition – Speech recognition– Diagnostic– Fraud detection– …

• Regression:– Forecasting (prediction on base of past history)– …

• Pattern association:– Retrieve an image from corrupted one– …

• Clustering: – clients profiles– disease subtypes– …

Neural Networks NN 1 27

Supervised learning

Non-linear classifiers

Linear classifiersPerceptron Adaline

Feed-forward networksRadial basis function networks

Support vector machines

Unsupervised learning

Clustering

Content addressable memoriesOptimization Hopfield networks

Self-organizing maps K-means

Neural Networks NN 1 28

Vectors: basics (slides by Hyoungjune Yi:

[email protected])• Ordered set of numbers: (1,2,3,4)

• Example: (x,y,z) coordinates of pt in space.

runit vecto a is ,1 If12

),,2,1(

vv

n

iixv

nxxxv

Neural Networks NN 1 29

Vector Addition

),(),(),( 22112121 yxyxyyxx wv

vvww

V+wV+w

Neural Networks NN 1 30

Scalar Product

),(),( 2121 axaxxxaa v

vvavav

Neural Networks NN 1 31

Operations on vectors• sum• max, min, mean, sort, …• Pointwise: .^

Neural Networks NN 1 32

Inner (dot) Productvv

ww

22112121 .),).(,(. yxyxyyxxwv

The inner product is a The inner product is a SCALAR!SCALAR!

cos||||||||),).(,(. 2121 wvyyxxwv

wvwv 0.

Neural Networks NN 1 33

Matrices

nmnn

m

m

m

mn

aaa

aaaaaaaaa

A

21

33231

22221

11211

mnmnmn BAC Sum:Sum:

ijijij bac

A and B must have the A and B must have the same dimensionssame dimensions

Neural Networks NN 1 34

Matrices

pmmnpn BAC Product:Product:

m

kkjikij bac

1

A and B must have A and B must have compatible dimensionscompatible dimensions

nnnnnnnn ABBA

Identity Matrix:

AAIIAI

100

010001

Neural Networks NN 1 35

Matrices

mnT

nm AC Transpose:Transpose:

jiij ac TTT ABAB )(

TTT BABA )(

IfIf AAT A is symmetricA is symmetric

Neural Networks NN 1 36

Matrices

IAAAA nnnnnnnn

11

Inverse:Inverse: A must be squareA must be square

1121

1222

12212211

1

2221

1211 1aaaa

aaaaaaaa

Neural Networks NN 1 37

2D Translation

ttPP

P’P’

Neural Networks NN 1 38

2D Translation Equation

PP

xx

yy

ttxx

ttyyP’P’

tt

tPP ),(' yx tytx

),(),(

yx ttyx

tP

Neural Networks NN 1 39

2D Translation using Matrices

PP

xx

yy

ttxx

ttyyP’P’

tt),(),(

yx ttyx

tP

11

001' y

x

tt

tytx

y

x

y

xP

tt PP

Neural Networks NN 1 40

Scaling

PP

P’P’

Neural Networks NN 1 41

Scaling Equation

PP

xx

yy

s.xs.x

P’P’s.ys.y

),('),(sysx

yx

PP

PP s'

yx

ss

sysx

00'P

SPSP '