The goal of this assignment is to design and implement a program that recognizes people’s handwritting.
We will implement this using Neural Networks. We will be using Matlab. Matlab has an extensive neural network toolbox that largely removes the need to build the nuts and bolts components of neural networks
This assignment is divided into three parts. The first part is an introduction to neural nets in Matlab. Part two uses neural nets to do computer character recognition with and without noise. And part three attempts to use neural nets to do human written character recognition
By switching the symmetry of the inputs, the signs of the outputs will also change. For example, in the training vector the second and third numbers are -1 and 2 respectively, while in p1 they are 2 and -1. the reversal of signs causes the second output to be negative. Also, if you look at the o and o1 output vectors, you will see that -1.0080 repeats itself. This is because the third and fourth numbers are now -1 and 2. However, it all depends on how the nn is trained in the end.
Part 2 – Character Recognition
Source Code
[alphabet,targets] = prprob;
net = newff(alphabet,targets,25);
net1 = net;
net1.divideFcn = '';
[net1,tr] = train(net1,alphabet,targets);
numNoisy = 10;
alphabet2 = [alphabet repmat(alphabet,1,numNoisy)+randn(35,26*numNoisy)*0.2];
targets2 = [targets repmat(targets,1,numNoisy)];
net2 = train(net,alphabet2,targets2);
noise_range = 0:.05:.5;
max_test = 100;
network1 = [];
network2 = [];
for i = 0:4
sumerr = 0;
figure
noisyR = alphabet(:,18)+randn(35,1) * 0.2*i;
plotchar(noisyR);
A2 = sim(net2,noisyR);
A2 = compet(A2);
answer = find(compet(A2) == 1);
figure
plotchar(alphabet(:,answer));
for j = 1:35
sumerr = sumerr + (alphabet(j,18)-alphabet(j,answer));
end
mserr(i+1) = (sumerr/35)^2;
end
x = 0:.2:.8;
plot(x,mserr)
title('Mean Squared Error for Noisy "R"')
xlabel('Noise Level')
ylabel('Mean Squared Error')
% %Stops recognizing R usually around .6 noise level. Then it recognizes the
% %R as either a P, a B, or a K, all of which are similar in structure to the
% %R. Based on the results of the original network tests, I would expect a
% %higher percentage of error than has been shown. The original noise
% %trained neural network at just .4 noise level experienced a 60% error
% %rate. Yet the network that we're using seems to perform much more
% %reliably at .4,
for i = 0:4
figure
noisyA = alphabet(:,1)+randn(35,1) * 0.2*i;
plotchar(noisyA);
A2 = sim(net2,noisyA);
A2 = compet(A2);
answer = find(compet(A2) == 1);
figure
plotchar(alphabet(:,answer));
for j = 1:35
sumerr = sumerr + (alphabet(j,1)-alphabet(j,answer));
end
mserr(i+1) = (sumerr/35)^2;
end
x = 0:.2:.8;
plot(x,mserr)
title('Mean Squared Error for Noisy "A"')
xlabel('Noise Level')
ylabel('Mean Squared Error')
for i = 0:4
figure
noisyO = alphabet(:,15)+randn(35,1) * 0.2*i;
plotchar(noisyO);
A2 = sim(net2,noisyO);
A2 = compet(A2);
answer = find(compet(A2) == 1);
figure
plotchar(alphabet(:,answer));
for j = 1:35
sumerr = sumerr + (alphabet(j,15)-alphabet(j,answer));
end
mserr(i+1) = (sumerr/35)^2;
end
x = 0:.2:.8;
plot(x,mserr)
title('Mean Squared Error for Noisy "O"')
xlabel('Noise Level')
ylabel('Mean Squared Error')
for i = 0:4
figure
noisyZ = alphabet(:,26)+randn(35,1) * 0.2*i;
plotchar(noisyZ);
A2 = sim(net2,noisyZ);
A2 = compet(A2);
answer = find(compet(A2) == 1);
figure
plotchar(alphabet(:,answer));
for j = 1:35
sumerr = sumerr + (alphabet(j,26)-alphabet(j,answer));
end
mserr(i+1) = (sumerr/35)^2;
end
x = 0:.2:.8;
plot(x,mserr)
title('Mean Squared Error for Noisy "Z"')
xlabel('Noise Level')
ylabel('Mean Squared Error')
%the noise level definitely affects the more distinctive letters less.
%Forexample, A and Z are both easily detected by the computer even at noise
%level .6 because they have very distinctive shapes that are not shared by
%other letters. However, the O which is very similar to letters like C and
%G is much harder to distinguish at high noise levels.
eight = [0;1;1;1;0;1;0;0;0;1;1;0;0;0;1;0;1;1;1;0;1;0;0;0;1;1;0;0;0;1;0;1;1;1;0];
for i = 0:4
figure
noisy8 = eight+randn(35,1) * 0.2*i;
plotchar(noisy8);
A2 = sim(net2,noisy8);
A2 = compet(A2);
answer = find(compet(A2) == 1);
figure
plotchar(alphabet(:,answer));
end
%When an input that the NN wasn't trained on is loaded, the neural network
%will of course not be able to tell what it is. However, it should be able
%to tell you which of the letters it is close to. This means that if you
%input an 8, you would expect to get back a B, since they have only four
%squares of difference between them. In test, though, the network had a
%tendency to relate the 8 to a P.
Output Images
Correlation
Letter R
Stops recognizing R usually around .6 noise level. Then it recognizes the R as either a P, a B, or a K, all of which are similar in structure to the R. Based on the results of the original network tests, I would expect a higher percentage of error than has been shown. The original noise trained neural network at just .4 noise level experienced a 60% error rate. Yet the network that we’re using seems to perform much more reliably at .4,
Letter A
Letter O
Letter Z
The noise level definitely affects the more distinctive letters less. Forexample, A and Z are both easily detected by the computer even at noise level .6 because they have very distinctive shapes that are not shared by other letters. However, the O which is very similar to letters like C and G is much harder to distinguish at high noise levels.
Number 8
When an input that the NN wasn’t trained on is loaded, the neural network will of course not be able to tell what it is. However, it should be able to tell you which of the letters it is close to. This means that if you input an 8, you would expect to get back a B, since they have only four squares of difference between them. In test, though, the network had a tendency to relate the 8 to a P.
Part 3 – Handwritten Digits Recognition
Source Code
clear all
MNIST = load('MNIST_data.mat');
train_samp = MNIST.train_samples';
temp = MNIST.train_samples_labels';
for ii=1:4000
train_labl(:,ii) = [0;0;0;0;0;0;0;0;0;0];
train_labl(temp(ii)+1,ii) = 10;
end
net = newff(train_samp,train_labl,[784,196,49,24],...
{'logsig','logsig','logsig','logsig'},'trainrp');
net.divideFcn = 'dividerand';
net.trainParam.lr = .25;
net.trainParam.epochs = 300;
net.trainParam.goal = 0;
net.trainParam.show = 50;
net = train(net,train_samp,train_labl);
confusionmat = zeros(10,10);
for ii=1:1000
result = sim(net,MNIST.test_samples(ii,:)');
label = MNIST.test_samples_labels(ii);
index = find(result==max(result));
confusionmat(index,label+1) = confusionmat(index,label+1) + 1;
end
confusionmat
sum = 0;
for ii = 1:10
sum = sum + confusionmat(ii,ii);
end
errorpct = (1-(sum/1000)) * 100
We implemented a 4 layer net. Our performance was always between 10% and 20%. We used mostly the ‘logsig’ function for our neurons because it yielded better performance.
The training process was slow. Taking about 20 minutes per run. We then setup the divideFcn parameter to ‘dividerand’, this cut down the training time significantly by stopping training when results started getting worst. We tried representing the images as weighted sums, weighted averages, weighted sums by row and column. All of these methods failed, none of them produced better than a 90% error.
Part 4 – Recognizing My Own Handwriting
Scanned Image
This is the scanned image of sample handwritting. The image was cropped by number and inverted. Here is an example Number:
Code
clear all
test(1,:,:,:) = imread('samples/0-0','jpg');
test(2,:,:,:) = imread('samples/0-1','jpg');
test(3,:,:,:) = imread('samples/0-2','jpg');
test(4,:,:,:) = imread('samples/0-3','jpg');
test(5,:,:,:) = imread('samples/0-4','jpg');
test(6,:,:,:) = imread('samples/0-5','jpg');
test(7,:,:,:) = imread('samples/0-6','jpg');
test(8,:,:,:) = imread('samples/0-7','jpg');
test(9,:,:,:) = imread('samples/0-8','jpg');
test(10,:,:,:) = imread('samples/0-9','jpg');
test(11,:,:,:) = imread('samples/1-0','jpg');
test(12,:,:,:) = imread('samples/1-1','jpg');
test(13,:,:,:) = imread('samples/1-2','jpg');
test(14,:,:,:) = imread('samples/1-3','jpg');
test(15,:,:,:) = imread('samples/1-4','jpg');
test(16,:,:,:) = imread('samples/1-5','jpg');
test(17,:,:,:) = imread('samples/1-6','jpg');
test(18,:,:,:) = imread('samples/1-7','jpg');
test(19,:,:,:) = imread('samples/1-8','jpg');
test(20,:,:,:) = imread('samples/1-9','jpg');
test(21,:,:,:) = imread('samples/2-0','jpg');
test(22,:,:,:) = imread('samples/2-1','jpg');
test(23,:,:,:) = imread('samples/2-2','jpg');
test(24,:,:,:) = imread('samples/2-3','jpg');
test(25,:,:,:) = imread('samples/2-4','jpg');
test(26,:,:,:) = imread('samples/2-5','jpg');
test(27,:,:,:) = imread('samples/2-6','jpg');
test(28,:,:,:) = imread('samples/2-7','jpg');
test(29,:,:,:) = imread('samples/2-8','jpg');
test(30,:,:,:) = imread('samples/2-9','jpg');
for ii=1:30
imgs(ii,:,:) = test(ii,:,:,1);
labels(ii) = mod((ii-1),10);
end
for ii=1:30
vecs(:,ii) = reshape(imgs(ii,:,:),784,1);
samples(:,ii) = double(vecs(:,ii))./255.0;
for jj=1:784
if samples(jj,ii) > .15
samples(jj,ii) = samples(jj,ii) * 10;
elseif samples(jj,ii) < .08
samples(jj,ii) = samples(jj,ii) * .5;
end
samples(jj,ii) = samples(jj,ii) / 10;
end
end
MNIST = load('MNIST_data.mat');
train_samp = MNIST.train_samples';
temp = MNIST.train_samples_labels';
for ii=1:4000
train_labl(:,ii) = [0;0;0;0;0;0;0;0;0;0];
train_labl(temp(ii)+1,ii) = 10;
end
net = newff(train_samp,train_labl,[784,196,49,24],...
{'logsig','logsig','logsig','logsig',},'trainrp');
net.divideFcn = 'dividerand';
net.trainParam.lr = .25;
net.trainParam.epochs = 300;
net.trainParam.goal = 1e-5;
net.trainParam.show = 50;
net = train(net,train_samp,train_labl);
confusionmat = zeros(10,10);
for ii=1:30
result = sim(net,samples(:,ii));
label = labels(ii);
index = find(result==max(result));
confusionmat(index,label+1) = confusionmat(index,label+1) + 1;
end
confusionmat
sum = 0;
for ii = 1:10
sum = sum + confusionmat(ii,ii);
end
errorpct = (1-(sum/30)) * 100
When we ran the simulation with the scanned numbers. Our neural net got confused. We think it is because the test sample size is too small(30) to get any accurate data. Other things that affected the accuracy of our Neural Net are the quality of the scan, and thenormalization procedures, might not have been the same that were used for the training images.
Neural Networks
The goal of this assignment is to design and implement a program that recognizes people’s handwritting.
We will implement this using Neural Networks. We will be using Matlab. Matlab has an extensive neural network toolbox that largely removes the need to build the nuts and bolts components of neural networks
This assignment is divided into three parts. The first part is an introduction to neural nets in Matlab. Part two uses neural nets to do computer character recognition with and without noise. And part three attempts to use neural nets to do human written character recognition
Part 1 - A Simple Neural Network
Source Code
p1 = [-1 2 -1 2; 2 2 5 5]; t1 = [-1 -1 1 1]; p2 = [2 0 -2 -4; 0 1 2 3]; t2 = [-1 -1 1 1]; p3 = [1 0;5 5]; t3 = [-1 1]; p4 = [1 0;0 0]; t4 = [-1 1]; net =newff(minmax(p1),[3,1],{'tansig','purelin'},'traingd'); net.trainParam.lr = .05; %Learning Rate net.trainParam.epochs = 300; %Max Ephocs net.trainParam.goal = 1e-5; %Training Goal in Mean Sqared Error net.trainParam.show = 50; %# of ephocs in display [net,tr1] = train(net,p1,t1); o1 = sim(net,p1) net = newff(minmax(p2),[3,1],{'tansig','purelin'},'traingd'); [net,tr2] = train(net,p2,t2); o2 = sim(net,p2); net = newff(minmax(p3),[3,1],{'tansig','purelin'},'traingd'); [net,tr3] = train(net,p3,t3); o3 = sim(net,p3); net = newff(minmax(p4),[3,1],{'tansig','purelin'},'traingd'); [net,tr4] = train(net,p4,t4); o4 = sim(net,p4);Output Images
Output 1
Output 2
Output 3
Output 4
Analysis
By switching the symmetry of the inputs, the signs of the outputs will also change. For example, in the training vector the second and third numbers are -1 and 2 respectively, while in p1 they are 2 and -1. the reversal of signs causes the second output to be negative. Also, if you look at the o and o1 output vectors, you will see that -1.0080 repeats itself. This is because the third and fourth numbers are now -1 and 2. However, it all depends on how the nn is trained in the end.
Part 2 – Character Recognition
Source Code
[alphabet,targets] = prprob; net = newff(alphabet,targets,25); net1 = net; net1.divideFcn = ''; [net1,tr] = train(net1,alphabet,targets); numNoisy = 10; alphabet2 = [alphabet repmat(alphabet,1,numNoisy)+randn(35,26*numNoisy)*0.2]; targets2 = [targets repmat(targets,1,numNoisy)]; net2 = train(net,alphabet2,targets2); noise_range = 0:.05:.5; max_test = 100; network1 = []; network2 = []; for i = 0:4 sumerr = 0; figure noisyR = alphabet(:,18)+randn(35,1) * 0.2*i; plotchar(noisyR); A2 = sim(net2,noisyR); A2 = compet(A2); answer = find(compet(A2) == 1); figure plotchar(alphabet(:,answer)); for j = 1:35 sumerr = sumerr + (alphabet(j,18)-alphabet(j,answer)); end mserr(i+1) = (sumerr/35)^2; end x = 0:.2:.8; plot(x,mserr) title('Mean Squared Error for Noisy "R"') xlabel('Noise Level') ylabel('Mean Squared Error') % %Stops recognizing R usually around .6 noise level. Then it recognizes the % %R as either a P, a B, or a K, all of which are similar in structure to the % %R. Based on the results of the original network tests, I would expect a % %higher percentage of error than has been shown. The original noise % %trained neural network at just .4 noise level experienced a 60% error % %rate. Yet the network that we're using seems to perform much more % %reliably at .4, for i = 0:4 figure noisyA = alphabet(:,1)+randn(35,1) * 0.2*i; plotchar(noisyA); A2 = sim(net2,noisyA); A2 = compet(A2); answer = find(compet(A2) == 1); figure plotchar(alphabet(:,answer)); for j = 1:35 sumerr = sumerr + (alphabet(j,1)-alphabet(j,answer)); end mserr(i+1) = (sumerr/35)^2; end x = 0:.2:.8; plot(x,mserr) title('Mean Squared Error for Noisy "A"') xlabel('Noise Level') ylabel('Mean Squared Error') for i = 0:4 figure noisyO = alphabet(:,15)+randn(35,1) * 0.2*i; plotchar(noisyO); A2 = sim(net2,noisyO); A2 = compet(A2); answer = find(compet(A2) == 1); figure plotchar(alphabet(:,answer)); for j = 1:35 sumerr = sumerr + (alphabet(j,15)-alphabet(j,answer)); end mserr(i+1) = (sumerr/35)^2; end x = 0:.2:.8; plot(x,mserr) title('Mean Squared Error for Noisy "O"') xlabel('Noise Level') ylabel('Mean Squared Error') for i = 0:4 figure noisyZ = alphabet(:,26)+randn(35,1) * 0.2*i; plotchar(noisyZ); A2 = sim(net2,noisyZ); A2 = compet(A2); answer = find(compet(A2) == 1); figure plotchar(alphabet(:,answer)); for j = 1:35 sumerr = sumerr + (alphabet(j,26)-alphabet(j,answer)); end mserr(i+1) = (sumerr/35)^2; end x = 0:.2:.8; plot(x,mserr) title('Mean Squared Error for Noisy "Z"') xlabel('Noise Level') ylabel('Mean Squared Error') %the noise level definitely affects the more distinctive letters less. %Forexample, A and Z are both easily detected by the computer even at noise %level .6 because they have very distinctive shapes that are not shared by %other letters. However, the O which is very similar to letters like C and %G is much harder to distinguish at high noise levels. eight = [0;1;1;1;0;1;0;0;0;1;1;0;0;0;1;0;1;1;1;0;1;0;0;0;1;1;0;0;0;1;0;1;1;1;0]; for i = 0:4 figure noisy8 = eight+randn(35,1) * 0.2*i; plotchar(noisy8); A2 = sim(net2,noisy8); A2 = compet(A2); answer = find(compet(A2) == 1); figure plotchar(alphabet(:,answer)); end %When an input that the NN wasn't trained on is loaded, the neural network %will of course not be able to tell what it is. However, it should be able %to tell you which of the letters it is close to. This means that if you %input an 8, you would expect to get back a B, since they have only four %squares of difference between them. In test, though, the network had a %tendency to relate the 8 to a P.Output Images
Correlation
Letter R
Letter A
Letter O
Letter Z
Number 8
When an input that the NN wasn’t trained on is loaded, the neural network will of course not be able to tell what it is. However, it should be able to tell you which of the letters it is close to. This means that if you input an 8, you would expect to get back a B, since they have only four squares of difference between them. In test, though, the network had a tendency to relate the 8 to a P.
Part 3 – Handwritten Digits Recognition
Source Code
clear all MNIST = load('MNIST_data.mat'); train_samp = MNIST.train_samples'; temp = MNIST.train_samples_labels'; for ii=1:4000 train_labl(:,ii) = [0;0;0;0;0;0;0;0;0;0]; train_labl(temp(ii)+1,ii) = 10; end net = newff(train_samp,train_labl,[784,196,49,24],... {'logsig','logsig','logsig','logsig'},'trainrp'); net.divideFcn = 'dividerand'; net.trainParam.lr = .25; net.trainParam.epochs = 300; net.trainParam.goal = 0; net.trainParam.show = 50; net = train(net,train_samp,train_labl); confusionmat = zeros(10,10); for ii=1:1000 result = sim(net,MNIST.test_samples(ii,:)'); label = MNIST.test_samples_labels(ii); index = find(result==max(result)); confusionmat(index,label+1) = confusionmat(index,label+1) + 1; end confusionmat sum = 0; for ii = 1:10 sum = sum + confusionmat(ii,ii); end errorpct = (1-(sum/1000)) * 100confusionmat = 74 0 0 1 0 2 2 0 0 1 0 120 2 0 0 1 0 6 0 1 0 0 85 1 1 0 1 2 3 1 0 1 2 93 2 3 0 0 7 1 2 0 1 2 87 1 1 2 6 16 7 0 1 5 4 74 3 1 3 2 1 0 2 1 2 2 76 0 3 0 0 0 2 4 0 0 0 74 0 4 0 1 13 4 5 9 3 1 53 1 2 0 5 4 7 0 1 13 11 65 errorpct = 19.9000Analysis
We implemented a 4 layer net. Our performance was always between 10% and 20%. We used mostly the ‘logsig’ function for our neurons because it yielded better performance.
The training process was slow. Taking about 20 minutes per run. We then setup the divideFcn parameter to ‘dividerand’, this cut down the training time significantly by stopping training when results started getting worst. We tried representing the images as weighted sums, weighted averages, weighted sums by row and column. All of these methods failed, none of them produced better than a 90% error.
Part 4 – Recognizing My Own Handwriting
Scanned Image
Code
clear all test(1,:,:,:) = imread('samples/0-0','jpg'); test(2,:,:,:) = imread('samples/0-1','jpg'); test(3,:,:,:) = imread('samples/0-2','jpg'); test(4,:,:,:) = imread('samples/0-3','jpg'); test(5,:,:,:) = imread('samples/0-4','jpg'); test(6,:,:,:) = imread('samples/0-5','jpg'); test(7,:,:,:) = imread('samples/0-6','jpg'); test(8,:,:,:) = imread('samples/0-7','jpg'); test(9,:,:,:) = imread('samples/0-8','jpg'); test(10,:,:,:) = imread('samples/0-9','jpg'); test(11,:,:,:) = imread('samples/1-0','jpg'); test(12,:,:,:) = imread('samples/1-1','jpg'); test(13,:,:,:) = imread('samples/1-2','jpg'); test(14,:,:,:) = imread('samples/1-3','jpg'); test(15,:,:,:) = imread('samples/1-4','jpg'); test(16,:,:,:) = imread('samples/1-5','jpg'); test(17,:,:,:) = imread('samples/1-6','jpg'); test(18,:,:,:) = imread('samples/1-7','jpg'); test(19,:,:,:) = imread('samples/1-8','jpg'); test(20,:,:,:) = imread('samples/1-9','jpg'); test(21,:,:,:) = imread('samples/2-0','jpg'); test(22,:,:,:) = imread('samples/2-1','jpg'); test(23,:,:,:) = imread('samples/2-2','jpg'); test(24,:,:,:) = imread('samples/2-3','jpg'); test(25,:,:,:) = imread('samples/2-4','jpg'); test(26,:,:,:) = imread('samples/2-5','jpg'); test(27,:,:,:) = imread('samples/2-6','jpg'); test(28,:,:,:) = imread('samples/2-7','jpg'); test(29,:,:,:) = imread('samples/2-8','jpg'); test(30,:,:,:) = imread('samples/2-9','jpg'); for ii=1:30 imgs(ii,:,:) = test(ii,:,:,1); labels(ii) = mod((ii-1),10); end for ii=1:30 vecs(:,ii) = reshape(imgs(ii,:,:),784,1); samples(:,ii) = double(vecs(:,ii))./255.0; for jj=1:784 if samples(jj,ii) > .15 samples(jj,ii) = samples(jj,ii) * 10; elseif samples(jj,ii) < .08 samples(jj,ii) = samples(jj,ii) * .5; end samples(jj,ii) = samples(jj,ii) / 10; end end MNIST = load('MNIST_data.mat'); train_samp = MNIST.train_samples'; temp = MNIST.train_samples_labels'; for ii=1:4000 train_labl(:,ii) = [0;0;0;0;0;0;0;0;0;0]; train_labl(temp(ii)+1,ii) = 10; end net = newff(train_samp,train_labl,[784,196,49,24],... {'logsig','logsig','logsig','logsig',},'trainrp'); net.divideFcn = 'dividerand'; net.trainParam.lr = .25; net.trainParam.epochs = 300; net.trainParam.goal = 1e-5; net.trainParam.show = 50; net = train(net,train_samp,train_labl); confusionmat = zeros(10,10); for ii=1:30 result = sim(net,samples(:,ii)); label = labels(ii); index = find(result==max(result)); confusionmat(index,label+1) = confusionmat(index,label+1) + 1; end confusionmat sum = 0; for ii = 1:10 sum = sum + confusionmat(ii,ii); end errorpct = (1-(sum/30)) * 100confusionmat = 0 0 0 0 0 0 0 0 0 0 1 2 1 2 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 2 1 0 1 2 1 2 1 2 2 1 1 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 errorpct = 86.6667Analysis
When we ran the simulation with the scanned numbers. Our neural net got confused. We think it is because the test sample size is too small(30) to get any accurate data. Other things that affected the accuracy of our Neural Net are the quality of the scan, and thenormalization procedures, might not have been the same that were used for the training images.
Old Page
Source Code