Convolutional Neural Network (LeNet 5). Construction of C3, C5 layers - networking

http://i60.tinypic.com/no7tye.png
Fig. 1 Convolutional Neural Network (LeNet5)
On Convolutional Neural Network (LeNet 5), Fig. 1 proceeding of Convolution (C1), Max Pooling(Subsampling) (S2,S4) layers are computed by iterative manneur. But I did not understood how correctly proceed C3 (Convolution) layer.
http://tinypic.com/r/fvzp86/8
Fig. 2 Proceeding C1 layer
Firstly as an input we recieve a MNIST 32*32 grayscale image of number, perceiving it as an Array of Bytes of size 32*32. In C1 layer we have 6 distinct(various) kernels filled with random small values. Each kernel from 1 to 6 is used to build 6 various feature maps (one kernel per one feature map). Moving receptive field of size 5*5 one 1 pixel stride (bias) from left to right, multiplying value in image Array on kernel value adding bias and passing through sigmoid function. The result is i,j of a current constructed feature map. Once we have reached the end of Image Array we finished building of current feature map.
http://i57.tinypic.com/rk0jk9.jpg
Fig. 3 Proceeding S2 layer
Next we start to produce S2 layer, again there will be 6 feature maps, as we using 2*2 receptive field individually for each of 6 feature maps of C1 layer (using max pooling operations, selecting maximal value in 2*2 receptive field). Proceeding of C1,S2,S4 conducting on iterative manneur.
http://i58.tinypic.com/ifsidu.png
Fig. 4 Connection list of C3 layer
But next we need to compute C3 layer. According to various papers there exist a connection map. Could you please say what is perceived under connection list? Does this mean that we will still use 5*5 receptive field as in C1 layer. And for example we see that in first row there is a marked feature maps corresponding to columns (0,4,5,6,9,10,11,12,14,15). Does this means that to construct 0,4,5,6,9,10,11,12,14,15 feature maps of C3 layer we will proceed convolutional operation under the first feature map of S2 layer with 5*5 receptive field. What concrete kernel will be used during convolutional operation, or again we need to randomly generate 16 kernels filled with small numbers as we did it in C1 layer. If yes we see that feature maps 0,4,5,6,9,10,11,12,14,15 of C3 colored in light grey, light grey, dark grey, light grey, dark grey, light grey, dark grey, light grey, light grey, dark grey. It can be clearly see that first feature map of S2 is light grey but only 0,4,6,10,12,14 are colored in light grey. So maybe the building of 16 feature maps in C3 proceeding by different way. Could you please say how also produce C5 layer, will it have some certain connection list?

Disclaimer: I have just started with this topic so please do point out mistakes in my concept!
In the original Lenet paper, on page 8, you can find a connection map that links different layers of S2 to layers of C3. This connection list tells us which layers of S2 are being convolved with the kernel(details coming up) to produce the layers of C3.
You will notice that each layer of S2 is involved in producing exactly 10 (not all 16) layers of C3. This shows that the size of kernel is (5x5x6) x 10.
In C1 we had a (5x5) x 6 kernel i.e. 5x5 with 6 feature maps. This is 2D convolution. In C3 we have (5x5x6) x 10 kernel i.e. a "kernel-box" with 10 feature maps. These 10 feature maps and the kernel-box combine to produce 16 layers rather than 6 as these are not fully connected.
Regarding generation of kernel weights, it depends on the algo. It can be random, pre-defined or using some scheme e.g. xavier in caffe.
What confused me is that the kernel details are not well defined and have to be derived from the given information.
Update:
How is C5 Produced?
Layer C5 is a convolutional layer with 120 feature maps. C5 feature maps have size of 1x1 as a 5x5 kernel is applied on S4. In the case of a 32x32 input, we can also say that S4 and C5 are fully connected. Size of Kernel applied on S4 to get C5 is (5x5x16) x 120 (bias not shown). Details on how these 120 kernel-boxes connect to S4 are not given explicitly in the paper. However, as a hint, it is mentioned that S4 and C5 are fully connected.

The key point in the paper concerning "C5" seems to be that the 5x5 kernel is applied to ALL 16 or S4's feature maps - a fully connected layer.
"Each unit is connected to a 5x5 neighborhood on all 16 of S4's feature maps".
Since we have 120 output units, we should have 120 bias unit connections (or else the architecture details don't tally).
We then connect all the 25x16 input units to produce one of the feature map outputs.
So in total we have
num_connections = (25x16+1)x120 = 48000+120 = 48120

I have understood the forward pass of S2 to C3 to have 60*(5x5) + 16*1 = 1'516 trainable params. Here I've separated out x-times from *-times since 5x5 is the dimensions of each 2D kernel. Since there are 60 X:es in the table that means there are 60 such kernels:
From column 0 to 5 in the table we have thus 3*(5x5) kernels that are convolved (actually cross-correlated) with each specified feature map from S4 thus for each feature map (0 to 5) of C3 you get three 10x10 images since 14x14 - 5x5 + 1x1 = 10x10. Then these are summed together with a scalar bias to form a final 10x10 feature map in C3.
From column 6 to 14 you get 4*(5x5) kernels that are "convolved" with each specified feature map from S4 and then combined as before to feature maps 6 to 14 of C5.
Finally in column 15 you have 6*(5x5) kernels.
Together this is (6*3 + 9*4 + 1*6)*(5x5) = 60*(5x5), i.e. 60 pieces of 5x5 kernels. When adding the 16 scalar biases you get 60*5*5 + 16 = 1516 trainable parameters which agrees with the number specified in the paper.
Hope this helps.

Related

how to plot specific segment from density.lpp

I use density.lpp for kernel density estimation. I want to pick specific segment in that and plot the estimation through chosen segment. As an example, I have a road which is a combination of two segments. each segments have different length so I don't know how many pieces each of them are divided by.
here is the locations of vertices and road segment ids.
https://www.dropbox.com/s/fmuul0b6lus279c/R.csv?dl=0
here is the code I used to create spatial lines data frame and random points on the network and get density estimation.
Is there a way to know how many pieces each segment divided by? OR if I want to plot locations vs estimation for chosen segment how can I do? Using dimyx=100 created 199 estimation points but I don't know how many of them belongs to Swid=1 or Swid=2.
One approached I used was, using gDistance it works fine in this problem because these segments connected to one directions however, when there is 4 ways connection, some of the lambda values connects to another segments which is not belongs to that segment. I provided picture and circled 2 points, when I used gDistance, those points connected to other segments. Any ideas?
R=read.csv("R.csv",header=T,sep=",")
R2.1=dplyr::select(R, X01,Y01,Swid)
coordinates(R2.1) = c("X01", "Y01")
proj4string(R2.1)=CRS("+proj=utm +zone=17 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0")
plot(R2.1,main="nodes on the road")
##
LineXX <- lapply(split(R2.1, R2.1$Swid), function(x) Lines(list(Line(coordinates(x))), x$Swid[1L]))
##
linesXY <- SpatialLines(LineXX)
data <- data.frame(Swid = unique(R2.1$Swid))
rownames(data) <- data$Swid
lxy <- SpatialLinesDataFrame(linesXY, data)
proj4string(lxy)=proj4string(trtrtt.original)
W.1=as.linnet.SpatialLines(lxy)
Rand1=runiflpp(250, W.1)
Rand1XY=coords(Rand1)[,1:2]
W2=owin(xrange=c(142751.98, 214311.26), yrange=c(3353111, 3399329))
Trpp=ppp(x=Rand1XY$x, y=Rand1XY$y, window=W2) ### planar point object
L.orig=lpp(Trpp,W.1) # discrete
plot(L.orig,main="Original with accidents")
S1=bw.scott(L.orig)[1] # in case to change bandwitdh
Try274=density(L.orig,S1,distance="path",continuous=TRUE,dimyx=100)
L=as.linnet(L.orig)
length(Try274[!is.na(Try274$v)])
[1] 199
This is a question about the spatstat package.
The result of density.lpp is an object of class linim. For any such object, you can use as.data.frame to extract the data. This yields a data frame with one row for each sample point on the network. For each sample point, the data are xc, yc (coordinates of nearest pixel centre), x,y (exact coordinates of sample point on network), seg (identifier of segment), tp (relative position along segment) and values (the density value). If you split the data frame by the seg column, you will get the data for invididual segments of the network.
However, it seems that you may want information about the internal workings of density.lpp. In order to achieve adequate accuracy during the computation phase, density.lpp subdivides each network segment into many short segments (using a complex set of rules). This information is lost when the final results are discretised into a linim object and returned. The attribute "dx" reports the length of the short segments that were used in the computation phase, but that's all.
If you email me directly I can show you how to extract the internal information.

Count outside edges of adjacent cells in a matrix in R

I'm working on some gridded temperature data, which I have categorised into a matrix where each cell can be one of two classes - let's say 0 or 1 for simplicity. For each class I want to calculate patch statistics, taking inspiration from FRAGSTATS, which is used in landscape ecology to characterise the shape and size of habitat patches.
For my purposes, a patch is a cluster of adjacent cells of the same class. Here's an example matrix, mat:
mat <-
matrix(c(0,1,0,
1,1,1,
1,0,1), nrow = 3, ncol = 3,
byrow = TRUE)
0 1 0
1 1 1
1 0 1
All the 1s in mat form a single patch (we'll ignore the 0s), and in order to calculate various different shape metrics I need to be able to calculate the perimeter (i.e. number of outside edges).
EDIT
Sorry I apparently can't post an image because I don't have enough reputation, but you can see in the black lines of G5W's answer below that the outside borders of 1's represent the outside edges I'm referring to.
Manually I can count that the patch of 1s has 14 outside edges and I know the area (i.e. number of cells) is 6. Based on a paper by He et al. and this other question I've figured out how to calculate the number of inside edges (5 in this example), but I'm really struggling to do the same for the outside edges! I think it's something to do with how the patch shape compares to the largest integer square that has a smaller area (in this case, a 2 x 2 square), but so far my research and pondering have been to no avail.
N.B. I am aware of the package SDMTools, which can calculate various FRAGSTATS metrics. Unfortunately the metrics returned are too processed e.g. instead of just Aggregation Index, I need to know the actual numbers used to calculate it (number of observed shared edges / maximum number of shared edges).
This is my first post on here so I hope it's detailed enough! Thanks in advance :)
If you know the area and the number of inside edges, it is simple to calculate the number of outside edges. Every patch has four edges so in some way, the total number of edges is 4 * area. But that is not quite right because every inside edge is shared between two patches. So the right number of total edges is
4*area - inside
The number of outside edges is the total edges minus the inside edges, so
outside = total - inside = (4*area- inside) - inside = 4*area - 2*inside.
You can see that the area is made up of 6 squares each of which has 4 sides. The inside edges (the red ones) are shared by two adjacent squares.

Describe state space in reinforcement learning

I'm doing some reinforcement learning task where I have environment (consisting of grass, forest, dirt and water) and predator and prey. My prey is trying to keep away from predator for as long as possible, meanwhile consume water and grass to survive. I have 2 functions I must edit, getStateDesc <- function(simData, preyId) and getReward <- function(oldstate, action, newstate). I already have some states and rewards implemented by default, and my state space is keeping record of c(distance to predator, direction to predator, and if prey is on border) states for qlearning algorithm. In reward function, my prey is penalized based od distance to predator and if it is trying to move on border. I now want to add state to check if my prey is in forest(so it can hide) for which I have implemented function isPreyInForest. I want to keep two states for this, if isPreyInForest==TRUE => state<-1 if not state<-2 and based on this reward my agent later. Problem is that I cannot change dimension of state space ( c(distance, direction, border), because when I try to add state to this ( c(distance,direction,border,state) and later in qlearning when I run the simulation with qlearning(c(30, 4, 5,2), maxtrials=100)
(notice that 30 here represent max distance from predator, 4 is direction so 4 max directions and 5 is border, where first 4 numbers are borders and 5 is when agent is not on border state) I have Error in apply(Q, len + 1, "[", n) : dim(X) must have a positive length. So any idea how to expand state space and give good argument to qlearning function?

Graph Theory: number of connected triples

In order to find the Global Clustering Coefficient I need to find the number of connected triples. About this graph:
these are the triples that I found:
7-6-5
5-3-7
5-3-1
4-3-7
4-3-1
4-5-6
3-7-6
3-5-6
2-3-4
2-1-7
2-3-5
1-7-6
total: 12 triples.
Moreover, there are 3 triangles, and 1 triangle is equal to 3 triples. So in total there are 12 + 3*3 = 21 triples. Is that correct? And is it possible to find a rule or a method to find all the connected triples in a graph without doing it manually?
summation of k*(k-1)/2, where k is the degree of all nodes.

find the mean for points of binary features

I have groups of binary string each bit represent a feature in a variable e.g I have a color variable where red blue and green are the features thus if I have 010 --> I have a blue object.
I need to get the center of these objects by calculating a weighted mean example 010 weight's 0.5; 100 weights 0.4 and 001 weights 0.8 [010 *0.5 + 100*0.4 + 001*0.8]/[1.7]
is there a possibility to get a point which represents the center of those points which should had same properties of others points (binary on 3 bits)
thank u in advance for your help
I guess you can use the following approach from cluster analysis: you need to choose metric for your object space (Euclidean, Taxicab or something else) and then for all objects from group (or if cardinality of the set is small - for all possible objects) calculate average distance to all objects from group. Then, you can assume object with a smallest average distance is center of a group.

Resources