I tried to implement a simple 2D single layer perceptron and ended up with this solution:
perceptron <- function(featureVec, classVec, wStart=matrix(c(0,0,0)), eta=1, limit = 50) {
plot(x=featureVec[,1],y=featureVec[,2])
# Extending dimensions
dimension <- dim(featureVec)[1]
featureVec <- cbind(featureVec,rep(1,dimension))
# Inverting 2. class
index <- classVec == -1
featureVec[index,] <- apply(matrix(featureVec[index]),1,prod,-1)
wTemp <- wStart
y <- featureVec %*% wTemp
iteration = 0
while (T) {
y <- featureVec %*% wTemp
delta <- as.matrix(featureVec[y <= 0,])
for(i in 1:nrow(delta)) {
wTemp <- wTemp + eta*delta[i,]
}
result <- featureVec %*% wTemp
if (sum(result <= 0) == 0) {
break
}
if (iteration >= limit) {
stop("Maximum count of interations reached!")
}
iteration = iteration + 1
}
if(wTemp[2] != 0) {
abline(-wTemp[3]/wTemp[2],-wTemp[1]/wTemp[2])
} else if(wTemp[2] == 0) {
abline(v=wTemp[1])
} else if(wTemp[1] == 0) {
abline(h=wTemp[2])
}
return(wTemp)
}
The feature vector works row-wise, the class vector needs values of 1 and -1 col-wise.
For most of my tests it works correct, but when I have samples like (0,0) (0,1) with classes (1,-1) I get no result. That happens with some of my examples with two points lying on a straight line (horizontal to a coordinate axis). When I try to choose different start vectors it sometimes works correctly (I have no deterministic behaviour here right now I guess). Is that a correct behaviour or is my implementation wrong?
Thanks for your help, Meiner.
EDIT: Some changes of the inital post.
Bad Dataset:
featureTest <- matrix(c(0,0,0,1),byrow=T,nrow=2)
classTest <- matrix(c(1,-1),nrow=2)
perceptron(featureTest,classTest)
featureTest <- matrix(c(0,1,0,2),byrow=T,nrow=2)
classTest <- matrix(c(1,-1),nrow=2)
perceptron(featureTest,classTest)
Good Dataset:
featureTest <- matrix(c(0,0,0,2),byrow=T,nrow=2)
classTest <- matrix(c(1,-1),nrow=2)
perceptron(featureTest,classTest)
Related
I am working on a project in which I am simulating 8 classroom social networks over 6 weeks, so 30 iterations. Students will nominate each other based on a number of factors, and I plan to simulate a number of conditions in which I remove or add some of these factors to the simulation. In other words, I'm going to be repeating a lot of code, so I'd rather use functions rather than cutting and pasting where ever possible.
Right now, I'm trying to create a function that adjusts the probability of one student selecting another based on the similarity of their emotions. When I include it in a set of nested for for loops, this works just fine:
num_students <- 5
names_students <- letters[1:num_students]
student_emotion <- sample(round(runif(5, min = -5, max = 5), digits = 1))
student_emotion_df <- cbind.data.frame(names_students, student_emotion)
probs <- rep(1/num_students, 5)
row_prob <- vector(length = 5)
for(i in 1:num_students){
for(q in 1:num_students){
if(abs(student_emotion[i]-student_emotion[q]) >= 0 &
abs(student_emotion[i]-student_emotion[q]) <= .5){
row_prob[q] <- 1*probs[q]
} else if(abs(student_emotion[i]-student_emotion[q]) > .5 &
abs(student_emotion[i]-student_emotion[q]) <= 1) {
row_prob[q] <- .75 * probs[q]
}
else {
row_prob[q] <- .5 * probs[q]
}
}
}
The row_prob object is a vector of probabilities a student i, in the column, will select student q, in the rows.
I've created a user-defined function based on the same code, and that works:
emotion_difference_fun <- function(probs){
for(q in 1:num_students){
if(abs(student_emotion[i]-student_emotion[q]) >= 0 &
abs(student_emotion[i]-student_emotion[q]) <= .5){
row_prob[q] <- 1*probs[q]
} else if(abs(student_emotion[i]-student_emotion[q]) > .5 &
abs(student_emotion[i]-student_emotion[q]) <= 1) {
row_prob[q] <- .75 * probs[q]
}
else {
row_prob[q] <- .5 * probs[q]
}
}
return(row_prob)
}
emotion_difference_fun(probs)
But when I try to embed that function within the for loop iterating through the columns, row_prob returns as an empty vector:
for(i in 1:num_students){
emotion_difference_fun(probs)
}
Any thoughts on how I can get this to work?
Thanks for any help you're able to offer.
If I understood your question properly, then you need to assign the results in your last 'for' loop:
for(i in 1:num_students){
if(i == 1) out <- NULL
out <- c(out, emotion_difference_fun(probs))
}
out
Is that what you are looking for?
What I am unclear about though, is why in your second code section you are not looking for a 5*5 matrix. Eventually, when running that code, it doesn't matter that you did it for i = 5 students, because it will only save in row_prob your last iteration (student = 5).
You can use replicate to repeat the function emotion_difference_fun for num_students.
result <- replicate(num_students, emotion_difference_fun(probs))
You can also set simplify = FALSE to get output as list.
result <- replicate(num_students, emotion_difference_fun(probs),simplify = FALSE)
I am trying to write a function to calculate h-point. the function is defined over a rank frequency data frame.
consider the following data.frame :
DATA <-data.frame(frequency=c(49,48,46,38,29,24,23,22,15,12,12,10,10,9,9), rank=c(seq(1, 15)))
and the formula for h-point is :
if {there is an r = f(r), h-point = r }
else { h-point = f(i)j-f(j)i / j-i+f(i)-f(j) }
where f(i) and f(j) are corresponding frequencies for ith and jth ranks and i and j are adjacent ranks that i<f(i) and j>f(j).
NOW, i have tried the following codes :
fr <-function(x){d <-DATA$frequency[x]
return(d)}
for (i in 1:length(DATA$rank)) {
j <- i+1
if (i==fr(i))
return(i)
else(i<fr(i) && j>fr(j)) {
s <-fr(i)*j-fr(j)*i/j-i+fr(i)-fr(j)
return(s)
}}
I also tried:
for (i in 1:length(DATA$rank)) {
j <- i+1
if (i==fr(i))
return(i)
if (i<fr(i) while(j>fr(j))) {
s <-fr(i)*j-fr(j)*i/j-i+fr(i)-fr(j)
return(s)
}}
and neither of them works. for the DATA ,the desired result would be i=11 and j=12, so:
h-point=12×12 - 10×11 / 12 - 11 + 12 - 10
can you please tell me what I`m doing wrong here?
You could do:
h_point <- function(data){
x <- seq(nrow(data))
f_x <- data[["frequency"]][x]
h <- which(x == f_x)
if(length(h)>1) h
else{
i <- which(x<f_x)
j <- which(x>f_x)
s <- which(outer(i,j,"-") == -1, TRUE)
i <- i[s[,1]]
j <- j[s[,2]]
cat("i: ",i, "j: ", j,"\n")
f_x[i]*j - f_x[j]*i / (i-j + f_x[i]-f_x[j])
}
}
h_point(DATA)
i: 11 j: 12
[1] 34
I think I have figured out what you are trying to achieve. My loop will go through DATA and break at any point if rank == frequency for a given row. If might be more prudent to explicitly test this with DATA$rank[i] == fr(i) rather than relying on i, in case tied ranks etc.
The second if statement calculates h-point (s) for rows i and j if row i has rank that is lower than freq and row j has a rank that is higher.
Is this what you wanted?
DATA <-data.frame(frequency=c(49,48,46,38,29,24,23,22,15,12,12,10,10,9,9), rank=c(seq(1, 15)))
fr <-function(x){d <-DATA$frequency[x]
return(d)}
for(i in 1:nrow(DATA)){
j <- i+1
if (i==fr(i)){
s <- list(ij=c(i=i,j=j), h=i)
break
}else if(i <fr(i) && j>fr(j)){
s <-list(ij=c(i=i,j=j),h=fr(i)*j-fr(j)*i/j-i+fr(i)-fr(j))
}}
I am not sure the formula is correct, in your loop you had j-i but in explanation it was i-j. Not sure if the entire i-j+fr(i)-fr(j) is the denominator and similarly for the numerator. Simple fixes.
I was wondering about how to find the smallest circumcircle of an irregular polygon. I've worked with spatial polygons in R.
I want to reproduce some of the fragstats metrics in a vector mode because I had hard times with the package 'landscapemetrics' for a huge amount of data. In specific I would like to implement the circle (http://www.umass.edu/landeco/research/fragstats/documents/Metrics/Shape%20Metrics/Metrics/P11%20-%20CIRCLE.htm). So far, I could not find the formula or script for the smallest circumcircle.
All your comments are more than welcome.
Than you
As I mentioned in a comment, I don't know of existing R code for this, but a brute force search should be fast enough if you don't have too many points that need to be in the circle. I just wrote this one. The center() function is based on code from Wikipedia for drawing a circle around a triangle; circumcircle() is the function you want, found by brute force search through all circles that pass through 2 or 3 points in the set. On my laptop it takes about 4 seconds to handle 100 points. If you have somewhat bigger sets, you can probably get tolerable results by translating to C++, but it's an n^4 growth rate, so you'll need a better solution
for a really large set.
center <- function(D) {
if (NROW(D) == 0)
matrix(numeric(), ncol = 2)
else if (NROW(D) == 1)
D
else if (NROW(D) == 2) {
(D[1,] + D[2,])/2
} else if (NROW(D) == 3) {
B <- D[2,] - D[1,]
C <- D[3,] - D[1,]
Dprime <- 2*(B[1]*C[2] - B[2]*C[1])
if (Dprime == 0) {
drop <- which.max(c(sum((B-C)^2), sum(C^2), sum(B^2)))
center(D[-drop,])
} else
c((C[2]*sum(B^2) - B[2]*sum(C^2))/Dprime,
(B[1]*sum(C^2) - C[1]*sum(B^2))/Dprime) + D[1,]
} else
center(circumcircle(D))
}
radius <- function(D, U = center(D))
sqrt(sum((D[1,] - U)^2))
circumcircle <- function(P) {
n <- NROW(P)
if (n < 3)
return(P)
P <- P[sample(n),]
bestset <- NULL
bestrsq <- Inf
# Brute force search
for (i in 1:(n-1)) {
for (j in (i+1):n) {
D <- P[c(i,j),]
U <- center(D)
rsq <- sum((D[1,] - U)^2)
if (rsq >= bestrsq)
next
failed <- FALSE
for (k in (1:n)[-j][-i]) {
Pk <- P[k,,drop = FALSE]
if (sum((Pk - U)^2) > rsq) {
failed <- TRUE
break
}
}
if (!failed) {
bestset <- c(i,j)
bestrsq <- rsq
}
}
}
# Look for the best 3 point set
for (i in 1:(n-2)) {
for (j in (i+1):(n-1)) {
for (l in (j+1):n) {
D <- P[c(i,j,l),]
U <- center(D)
rsq <- sum((D[1,] - U)^2)
if (rsq >= bestrsq)
next
failed <- FALSE
for (k in (1:n)[-l][-j][-i]) {
Pk <- P[k,,drop = FALSE]
if (sum((Pk - U)^2) > rsq) {
failed <- TRUE
break
}
}
if (!failed) {
bestset <- c(i,j,l)
bestrsq <- rsq
}
}
}
}
P[bestset,]
}
showP <- function(P, ...) {
plot(P, asp = 1, type = "n", ...)
text(P, labels = seq_len(nrow(P)))
}
showD <- function(D) {
U <- center(D)
r <- radius(D, U)
theta <- seq(0, 2*pi, len = 100)
lines(U[1] + r*cos(theta), U[2] + r*sin(theta))
}
n <- 100
P <- cbind(rnorm(n), rnorm(n))
D <- circumcircle(P)
showP(P)
showD(D)
This shows the output
So I've created a while loop to give a coordinate point, Xm and Ym, which follows a uniform distribution, where the point must be within a circle of radius 25. Below is the code for that:
outcome<-function()
{
done=0
while(done==0){
Xm<-runif(1,-25,25)
Ym<-runif(1,-25,25)
if (Xm^2+Ym^2<=25^2){
z<-c(Xm,Ym)
done=1
}
}
z
}
outcome()
I now need to do the same thing, 250 times. I've changed the code to this:
plotoutcome<-function()
{
done=0
while(done==0){
Xm2<-runif(250,-25,25)
Ym2<-runif(250,-25,25)
if (Xm2^2+Ym2^2<=25^2){
z<-c(Xm2,Ym2)
done=1
}
}
z
}
plotoutcome()
However when I run the second code, I get this error message: In if (Xm2^2 + Ym2^2 <= 25^2) { :
the condition has length > 1 and only the first element will be used.
Any ideas on how to fix this?
The simplest way is to take advantage of outcome:
replicate(250, outcome())
But if a new function is needed, here is a plotoutcome function:
plotoutcome<-function()
{
total <- 0
done <- FALSE
Xtmp <- numeric(250)
Ytmp <- numeric(250)
while(!done){
Xm2 <- runif(1, -25, 25)
Ym2 <- runif(1, -25, 25)
i <- Xm2^2 + Ym2^2 <= 25^2
if(i){
total <- total + 1
Xtmp[total] <- Xm2
Ytmp[total] <- Ym2
}
done <- total == 250
}
list(X = Xtmp, Y = Ytmp)
}
do.call(cbind, plotoutcome())
I am working with a time-series raster brick. The brick has 365 layers representing a value for each day of the year.
I want to create a new layer in which each cell holds the number of day of year in which a certain condition is met.
My current approach is the following (APHRO being the raster brick), but returns the error message below:
enter code here
r <- raster(ncol=40, nrow=20)
r[] <- rnorm(n=ncell(r))
APHRO <- brick(x=c(r, r*2, r))
NewLayer <- calc(APHRO, fun=FindOnsetDate(APHRO))
Returning this error:
Error in .local(x, ...) : not a valid subset
And the function being parsed:
FindOnsetDate <- function (s) {
x=0
repeat {
x+1
if(s[[x]] >= 20 | s[[x]] + s[[x+1]] >= 20 & ChkFalseOnset() == FALSE)
{break}
}
return(x);
}
With the function for the 3rd condition being:
ChkFalseOnset <- function (x) {
for (i in 0:13){
if (sum(APHRO[[x+i:x+i+7]]) >= 5)
{return(FALSE); break}
return(TRUE)
}
}
Thank you in advance!!!!
And please let me know if I should provide more information - tried to keep it parsimonious.
The problem is that your function is no good:
FindOnsetDate <- function (s) {
x=0
repeat {
x+1
if(s[[x]] >= 20 | s[[x]] + s[[x+1]] >= 20)
{break}
}
return(x);
}
FindOnsetDate(1:100)
#Error in s[[x]] :
# attempt to select less than one element in get1index <real>
Perhaps something like this:
FindOnsetDate <- function (s) {
j <- s + c(s[-1], 0)
sum(j > 20 | s > 20)
# if all values are positive, just do sum(j > 20)
}
FindOnsetDate(1:20)
#10
This works now:
r <- calc(APHRO, FindOnsetDate)
I would suggest a basic two-step process. With a 365-days example:
set.seed(123)
r <- raster(ncol=40, nrow=20)
r_list <- list()
for(i in 1:365){
r_list[[i]] <- setValues(r,rnorm(n=ncell(r),mean = 10,sd = 5))
}
APHRO <- brick(r_list)
Use a basic logic test for each iteration:
r_list2 <- list()
for(i in 1:365){
if(i != 365){
r_list2[[i]] <- APHRO[[i]] >= 20 | APHRO[[i]] + APHRO[[i+1]] >= 20
}else{
r_list2[[i]] <- APHRO[[i]] >= 20
}
}
Compute sum by year:
NewLayer <- calc(brick(r_list2), fun=sum)
plot(NewLayer)