For loop over selected rows - r

I am new to R (or any programming language) I want to run a for loop along a selected rows of a Matrix, say 3,5,6,8. I know how to do it for a continuous range. How can I do it?

try this:
my_mat <- matrix(1:20, ncol = 2)
my_seq <- c(3, 5, 6, 8)
for(i in my_seq) {
print(my_mat[i, ])
}

Related

Failure in Calling a Function in R

I'm trying to create a function that compares two matrices. It will compare the element of both matrices at a certain position, and returns "greater than" "equal to" or "less than". Below is the code I have right now. However, when I tried calling the function, R does not return anything, not even an error message. I'm wondering why that is the case. Any suggestions would be helpful. Thanks.
fxn <- function(x, y) {
emptymatrix <- matrix( , nrow = dim(x)[1], ncol = dim(x)[2])
for (i in 1:dim(emptymatrix)[1]) {
for (j in 1:dim(emptymatrix)[2]) {
if (x[i, j] < y[i, j]) {
emptymatrix[i, j] <- "Less Than"
}else if (x[i, j] == y[i, j]) {
emptymatrix[i, j] <- "Equal to"
}else {
emptymatrix[i, j] <- "Greater than"
}
}
}
}
#trying to test the function
vecc1 <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)
vecc2 <- c(4, 5, 2, 3, 1, 1, 8, 9, 10)
matrix1 <- matrix(vecc1, nrow = 3, byrow = T)
matrix2 <- matrix (vecc2, nrow=3, byrow = T)
fxn(matrix1, matrix2)
Hi as SamR pointed out in his comment, your function doesn't return anything, because it has no return function / object in the end. He is also right about the loop thing, because R is mainly designed for tabular data and matrices it can do a lot of stuff for you under the hood. This is a great examples about some design principles R has. First we don't need to use a for loop because we can just evaluate larger equal less, on all indices (vectorized). The output will be a matrix of size M with TRUE / FALSE. we can use this matrix to index our new matrix at all TRUE position. than we just need to assign a single string "equal", "larger", or "less" that gets recycled to the length of the longer vector(/list).
vecc1 <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)
vecc2 <- c(4, 5, 2, 3, 1, 1, 8, 9, 10)
matrix1 <- matrix(vecc1, nrow = 3, byrow = T)
matrix2 <- matrix (vecc2, nrow=3, byrow = T)
# run this to see how the comparision works
matrix1 == matrix2
foo <- function(x,y) {
m_new<-matrix(NA,nrow=dim(x),ncol=dim(x))
m_new[x==y]<-"Equal"
m_new[x<y]<-"Less Than"
m_new[x>y]<-"Greater Than"
m_new # faster
#return(m_new) is not as efficent
}
foo(matrix1,matrix2)
You missed returning emptyMatrix from your function.
In R, the result of the last statement in a function is returned automatically. In the original function, the last statement was the for loop, whose value is NULL. It was returned, marked "invisible", so it didn't print.
The usual convention in R is to type the name of the object you want to return when it isn't already the last value produced. So just add one line to your function, containing emptyMatrix.
You can also call return(emptyMatrix), but that's actually less efficient.
And if you like returning things invisibly like for loops do, you can call invisible(emptyMatrix) as the last line. Then it won't automatically print, but you can still assign it to another variable.

Apply an index command on a matrix of lists

I have a matrix that contains lists containing shortest path sequences of an igraph object.
I want to turn this matrix into an igraph.es(edge sequence).
sample:
library(igraph)
data <- data.frame(from =c(1, 2, 3, 4, 5, 1),
to =c(4, 3, 4, 5, 6, 5),
weight=c(0.2,0.1,0.5,0.7,0.8,0.2))
g <- graph.data.frame(data, directed=FALSE)
sp <- sapply(data, function(x){shortest_paths(g, from = x, to = V(g)[x],output = "epath")})
sp is now a matrix. We can subset it with indexing:
x<-sp[[2]][[2]]
will turn x to an igraph::edge_sequence.
I'm looking for an apply command to turn all path_sequences of sp into edge_sequences. Thank you in advance.
EDIT:
I managed to unlist the first layer of the list.
sp<-flatten(sp)
So we just need a simple index.
Can I just use a for loop now?
Something like:
for(i in sp){ result[i]<- sum(E(g)$weight[sp[[i]])}
unfortunately this doesn't give me the desired output..

For Loop in R to Transform Each Column Using Box Cox from the Cars Package

I am working on an assignment for school. I need to transform the columns in a data frame using a for loop and the bcPower function from the cars package. My data frame named bb2.df consists of 13 columns of baseball statistics for 337 players. The data is from:
http://ww2.amstat.org/publications/jse/datasets/baseball.dat.txt
I read the data in using:
bb.df <- read.fwf("baseball.dat.txt",widths=c(4,6,6,4,4,3,3,3,4,4,4,3,3,2,2,2,2,19))
And then I created a second data frame just for the numeric stats using:
bb2.df <- bb.df[,1:13]
library(cars)
Then I unsuccessfully tried to build the for loop.
> bb2.df[[i]] <- bcPower(bb2.df[[i]],c)
> for (i in 1:ncol(bb2.df)) {
+ c <- coef(powerTransform(bb2.df[[i]]))
+ bb2.df[[i]] <- bcPower(bb2.df[[i]],c)
+ }
Error in bc1(out[, j], lambda[j]) :
First argument must be strictly positive.
The loop seems to transform the first three columns but stops.
What am I doing wrong?
This solution
tests whether a column appears to contain logical values and omits them from the transformation
replaces zero values in the vectors with a small number, outside the range of the actual values
stores the transformed values in a new data frame, retaining the column and row names
I have also tested all of the variables for normality before and after the transformation. I tried to find a variable that's interesting in that the transformed variable has a large p-value for the Shapiro test, but also there there was a large change in the p-value. Finally, the interesting variable is scaled in both the original and transformed version, and the two versions are overlaid on a density plot.
library(car); library(ggplot2); library(reshape2)
# see this link for column names and type hints
# http://ww2.amstat.org/publications/jse/datasets/baseball.txt
# add placeholder column for opening quotation mark
bb.df <-
read.fwf(
"http://ww2.amstat.org/publications/jse/datasets/baseball.dat.txt",
widths = c(4, 6, 6, 4, 4, 3, 3, 3, 4, 4, 4, 3, 3, 2, 2, 2, 2, 2, 17)
)
# remove placeholder column
bb.df <- bb.df[,-(ncol(bb.df) - 1)]
names(bb.df) <- make.names(
c(
'Salary', 'Batting average', 'OBP', 'runs', 'hits', 'doubles', 'triples',
'home runs', 'RBI', 'walks', 'strike-outs', 'stolen bases', 'errors',
"free agency eligibility", "free agent in 1991/2" ,
"arbitration eligibility", "arbitration in 1991/2", 'name'
)
)
# test for boolean/logical values... don't try to transform them
logicals.test <- apply(
bb.df,
MARGIN = 2,
FUN = function(one.col) {
asnumeric <- as.numeric(one.col)
aslogical <- as.logical(asnumeric)
renumeric <- as.numeric(aslogical)
matchflags <- renumeric == asnumeric
cant.be.logical <- any(!matchflags)
print(cant.be.logical)
}
)
logicals.test[is.na(logicals.test)] <- FALSE
probably.numeric <- bb.df[, logicals.test]
result <- apply(probably.numeric, MARGIN = 2, function(one.col)
{
# can't transform vectors containing non-positive values
# replace zeros with something small
non.zero <- one.col[one.col > 0]
small <- min(non.zero) / max(non.zero)
zeroless <- one.col
zeroless[zeroless == 0] <- small
c <- coef(powerTransform(zeroless))
transformation <- bcPower(zeroless, c)
return(transformation)
})
result <- as.data.frame(result)
row.names(result) <- bb.df$name
cols2test <- names(result)
normal.before <- sapply(cols2test, function(one.col) {
print(one.col)
temp <- shapiro.test(bb.df[, one.col])
return(temp$p.value)
})
normal.after <- sapply(cols2test, function(one.col) {
print(one.col)
temp <- shapiro.test(result[, one.col])
return(temp$p.value)
})
more.normal <- cbind.data.frame(normal.before, normal.after)
more.normal$more.normal <-
more.normal$normal.after / more.normal$normal.before
more.normal$interest <-
more.normal$normal.after * more.normal$more.normal
interesting <-
rownames(more.normal)[which.max(more.normal$interest)]
data2plot <-
cbind.data.frame(bb.df[, interesting], result[, interesting])
names(data2plot) <- c("original", "transformed")
data2plot <- scale(data2plot)
data2plot <- melt(data2plot)
names(data2plot) <- c("Var1", "dataset", interesting)
ggplot(data2plot, aes(x = data2plot[, 3], fill = dataset)) +
geom_density(alpha = 0.25) + xlab(interesting)
Original, incomplete answer:
I believe you're trying to do illegal power transformations (vectors including non-positive values, specifically zeros; vectors with no variance)
The fact that you are copying bb.df into bb2.df and then overwriting is a sure sign that you should really be using apply.
This doesn't create a useful dataframe, but it should get you started,
library(car)
bb.df <-
read.fwf(
"baseball.dat.txt",
widths = c(4, 6, 6, 4, 4, 3, 3, 3, 4, 4, 4, 3, 3, 2, 2, 2, 2, 19)
)
bb.df[bb.df == 0] <- NA
# skip last (text) col
for (i in 1:(ncol(bb.df) - 1)) {
print(i)
# use comma to indicate indexing by column
temp <- bb.df[, i]
temp[temp == 0] <- NA
temp <- temp[complete.cases(temp)]
if (length(unique(temp)) > 1) {
c <- coef(powerTransform(bb.df[, i]))
print(bcPower(bb.df[i], c))
} else {
print(paste0("column ", i, " is invariant"))
}
}
# apply solution
result <- apply(bb.df[,-ncol(bb.df)], MARGIN = 2, function(one.col)
{
temp <- one.col
temp[temp == 0] <- NA
temp <- temp[complete.cases(temp)]
if (length(unique(temp)) > 1) {
c <- coef(powerTransform(temp))
transformation <- bcPower(temp, c)
return(transformation)
} else
{
print("skipping invariant column")
return(NULL)
}
})

Creating a data frame in R which is an interaction of a row and subsequent rows of another data frame

This isn't the exact problem but its the general idea
test<-data.frame(rbind(c(1,2,3), c(4, 5, 6), c(7, 8, 9), c(1, 2, 3)))
names(test)[1] <- "one"
names(test)[2] <- "two"
names(test)[3] <- "three"
So I basically want to create a new data frame containing one column where each row is row(I)- row(I+1) of the test data frame
such as:
test[1,1]-test[2,1]
test[2,1]-test[3,1]
test[3,1]-test[4,1]...
I;ve tried:
k<-nrow(test)
for(i in 1:k-1){
test2<-data.frame(test[i,1]-test[i+1,1])
}
but this is just producing one value
also then to add a final row '0', so there are k rows in total
For your row to row substraction, you need to loop through your rows and columns:
k<-nrow(test)
n <- ncol(test)
for (j in 1:n){
for(i in 1:k-1){
test[i,n + j] <- test[i,j]-test[i+1,j]
i = i+1
}
}
To add a new row:
test[k+1,] <- matrix(0,nrow=1, ncol=n)

Extract an increasing subsequence

I wish to extract an increasing subsequence of a vector, starting from the first element. For example, from this vector:
a = c(2, 5, 4, 0, 1, 6, 8, 7)
...I'd like to return:
res = c(2, 5, 6, 8).
I thought I could use a loop, but I want to avoid it. Another attempt with sort:
a = c(2, 5, 4, 0, 1, 6, 8, 7)
ind = sort(a, index.return = TRUE)$ix
mat = (t(matrix(ind))[rep(1, length(ind)), ] - matrix(ind)[ , rep(1, length(ind))])
mat = ((mat*upper.tri(mat)) > 0) %*% rep(1, length(ind)) == (c(length(ind):1) - 1)
a[ind][mat]
Basically I sort the input vector and check if the indices verify the condition "no indices at the right hand side are lower" which means that there were no greater values beforehand.
But it seems a bit complicated and I wonder if there are easier/quicker solutions, or a pre-built function in R.
Thanks
One possibility would be to find the cumulative maxima of the vector, and then extract unique elements:
unique(cummax(a))
# [1] 2 5 6 8
The other answer is better, but i made this iterative function which works as well. It works by making all consecutive differences > 0
increasing <- function (input_vec) {
while(!all(diff(input_vec) > 0)){
input_vec <- input_vec[c(1,diff(input_vec))>0]
}
input_vec
}

Resources