I have multiple columns in a table called "Gr1","Gr2",...,"Gr10".
I want to convert the class from character to integer. I want to do it in a dynamic way, I'm trying this, but it doesn't work:
for (i in 1:10) {
Col <- paste0('Students1$Gr',i)
Col <- as.integer(Col)
}
My objective here is to know how to add dynamically the for variable to the name of a column. Something like:
for (i in 1:10) {
Students1$Gr(i) <- as.integer(Students1$Gr(i))
}
Any idea is welcome.
Thank you very much,
Matias
# Example matrix
xm <- matrix(as.character(1:100), ncol = 10);
colnames(xm) <- paste0('Gr', 1:10);
# Example data frame
xd <- as.data.frame(xm, stringsAsFactors = FALSE);
# For matrices, this works
xm <- apply(X = xm, MARGIN = 2, FUN = as.integer);
# For data frames, this works
for (i in 1:10) {
xd[ , paste0('Gr', i)] <- as.integer(xd[ , paste0('Gr', i)]);
}
Related
I have a for loop,& while loop which produces a data after each iteration.
I want to add all the data together in a data frame but find it difficult. Because only the last data created from the loop is successful(can be seen in the following picture:output code).
Here is the code, please suggest how to fix it:
df = data.frame(matrix(nrow = 350, ncol = 12))
kol<-1
for (x in 1:350) {
output <- c(paste0(x))
df[,1] = output
}
while (kol <= 223) {
if(kol < 224){
rowd1 <- c(paste("gen ",kol))
}
df[,2] = rowd1
kol = kol+1
}#while
while (kol <= 446) {
if(kol < 447){
rowd2 <- c(paste("gen ",kol))
}
df[,3] = rowd2
kol = kol+1
}
colnames(df) <- c("Kromosom", "A","B","C","D","E","F","G","H","I","J","K")
df
so, I will update the question I posed.
what if the code becomes like this:
the problem: row problem
...
for (x in 1:350) {
output <- c(paste0(x))
df[x,1] = output
}
for (x2 in 1:223) {
output2 <- c(paste("Gen ",x2))
df[1,2:224] = output2
}"#why only the value 223 comes out, like the output in the picture 'row problem' that is -Gen 223-"
...
For a data.frame df, df[,n] is the entire n-th column. So you are setting the entire column at each step. In your code, use
df[x, 1] = output
for example, to set the value for a single row.
SOLVED
Special thanks to #JamesHirschorn, I really appreciate your help in resolving the problem. And thanks to the people who give feedback.
To Do
would probably stay away from while here. Why not just use for
instead? – GuedesBF
use df[x, 1] = output to set the value for each row in the 1st column
Code
df = data.frame(matrix(nrow = 350, ncol = 224))
for (x in 1:350) {
output <- c(paste(x))
df[x,1] = output
}
for (x2 in 2:224) {
output2 <- c(paste("Gen ",x2-1))
df[1,x2] = output2
}
colnames(df) <- c("Kromosom", "A","B","C","D","E","F","G","H","I","J","K", ...)
df
Output: here
.So, wish me luck in the future
I want to do an operation if each data frame of a list. I want to perform the Kolmogorov–Smirnov (KS) test for one column in each data frame. I am using the code below but it is not working:
PDF_mean <- matrix(nrow = length(siteNumber), ncol = 4)
PDF_mean <- data.frame(PDF_mean)
names(PDF_mean) <- c("station","normal","gamma","gev")
listDF <- mget(ls(pattern="DSF_moments_"))
length(listDF)
i <- 1
for (i in length(listDF)) {
PDF_mean$station[i] <- siteNumber[i]
PDF_mean$normal[i] <- ks.test(list[i]$mean,"pnorm")$p.value
PDF_mean$gev[i] <- ks.test(list[i]$mean,"pgev")$p.value
PDF_mean$gamma[i] <- ks.test(list[i]$mean,"gamma")$p.value
}
Any help?
It is not length(listDF) instead, it would be seq_along(listDF) or 1:length(listDF) (however, it is more appropriate with seq_along) because length is a single value and it is not doing any loop
for(i in seq_along(listDF)) {
PDF_mean$station[i] <- listDF[[i]]$siteNumber
PDF_mean$normal[i] <- ks.test(listDF[[i]]$mean,"pnorm")$p.value
PDF_mean$gev[i] <- ks.test(listDF[[i]]$mean,"pgev")$p.value
PDF_mean$gamma[i] <- ks.test(listDF[[i]]$mean,"gamma")$p.value
}
What I want is to create 60 data frames with 500 rows in each. I tried the below code and, while I get no errors, I am not getting the data frames. However, when I do a View on the as.data.frame, I get the view, but no data frame in my environment. I've been trying for three days with various versions of this code:
getDS <- function(x){
for(i in 1:3){
for(j in 1:30000){
ID_i <- data.table(x$ID[j: (j+500)])
}
}
as.data.frame(ID_i)
}
getDS(DATASETNAME)
We can use outer (on a small example)
out1 <- c(outer(1:3, 1:3, Vectorize(function(i, j) list(x$ID[j:(j + 5)]))))
lapply(out1, as.data.table)
--
The issue in the OP's function is that inside the loop, the ID_i gets updated each time i.e. it is not stored. Inorder to do that we can initialize a list and then store it
getDS <- function(x) {
ID_i <- vector('list', 3)
for(i in 1:3) {
for(j in 1:3) {
ID_i[[i]][[j]] <- data.table(x$ID[j:(j + 5)])
}
}
ID_i
}
do.call(c, getDS(x))
data
x <- data.table(ID = 1:50)
I'm not sure the description matches the code, so I'm a little unsure what the desired result is. That said, it is usually not helpful to split a data.table because the built-in by-processing makes it unnecessary. If for some reason you do want to split into a list of data.tables you might consider something along the lines of
getDS <- function(x, n=5, size = nrow(x)/n, column = "ID", reps = 3) {
x <- x[1:(n*size), ..column]
index <- rep(1:n, each = size)
replicate(reps, split(x, index),
simplify = FALSE)
}
getDS(data.table(ID = 1:20), n = 5)
I thought that the following problem must have been answered or a function must exist to do it, but I was unable to find an answer.
I have a nested loop that takes a row from one 3-col. data frame and copies it next to each of the other rows, to form a 6-col. data frame (with all possible combinations). This works fine, but with a medium sized data set (800 rows), the loops take forever to complete the task.
I will demonstrate on a sample data set:
Sdat <- data.frame(
x = c(10,20,30,40),
y = c(15,25,35,45),
ID =c(1,2,3,4)
)
compar <- data.frame(matrix(nrow=0, ncol=6)) # to contain all combinations
names(compar) <- c("x","y", "ID", "x","y", "ID")
N <- nrow(Sdat) # how many different points we have
for (i in 1:N)
{
for (j in 1:N)
{
Temp1 <- Sdat[i,] # data from 1st point
Temp2 <- Sdat[j,] # data from 2nd point
C <- cbind(Temp1, Temp2)
compar <- rbind(C,compar)
}
}
These loops provide exactly the output that I need for further analysis. Any suggestion for vectorizing this section?
You can do:
ind <- seq_len(nrow(Sdat))
grid <- expand.grid(ind, ind)
compar <- cbind(Sdat[grid[, 1], ], Sdat[grid[, 2], ])
A naive solution using rep (assuming you are happy with a data frame output):
compar <- data.frame(x = rep(Sdat$x, each = N),
y = rep(Sdat$y, each = N),
id = rep(1:n, each = N),
x1 = rep(Sdat$x, N),
y1 = rep(Sdat$y, N),
id_1 = rep(1:n, N))
I am trying to generate a data frame based on a user-defined function. My problem is that in the output only the first row is being filled.
Here is an example of the function I am using:
df <- data.frame(cs=rep(c("T1","T2","T3","T4"),each=16),yr=rep(c(1:4), times = 4, each = 4))
sp.df <- data.frame(matrix(sample.int(100, size = 20*64,replace=T), ncol = 20, nrow = 64))
myfunc<-function(X, system, Title)
{
for(i in 1:4){
Col_T <- data.frame(matrix(NA, ncol = length(X), nrow = 4))
Col_T[i,] <- colSums(X[which(df$yr==i & df$cs==system),])
return (Col_T)}}
myfunc(X=sp.df, system="T1", Title="T1")
I would welcome any suggestion to resolve this issue.
Thank you very much.
There are two problems with the function:
You're overwriting Col_T with all NAs as the first statement inside the for loop.
You're returning from the function inside the for loop.
Rewrite it as follows:
myfunc <- function(X, system, Title ) {
Col_T <- data.frame(matrix(NA, ncol=length(X), nrow=4 ));
for (i in 1:4)
Col_T[i,] <- colSums(X[which(df$yr==i & df$cs==system),]);
return(Col_T);
};