I want to do same order as removing the first rows for several dataframes.
lab1 <- lab1[-c(1),]
lab2 <- lab2[-c(1),]
lab3 <- lab3[-c(1),]
lab4 <- lab4[-c(1),]
lab5 <- lab5[-c(1),]
lab6 <- lab6[-c(1),]
lab7 <- lab7[-c(1),]
lab8 <- lab8[-c(1),]
lab9 <- lab9[-c(1),]
lab10 <- lab10[-c(1),]
...
I want to use repeated phase like this.
for(i in 2:19){ labi <- labi[-c(1),]}
However, labi is recognized as a dataframe name.
I need to do such orders for many dataframes. Can someone help?
Maybe you can try list2env like below
list2env(
lapply(mget(ls(pattern = "lab\\d+")), function(x) x[-1, ]),
envir = .GlobalEnv
)
How do you create the dataframes? Is it possible instead to make a list of lab[[i]]?
Else, you can write the expression as a string then evaluate it, but it's a bit of a hacky way, it would be better to avoid it:
cmd <- paste0("lab", 1, "[c(-1),]")
eval(str2expression(cmd))
Related
Would like to reference a dataframe name stored in an object, such as:
dfName <- 'mydf1'
dfName <- data.frame(c(x = 5)) #want dfName to resolve to 'mydf1', not create a dataframe named 'dfName'
mydf1
Instead, I get: Error: object 'mydf1' not found
CORRECTED SCENARIO:
olddf <- data.frame(c(y = 8))
mydf1 <- data.frame(c(x = 5))
assign('dfName', mydf1)
dfName <- olddf #why isnt this the same as doing "mydf1 <- olddf"?
I don't want to reference an actual dataframe named "dfName", rather "mydf1".
UPDATE
I have found a clunky workaround for what I wanted to do. The code is:
olddf <- data.frame(x = 8)
olddfName <- 'olddf'
newdfName <- 'mydf1'
statement <- paste(newdfName, "<-", olddfName, sep = " ")
writeLines(statement, "mycode.R")
source("mycode.R")
Anyone have a more elegant way, especially without resorting to a write/source?
I am guessing you want to store multiple data.frames in a loop or similar. In that case it is much more efficient and better to store them in a named list. However, you can achieve your goal with assign
assign('mydf1', data.frame(x = 5))
mydf1
x
1 5
Imagine I have three dataframes:
data.frame1 <- data.frame(x=c(1:10))
data.frame2 <- data.frame(x=c(11:20))
data.frame3 <- data.frame(x=c(21:30))
I could bind them together by explicitely naming each of them:
res.data.frame <- cbind(data.frame1, data.frame2, data.frame3)
However, I am looking for more dynamic ways to do so, e.g. with placeholders.
This saves somehow the three dataframes in a new dataframe, but not in a usable format:
res.data.frame1 <- as.data.frame(mapply(get, grep("^data.frame.$", ls(), value=T)))
This command would only save the three names:
res.data.frame2 <- grep(pattern = "^data.frame.$", ls(), value=T)
This one only gives an error message:
res.data.frame3 <- do.call(cbind, lapply(ls(pattern = "^data.frame.$")), get)
Does anyone know the right way to do this?
Something like this maybe?
Assuming ls()
# [1] "data.frame1" "data.frame2" "data.frame3"
as.data.frame(Reduce("cbind", sapply(ls(), function(i) get(i))))
Based on #akrun's comment, this can be simplified to
as.data.frame(Reduce("cbind", mget(ls())))
I'm trying to do this simple task, all the variables are initialized properly, but for some reason this isn't working. What am I doing wrong?
for(i in 1:117)
{x = runif(1,0,1)
if(x<0.5)
testframe = rbind(utilities[i,])
else
trainframe = rbind(utilities[i,])}
In your loop, you overwrite both testframe and trainframe in each run of the loop. You could use testframe <- rbind(testframe, utilities[i, ]), but this would be quite inefficient.
Here's another approach without loops:
x <- sample(c(TRUE, FALSE), 117, replace = TRUE)
testframe <- utilities[x, ]
trainframe <- utilities[!x, ]
You can also create a list including the two subsets (based on vector x):
split(utilities, x)
If you insist to use a for loop, you can always save your outcome as an empty list and add items to that list:
Here's an untested example (as I do not have the "utilities" data):
testframe <- list()
trainframe <- list()
for(i in 1:117)
{x = runif(1,0,1)
if(x<0.5)
testframe[i] <- utilities[i,] ##whatever you want to save here
else
trainframe = utilities[i,]
}
Hope this helps
This should be a simple one, i hope. I have several dataframes loaded into workspace, labelled df01 to df100, not all numbers represented. I'd like to plot a specific column across all datasets, for example in a box plot. How do I refer all objects starting with df, using globbing, ie:
boxplot(df00$col1, df02$col1, df04$col1)
=
boxplot(df*$col1)
The idomatic approach is to work with lists, or to use a separate environment.
You can create this list using ls and pattern
df.names <- ls(pattern = '^df')
# note
# ls(pattern ='^df[[:digit:]]{2,}')
# may be safer if there are objects starting with df you don't want
df.list <- mget(df.names)
# note if you are using a version of R prior to R 3.0.0
# you will need `envir = parent.frame()`
# mget(ls(pattern = 'df'), envir = parent.frame())
# use `lapply` to extract the relevant columns
df.col1 <- lapply(df.list, '[[', 'col1')
# call boxplot
boxplot(df.col1)
Try this:
nums <- sprintf("%02d", 0:100)
dfs.names <- Filter(exists, paste0("df", nums))
dfs.obj <- lapply(dfs.names, get)
dfs.col1 <- lapply(dfs.obj, `[[`, "col1")
do.call(boxplot, dfs.col1)
I have 9880 records in a data frame, I am trying to split it into 9 groups of 1000 each and the last group will have 880 records and also name them accordingly. I used for-loop for 1-9 groups but manually for the last 880 records, but i am sure there are better ways to achieve this,
library(sqldf)
for (i in 0:8)
{
assign(paste("test",i,sep="_"),as.data.frame(final_9880[((1000*i)+1):(1000*(i+1)), (1:53)]))
}
test_9<- num_final_9880[9001:9880,1:53]
also am unable to append all the parts in one for-loop!
#append all parts
all_9880<-rbind(test_0,test_1,test_2,test_3,test_4,test_5,test_6,test_7,test_8,test_9)
Any help is appreciated, thanks!
A small variation on this solution
ls <- split(final_9880, rep(0:9, each = 1000, length.out = 9880)) # edited to Roman's suggestion
for(i in 1:10) assign(paste("test",i,sep="_"), ls[[i]])
Your command for binding should work.
Edit
If you have many dataframes you can use a parse-eval combo. I use the package gsubfn for readability.
library(gsubfn)
nms <- paste("test", 1:10, sep="_", collapse=",")
eval(fn$parse(text='do.call(rbind, list($nms))'))
How does this work? First I create a string containing the comma-separated list of the dataframes
> paste("test", 1:10, sep="_", collapse=",")
[1] "test_1,test_2,test_3,test_4,test_5,test_6,test_7,test_8,test_9,test_10"
Then I use this string to construct the list
list(test_1,test_2,test_3,test_4,test_5,test_6,test_7,test_8,test_9,test_10)
using parse and eval with string interpolation.
eval(fn$parse(text='list($nms)'))
String interpolation is implemented via the fn$ prefix of parse, its effect is to intercept and substitute $nms with the string contained in the variable nms. Parsing and evaluating the string "list($mns)" creates the list needed. In the solution the rbind is included in the parse-eval combo.
EDIT 2
You can collect all variables with a certain pattern, put them in a list and bind them by rows.
do.call("rbind", sapply(ls(pattern = "test_"), get, simplify = FALSE))
ls finds all variables with a pattern "test_"
sapply retrieves all those variables and stores them in a list
do.call flattens the list row-wise.
No for loop required -- use split
data <- data.frame(a = 1:9880, b = sample(letters, 9880, replace = TRUE))
splitter <- (data$a-1) %/% 1000
.list <- split(data, splitter)
lapply(0:9, function(i){
assign(paste('test',i,sep='_'), .list[[(i+1)]], envir = .GlobalEnv)
return(invisible())
})
all_9880<-rbind(test_0,test_1,test_2,test_3,test_4,test_5,test_6,test_7,test_8,test_9)
identical(all_9880,data)
## [1] TRUE