View dataframes by pasting its name in r - r

Is there any way to View dataframes in r, while refering to them with another variable? Say I have 10 data frames named df1 to df10, is there a way I can View them while using i instead of 1:10?
Example:
df1 = as.data.frame(c(1:20))
i = 1
View(paste("df", i, sep =""))
I would like this last piece of code to do the same as View(df1). Is there any command or similar in R that allows you to do that?

The answer to your immediate question is get:
df1 <- data.frame(x = 1:5)
df2 <- data.frame(x = 6:10)
> get(paste0("df",1))
x
1 1
2 2
3 3
4 4
5 5
But having multiple similar objects with names like df1, df2, etc in your workspace is considered fairly bad practice in R, and instead experienced R folks will prefer to put related objects in a named list:
df_list <- setNames(list(df1,df2),paste0("df",1:2))
> df_list[[paste0("df",1)]]
x
1 1
2 2
3 3
4 4
5 5

Related

Rbind data frames with names in a list [duplicate]

I have an issue that I thought easy to solve, but I did not manage to find a solution.
I have a large number of data frames that I want to bind by rows. To avoid listing the names of all data frames, I used "paste0" to quickly create a vector of names of the data frames. The problem is that I do not manage to make the rbind function identify the data frames from this vector of name.
More explicitely:
df1 <- data.frame(x1 = sample(1:5,5), x2 = sample(1:5,5))
df2 <- data.frame(x1 = sample(1:5,5), x2 = sample(1:5,5))
idvec <- noquote(c(paste0("df",c(1,2))))
> [1] df1 df2
What I would like to get:
dftot <- rbind(df1,df2)
x1 x2
1 4 1
2 5 2
3 1 3
4 3 4
5 2 5
6 5 3
7 1 4
8 2 2
9 3 5
10 4 1
dftot <- rbind(idvec)
> [,1] [,2]
> idvec "df1" "df2"
If there are multiple objects in the global environment with the pattern df followed by digits, one option is using ls to find all those objects with the pattern argument. Wrapping it with mget gets the values in the list, which we can rbind with do.call.
v1 <- ls(pattern='^df\\d+')
`row.names<-`(do.call(rbind,mget(v1)), NULL)
If we know the objects, another option is paste to create a vector of object names and then do as before.
v1 <- paste0('df', 1:2)
`row.names<-`(do.call(rbind,mget(v1)), NULL)
This should give the result:
dfcount <- 2
dftot <- df1 #initialise
for(n in 2:dfcount){dftot <- rbind(dftot, eval(as.name(paste0("df", as.character(n)))))}
eval(as.name(variable_name)) reads the data frames from strings matching their names.

How to conditionally combine data.frame object in the list in more elegant way?

I have data.frame in the list, and I intend to merge specific data.frame objects conditionally where merge second, third data.frame objects without duplication, then merge it with first data.frame objects. However, I used rbind function to do this task, but my approach is not elegant. Can anyone help me out the improve the solution ? How can I achieve more compatible solution that can be used in dynamic functional programming ? How can I get desired output ? Any idea ?
reproducible example:
dfList <- list(
DF.1 = data.frame(red=c(1,2,3), blue=c(NA,1,2), green=c(1,1,2)),
DF.2 = data.frame(red=c(2,3,NA), blue=c(1,2,3), green=c(1,2,4)),
DF.3 = data.frame(red=c(2,3,NA,NA), blue=c(1,2,NA,3), green=c(1,2,3,4))
)
dummy way to do it:
rbind(dfList[[1L]], unique(rbind(dfList[[2L]], dfList[[3L]])))
Apparently, my attempt is not elegant to apply in functional programming. How can make this happen elegantly ?
desired output :
red blue green
1 1 NA 1
2 2 1 1
3 3 2 2
11 2 1 1
21 3 2 2
31 NA 3 4
6 NA NA 3
How can I improve my solution more elegantly and efficiently ? Thanks in advance
The best (easiest and fastest way) to do this is data.table::rbindlist.
It would work like this:
library(data.table)
dfList <- list(
DF.1 = data.table(red=c(1,2,3), blue=c(NA,1,2), green=c(1,1,2)),
DF.2 = data.table(red=c(2,3,NA), blue=c(1,2,3), green=c(1,2,4)),
DF.3 = data.table(red=c(2,3,NA,NA), blue=c(1,2,NA,3), green=c(1,2,3,4))
)
# part 1: list element 1
dt_1 <- dfList[[1]]
# part 2: all other list elements (in your case 2 and 3)
dt_2 <- unique(rbindlist(dfList[-1]))
# use rbindlist to bind the rows together
dt_all <- rbindlist(list(dt_1, dt_2))
Comment.
My solution is pretty close to your proposed solution. I think the "ugliness" about this way is that it is an edge case to merge datasets and deattach the first element (and treat it in a different way). The best solution would probably be to step back and think about the underlying idea and solve it using an additional variable in the datasets (i.e., for df1 and then for df2_3), which I would consider the R-way.
Something along this thought would look like this:
myList2 <- list(
DF.1 = data.table(red=c(1,2,3), blue=c(NA,1,2), green=c(1,1,2), var = "df1"),
DF.2 = data.table(red=c(2,3,NA), blue=c(1,2,3), green=c(1,2,4), var = "other"),
DF.3 = data.table(red=c(2,3,NA,NA), blue=c(1,2,NA,3), green=c(1,2,3,4), var = "other")
)
dt <- rbindlist(myList2)
unique(dt)
# red blue green var
# 1: 1 NA 1 df1
# 2: 2 1 1 df1
# 3: 3 2 2 df1
# 4: 2 1 1 other
# 5: 3 2 2 other
# 6: NA 3 4 other
# 7: NA NA 3 other
A way of rbinding a list of data.frames with only base R is do.call(list, rbind) (see this question that also presents some alternatives).
If you then desire only unique rows you can follow-up with a unique
unique(do.call(dfList, rbind))

How to change values in a column of a data frame based on conditions in another column?

I would like to have an equivalent of the Excel function "if". It seems basic enough, but I could not find relevant help.
I would like to assess "NA" to specific cells if two following cells in a different columns are not identical. In Excel, the command would be the following (say in C1): if(A1 = A2, B1, "NA"). I then just need to expand it to the rest of the column.
But in R, I am stuck!
Here is an equivalent of my R code so far.
df = data.frame(Type = c("1","2","3","4","4","5"),
File = c("A","A","B","B","B","C"))
df
To get the following Type of each Type in another column, I found a useful function on StackOverflow that does the job.
# determines the following Type of each Type
shift <- function(x, n){
c(x[-(seq(n))], rep(6, n))
}
df$TypeFoll <- shift(df$Type, 1)
df
Now, I would like to keep TypeFoll in a specific row when the File for this row is identical to the File on the next row.
Here is what I tried. It failed!
for(i in 1:length(df$File)){
df$TypeFoll2 <- ifelse(df$File[i] == df$File[i+1], df$TypeFoll, "NA")
}
df
In the end, my data frame should look like:
aim = data.frame(Type = c("1","2","3","4","4","5"),
File = c("A","A","B","B","B","C"),
TypeFoll = c("2","3","4","4","5","6"),
TypeFoll2 = c("2","NA","4","4","NA","6"))
aim
Oh, and by the way, if someone would know how to easily put the columns TypeFoll and TypeFoll2 just after the column Type, it would be great!
Thanks in advance
I would do it as follows (not keeping the result from the shift function)
df = data.frame(Type = c("1","2","3","4","4","5"),
File = c("A","A","B","B","B","C"), stringsAsFactors = FALSE)
# This is your shift function
len=nrow(df)
A1 <- df$File[1:(len-1)]
A2 <- df$File[2:len]
# Why do you save the result of the shift function in the df?
Then assign if(A1 = A2, B1, "NA"). As akrun mentioned ifelse is vectorised: Btw. this is how you append a column to a data.frame
df$TypeFoll2 <- c(ifelse(A1 == A2, df$Type, NA), 6) #Why 6?
As 6 is hardcoded here something like:
df$TypeFoll2 <- c(ifelse(A1 == A2, df$Type, NA), max(df$Type)+1)
Is more generic.
First off, 'for' loops are pretty slow in R, so try to think of this as vector manipulation instead.
df = data.frame(Type = c("1","2","3","4","4","5"),
File = c("A","A","B","B","B","C"));
Create shifted types and files and put it in new columns:
df$TypeFoll = c(as.character(df$Type[2:nrow(df)]), "NA");
df$FileFoll = c(as.character(df$File[2:nrow(df)]), "NA");
Now, df looks like this:
> df
Type File TypeFoll FileFoll
1 1 A 2 A
2 2 A 3 B
3 3 B 4 B
4 4 B 4 B
5 4 B 5 C
6 5 C NA NA
Then, create TypeFoll2 by combining these:
df$TypeFoll2 = ifelse(df$File == df$FileFoll, df$TypeFoll, "NA");
And you should have something that looks a lot like what you want:
> df;
Type File TypeFoll FileFoll TypeFoll2
1 1 A 2 A 2
2 2 A 3 B NA
3 3 B 4 B 4
4 4 B 4 B 4
5 4 B 5 C NA
6 5 C NA NA NA
If you want to remove the FileFoll column:
df$FileFoll = NULL;

Assigning one variable to another based on (and within) macro

I am using RStudio 0.98.1062.
What I am trying to do is within a macro to create a new variable based on another one (that already has a suffix defined by me) in the same dataframe . The name of the data frame, and the index(suffix) are macro variables.
Here is my code:
read_data <- defmacro(fileName, monthIndex, dfName,
expr = {
dfName <- read.table(fileName, head=TRUE,sep = ",")
#add suffix vor the variables for the corresponding month
colnames(dfName) <- paste(colnames(dfName),monthIndex, sep = "_")
#dfName["EasyClientMerge"]<-numeric()
within(dfName, assign("EasyClientMerge", paste("dfName$EasyClientNumber",monthIndex,sep="_"))
})
if the macro parameters are (..., monthIndex=6, dfName= m201309) I expect the following variable to be created
m201309$EasyClientMerge<-m201309$EasyClient_6
first of all a new variable is not created within the data frame and second of all it seems that a string is taken "m201309$EasyClient_6" rather than reference to dataframe & variable name
Thanks a lot in advance cause I am kind of stuck!
If you really insist on producing hard coded data.frames within a function (in my opinion a bad choice), you can do it like so.
> dfName <- "new.df"
> assign(dfName, value = list(clientMerge = 1:10, clientMerge2 = 1:10))
> as.data.frame(new.df)
clientMerge clientMerge2
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 8 8
9 9 9
10 10 10

Why as.data.frame doing this in R programming?

First of all i would like to tell that I am new to R programming. I was doing some experiment on some R code. I am facing some strange behaviour that I do not expect. I think some one can help me to figure it out.
I ran the following code to read data from a CSV file:
normData= read.csv("normData.csv");
and my normData looks like:
But When I ran the following code to form a Data Frame:
datExpr0 = as.data.frame(t(normData));
I get the following data:
Can some one please tell me, from where the an extra raw (v1,v2,v3,v4,v5,v6) coming from?
Try using:
setNames(as.data.frame(t(normData[-1])), normData[[1]])
However, it might be better to see if you can use the row.names argument in read.table to directly read your "X" as the row names. Then you should be able to directly use as.data.table(t(...)).
Here's a small example to show what's happening:
Start with a data.frame with characters as the first column:
df <- data.frame(A = letters[1:3],
B = 1:3, C = 4:6)
df
# A B C
# 1 a 1 4
# 2 b 2 5
# 3 c 3 6
When you transpose the entire thing, you also transpose that first column (thereby also creating a character matrix).
as.data.frame(t(df))
# V1 V2 V3
# A a b c
# B 1 2 3
# C 4 5 6
So, we drop the column first, and use the values from the column to replace the "V1", "V2"... names.
setNames(as.data.frame(t(df[-1])), df[[1]])
# a b c
# B 1 2 3
# C 4 5 6

Resources