This question already has answers here:
R - Change column name using get()
(3 answers)
Closed 2 years ago.
My goal is to set the column names on a data frame. The name of this data frame is stored in a variable name_of_table.
name_of_table<-"table_13"
assign(name_of_table,read.csv("table_13_air_vehicle_risks_likelihood_and_cost_effects.csv", header=FALSE))
# This works fine, like table_13 <- read.csv(...)
first_level_header <- c("one","two","three","four","five")
colnames(get(name_of_table)) <- first_level_header
# Throws error:
#Error in colnames(get(name_of_table)) <- first_level_header :
# could not find function "get<-"
Obviously if I substitute table_13 for get(name_of_table) this works.
If instead I try:
colnames(names(name_of_table)) <- first_level_header
#Throws error: Error in `colnames<-`(`*tmp*`, value = c("one", "two", "three", "four", : attempt to set
#'colnames' on an object with less than two dimensions
I was pointed to this post earlier: R using get() inside colnames
But eval(parse(paste0("colnames(",name_of_table,")<- first_level_header"))), besides being hideous, does not work either: Error in file(filename, "r") : cannot open the connection
I don't understand the suggestion involving SetNames.
I apologize if get/assign is not the right approach, of course I want to do this the "right" way, I appreciate the guidance.
You could use library(data.table)
table_13 = data.table(1:5, 1:5, 1:5, 1:5, 1:5)
setnames(get(name_of_table), first_level_header) # N.B. also works for a data.frame
# one two three four five
# 1: 1 1 1 1 1
# 2: 2 2 2 2 2
# 3: 3 3 3 3 3
# 4: 4 4 4 4 4
# 5: 5 5 5 5 5
Related
I have the following data.table called D.
ngram
1 in_the_years
2 the_years_thereafter
3 years_thereafter_most
4 he_wasn't_home
5 how_are_you
6 thereafter_most_of
I need to add a few variables.
1.queryWord (the requirement is to extract the first 2 words)
the following is my code
D[,queryWord:=strsplit(ngram,"_[^_]+$")[[1]],by=ngram]
ngram queryWord
1 in_the_years in_the
2 the_years_thereafter the_years
3 years_thereafter_most years_thereafter
4 he_wasn't_home he_wasn't
5 how_are_you how_are
6 thereafter_most_of thereafter_most
2.predict. The requirement is to extract the last word.
The following is desired output
ngram queryWord predict
1 in_the_years in_the years
2 the_years_thereafter the_years thereafter
3 years_thereafter_most years_thereafter most
4 he_wasn't_home he_wasn't home
5 how_are_you how_are you
6 thereafter_most_of thereafter_most of
For this purpose I wrote the following function
getLastTerm<-function(x){
y<-strsplit(x,"_")
y[[1]][length(y[[1]])]
}
getLasTerm("in_the_years","_") return "years" however is not working inside the data.table object D.
D[,predict:=getLastTerm(ngram)[[1]],by=ngram]
Please I need help
Before adressing your actual question, you can simplify your first step to:
# option 1
D[, queryWord := strsplit(ngram,"_[^_]+$")][]
# option 2
D[, queryWord := sub('(.*)_.*$','\\1',ngram)][]
To get the predict-column, you don't need to write a special function. Using a combination of strsplit, lapply and last:
D[, predict := lapply(strsplit(D$ngram,"_"), last)][]
Or an even easier solution is using only sub:
D[, predict := sub('.*_(.*)$','\\1',ngram)][]
Both approaches give the following final result:
> D
ngram queryWord predict
1: in_the_years in_the years
2: the_years_thereafter the_years thereafter
3: years_thereafter_most years_thereafter most
4: he_wasn't_home he_wasn't home
5: how_are_you how_are you
6: thereafter_most_of thereafter_most of
Used data:
D <- fread("ngram
in_the_years
the_years_thereafter
years_thereafter_most
he_wasn't_home
how_are_you
thereafter_most_of", header = TRUE)
Your get last term function only selects the first list. Try below.
getLastTerm <- function(x){
y <- strsplit(x,"_")
for (i in (1:6)) {
x[i] <- y[[i]][length(y[[i]])]
}
x
}
D$new <- getLastTerm(D$ngram)
Is there any way to View dataframes in r, while refering to them with another variable? Say I have 10 data frames named df1 to df10, is there a way I can View them while using i instead of 1:10?
Example:
df1 = as.data.frame(c(1:20))
i = 1
View(paste("df", i, sep =""))
I would like this last piece of code to do the same as View(df1). Is there any command or similar in R that allows you to do that?
The answer to your immediate question is get:
df1 <- data.frame(x = 1:5)
df2 <- data.frame(x = 6:10)
> get(paste0("df",1))
x
1 1
2 2
3 3
4 4
5 5
But having multiple similar objects with names like df1, df2, etc in your workspace is considered fairly bad practice in R, and instead experienced R folks will prefer to put related objects in a named list:
df_list <- setNames(list(df1,df2),paste0("df",1:2))
> df_list[[paste0("df",1)]]
x
1 1
2 2
3 3
4 4
5 5
I am using RStudio 0.98.1062.
What I am trying to do is within a macro to create a new variable based on another one (that already has a suffix defined by me) in the same dataframe . The name of the data frame, and the index(suffix) are macro variables.
Here is my code:
read_data <- defmacro(fileName, monthIndex, dfName,
expr = {
dfName <- read.table(fileName, head=TRUE,sep = ",")
#add suffix vor the variables for the corresponding month
colnames(dfName) <- paste(colnames(dfName),monthIndex, sep = "_")
#dfName["EasyClientMerge"]<-numeric()
within(dfName, assign("EasyClientMerge", paste("dfName$EasyClientNumber",monthIndex,sep="_"))
})
if the macro parameters are (..., monthIndex=6, dfName= m201309) I expect the following variable to be created
m201309$EasyClientMerge<-m201309$EasyClient_6
first of all a new variable is not created within the data frame and second of all it seems that a string is taken "m201309$EasyClient_6" rather than reference to dataframe & variable name
Thanks a lot in advance cause I am kind of stuck!
If you really insist on producing hard coded data.frames within a function (in my opinion a bad choice), you can do it like so.
> dfName <- "new.df"
> assign(dfName, value = list(clientMerge = 1:10, clientMerge2 = 1:10))
> as.data.frame(new.df)
clientMerge clientMerge2
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 8 8
9 9 9
10 10 10
This question already has answers here:
define $ right parameter with a variable in R [duplicate]
(3 answers)
Closed 8 years ago.
I have the following two data frames (the real data frames are much larger). I would like to determine at which position(s) both data frames contain a 3 (in this example spot [2,1] only).
a<-c(1,3,5)
b<-c(2,3,4)
c<-c(3,3,4)
d<-c(2,4,7)
e<-cbind(a,b)
f<-cbind(c,d)
colnames(e)<-c("a","b")
colnames(f)<-c("a","b")
Results:
e
## a b
## 1 2
## 3 3
## 5 4
f
## a b
## 3 2
## 3 4
## 4 7
I've tried to use the following function, but it doesn't work.
fun<-function(x)
{ifelse(e$x==3 & f$x==3, "yes","no")}
Vs<-c("a","b")
lapply(Vs, fun)
Does anyone have any ideas, specifically on how use a variable as the character after the extraction operator ($) in a user written function?
Here you go:
fun <- function(x) { ifelse(e[,x] == 3 & f[,x] == 3, "yes", "no") }
Vs <- c("a", "b")
lapply(Vs, fun)
I have a following sample data frame:
x<-c(1:4)
y<-c(9:12)
z<-c("a","b","c","d")
data<-data.frame(x,y,z) # as data:
x y z
1 1 9 a
2 2 10 b
3 3 11 c
4 4 12 d
I want to extract the column 2 or 3 using the function (note: I am using column names to extract). My code is as follows:
data_frame<-function(col){
cols<-c("y","z")
# column x is already there; it is not in a vector of col.
if (col %in% cols){
kk<-data[,c("x","col")]
return (kk)}
}
Now, I want the output for data_frame("y"). However, R gives me the following error:
data_frame("y")
Error in `[.data.frame`(data, , c("x", "col")) :
undefined columns selected.
I was wondering why R is not taking my argument col which is y here. I am a bit upset why R is interpreting argument col as the name of the column. Your valuable suggestion in this regard would be highly appreciated.
This part: kk<-data[,c("x","col")] should be kk<-data[,c("x",col)]