I'm making a function and I'd like to call a column in a particular way.
Initialize data
a <- c(1,2,3,4,5)
b <- c(6,7,8,9,10)
c <- c(1,2,3,4,5)
d <- c(6,7,8,9,10)
df <- as.data.frame(cbind(a,b,c,d))
Call column for the table function
Func <- function(df){
X <- df
Y <- names(M)
for(i in 1:2){
table(X$___,X$___)
}}
The trouble is I don't know how to call the columns.
I'd like it to be the equivalent to table(X$a, X$b) as it iterates through the loop.
I tried this and it didn't work
for(i in 1:2){
Q <- Y[i]
W <- Y[j]
table(X$Q,X$W)
}}
It is necessary for a function I'm using that I make a table with the form table(X$a, X$b) and I don't know quite how to achieve that in a for loop?
Instead of calling table using names of the column you could use column index and use it in the function so you don't have to worry about how to call the columns.
Replace your for loop and use
table(df[1:2])
which would give you the expected result.
You need to use two [[ to get the content of the column:
df <- datasets::mtcars
for (i in 1:2) df[[i]]
This will also work for column names
for (i in names(df)) df[[i]]
Not sure what you are trying to achieve though. You can also just do:
lapply(df[1:2], table)
You can also loop through col using column index. In the following code you can loop through iris dataset column:
for(i in 1:length(colnames(iris))){
print(iris[,i]) # to get single column
print(iris[,c(i,i+1)]) # to get multiple column data
}
Related
I am putting together a summary table from a larger data frame. I noticed that I was re-using the following code but with different %like% characters:
# This code creates a df of values where the row name matches the character
df <- (data[which(data$`col_name` %like% "Total"),])
df <- df[3:ncol(df)]
df[is.na(df)] <- 0
# This creates a row composed of the sum of each column
for (i in seq_along(df)) {
df[10, i] <- sum(df[i])
}
# This inserts the resulting values into a separate summary table
summary[1, 2:ncol(summary)] <- df[nrow(df),]
To keep the code dry and avoid repetition, I thought it would be best to translate this into a custom function that I could then call with different strings:
create_row <- function(x) {
df <- (data[which(data$`Crop year` %like% as.character(x)),])
df <- df[3:ncol(df)]
df[is.na(df)] <- 0
for (i in seq_along(df)) {
df[10, i] <- sum(df[i])
}
}
# Then populate the summary table as before with the results
total <- create_row("Total")
summary[1, 2:ncol(summary)] <- total[nrow(total),]
However when attempting to run this, it simply returns an empty variable.
Through trial and error, I have found that the line of code causing this is:
df[is.na(df)] <- 0
The code works absolutely fine when run line by line outside of this custom function.
As mentioned in the comments if you add return(df) at the end of the function, the function will work. We need to do that because for loop unlike any other functions doesn't return an object after it's executed.
Moreover, as mentioned in the comments by #alan that you can use colSums to get sum of each column directly instead of for loop to loop over each column and take its sum.
I need to create as many copies of the same df with as names of those data-frames changing values stored in a vector.
For example:
z <- c("A-1", "B-2", "C-2", ...)
for (i in z) {
i <- already_existing_df
}
Manual hard-coding would be something like:
`A-1` <- df
`B-2` <- df
# ...and so on
Of course I would want to automate this, and not hardcode it... also 'cause it will change every month, and we are talking about many dfs...
Now, I know that to pass i as a variable name, you can simply do:
df[i]
but I don't know how to pass i as a df name.
Thank you in advance for any help!
Another approach could be using replicate to repeat the dataframe length(z) times and assign the names to the list
z <- c("A-1", "B-2", "C-2")
list_df <- setNames(replicate(length(z), df, simplify = FALSE), z)
You can then keep list of dataframes as it is or make them as separate dataframe.
list2env(list_df, .GlobalEnv)
Depending on your desired result you could define a list (or an Environment):
z <- vector("list", 3)
for (i in seq_along(z)) {
z[[i]] <- already_existing_df
}
names(z) <- c("A-1", "B-2", "C-2")
You can do assign(df, z) which will assign df to your enviroment under the i element of z.
I made a function in R that I would like to loop. I have gotten the function to work in a single case. I can't get the function to return the vector of number produced by the function.
vec_fun5 <- function(x,y){
Vec <- c(round(mean(x[[y]],na.rm=T),2),nrow(na.omit(x[,y])),length(which(x[,y]==1)),length(which(x[,y]==2)),length(which(x[,y]==3)),length(which(x[,y]==4)),length(which(x[,y]==5)))
return(Vec)
}
for(i in 20:24){
vec_fun5(x,i)
}
I would like to produce a data frame with all of the vectors produced by the loop.
Maybe you can try putting the objects created by the function in a list:
vec_save <- list()
ii <- 1
for(i in 20:24){
vec_save[[ii]] <- vec_fun5(x,i)
ii <- ii+1
}
Following this, if you would like to cbind or rbind the vectors of interest to obtain a single dataframe, you can just run:
df <- do.call("cbind", vec_save) #assuming that you want to bind them by column
I want to create a new dataframe and keep adding variables in R with in a for loop. Here is a pseudo code on what I want:
X <- 100
For(i in 1:X)
{
#do some processing and store it a variable named "temp_var"
temp_var[,i] <- z
#make it as a dataframe and keep adding the new variables until the loop completes
}
After completion of the above loop the vector "temp_var" by itself should be a dataframe containing 100 variables.
I would avoid using for loops as much as possible in R. If you know your columns before hand, lapply is something you want to use.
However, if this is constraint by your program, you can do something like this:
X <- 100
tmp_list <- list()
for (i in 1:X) {
tmp_list[[i]] <- your_input_column
}
data <- data.frame(tmp_list)
names(data) <- paste0("col", 1:X)
I have multiple data frames named y1 to y13 - one column each. They all have a column name that I would like to change to "Date.Code". I've tried the following in a for loop:
for(i in 1:13){
colnames(get(paste("y", i, sep=""))) <- c("Date.Code")
}
That didn't work.
I also tried:
for(i in 1:13){
assign(("Date.Code"), colnames(get(paste("y", i, sep=""))))
}
Also with no luck.
Any suggestions?
Thanks,
E
The difficulty here is that you cannot use get with an assignment operator directly
eg, get(nm) <- value will not work. You can use assign, as you're trying, but in a slightly different fashion.
assuming cn is the column number whose name you would like to change
for(i in 1:13){
nm <- paste0("y", i)
tmp <- get(nm)
colnames(tmp)[cn] <- c("Date.Code")
assign(nm, tmp)
}
That being said, a cleaner way of approaching this would be to collect all of your DF's into a single list, then you can easily use lapply to operate on them. For Example:
# collect all of your data.frames into a single list.
df.list <- lapply(seq(13), function(i) get(paste0("y", i)))
# now you can easily change column names. Note the `x` at the end of the function which serves
# as a return of the value. It then gets assigned back to an object `df.list`
df.list <-
lapply(df.list, function(x) {colnames(x)[cn] <- "Date.Code"; x})
Lastly, search these boards for [r] data.table and you will see many options for changing values by reference and setting attributes more directly.
Here one liner solution:
list2env(lapply(mget(ls(pattern='y[0-9]+')),
function(x) setNames(x,"Date.Code")),.GlobalEnv)
Of course it is better to keep all your variable in the same list.