Code inside a function works, but the function itself doesn't - r

Let's assume I have this dataframe:
df <- data.frame(A = letters[1:5],
B = letters[6:10],
stringsAsFactors = FALSE)
A B
1 a f
2 b g
3 c h
4 d i
5 e j
Where I'm looking for this output:
A B
1 e j
2 d i
3 c h
4 b g
5 a f
With this function:
f_Order <- function(df){
df$Order <- as.integer(row.names(df))
df <- arrange(df, desc(Order))[,c("A","B")]
}
Though the function above doesn't work, the code inside the function works perfectly:
df$Order <- as.integer(row.names(df))
df <- arrange(df, desc(Order))[,c("A","B")]
> x
A B
1 e j
2 d i
3 c h
4 b g
5 a f
Why? How do I make the function work?
EDIT:
To clarify, the problem statement is not to change the order of the df, but to make the function f_Order to work. The code does what I want, but it doesn't what I want inside that function. I need to know why, and how I can make the function to work.
EDIT2:
This is exactly the code I'm running, and still doesn't work any of the solutions.
x <- data.frame(A = letters[1:5],
B = letters[6:10],
stringsAsFactors = FALSE)
f_Order <- function(df){
df$Order <- as.integer(row.names(df))
df <- arrange(df, desc(Order))
return(df)
}
f_Order(x)

What if you have a return() at the end of your function? Something like this:
f_Order <- function(df){
df$Order <- as.integer(row.names(df))
df <- arrange(df, desc(Order))[,c("A","B")]
return(df)
}
Basically if you have stuff happening in a function, you need to return it at the end if you want there to be an output. Otherwise it just...does it inside the function, but not in the wider environment, and then doesn't show you anything.
Output:
> f_Order(df)
A B
1 e j
2 d i
3 c h
4 b g
5 a f
If you want to update df, then run df <- f_Order(df).

Continuing with dplyr:
f_Order <- function(df){
#df$Order <- as.integer(row.names(df))
df %>%
mutate(Order=row.names(.)) %>%
arrange(desc(Order))
}
If we don't want to keep Order:
f_Order <- function(df){
df %>%
arrange(desc(row.names(.)))
}
Result:
f_Order(df)
A B
1 e j
2 d i
3 c h
4 b g
5 a f

Related

Take last n columns with value

i have a basic R question: imagine the following code:
a <- c("A","B","C")
b <- c("A","B","C")
c <- c("A","X","C")
x <- c("A","B","C")
y <- c("","B","C")
z <- c("","","C")
frame <- data.frame(a,b,c,x,y,z)
now i want to get the content of the last 3 columns but only if they contain value. So the Output should look like this
new1 <- c("A","X","C")
new2 <- c("A","B","C")
new3 <- c("A","B","C")
frame2 <- data.frame(new1,new2,new3)
I am thankful for every help.
Using apply from base R
as.data.frame(t(apply(frame, 1, FUN = function(x) tail(x[nzchar(x)], 3))))
You can do,
new_frame <- frame[colSums(frame == '') == 0]
new_frame[tail(seq_along(new_frame), 3)]
b c x
1 A A A
2 B X B
3 C C C

Special reshape in R

Consider a 3x3 char dataframe:
example <- data.frame(one = c("a","b","c"),
two = c("a","b","b"),
three = c ("c","a","b"))
I want to resize these data to 6x2 and add the following content:
desired <- data.frame(one = c("a","a","b","b",
"c","b"),
two = c("a","c","b","a","b","b"))
For the original example dataframe, I want to rbind() the contents of example[,2:3] beneath each row index.
This can be achieved by:
ex <- as.matrix(example)
des <- as.data.frame(rbind(ex[,1:2], ex[,2:3]))
Maybe using library(tidyverse) for an arbitrary number of columns would be nicer?
For each pair of columns, transpose the sub-data.frame defined by them and coerce to vector. Then coerce to data.frame and set the result's names.
The code that follows should be scalable, it does not hard code the number of columns.
desired2 <- as.data.frame(
lapply(seq(names(example))[-1], \(k) c(t(example[(k-1):k])))
)
names(desired2) <- names(example)[-ncol(example)]
identical(desired, desired2)
#[1] TRUE
The code above rewritten as a function.
reformat <- function(x){
y <- as.data.frame(
lapply(seq(names(x))[-1], \(k) c(t(x[(k-1):k])))
)
names(y) <- names(x)[-ncol(x)]
y
}
reformat(example)
example %>% reformat()
Another example, with 6 columns input.
ex1 <- example
ex2 <- example
names(ex2) <- c("fourth", "fifth", "sixth")
ex <- cbind(ex1, ex2)
reformat(ex)
ex %>% reformat()
A tidyverse approach using tidyr::pivot_longer may look like so:
library(dplyr)
library(tidyr)
pivot_longer(example, -one, values_to = "two") %>%
select(-name)
#> # A tibble: 6 × 2
#> one two
#> <chr> <chr>
#> 1 a a
#> 2 a c
#> 3 b b
#> 4 b a
#> 5 c b
#> 6 c b
A base-R solution with Map:
#iterate over example$one, example$two, and example$three at the same
#time, creating the output you need.
mylist <- Map(function(x ,y ,z ) {
data.frame(one = c(x, y), two = c(y, z))
},
example$one #x,
example$two #y,
example$three #z)
do.call(rbind, mylist)
one two
a.1 a a
a.2 a c
b.1 b b
b.2 b a
c.1 c b
c.2 b b

R - assign element names to column names inside a list of lists of dataframes in tidyverse

I have a list of lists of dataframes, each having one column, like this:
list(list(A = data.frame(X = 1:5),
B = data.frame(Y = 6:10),
C = data.frame(Z = 11:15)),
list(A = data.frame(X = 16:20),
B = data.frame(Y = 21:25),
C = data.frame(Z = 26:30)),
list(A = data.frame(X = 31:35),
B = data.frame(Y = 36:40),
C = data.frame(Z = 41:45))) -> dflist
I need to make it so that the column names X, Y and Z inside of each dataframe are changed to A, B and C. An important thing to add is that the names A, B and C are not known beforehand, but must be extracted from the list element names. I have a simple script that can accomplish this:
for(i in 1:3){
for(j in 1:3){
colnames(dflist[[i]][[j]]) <- names(dflist[[i]])[[j]]
}
}
However, I need to do this in tidyverse style. I have found similar questions on here, however, they only deal with lists of dataframes and not with lists of lists of dataframes and I can't find a way to make it work.
Using combination of map and imap -
library(dplyr)
library(purrr)
map(dflist, function(x)
imap(x, function(data, name)
data %>% rename_with(function(y) name)))
#[[1]]
#[[1]]$A
# A
#1 1
#2 2
#3 3
#4 4
#5 5
#[[1]]$B
# B
#1 6
#2 7
#3 8
#4 9
#5 10
#[[1]]$C
# C
#1 11
#2 12
#3 13
#4 14
#5 15
#...
#...
Also possible without purrr, using lapply and mapply (the latter with SIMPLIFY=FALSE). If dflist is your list of lists:
lapply(dflist, function(x){
mapply(function(y,z){
`colnames<-`(y, z)
}, y=x, z=names(x), SIMPLIFY=F)
})
#or on one line:
lapply(dflist, function(x) mapply(function(y,z) `colnames<-`(y, z), y=x, z=names(x), SIMPLIFY=F))
A solution with purrr:walk:
library(tidyverse)
walk(1:length(dflist),
function(x)
walk(names(dflist[[x]]), ~ {names(dflist[[x]][[.x]]) <<- .x}))

Function with IF ELSE doesn't work

I have a simple function:
new_function <- function(x)
{
letters <- c("A","B","C")
new_letters<- c("D","E","F")
if (x %in% letters) {"Correct"}
else if (x %in% new_letters) {"Also Correct"}
else {x}
}
I make a dataframe with letters:
df <- as.data.frame(LETTERS[seq( from = 1, to = 10 )])
names(df)<- c("Letters")
I want to apply the function on the dataframe:
df$result <- new_function(df$Letters)
And it doesn't work (the function only writes "Correct")
I get this warning:
Warning message:
In if (x %in% letters) { :
the condition has length > 1 and only the first element will be used
You can use lapply:
df$result <- lapply(df$Letters,new_function)
Output:
df
Letters result
1 A Correct
2 B Correct
3 C Correct
4 D Also Correct
5 E Also Correct
6 F Also Correct
7 G 7
8 H 8
9 I 9
10 J 10
I would rewrite your new_function with ifelse as #akrun suggested. as.character converts x to character in case it is a factor:
new_function <- function(x){
ifelse(x %in% c("A","B","C"), "Correct",
ifelse(x %in% c("D","E","F"), "Also Correct", as.character(x)))
}
df$result <- new_function(df$Letters)
or with case_when from dplyr:
library(dplyr)
new_function <- function(x){
case_when(x %in% c("A","B","C") ~ "Correct",
x %in% c("D","E","F") ~ "Also Correct",
TRUE ~ as.character(x))
}
df %>%
mutate(result = new_function(Letters))
Result:
Letters result
1 A Correct
2 B Correct
3 C Correct
4 D Also Correct
5 E Also Correct
6 F Also Correct
7 G G
8 H H
9 I I
10 J J
Data:
df <- as.data.frame(LETTERS[seq( from = 1, to = 10 )])
names(df)<- c("Letters")

finding unique combination of vars and create a new var if found a unique combination

I have this dataframe as toy example
aski = data.frame(A = c("x","y","z","x","z","z"),
B = c("a","b","c","a","b","c"))
Now i want to check for each unique combination of A and B and if its a unique combo i want to create a new variable in dataframe and increment each time(e.g r1,r2,....) a unique combintaion found.
Output dataframe something like this
aski2 = data.frame(A = c("x","y","z","x","z","z"),
B = c("a","b","c","a","b","c"),
output = c("r1","r2","r3","r1","r4","r3"))
Try this:
aski2 <- data.frame(A = c("x","y","z","x","z","z"),
B = c("a","b","c","a","b","c"))
ref <- do.call(paste, aski2)
aski2$output <- paste("r", as.numeric(factor(ref, levels = unique(ref))),
sep = "")
aski2
Another option is use group_indices; Group by column A and B, and it generates a unique id for each group (see ?group_indices):
aski2 <- data.frame(A = c("x","y","z","x","z","z"),
B = c("a","b","c","a","b","c"),
C = c("s","v","g","v","g","d"))
aski2 %>% mutate(output = sprintf("r%s", group_indices(., A, B)))
# A B C output
#1 x a s r1
#2 y b v r2
#3 z c g r4
#4 x a v r1
#5 z b g r3
#6 z c d r4
One option is .GRP
library(data.table)
setDT(aski2)[, output := paste0("r", .GRP), .(A, B)]

Resources