I have many datasets in tibble format, with variables as rows. I want to change the layout and wrangle individual dataset. To save myself from repetitive work and risk of making mistakes. I wrote this function in R to do this.
library(tidyverse)
change_data_layout<- function(data_df){
data_df_2 <- data_df %>% mutate(samples = colnames()) %>% t()
colnames(data_df_2) <-data_df_2[1,]
rownames <- rownames(data_df_2) [2:nrow(data_df_2)]
data_df_3 <- data_df_2[1:nrow(data_df_2),] %>% as_tibble() %>% mutate(samples = rownames)
colnames(data_df_3) <- data_df_3 [1,]
data_df_4 <- data_df_3[2:nrow(data_df_3),]
data_final <- data_df_4 %>%
mutate_each(funs(type.convert)) %>% mutate_if(is.factor, as.character)
return(data_final)
}
However, when I run this function as :
dataset1_final <- change_data_layout(dataset1)
I got this error message:
Error: argument "x" is missing, with no default
Called from: mutate_impl(.data, dots)
Any help and suggestions?
Related
I need to create a line ID column within a dataframe for further pre-processing steps. The code worked fine up until yesterday. Today, however I am facing the error message:
"Error in mutate():
ℹ In argument: line_id = (function (x, y) ....
Caused by error:
! Can't convert y to match type of x ."
Here is my code - the dataframe consists of two character columns:
split_text <- raw_text %>%
mutate(text = enframe(strsplit(text, split = "\n", ))) %>%
unnest(cols = c(text)) %>%
unnest(cols = c(value)) %>%
rename(text_raw = value) %>%
select(-name) %>%
mutate(doc_id = str_remove(doc_id, ".txt")) %>%
# removing empty rows + add line_id
mutate(line_id = row_number())
Besides row_number(), I also tried rowid_to_column, and even c(1:1000) - the length of the dataframe. The error message stays the same.
Try explicitly specifying the data type of the "line_id" column as an integer using the as.integer() function, like this:
mutate(line_id = as.integer(row_number()))
This code works but is not fully satisfying, since I have to break the pipe:
split_text$line_id <- as.integer(c(1:nrow(split_text)))
(I am new in R)
Trying to change variables data type of df members to factors based on condition if their names available in a list to_factors_list.
I have tried some code using mutate(across()) but it's giving errors.
Data prep.:
library(tidyverse)
# tidytuesday himalayan data
members <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-22/members.csv")
# creating list of names
to_factors_list <- members %>%
map_df(~(data.frame(n_distinct = n_distinct(.x))),
.id = "var_name") %>%
filter(n_distinct < 15) %>%
select(var_name) %>% pull()
to_factors_list
############### output ###############
'season''sex''hired''success''solo''oxygen_used''died''death_cause''injured''injury_type'
Getting error in below code attempts:
members %>%
mutate(across(~.x %in% to_factors_list, factor))
members %>%
mutate_if( ~.x %in% to_factors_list, factor)
I am not sure what's wrong and how can I make this work ?
In base R, this can be done with lapply
members[to_factors_list] <- lapply(members[to_factors_list], factor)
The correct syntax is:
members %>% mutate(across(to_factors_list, factor))
Or if you prefer an older-version dplyr syntax:
members %>% mutate_at(vars(to_factors_list), factor)
I need to create a subset of my main data frame (mydata1) in R.
The Date column in mydata1 has already been formatted as a Date using the following codes:
mydata1$Date = as.Date(mydata1$Date)
I have the current codes running to create the subset of my data:
mydata3 <- mydata1 %>%
filter(Total.Extras.Per.GN >= 100) %>%
filter(Original.Meal.Plan.Code %in% target) %>%
filter(Date, between ("2017-01-01"), ("2017-06-01")) %>%
select(PropertyCode, Date, Market, Original.Meal.Plan.Code, GADR, Total.Extras.Per.GN)
However, the line filter(Date, between ("2017-01-01"), ("2017-06-01")) %>% is giving me an error. How do I write it properly so that it filters my Date column with the dates specified therein?
Error message:
Error in filter_impl(.data, dots) :
argument "left" is missing, with no default
Simply place Date inside the between arg and wrap date strings in as.Date() for comparison:
mydata3 <- mydata1 %>%
filter(Total.Extras.Per.GN >= 100) %>%
filter(Original.Meal.Plan.Code %in% target) %>%
filter(between(Date, as.Date("2017-01-01"), as.Date("2017-06-01"))) %>%
select(PropertyCode, Date, Market, Original.Meal.Plan.Code, GADR, Total.Extras.Per.GN)
library(tidyverse)
data(mtcars)
mtcars <- rownames_to_column(mtcars,var = "car")
mtcars$make <- map_chr(mtcars$car,~strsplit(.x," ")[[1]][1])
mt2 <- mtcars %>% select(1:8,make) %>% nest(-make,.key = "l")
mt4<-mt2[1:5,]
mt4[c(1,5),"l"] <- list(list(NULL))
Now, I´d like to run the following function for each make of car:
fun_mt <- function(df){
a <- df %>%
filter(cyl<8) %>%
arrange(mpg) %>%
slice(1) %>%
select(mpg,disp)
return(a)
}
mt4 %>% mutate(newdf=map(l,~possibly(fun_mt(.x),otherwise = "NA"))) %>% unnest(newdf)
However, the NULL columns refuse to evaluate due to
Error: no applicable method for 'filter_' applied to an object of class "NULL"
I also tried using the safely and possibly approach, but still I get an error msg:
Error: Don't know how to convert NULL into a function
Any good solutions to this?
The problem is that NULL gets passed into the function fun_mt(). You wanted to catch this with possibly(). But possibly() is a function operator, i.e. you pass it a function and it returns a function. So, your call should have been
~ possibly(fun_mt, otherwise = "NA"))(.x)
But this doesn't yet work with unnest(). Instead of a character "NA" (a bad idea anyway, rather use a proper NA) you would have to default to a data frame:
~ possibly(fun_mt, otherwise = data.frame(mpg = NA, disp = NA))(.x)
I have data frame which contain column, which is list.
data frame contain json reponse as column, and second column is list which is converted from JSON using following code.
vectorize_fromJSON <- Vectorize(fromJSON, USE.NAMES=FALSE)
z <- vectorize_fromJSON(data_df$json_response)
I am using rowwise with do function to extract information for list.
However, I am not able to use if with it.
Working code
t <- data_df %>% rowwise %>% do(
test = class(.$json_list$cbas$dslscc)
)
I want something like as follows:
t <- data_df %>% rowwise %>% do(
test = ifelse(class(.$json_list$cbas$dslscc)=="list", TRUE,
.$json_list$cbas$dslscc)
)
following is error:
Error in
.$json_list$clear_bank_attributes$days_since_last_successful_check_cashed$nil
: $ operator is invalid for atomic vectors