I have a data frame where I had to convert all variables to the character class in order to bind_rows(). Now I want to identify and convert the columns that have numbers in them back to class numeric. I have 41 values so I don't want to have to mutate each of them separately.
Preferably the tidyverse way.
library(dplyr)
data_frame(number_var = as.character(rnorm(1:26)),
character_var = LETTERS)
You could use parse_guess from readrpackage:
library(dplyr)
library(readr)
df <- data_frame(number_var = as.character(rnorm(1:26)),
character_var = LETTERS)
df %>%
mutate_all(parse_guess) # guess column type for each column
Related
In a data.frame, I would like to add a column that identifies groups of consecutive days.
I think I need to start by converting my strings to date format...
Here's my example :
mydf <- data.frame(
var_name = c(rep("toto",6),rep("titi",5)),
date_collection = c("09/12/2022","10/12/2022","13/12/2022","16/12/2022","16/12/2022","17/12/2022",
"01/12/2022","03/11/2022","04/11/2022","05/11/2022","08/11/2022")
)
Expected output :
Convert to Date class and do the adjacent diff to create a a logical vector and take the cumulative sum
library(dplyr)
library(lubridate)
mydf %>%
mutate(id = cumsum(c(0, abs(diff(dmy(date_collection)))) > 1)+1)
Is there a simple way to add columns to a data.frame with known vector's values?
I have a dataframe "Now"
Now<-data.frame(1:4)
Vect<-c(A,B,C)
Vect_name<-c("x1","x2","x3")
I want a dataframe result like this:
Result<-data.frame(c(1:4),"A","B","C")
colnames(Result)<-Vect_name
and I want the code is also applicable when the length of "Vect" and "Vect_name" is variable.
I mean Vect could be c(A,B,C,D...) and Vect_name could be c("x1","x2","x3","x4"...).
Thank you.
We can use base R by assigning the list of 'Vect' (convert to a list) to the column names in 'Vect_name'
Now[Vect_name] <- as.list(Vect)
Or if we don't want to change the original object, use cbind
Now1 <- cbind(Now, t(setNames(Vect, Vect_name)))
Or with tidyverse, create a named list column and then use unnest_wider
library(dplyr)
library(tidyr)
Now %>%
mutate(col = list(as.list(set_names(Vect, Vect_name)))) %>%
unnest_wider(col)
data
Vect <- c("A", "B", "C")
I have two columns (class: character) in a data.frame that include large numbers (e.g. column A: 999967258082532415; columns B: 999967258082532415). I want a new columns C that combines the two numbers:999967258082532415999967258082532415
I use:
data_1$visit_id <- do.call(paste, c(data_1[c("post_visid_high", "post_visid_low")], sep = ""))
But my new column gets converted to factor, but I still want a character. What can i do?
I created a sample dataset that resembles yours:
df <- data.frame(col_A = c(2314325435454354,123098213728903214,12329042374094),
col_B = c(9034832054097390485,30945743504375043,234903284304))
Using dplyr, create a new column (column C) that concatenates the other two columns, followed by mutating all columns to character data type:
library(dplyr)
df <- df %>%
mutate(col_C = col_A + col_B) %>%
mutate_all(funs(as.character(.)))
I have two separate datasets: one has the column headers and another has the data.
The first one looks like this:
where I want to make the 2nd column as the column headers of the next dataset:
How can I do this? Thank you.
In general you can use colnames, which is a list of your column names of your dataframe or matrix. You can rename your dataframe then with:
colnames(df) <- *listofnames*
Also it is possible just to rename one name by using the [] brackets.
This would rename the first column:
colnames(df2)[1] <- "name"
For your example we gonna take the values of your column. Try this:
colnames(df2) <- as.character(df1[,2])
Take care that the length of the columns and the header is identical.
Equivalent for rows is rownames()
dplyr way w/ reproducible code:
library(dplyr)
df <- tibble(x = 1:5, y = 11:15)
df_n <- tibble(x = 1:2, y = c("col1", "col2"))
names(df) <- df_n %>% select(y) %>% pull()
I think the select() %>% pull() syntax is easier to remember than list indexing. Also I used names over colnames function. When working with a dataframe, colnames simply calls the names function, so better to cut out the middleman and be more explicit that we are working with a dataframe and not a matrix. Also shorter to type.
You can simply do this :
names(data)[3]<- 'Newlabel'
Where names(data)[3] is the column you want to rename.
I have a dataframe containing observations for two sets of data (A,B), with dataset and observation type given by the column names :
mydf <- data.frame(meta1=paste0("a",1:2), meta2=paste0("b",1:2),
A_var1 = c(11:12), A_var2 = c("p","r"),
B_var1 = c(21:22), B_var2 = c("x","z"))
I would like to reshape this dataframe so that each row contains observations on one set only. In this long format, set and column names should by given by splitting the original column names at the '_':
mydf2 <- data.frame(meta1=rep(paste0("a",1:2),2),
meta2=rep(paste0("b",1:2),2),
set=c("A","B","A","B"),
var1 = c(11:12),
var2 = c("a","b","c","d"))
I have tried using 'gather' in combination with 'str_split','sub', but unfortunately without success. Could this be done using tideverse functions?
Yes you can do this with tidyverse !
You were close, you need to gather, then separate, then spread.
new_df <- mydf %>%
gather(set, vars, 3:6) %>%
separate(set, into = c('set', 'var'), sep = "_") %>%
spread(var, vars)
hope this helps!