Change column class if column consists of numbers - r

I have a data frame where I had to convert all variables to the character class in order to bind_rows(). Now I want to identify and convert the columns that have numbers in them back to class numeric. I have 41 values so I don't want to have to mutate each of them separately.
Preferably the tidyverse way.
library(dplyr)
data_frame(number_var = as.character(rnorm(1:26)),
character_var = LETTERS)

You could use parse_guess from readrpackage:
library(dplyr)
library(readr)
df <- data_frame(number_var = as.character(rnorm(1:26)),
character_var = LETTERS)
df %>%
mutate_all(parse_guess) # guess column type for each column

Related

How to add a column that identifies groups of consecutive days

In a data.frame, I would like to add a column that identifies groups of consecutive days.
I think I need to start by converting my strings to date format...
Here's my example :
mydf <- data.frame(
var_name = c(rep("toto",6),rep("titi",5)),
date_collection = c("09/12/2022","10/12/2022","13/12/2022","16/12/2022","16/12/2022","17/12/2022",
"01/12/2022","03/11/2022","04/11/2022","05/11/2022","08/11/2022")
)
Expected output :
Convert to Date class and do the adjacent diff to create a a logical vector and take the cumulative sum
library(dplyr)
library(lubridate)
mydf %>%
mutate(id = cumsum(c(0, abs(diff(dmy(date_collection)))) > 1)+1)

R function transform a vector to columns

Is there a simple way to add columns to a data.frame with known vector's values?
I have a dataframe "Now"
Now<-data.frame(1:4)
Vect<-c(A,B,C)
Vect_name<-c("x1","x2","x3")
I want a dataframe result like this:
Result<-data.frame(c(1:4),"A","B","C")
colnames(Result)<-Vect_name
and I want the code is also applicable when the length of "Vect" and "Vect_name" is variable.
I mean Vect could be c(A,B,C,D...) and Vect_name could be c("x1","x2","x3","x4"...).
Thank you.
We can use base R by assigning the list of 'Vect' (convert to a list) to the column names in 'Vect_name'
Now[Vect_name] <- as.list(Vect)
Or if we don't want to change the original object, use cbind
Now1 <- cbind(Now, t(setNames(Vect, Vect_name)))
Or with tidyverse, create a named list column and then use unnest_wider
library(dplyr)
library(tidyr)
Now %>%
mutate(col = list(as.list(set_names(Vect, Vect_name)))) %>%
unnest_wider(col)
data
Vect <- c("A", "B", "C")

R combine / merge two columns to one column

I have two columns (class: character) in a data.frame that include large numbers (e.g. column A: 999967258082532415; columns B: 999967258082532415). I want a new columns C that combines the two numbers:999967258082532415999967258082532415
I use:
data_1$visit_id <- do.call(paste, c(data_1[c("post_visid_high", "post_visid_low")], sep = ""))
But my new column gets converted to factor, but I still want a character. What can i do?
I created a sample dataset that resembles yours:
df <- data.frame(col_A = c(2314325435454354,123098213728903214,12329042374094),
col_B = c(9034832054097390485,30945743504375043,234903284304))
Using dplyr, create a new column (column C) that concatenates the other two columns, followed by mutating all columns to character data type:
library(dplyr)
df <- df %>%
mutate(col_C = col_A + col_B) %>%
mutate_all(funs(as.character(.)))

How to Rename Column Headers in R

I have two separate datasets: one has the column headers and another has the data.
The first one looks like this:
where I want to make the 2nd column as the column headers of the next dataset:
How can I do this? Thank you.
In general you can use colnames, which is a list of your column names of your dataframe or matrix. You can rename your dataframe then with:
colnames(df) <- *listofnames*
Also it is possible just to rename one name by using the [] brackets.
This would rename the first column:
colnames(df2)[1] <- "name"
For your example we gonna take the values of your column. Try this:
colnames(df2) <- as.character(df1[,2])
Take care that the length of the columns and the header is identical.
Equivalent for rows is rownames()
dplyr way w/ reproducible code:
library(dplyr)
df <- tibble(x = 1:5, y = 11:15)
df_n <- tibble(x = 1:2, y = c("col1", "col2"))
names(df) <- df_n %>% select(y) %>% pull()
I think the select() %>% pull() syntax is easier to remember than list indexing. Also I used names over colnames function. When working with a dataframe, colnames simply calls the names function, so better to cut out the middleman and be more explicit that we are working with a dataframe and not a matrix. Also shorter to type.
You can simply do this :
names(data)[3]<- 'Newlabel'
Where names(data)[3] is the column you want to rename.

R: reshape dataframe from wide to long format based on compound column names

I have a dataframe containing observations for two sets of data (A,B), with dataset and observation type given by the column names :
mydf <- data.frame(meta1=paste0("a",1:2), meta2=paste0("b",1:2),
A_var1 = c(11:12), A_var2 = c("p","r"),
B_var1 = c(21:22), B_var2 = c("x","z"))
I would like to reshape this dataframe so that each row contains observations on one set only. In this long format, set and column names should by given by splitting the original column names at the '_':
mydf2 <- data.frame(meta1=rep(paste0("a",1:2),2),
meta2=rep(paste0("b",1:2),2),
set=c("A","B","A","B"),
var1 = c(11:12),
var2 = c("a","b","c","d"))
I have tried using 'gather' in combination with 'str_split','sub', but unfortunately without success. Could this be done using tideverse functions?
Yes you can do this with tidyverse !
You were close, you need to gather, then separate, then spread.
new_df <- mydf %>%
gather(set, vars, 3:6) %>%
separate(set, into = c('set', 'var'), sep = "_") %>%
spread(var, vars)
hope this helps!

Resources