Keep leading zeros with colsplit in R [duplicate] - r

This question already has answers here:
How to avoid: read.table truncates numeric values beginning with 0
(3 answers)
Closed 6 years ago.
I'm using colsplit to split a large string to columns.
there are numbers in the string with leading zeros.
How can I prevent colsplit from converting them to numeric values?
Example:
value in string: 0000122517
after colsplit this becomes: 122517
I need the leading zeros, and the values can be of any length, so I
cannot add the zeros afterwards.
Kind regards,
Oene Douma

We can use read.table
read.table(text=str, sep="~", header=FALSE, colClasses = c("character", "character"))

Related

Is there an R function to format a character string pattern? [duplicate]

This question already has an answer here:
Split delimited single value character vector
(1 answer)
Closed 5 years ago.
I have a string in R in the following form:
"AAAAA","BBBBB","CCCCC",..
And i want to convert it to a standard typical R vector containing the same string elements ("AAAAA", "BBBBB", etc.):
vector<-c("AAAAA","BBBBB","CCCCC",..)
I've read that strsplit could do it, but haven't managed to achieve it.
strsplit gives you back a list of the character vectors, so if you want it in a single vector, use unlist as well.
So,
unlist(strsplit(string, ","))

Split column names in R using a separator [duplicate]

This question already has answers here:
Extracting numbers from vectors of strings
(12 answers)
Closed 2 years ago.
I have a dataframe X with column names such as
1_abc,
2_fgy,
27_msl,
936_hhq,
3_hdv
I want to just keep the numbers as the column name (so instead of 1_abc, just 1). How do I go about removing it while keeping the rest of the data intact?
All column names have underscore as the separator between numeric and character variables. There are about 400 columns so I want to be able to code this without using specific column name
You may use sub here for a base R option:
names(df) <- sub("^(\\d+).*$", "\\1", names(df))
Another option might be:
names(df) <- sub("_.*", "", names(df))
This would just strip off everything from the first underscore until the end of the column name.

Changing Columns 22-300 from double (or other type) into numeric [duplicate]

This question already has answers here:
How to convert a data frame column to numeric type?
(18 answers)
Closed 3 years ago.
I have a huge dataset and some columns are text. Upon importing the Excel file and using preview, I can manually change the first 50 columns to numeric, if this applies. But, there are still 250 more columns I need to change to numeric. How would I use R code to change all columns from column 22 through column 300 to numeric?
We can use type.convert (assuming the columns are character class)
df1[22:300] <- type.convert(df1[22:300], as.is = TRUE)
Also, with mutate_at from dplyr
library(tidyverse)
df1 %>%
mutate_at(22:300, type_convert)
One way:
mydf[22:300] <- lapply(mydf[22:300], as.numeric)

Split comma delimited string [duplicate]

This question already has an answer here:
Split delimited single value character vector
(1 answer)
Closed 5 years ago.
I have a string in R in the following form:
"AAAAA","BBBBB","CCCCC",..
And i want to convert it to a standard typical R vector containing the same string elements ("AAAAA", "BBBBB", etc.):
vector<-c("AAAAA","BBBBB","CCCCC",..)
I've read that strsplit could do it, but haven't managed to achieve it.
strsplit gives you back a list of the character vectors, so if you want it in a single vector, use unlist as well.
So,
unlist(strsplit(string, ","))

R: keep leading zero [duplicate]

This question already has answers here:
How to avoid: read.table truncates numeric values beginning with 0
(3 answers)
Closed 8 years ago.
I have a dataset in .csv format. In my dataset there is one column which is leading with zero like this "05","02". i am trying to import .csv file using read.csv in R. It read successfully but it remove the leading zero.
Thanks in Advance.
If all the data in the column are of the same length, you can do paste0("0", NAME).
If variable length, try formatC like so: formatC(NAME, width = 2, format = "d", flag = "0").
In the latter example, 'd' refers to 'integer' and 'width' can be changed as desired.

Resources