This question already has answers here:
How to avoid: read.table truncates numeric values beginning with 0
(3 answers)
Closed 8 years ago.
I have a dataset in .csv format. In my dataset there is one column which is leading with zero like this "05","02". i am trying to import .csv file using read.csv in R. It read successfully but it remove the leading zero.
Thanks in Advance.
If all the data in the column are of the same length, you can do paste0("0", NAME).
If variable length, try formatC like so: formatC(NAME, width = 2, format = "d", flag = "0").
In the latter example, 'd' refers to 'integer' and 'width' can be changed as desired.
Related
This question already has answers here:
How to sort a character vector where elements contain letters and numbers?
(6 answers)
Sort columns numerically in R [duplicate]
(2 answers)
Closed 9 months ago.
I have 100 files, each named "ABC - Day - 1(to 100).csv".
When I read them into R, it is ordered like this: Day1, Day10, Day100, etc. (see figure 1). I know R does this because it is sorting it by character, not by number. Is there a way that I could reorder the path in numerically correct order (Day1, Day2, Day3, ...) without me actually having to manually change my raw file names?
Here is what I have so far:
filenames <- list.files(path="../STEP_ONE/Test_raw",
pattern="ADD_Day+.*sav",
full.names = TRUE) # Reads in path of the 100 files
Let’s suppose you have a vector v with the names of your file (according to what you said, ___Day__.sav). You can subtract the number of the day and reorder the names with the following code:
# Load library
library(stringr)
# Matrix with your files' names and the day
tab <- as.data.frame(str_match(v, "Day\\s*(.*?)\\s*.sav"))
# Column names
colnames(tab) <- c("file.name", "day")
# Day as numeric
tab$day <- as.numeric(tab$day)
# Reorder `tab` according to $day
tab <- tab[order(tab$day),]
This question already has answers here:
Create a numeric vector with names in one statement?
(6 answers)
Closed 10 months ago.
How to assign values to string when the data is very large.
Currently I assign values to character vectors manually as illustrated below, however, when the amount of data is very large it becomes tedious to do that process manually. Is there a function that allows me to do it?
c("a" = 100, "b" = 200, "c"=300, ..., "aaaaaa" = n)
Is there any particular meaning to your numbering? If you just need numerical values for strings, you can use as.factor and as.numeric as outlined in this post:
R: Encode character variables into numeric
However, if you need a specific encoding you will have specify the associated labels necessary; there isn't enough information in your question to help with this further, but the documentation is here:
https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/factor
This question already has answers here:
R regex find last occurrence of delimiter
(4 answers)
Closed 1 year ago.
I have a matrix with thousands of columns which names are as shown below:
Z41_5_tes_ACGTTCCATAGCCGTA
Z41_5_ACGTTCCAGAGCGGTA
Z53_5_ACGTTCCAGAGCCGTA
Z53_5_ACGTTCCAGATCTGTA
Z41_5_ACGTTGCATAGCGGTA
Z41_5_tes_ACGTTCGCTAGCCGTA
I would like to create a vector with names that include the beginning of each columns names as shown below:
Z41_5_tes
Z41_5
Z53_5
Z53_5
Z41_5
Z41_5_tes
I have tried but here I did not capture Z41_5_tes.
names <- gsub("^([^]*[^_]).$", "\1", colnames(x#data))
Z41_5
Z53_5
Remove everything after the last underscore.
sub('_[^_]*$', '', x)
#[1] "Z41_5_tes" "Z41_5" "Z53_5" "Z53_5" "Z41_5" "Z41_5_tes"
Extract everything before last underscore.
sub('(.*)_.*', '\\1', x)
#[1] "Z41_5_tes" "Z41_5" "Z53_5" "Z53_5" "Z41_5" "Z41_5_tes"
data
x <- c("Z41_5_tes_ACGTTCCATAGCCGTA", "Z41_5_ACGTTCCAGAGCGGTA",
"Z53_5_ACGTTCCAGAGCCGTA", "Z53_5_ACGTTCCAGATCTGTA",
"Z41_5_ACGTTGCATAGCGGTA", "Z41_5_tes_ACGTTCGCTAGCCGTA")
This question already has answers here:
How to convert a data frame column to numeric type?
(18 answers)
Closed 3 years ago.
I have a huge dataset and some columns are text. Upon importing the Excel file and using preview, I can manually change the first 50 columns to numeric, if this applies. But, there are still 250 more columns I need to change to numeric. How would I use R code to change all columns from column 22 through column 300 to numeric?
We can use type.convert (assuming the columns are character class)
df1[22:300] <- type.convert(df1[22:300], as.is = TRUE)
Also, with mutate_at from dplyr
library(tidyverse)
df1 %>%
mutate_at(22:300, type_convert)
One way:
mydf[22:300] <- lapply(mydf[22:300], as.numeric)
This question already has answers here:
How to avoid: read.table truncates numeric values beginning with 0
(3 answers)
Closed 6 years ago.
I'm using colsplit to split a large string to columns.
there are numbers in the string with leading zeros.
How can I prevent colsplit from converting them to numeric values?
Example:
value in string: 0000122517
after colsplit this becomes: 122517
I need the leading zeros, and the values can be of any length, so I
cannot add the zeros afterwards.
Kind regards,
Oene Douma
We can use read.table
read.table(text=str, sep="~", header=FALSE, colClasses = c("character", "character"))