In R, how does one read delimiter or and also convert delimiter for "|" vertical line (ASCII: | |). I need to split on whole numbers inside the file, so strsplit() does not help me.
I have R code that reads csv file, but it still retains the vertical line "|" character. This file has a separator of "|" between fields. When I try to read with read.table() I get comma, "," separating every individual character. I also try to use dplyr in R for tab_spanner_delim(delim = "|") to convert the vertical line after the read.delim("file.csv", sep="|") read the file, even this read.delmin() does not work. I new to special char R programming.
read.table(text = gsub("|", ",", readLines("file.csv")))
dat_csv <- read.delim("file.csv", sep="|")
x <- cat_csv %>% tab_spanner_delim(delim = "|")
dput() from read.table(text = gsub("|", ",", readLines("file.csv")))
",\",R,D,|,I,|,7,8,|,0,1,0,|,0,0,1,2,|,8,8,1,0,1,|,1,|,7,|,1,0,5,|,1,1,6,|,1,9,9,9,1,2,2,0,|,0,0,:,0,0,|,|,A,M,|,6,|,|,|,|,|,|,|,|,|,|,|,|,|,\",",
",\",R,D,|,I,|,7,8,|,0,1,0,|,0,0,1,2,|,8,8,1,0,1,|,1,|,7,|,1,0,5,|,1,1,6,|,1,9,9,9,1,2,2,6,|,0,0,:,0,0,|,4,.,9,|,|,6,|,|,|,|,|,|,|,|,|,|,|,|,|,\","
dput() from dat_csv <- read.delim("file.csv", sep="|")
"RD|I|78|010|0012|88101|1|7|105|116|19991220|00:00||AM|6|||||||||||||",
"RD|I|78|010|0012|88101|1|7|105|116|19991226|00:00|4.9||6|||||||||||||"
dput(dat_csv)
"RD|I|78|010|0012|88101|1|7|105|116|19991220|00:00||AM|6|||||||||||||",
"RD|I|78|010|0012|88101|1|7|105|116|19991226|00:00|4.9||6|||||||||||||"
We can read the data line by line using readLines. Remove unwanted characters at the end of each line using trimws, paste the string into one string with new line (\n) character as the collapse argument and use this string in read.table to read data as dataframe.
data <- read.table(text = paste0(trimws(readLines('file.csv'),
whitespace = '[", ]'), collapse = '\n'), sep = '|')
I need to import a bunch of .csv files into R. I do this using the following code:
Dataset <- read.csv(paste0("./CSV/State_level/",file,".csv"),header = F,sep = ";",dec = "," , stringsAsFactors = FALSE)
The input is an .csv file with "," as separator for decimal places. Unfortunately there are quite a few entries as follows: 20,012,054.
This should really be: 20012,054 and leads to either NAs but usually the whole df being imported as character and not numeric which I'd like to have.
How do I get rid of the first "," when looking from left to right and only if the number has more than 3 figuers infront of the decimal-comma?
Here is a sample of how the data looks in the .csv-file:
A data.frame might look like this:
df<-data.frame(a=c(0.5,0.84,12.25,"20,125,25"), b=c("1,111,054",0.57,105.25,0.15))
I used "." as decimal separator in this case to make it a number, which in the .csv is a ",", but this is not the issue for numbers in the format: 123,45.
Thank you for your ideas & help!
We can use sub to get rid of the first ,
df[] <- lapply(df, function(x) sub(",(?=.*,)", "", x, perl = TRUE))
Just to show it would leave the , if there is only a single , in the code
sub(",(?=.*,)", "", c("0,5", "20,125,25"), perl = TRUE)
#[1] "0,5" "20125,25"
I have a txt file (remove.txt) with these kind of data (that's RGB Hex colors):
"#DDDEE0", "#D8D9DB", "#F5F6F8", "#C9CBCA"...
Which are colors I don't want into my analysis.
And I have a R object (nacreHEX) with other data like in the file, but there are into this the good colors and the colors wich I don't want into my analysis. So I use this code to remove them:
nacreHEX <- nacreHEX [! nacreHEX %in% remove] .
It's works when remove is a R object like this remove <- c("#DDDEE0", "#D8D9DB"...), but it doesn't work when it's come from a txt file and I change it into a data.frame, and neither when I try with remove2 <-as.vector(t(remove)).
So there is my code:
remove <- read.table("remove.txt", sep=",")
remove2 <-as.vector(t(remove))
nacreHEX <- nacreHEX [! nacreHEX %in% remove2]
head(nacreHEX)
With this, there are no comas with as.vector, so may be that's why it doesn't work.
How can I make a R vector with comas with these kind of data?
What stage did I forget?
The problem is that your txt file is separated by ", " not ",'. The spaces end up in your string:
rr = read.table(text = '"#DDDEE0", "#D8D9DB", "#F5F6F8", "#C9CBCA"', sep = ",")
(rr = as.vector(t(rr)))
# [1] "#DDDEE0" " #D8D9DB" " #F5F6F8" " #C9CBCA"
You can see the leading spaces before the #. We can trim these spaces with trimws().
trimws(rr)
# [1] "#DDDEE0" "#D8D9DB" "#F5F6F8" "#C9CBCA"
Even better, you can use the argument strip.white to have read.table do it for you:
rr = read.table(text = '"#DDDEE0", "#D8D9DB", "#F5F6F8", "#C9CBCA"',
sep = ",", strip.white = TRUE)
I need to export a pipe delimited "|" txt file from [R] script.
Thanks!
Scottieie
The function write.table provides the argument sep to define a separator.
Use sep = "|" to separate cells by a | character, e.g.:
write.table(data.frame(a=1:3,b=3:1), file = "output.txt", sep = "|")
A Tidyverse solution would be to use write_delim():
write_delim(my_table, data/my_table.txt, delim = "|")
Check out the documentation here.