How to print double quotes (") in R

How to print double quotes (") in R - r

I want to print to the screen double quotes (") in R, but it is not working. Typical regex escape characters are not working:
> print('"')
[1] "\""
> print('\"')
[1] "\""
> print('/"')
[1] "/\""
> print('`"')
[1] "`\""
> print('"xml"')
[1] "\"xml\""
> print('\"xml\"')
[1] "\"xml\""
> print('\\"xml\\"')
[1] "\\\"xml\\\""
I want it to return:
" "xml" "
which I will then use downstream.
Any ideas?

Use cat:
cat("\" \"xml\" \"")
OR
cat('" "','xml','" "')
Output:
" "xml" "
Alternative using noqoute:
noquote(" \" \"xml\" \" ")
Output :
" "xml" "
Another option using dQoute:
dQuote(" xml ")
Output :
"“ xml ”"

With the help of the print parameter quote:
print("\" \"xml\" \"", quote = FALSE)
> [1] " "xml" "
or
cat('"')

Related

Regex for matching between a colon and last newline prior to next colon

I am trying to parse a string with regex to pull out information between a colon and the last newline prior to the next colon. How can I do this?
string <- "Name: Al's\nPlace\nCountry:\nState\n/ Province: RI\n"
stringr::str_extract_all(string, "(?<=:)(.*)(?:\\n)")
but I get:
[[1]]
[1] " Al's\n" " \n" " RI\n"
when I want:
[[1]]
[1] " Al's\nPlace\n" " \n" " RI\n"

I'm not sure if this is what you're after as your wanted output looks a bit different.
:((?:.*\\n?)+?)(?=.*:|$)
: match a colon
((?:.*\n?)+?) match and capture lazily any lines (to optional \n)
(?=.*:|$) until there is a line with colon ahead
See this demo at regex101

R - Converting a text file with white spaces / tabs and newline to list

Curious if you might offer advice on the following.
I have data in a text file in this form:
"var1"
" var1a"
" var1a_descrp1"
" thing"
" var1b"
" var1b_descrp2"
" thing"
" var1b_descrp3"
" thing1"
" thing2"
" var1b_descrp4"
"poobarvar"
" var2a"
" var2a_descrp1"
" var2b"
" var2b_descrp1"
" thing"
" var2b_descrp1"
" thing1"
" thing2"
" thing3"
White spaces go a max depth of 12 spaces, or "three levels" deep.
And I'd love to cleanly parse this into a list structure of something like the following structure:
$var1
$var1$var1a
$var1$var1a$var1a_descrp1
$var1$var1a$var1a_descrp1[[1]]
[1] "thing"
$var1$var2a
$var1$var2a$var2a_descrp2
$var1$var2a$var2a_descrp2[[1]]
[1] "thing"
$var1$var2a$var2a_descrp3
$var1$var2a$var2a_descrp3[[1]]
[1] "thing1"
$var1$var2a$var2a_descrp3[[2]]
[1] "thing2"
$poobarvar
$poobarvar$var2a
list()
$poobarvar$var2b
$poobarvar$var2b$var2b_descrp1
$poobarvar$var2b$var2b_descrp1[[1]]
[1] "thing1"
$poobarvar$var2b$var2b_descrp1[[2]]
[1] "thing2"
$poobarvar$var2b$var2b_descrp1[[3]]
[1] "thing3"
I have a pretty convoluted set of while loops and if-else statements I'd love to clean up.

Replacing + by 5 in R

I have a dataset called Price which is supposed to be numeric but is generated as a string because all 5 is replaced by +.
It looks like this:
"99000" "98300" "98300" "98290" "98310" " 9831+ " "98310" " 9830+ " " 9830+ " " 9830+ " " 9829+ " " 9828+ " " 9827+ " "98270"
I used the gsub function in R to try and replace + by 5. The code I wrote is:
finalPrice<-gsub("+",5,Price)
However, the output is just a bunch of numbers which doesn't make sense for what I intended:
"59595050505,5 59585350505,5 59585350505,5 59585259505,5 59585351505,5 5 5 595853515+5 5,5 59585351505,5 5 5 595853505+5 5,5 5 5 595853505+5
How can I fix this?

The + sign should be escaped. Try this:
finalPrice<-gsub("\\+",5, Price)

Besides using double-escapes to force a literal-x to be matched by the pattern argument, you can also use either the fixed=TRUE parameter or use a character-class defined by the "[.]"-operation. See the ?regex page for more details:
> gsub("+", "5", txt, fixed=TRUE)
[1] "99000" "98300" "98300" "98290" "98310"
[6] " 98315 " "98310" " 98305 " " 98305 " " 98305 "
[11] " 98295 " " 98285 " " 98275 " "98270"
> gsub("[+]", "5", txt)
[1] "99000" "98300" "98300" "98290" "98310"
[6] " 98315 " "98310" " 98305 " " 98305 " " 98305 "
[11] " 98295 " " 98285 " " 98275 " "98270"

When writing regex, + means match the preceeding group one or more times. As the preceeding character is in your regex before the + is empty, gsub matches every empty string in the target.
The result is that 5 is inserted into each of these positions.
To avoid this, escape the +, which needs to be done with double backslash in R:
finalPrice<-gsub("\\+",5,Price)

Non alphanumeric characters in R

For uppercase, lowercase letters and 10-digits I can generate a vector that contains all letters or 10-digit number as follow:
A <- LETTERS[0:26]
B <- letters[0:26]
C <- seq(0,9)
I wonder whether there is a similar function for non-alphanumeric characters.
~!##$%^&*_-+=`|\(){}[]:;"'<>,.?/
I tried
D <- c("~","!","#","#","$","%","^", "&","*","_","-","+","=","`","|","\","(",")","{","}","[","]",":",";",""","'","<",">",",",".","?","/")
Thanks

This is another option. Generate all ascii characters, then filter out the non punctuation with regular expressions.
ascii <- rawToChar(as.raw(0:127), multiple=TRUE)
ascii[grepl('[[:punct:]]', ascii)]
# [1] "!" "\"" "#" "$" "%" "&" "'" "(" ")" "*" "+" "," "-" "." "/" ":" ";" "<" "=" ">" "?" "#"
# [23] "[" "\\" "]" "^" "_" "`" "{" "|" "}" "~"

This might be useful . . The ASCII character set is arranged in ranges of similar types of characters (letters, etc).
http://datadebrief.blogspot.com/2011/03/ascii-code-table-in-r.html

It's a bit drawn out, and there's probably a better website (and a better way to get the same result), but
library(XML); library(RCurl)
doc <- htmlParse(getURL("https://wci.llnl.gov/codes/basis/manual/node161.html"))
xp <- xpathSApply(doc, "//tr/td", xmlValue, trim = TRUE)
xp[nzchar(xp) & nchar(xp) == 1]
# [1] "!" "[" "%" "," "]" "&" "-" "|" "'" "." "=" "~" "("
# [14] "/" ")" "*" "=" "{" "?" "`" "}" "#" ":" ";" "^" " "
Also, using the website from the other answer yields a more complete result
> URL <- "http://datadebrief.blogspot.com/2011/03/ascii-code-table-in-r.html"
> r <- readLines(URL, warn = FALSE)[780:874]
> s <- sapply(strsplit(r, "\\s+"), "[", 1)
> s[!s %in% c(letters, LETTERS, 0:9)]
# [1] "" "!" "\"" "#" "$" "%" "&" "'" "("
# [10] ")" "*" "+" "," "-" "." "/" ":" ";"
# [19] "<" "=" ">" "?" "#" "[" "\\\\" "]" "^"
# [28] "_" "`" "{" "|" "}" "~"
...or yeah, just use rawToChar(as.raw(...)) like MrFlick said :-)

This answer is only for amusement, list the characters you want and use strsplit to generate your vector.
> D <- strsplit('!"#$%&\'()*+,-./\\:;<=>?#[]^_`{|}~', '(?=.)', perl=T)[[1]]
## [1] "!" "\"" "#" "$" "%" "&" "'" "(" ")" "*" "+" "," "-" "." "/"
## [16] "\\" ":" ";" "<" "=" ">" "?" "#" "[" "]" "^" "_" "`" "{" "|"
## [31] "}" "~"
Or filter the characters you want.
> D <- gsub('[^\\pP\\pS]', '', rawToChar(as.raw(1:127), multiple=T), perl=T)
> D[D != ""]
## [1] "!" "\"" "#" "$" "%" "&" "'" "(" ")" "*" "+" "," "-" "." "/"
## [16] ":" ";" "<" "=" ">" "?" "#" "[" "\\" "]" "^" "_" "`" "{" "|"
## [31] "}" "~"

extracting first value from a list

I would like to extract the first value from this list:
[[1]]
[1] " \" 0.0337302" " -0.000248016" " -0.000496032" " -0.000744048"
[5] " -0.000992063" " -0.00124008" " -0.0014881" " -0.00173611"
[9] " -0.00198413" " -0.00223214" " -0.00248016" " -0.00272817"
[13] " -0.00297619" " -0.00322421" " -0.00347222" " -0.00372024"
[17] " -0.00396825" " -0.00421627" " -0.00446429" " -0.0047123"
[21] " -0.00496032" " -0.00520833" " -0.00545635" " -0.00570437"
the name of this test is M, I have tested this M[1] and M[[1]] but I don't get the correct answer.
How can I do that?

You need to subset the list, and then the vector in the list:
M[[1]][1]
In other words, M is a list of 1 element, a character vector of length 24.
You may want to use unlist M to convert it to just a vector.
M <- unlist(M)
Then you can just use M[1].
To remove the \" you can use sub:
sub("\"","",M[1])
[1] " 0.0337302"

The first element in the list you've shown is the entire vector shown by
[1] " \" 0.0337302" " -0.000248016" " -0.000496032" " -0.000744048"
[5] " -0.000992063" " -0.00124008" " -0.0014881" " -0.00173611"
[9] " -0.00198413" " -0.00223214" " -0.00248016" " -0.00272817"
[13] " -0.00297619" " -0.00322421" " -0.00347222" " -0.00372024"
[17] " -0.00396825" " -0.00421627" " -0.00446429" " -0.0047123"
[21] " -0.00496032" " -0.00520833" " -0.00545635" " -0.00570437"
you get that vector by doing M[[1]]
To further get the first element of this vector just recognize that M[[1]] is the vector you want the first element of so use normal subsetting to get that: M[[1]][1]
> M[[1]][1]
[1] " \" 0.0337302"

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to print double quotes (") in R - r

Use cat: cat("\" \"xml\" \"") OR cat('" "','xml','" "') Output: " "xml" " Alternative using noqoute: noquote(" \" \"xml\" \" ") Output : " "xml" " Another option using dQoute: dQuote(" xml ") Output : "“ xml ”"

With the help of the print parameter quote: print("\" \"xml\" \"", quote = FALSE) > [1] " "xml" " or cat('"')

Related

Regex for matching between a colon and last newline prior to next colon

R - Converting a text file with white spaces / tabs and newline to list

Replacing + by 5 in R

Non alphanumeric characters in R

extracting first value from a list

Categories

Resources