Plotting a sequence of strings using sprintf in R - r

I want to start using sprintf to plot a series of two strings in R for a title of a figure. Can anyone show me how to do it correctly? The values from HS and score should be plotted as characters behind the terms in quotes.
title = sprintf ("HS %s", as.character(HS), "Score %s", as.character(score))

With sprintf, we can multiple arguments as the usage is
sprintf(fmt, ...)
That implies, there would be a single fmt and any number of inputs
sprintf("%HS %s Score %s", as.character(HS), as.character(score))

Related

Add thousand separator to levels in cut function

My x axis labels look like [100000,250000] which makes it hard to understand the numer at first sight, I want it to look like [100.000,250.000], I know that the cut2 function has a formatfun parameter but I think I don´t know how to use it properly.
Try using the "formatC" function on your cut data. e.g.
formatC(my_cuts, big.mark = ".", decimal.mark = ",")
Let's create an example to work on:
x <- cut(seq(0,1,length.out=8) + 1e6, 3)
This is a factor. Although at bottom it's a numeric array, you don't want to format its values; you want to format its levels, which are the strings associated with its values. This is what the levels look like in the example (calling head to prevent lots of printing in case x has many distinct levels):
(head(levels(x)))
[1] "(1000000,1000000.3]" "(1000000.3,1000000.7]" "(1000000.7,1000001]"
To format the levels, we need to pick them apart into their numeric components (which are separated by a comma ","), format each component, and reassemble the results.
Here's the picking-apart-and-formatting step in one go, using only base R functionality. It calls gsub and strsplit on the first line (for cleaning out the "(" and "]" characters and splitting each pair of numeric strings into two strings) and employs prettyNum on the second line (for the formatting), which conveniently will format any character string that looks like a number:
s <- lapply(strsplit(gsub("]|[(]", "", levels(x)), ","),
prettyNum, big.mark=".", decimal.mark=",", input.d.mark=".", preserve.width="individual")
(You might not need the input.d.mark argument, but I did because my locale uses "." for a decimal point, as you could see above. The docs say "individual" is the default for setting the output width, but that just isn't the case on my system: I had to specify it explicitly.)
The paste* functions will perform the reassembly, whose results we simply re-assign to the levels of x:
levels(x) <- paste0("(", sapply(s, function(a) paste0(a, collapse="; ")), "]")
(Since each number potentially already includes "," and "." delimiters, I have specified a third punctuation mark, ";", to separate the numbers themselves -- but you may use what you wish, of course.)
Let's display the new levels to verify the results:
(head(levels(x)))
[1] "(1.000.000; 1.000.000,3]" "(1.000.000,3; 1.000.000,7]" "(1.000.000,7; 1.000.001]"

bquote, parsing, expression to get multiple lines labels in ggplot with greek letters and variables as subscripts

Let's say I have
paste0("Year = ",index,"\nN = ",length((dfGBD %>% filter(year==index))[[vbl]]),
" Bandwidth = ",round(stats::bw.nrd(log((dfGBD %>% filter(year == index))[[vbl]])),2),
"\nSkewness:", round(e1071::skewness(log((dfGBD %>% filter(year==index))[[vbl]])), 2),
" Kurtosis:",round(e1071::kurtosis(log((dfGBD %>% filter(year==index))[[vbl]])),2),
"\nmu[",vbl,"] = ", round(mean((dfGBD %>% filter(year==index))[[vbl]]),2),
" sigma[",vbl,"] = ",round(sd((dfGBD %>% filter(year==index))[[vbl]]),2)
)
inside a sapply through index years. Further, vbl is a string with the name of a variable. The sapply produces a vector of labels for a factor variable.
Applying ggplot I obtain labels similar to the next:
Year = 2000
N = 195 Bandwidth = 0.09
Skewness: 0 Kurtosis: -0.56
mu[Mortality] = 7750.85 sigma[Mortality] = 1803.28
Till here, all ok. I have already written mu[vbl], sigma[vbl] thinking in parsing and subscript notation to get the greek letters with the name of the variable saved in vbl as subscript.
First I tried facet_wrap with option labeller = "label_parsed". I obtained an error which I only solved writting the string between backticks ``, but then \n has no effect. I tried many options using bquote and/or parse and/or expression and/or atop etc. in order to get this multiple lines result with the desired output I described above. But only get or one line or very ugly outputs or, mostly, errors, and I couldn't see yet the greek letters.
So what/how should I do?
PS: as answered in other stackoverflow's, \n does not work in this context, so a list with bquote's for each line is suggested. I tried it, but then I got an error that I think is due to incompatibility of number of elements of all the lists and number of labels of a factor (a label may not be a list?).
Thank you!

replace parts of a string with a vector

I am having problems with replacing parts of a single string with a set of vector replacements, to result in a vector.
I have a string tex which is intended to tell a diagram what text to put as the node (and other) labels.
So if tex is "!label has frequency of !frequency"
and T has columns label with values c("chickens","ducks",...) and frequency with values c("chickens","ducks",...) amongst others,
the function returns a vector like c("Chickens has frequency of 35","Ducks has frequency of 12",...)
More formally, the problem is:
Given a tibble T and a string tex,
return a vector with length nrow(T), of which each element = tex but in which each occurrence within tex of the pattern !pattern is replaced by the vectorised contents of T$pattern
I looked at
Replace string in R with patterns and replacements both vectors
and
String replacement with multiple string but they don't fit my usecase.
stringr::str_replace() doesn't do it either.
possible baseR-solution using sprintf()
animals = c("chickens","ducks")
frequency = c(35,12)
sprintf( "%s has frequency of %s", animals, frequency)
[1] "chickens has frequency of 35" "ducks has frequency of 12"
also,
tex = "%s has frequency of %s"
sprintf( tex, animals, frequency )
will gave the same results.

Use scientific notation with xtable in R

I pass a data.frame to xtable
dat.table <- xtable(dat[1:20,] ,digits=10)
Instead of displaying digits like that, I would prefer to use scientific notation. How would I do that?
had a look but all I found was R: formatting the digits in xtable which isn't the answer it seems.
Try:
dat.table <- xtable(dat[1:20,] ,digits=-10)
"If values of digits are negative, the corresponding values of x are displayed in scientific format with abs(digits) digits." xtable
If you are wanting to x10^ notation trying use print and xtable. something like:
print(xtable(dat[1:10,1:7], display=c("s","s", "s","s","g","g","g","g")), math.style.exponents = TRUE)
where s is string and g is used for scientific notation (only when space is saved), themath.style.exponents from print will convert to x10^format.

R: How do I write "≥2: n=nrow(x)" in plot legend?

I am doing boxplots and have problems with the legend. Specifically, I want to write "≥2: n=formatC(nrow(x))" but can not combine the commands for the ≥ symbol, the function that calculates nrow(x) and formatC(nrow(x), bigmark=",") that should give the nrow number with a thousand separator.
What I tried so far:
smoke <- matrix(c(1:1200),ncol=1,byrow=TRUE)
colnames(smoke) <- c("High")
smoke <- as.table(smoke)
pdf('test.pdf')
plot(NA,xlim=c(0,100),ylim=c(0,100))
legend(10,70,bquote(paste(NA>=2, ": n=", .(formatC(nrow(smoke)), big.mark=","))))
dev.off()
which gives: ≥ 2: n=1200
I would like to have: ≥2: n=1,200
It seems that formatC does not work under bquote and I would also like to remove the space after the ≥ symbol.
I also tried:
legend(x,y, legend=c(expression(NA>=2), paste(": n=", formatC(nrow(smoke)), sep="")))
which gives the legend in two lines:
≥ 2
: n=1200
Putting paste before expression gives one line but does not convert the >= to ≥.
I am exporting the graph as pdf, which currently works for the ≥ symbol. I would prefer to keep that. Unicode does not work with pdf in my hands.
Thanks in advance,
Philipp
You have a ) in the wrong place right after smoke, so it takes the big.mark argument as part of paste and not formatC. Try this:
legend(10,70,bquote(paste(NA>=2, ": n=", .(formatC(nrow(smoke), big.mark=",")))))

Resources