Alternatives to paste( ) to concatenate formatted text expressions on a graph? - r

I'm using pretty10exp() from the sfsmisc package to make the scientific notation look better. For example:
library(sfsmisc)
a <- 0.000392884
The output of pretty10exp() looks like this:
> pretty10exp(a, digits.fuzz=3) #round to display a certain number of digits
expression(3.93 %*% 10^-4)
I can use this to display the pretty version of a on a graph's title or axes as described in this post:
Force R to write scientific notations as n.nn x 10^-n with superscript
However, things get ugly again when I try to combine it with paste() to write a sequence of character strings like this:
# some data
x <- seq(1, 100000, len = 10)
y <- seq(1e-5, 1e-4, len = 10)
# default plot
plot(x, y)
legend("topleft", bty="n",legend=paste("p =", pretty10exp(a, digits.fuzz=3)))
Which gives me the following graph, so I suppose paste() is not able to handle formatted expressions of the kind that can be found in the output of pretty10exp():
Is there an alternative to paste() that I could use to combine the expressions "p =" and the scientific notation of pretty10exp()?

One solution is just to copy what pretty10exp() does, which for a single numeric, a, and the options you set/defaults, is essentially:
a <- 0.00039288
digits.fuzz <- 3
eT <- floor(log10(abs(a)) + 10^-digits.fuzz)
mT <- signif(a/10^eT, digits.fuzz)
SS <- substitute(p == A %*% 10^E, list(A = mT, E = eT))
plot(1:10)
legend("topleft", bty = "n", legend = SS)
The equivalent using bquote() would be
SS <- bquote(p == .(mT) %*% 10^.(eT))

Definitely not a precise answer, but I just play around with things for a while. bquote is nice once you get a feel for it.
> call("rep", 10, 7)
## rep(10, 7)
> bquote(.(call("rep", 10, 7)) * q^5)
## rep(10, 7) * q^5
> sprintf("paste('%smm'^'%s')", 5, 5)
## [1] "paste('5mm'^'5')"
And my personal favorite, which will of course return TRUE...
just as soon as I someone writes is.awesome.
> bquote( f <- function(x) { is.awesome(R) })
## f <- function(x) {
## is.awesome(R)
## }

Related

Ryacas substitute but don't simplify or evaluate

I am using the Ryacas package in R and am trying to substitute variables for values but not simplify or solve the equation to show the working out. I have searched Stackoverflow, read the Ryacas documentation and have attempted to find this in the Yacas manual but have so far come up short. I think I am looking to turn simplification off to get the substituted equation and then on again to provide the final result.
Here is an example which provides only the result rather than the working:
library(Ryacas)
# a <- 2
# b <- 3
# c <- 4
eq <- ysym(('(a * b) / c'))
solution <- with_value(with_value(with_value(eq, 'a', 2), 'b', 3), 'c', 4)
tex(solution)
# "\\frac{3}{2}"
What I am trying to get as an output is:
# working out
# "\\frac{2 \times 3}{4}
as well as the actual solution:
# solution
# "\\frac{3}{2}"
Does anyone know whether there is a solution to this problem such as passing a command to yacas through yac_str or similar. I have tried translating to latex using the tex() command and then substituting after though the multiplication operators are removed and this means I need to find and replace them which becomes nasty when dealing with symbolics:
tex(eq)
# "\\frac{a b}{c}"
# substituting string values using stringi requires additional
# work to deal with the missing `*` between `a` and `b`
# "\\frac{2 3}{4}"
Whilst this can be done for simple expressions, there are numerous exceptions such as the variables a and c being present in \\frac etc.
I have also tried the TexForm command and substitute in various guises but am still not able to capture the unsimplified and unevaluated equation:
y_fn(substitute(with_value(eq, 'a', 2)), "TeXForm")
# "\\frac{2 b}{c}"
eval(substitute(with_value(eval(substitute(with_value(eq, 'a', 2))), 'b', 3)))
# y: 6/c
Any help appreciated.
You might need to customize this some more but this general approach can work if you are willing to do that.
We translate words to their values using gsubfn and handle the \times part at the end with gsub. Note that frac is not modified because we are matching words and frac it is not a word in the list given in the second arg to gsubfn.
library(magrittr)
library(gsubfn)
library(Ryacas)
eq <- ysym(('(a * b) / c'))
eq %>%
tex %>%
gsubfn("(\\w+)", list(a = 2, b = 3, c = 4), .) %>%
gsub("(\\d) (\\d)", "\\1 \\\\times \\2", .)
## [1] "\\frac{2 \\times 3}{4}"
Added
Take the first three code examples when searching for [r] Ryacas in stackoverflow and it worked on all of them. Note that these used the original version of Ryacas which is currently called Ryacas0 so I used that.
library(Ryacas0)
library(gsubfn)
library(magrittr)
tex_sub <- function(.x, ...) {
.x %>%
gsubfn("(\\w+)", list(...), .) %>%
gsub("(\\d) (\\d)", "\\1 \\\\times \\2", .)
}
# https://stackoverflow.com/questions/21858668/symbolic-matrix-multiplication-by-ryacas
x <- Sym("x")
mat1 <- List(
List(x, 2),
List(x^3, x))
mat2 <- List(
List(x, x),
List(3, 6 * x))
tt <- TeXForm(mat1 * mat2)
tex_sub(tt, x = 1)
###
# https://stackoverflow.com/questions/22739173/matrix-transpose-in-ryacas
u=Sym("u")
v=Sym("v")
w=Sym("w")
DG=List(List(w-v), List(u-w), List(v-u))
tt2 <- TeXForm(DG)
tex_sub(tt2, u = 2, v = 3, w = 4)
###
# https://stackoverflow.com/questions/49572184/how-to-derivate-using-ryacas
x <- Sym("x")
P <- Sym(1)
for (k in 1:3) {
P <- Simplify((1+k*x)*P + x*(1-x)*deriv(P, x))
print(P)
}
tt3 <- TeXForm(P)
tex_sub(tt3, x = 10)

Assign single output for multiple-output function to new function

I have a function that gives me a single output which is however composed of two elements. Example for it would be:
example <- function(x){
sin <- sin(x)
cos <- cos(x)
output <- cbind(sin, cos)
return(output)
}
Now my idea is to plot separately sin and cos, each as functions of x. I would like to avoid writing a separate function in this context since the two objects are better to be calculated all at once.
If I try :
x_grid = seq(0,1,0,0.05)
plot(x_grid, sapply(x_grid, FUN = example[1]))
I get the following error message :
Error in example[1] : object of type 'closure' is not subsettable
How to proceed then? (notice that I use sapply because I need my function to deal with more than a single value of x in my real case).
If you're looking for a non-base graphics solution:
library(ggplot2)
example3 <- function(x){
data.frame(
x = x,
sin = sin(x),
cos = cos(x)
)
}
x_grid=seq(0,1,0.05)
ggplot(data = example3(x_grid),
aes(x=x)) +
geom_line(aes(y = sin), color = "blue") +
geom_line(aes(y = cos), color = "red")
With the output:
Your function is vectorized so you can input a vector and extract each column by example(x_grid)[, "sin"] or example(x_grid)[, "cos"].
example(x_grid)
# sin cos
# [1,] 0.000000000 1.000000000
# [2,] 0.049979169 0.998750260
# [3,] 0.099833417 0.995004165
example(x_grid)[, "sin"]
# [1] 0.000000000 0.049979169 0.099833417 0.149438132 0.198669331
# [6] 0.247403959 0.295520207 0.342897807 0.389418342 0.434965534
Note: In this case, sapply is not recommended because the function itself has been vectorized. sapply will make it inefficient. Here is an illustration by benchmark:
library(microbenchmark)
bm <- microbenchmark(
basic = example(x_grid)[, 1],
sapply = sapply(x_grid, function(x) example(x)[1]),
times = 1000L
)
ggplot2::autoplot(bm)
If you want to plot both the two functions, matplot() can plot each column of one matrix.
x_grid <- seq(0, 10, 0.05)
matplot(x_grid, example(x_grid), type = "l")
Appears to be an extra parameter to seq
x_grid <- seq(0, 1, 0.05)
Slight modification to pass variable to function and then subset
plot(x_grid, sapply(x_grid, function(x) example(x)[1]))
Another approach for function which uses a list and then the function can be subset by name
example2 <- function(x) {
within(list(), {
sin <- sin(x)
cos <- cos(x)
})
}
plot(x_grid, sapply(x_grid, function(x) example2(x)$sin))
Unless the example is simplified, the following works without sapply
plot(x_grid, example2(x_grid)$sin)
Plotting both results
lapply(example2(x_grid), plot, x_grid)

R: Change Vector Output to Several Ranges

I am using Jenks Natural Breaks via the BAMMtools package to segment my data in RStudio Version 1.0.153. The output is a vector that shows where the natural breaks occur in my data set, as such:
[1] 14999 41689 58415 79454 110184 200746
I would like to take the output above and create the ranges inferred by the breaks. Ex: 14999-41689, 41690-58415, 58416-79454, 79455-110184, 110185-200746
Are there any functions that I can use in R Studio to accomplish this? Thank you in advance!
Input data
x <- c(14999, 41689, 58415, 79454, 110184, 200746)
If you want the ranges as characters you can do
y <- x; y[1] <- y[1] - 1 # First range given in question doesn't follow the pattern. Adjusting for that
paste(head(y, -1) + 1, tail(y, -1), sep = '-')
#[1] "14999-41689" "41690-58415" "58416-79454" "79455-110184" "110185-200746"
If you want a list of the actual sets of numbers in each range you can do
seqs <- Map(seq, head(y, -1) + 1, tail(y, -1))
You can definitely create your own function that produces the exact output you're looking for, but you can use the cut function that will give you something like this:
# example vector
x = c(14999, 41689, 58415, 79454, 110184, 200746)
# use the vector and its values as breaks
ranges = cut(x, x, dig.lab = 6)
# see the levels
levels(ranges)
#[1] "(14999,41689]" "(41689,58415]" "(58415,79454]" "(79454,110184]" "(110184,200746]"

Avoiding a loop when populating data frames in R

I have an empty data frame T_modelled with 2784 columns and 150 rows.
T_modelled <- data.frame(matrix(ncol = 2784, nrow = 150))
names(T_modelled) <- paste0("t=", t_sec_ERT)
rownames(T_modelled) <- paste0("z=", seq(from = 0.1, to = 15, by = 0.1))
where
t_sec_ERT <- seq(from = -23349600, to = 6706800, by = 10800)
z <- seq(from = 0.1, to = 15, by = 0.1)
I filled T_modelled by column with a nested for loop, based on a formula:
for (i in 1:ncol(T_modelled)) {
col_tmp <- colnames(T_modelled)[i]
for (j in 1:nrow(T_modelled)) {
z_tmp <- z[j]-0.1
T_tmp <- MANSRT+As*e^(-z_tmp*(omega/(2*K))^0.5)*sin(omega*t_sec_ERT[i]-((omega/(2*K))^0.5)*z_tmp)
T_modelled[j ,col_tmp] <- T_tmp
}
}
where
MANSRT <- -2.051185
As <- 11.59375
omega <- (2*pi)/(347.875*24*60*60)
c <- 790
k <- 0.00219
pb <- 2600
K <- (k*1000)/(c*pb)
e <- exp(1)
I do get the desired results but I keep thinking there must be a more efficient way of filling that data frame. The loop is quite slow and looks cumbersome to me. I guess there is an opportunity to take advantage of R's vectorized way of calculating. I just cannot see myself how to incorporate the formula in an easier way to fill T_modelled.
Anyone got any ideas how to get the same result in a faster, more "R-like" manner?
I believe this does it.
Run this first instruction right after creating T_modelled, it will be needed to test that the results are equal.
Tm <- T_modelled
Now run your code then run the code below.
z_tmp <- z - 0.1
for (i in 1:ncol(Tm)) {
T_tmp <- MANSRT + As*exp(-z_tmp*(omega/(2*K))^0.5)*sin(omega*t_sec_ERT[i]-((omega/(2*K))^0.5)*z_tmp)
Tm[ , i] <- T_tmp
}
all.equal(T_modelled, Tm)
#[1] TRUE
You don't need the inner loop, that's the only difference.
(I also used exp directly but that is of secondary importance.)
Much like your previous question's solution which you accepted, consider simply using sapply, iterating through the vector, t_sec_ERT, which is the same length as your desired dataframe's number of columns. But first adjust every element of z by 0.1. Plus, there's no need to create empty dataframe beforehand.
z_adj <- z - 0.1
T_modelled2 <- data.frame(sapply(t_sec_ERT, function(ert)
MANSRT+As*e^(-z_adj*(omega/(2*K))^0.5)*sin(omega*ert-((omega/(2*K))^0.5)*z_adj)))
colnames(T_modelled2) <- paste0("t=", t_sec_ERT)
rownames(T_modelled2) <- paste0("z=", z)
all.equal(T_modelled, T_modelled2)
# [1] TRUE
Rui is of course correct, I just want to suggest a way of reasoning when writing a loop like this.
You have two numeric vectors. Functions for numerics in R are usually vectorized. By which I mean you can do stuff like this
x <- c(1, 6, 3)
sum(x)
not needing something like this
x_ <- 0
for (i in x) {
x_ <- i + x_
}
x_
That is, no need for looping in R. Of course looping takes place none the less, it just happens in the underlying C, Fortran etc. code, where it can be done more efficiently. This is usually what we mean when we call a function vectorized: looping takes place "under the hood" as it were. The output of Vectorize() thus isn't strictly vectorized by this definition.
When you have two numeric vectors you want to loop over you have to first see if the constituent functions are vectorized, usually by reading the docs.
If it is, you continue by constructing that central vectorized compound function and and start testing it with one vector and one scalar. In your case it would be something like this (testing with just the first element of t_sec_ERT).
z_tmp <- z - 0.1
i <- 1
T_tmp <- MANSRT + As *
exp(-z_tmp*(omega/(2*K))^0.5) *
sin(omega*t_sec_ERT[i] - ((omega/(2*K))^0.5)*z_tmp)
Looks OK. Then you start looping over the elements of t_sec_ERT.
T_tmp <- matrix(nrow=length(z), ncol=length(t_sec_ERT))
for (i in 1:length(t_sec_ERT)) {
T_tmp[, i] <- MANSRT + As *
exp(-z_tmp*(omega/(2*K))^0.5) *
sin(omega*t_sec_ERT[i] - ((omega/(2*K))^0.5)*z_tmp)
}
Or you can do it with sapply() which is often neater.
f <- function(x) {
MANSRT + As *
exp(-z_tmp*(omega/(2*K))^0.5) *
sin(omega*x - ((omega/(2*K))^0.5)*z_tmp)
}
T_tmp <- sapply(t_sec_ERT, f)
I would prefer to put the data in a long format, with all combinations of z and t_sec_ERT as two columns, in order to take advantage of vectorization. Although I usually prefer tidyr for switching between long and wide formats, I've tried to keep this as a base solution:
t_sec_ERT <- seq(from = -23349600, to = 6706800, by = 10800)
z <- seq(from = 0.1, to = 15, by = 0.1)
v <- expand.grid(t_sec_ERT, z)
names(v) <- c("t_sec_ERT", "z")
v$z_tmp <- v$z-0.1
v$T_tmp <- MANSRT+As*e^(-v$z_tmp*(omega/(2*K))^0.5)*sin(omega*v$t_sec_ERT-((omega/(2*K))^0.5)*v$z_tmp)
T_modelled <- data.frame(matrix(v$T_tmp, nrow = length(z), ncol = length(t_sec_ERT), byrow = TRUE))
names(T_modelled) <- paste0("t=", t_sec_ERT)
rownames(T_modelled) <- paste0("z=", seq(from = 0.1, to = 15, by = 0.1))

Wavemulcor package - wave.multiple.cross.correlation function - replacement has length zero

I would like to use the wavemulcor package and in particular the wave.multiple.cross.correlation function to perform a wavelet multiple cross correlation on my data.
I am following the example as per the package manual but using my data instead. The function works with the example data but throws an error when I try it with mine. The error refers to "replacement has length zero" but I am unsure what this exacly means.
I've googled the error but there are many examples of the same issue for different function and generally they all have something to do with loops in code.
I then googled how to troubleshoot the problem and read about debugging. I tried debugging the code but I can't figure out where it's breaking down, I am still at the early stages of learning to code. I think it might be this section of code in the wave.multiple.cross.correlation function that is causing the problem:
xy.cor.vec <- matrix(unlist(xy.cor), l, dd)
xy.mulcor <- matrix(0, l, 2 * lm + 1)
YmaxR <- vector("numeric", l)
for (i in 1:l) {
r <- xy.cor.vec[i, ]
P <- diag(d)/2
P[lower.tri(P)] <- r
P <- P + t(P)
Pidiag <- diag(solve(P))
if (is.null(ymaxr)) {
YmaxR[i] <- Pimax <- which.max(Pidiag)
}
Are there any other ways I can determine why this is not working?
The actual error is as follows:
> Lst <- wave.multiple.cross.correlation(xx, lag.max = NULL, ymaxr = NULL)
Error in YmaxR[i] <- Pimax <- which.max(Pidiag) :
replacement has length zero
I have tried to trace the all the variables in the original code but I just can't figure it out.
This is the code I am trying to use and for completeness sake here is a link to a dput() of xx which is a list of the variables I wish to use, see the code below for details of xx
library(wavemulcor)
library(readxl)
rm(list = ls()) # clear objects
graphics.off() # close graphics windows
RNC20_30Hourly <- read_excel(RNC20_30Hourly.xlsx")
RNC20_30Hourly <- RNC20_30Hourly[-1]
RNC20_30TS <- ts(RNC20_30Hourly, start = 1, frequency = 23)
wf <- "d4"
J <- 6
lmax <- 36
n <- nrow(RNC20_30TS)
CK0158U09A3.modwt <- modwt(RNC20_30TS[,"CK0158U09A3"], wf, J)
CK0158U09A3.modwt.bw <- brick.wall(CK0158U09A3.modwt, wf)
CK0158U21A1.modwt <- modwt(RNC20_30TS[,"CK0158U21A1"], wf, J)
CK0158U21A1.modwt.bw <- brick.wall(CK0158U21A1.modwt, wf)
CK0158U21A2.modwt <- modwt(RNC20_30TS[,"CK0158U21A2"], wf, J)
CK0158U21A2.modwt.bw <- brick.wall(CK0158U21A2.modwt, wf)
CK0158U21A3.modwt <- modwt(RNC20_30TS[,"CK0158U21A3"], wf, J)
CK0158U21A3.modwt.bw <- brick.wall(CK0158U21A3.modwt, wf)
xx <- list(CK0158U09A3.modwt.bw,
CK0158U21A1.modwt.bw,
CK0158U21A2.modwt.bw,
CK0158U21A3.modwt.bw)
Lst <- wave.multiple.cross.correlation(xx, lag.max = 13, ymaxr = NULL)
CK0158.RTWP.cross.cor <- as.matrix(Lst$xy.mulcor[1:J,])
YmaxR <- Lst$YmaxR
cell.names <- c("CK0158U09A3", "CK0158U21A1", "CK0158U21A2", "CK0158U21A3")
rownames(CK0158.RTWP.cross.cor)<-rownames(CK0158.RTWP.cross.cor,
do.NULL = FALSE, prefix = "Level ")
lags <- length(-lmax:lmax)
lower.ci <- tanh(atanh(CK0158.RTWP.cross.cor) - qnorm(0.975) /
sqrt(matrix(trunc(n/2^(1:J)), nrow=J, ncol=lags)- 3))
upper.ci <- tanh(atanh(CK0158.RTWP.cross.cor) + qnorm(0.975) /
sqrt(matrix(trunc(n/2^(1:J)), nrow=J, ncol=lags)- 3))
par(mfrow=c(3,2), las=1, pty="m", mar=c(2,3,1,0)+.1, oma=c(1.2,1.2,0,0))
for(i in J:1) {
matplot((1:(2*lmax+1)),CK0158.RTWP.cross.cor[i,], type="l", lty=1, ylim=c(-1,1),
xaxt="n", xlab="", ylab="", main=rownames(CK0158.RTWP.cross.cor)[[i]][1])
if(i<3) {axis(side=1, at=seq(1, 2*lmax+1, by=12),
labels=seq(-lmax, lmax, by=12))}
#axis(side=2, at=c(-.2, 0, .5, 1))
lines(lower.ci[i,], lty=1, col=2) ##Add Connected Line Segments to a Plot
lines(upper.ci[i,], lty=1, col=2)
abline(h=0,v=lmax+1) ##Add Straight horiz and vert Lines to a Plot
text(1,1, labels=cell.names[YmaxR[i]], adj=0.25, cex=.8)
}
par(las=0)
mtext('Lag (hours)', side=1, outer=TRUE, adj=0.5)
mtext('Wavelet Multiple Cross-Correlation', side=2, outer=TRUE, adj=0.5)
Any help to resolve/troubleshoot this issue would be greatly appreciated.
Having reviewed the code with a fine tooth comb I have spotted the error in my code. According to the manual the usage is as follows:
wave.multiple.cross.correlation(xx, lag.max = NULL, ymaxr = NULL)
yet in the example provided, the author uses a different variable called lmax which specifies the lag to be used.
Lst <- wave.multiple.cross.correlation(xx, lmax)
As you can see from my example above I specified two different arguments for the lag, lmax = 36 and lag.max = 13
Ah well it's only cost me 100 reputations!
It seems that, given the length of your time series, you set the max wavelet level J too high (line 14 of your code):
14 J <- 6
Try J <- 4 instead and run
Lst <- wave.multiple.cross.correlation.prueba(xx, lag.max = lmax, ymaxr = NULL)
where lmax is the max lag you previously set to 36 (or any other sensible value of your choice).
In general, the recommended value for the max wavelet level is J = trunc(log2(T))-3 (to be on the safe side), with T = length of time series.
In fact, the reported error was consequence of your original xx list where levels d6 and s6 are populated with NaNs.
Note this is not part of the wavemulcor package but of the previous calculations of the wavelet transform
before entering the wavemulcor package function of your choice.
Hope this helps.

Resources