Reinitializing variables in R and having them update globally - r

I'm not sure how to pose this question with the right lingo and the related questions weren't about the same thing. I wanted to plot a function and noticed that R wasn't udpating the plot with my change in a coefficient.
a <- 2
x <- seq(-1, 1, by=0.1)
y <- 1/(1+exp(-a*x))
plot(x,y)
a <- 4
plot(x,y) # no change
y <- 1/(1+exp(-a*x)) # redefine function
plot(x,y) # now it updates
Just in case I didn't know what I was doing, I followed the syntax on this R basic plotting tutorial. The only difference was the use of = instead of <- for assignment of y = 1/(1+exp(-a*x)). The result was the same.
I've actually never just plotted a function with R, so this was the first time I experienced this. It makes me wonder if I've seen bad results in other areas if re-defined variables aren't propagated to functions or objects initialized with the initial value.
1) Am I doing something wrong and there is a way to have variables sort of dynamically assigned so that functions take into account the current value vs. the value it had when they were created?
2) If not, is there a common way R programmers work around this when tweaking variable assignments and making sure everything else is properly updated?

You are not, in fact, plotting a function. Instead, you are plotting two vectors. Since you haven't updated the values of the vector before calling the next plot, you get two identical plots.
To plot a function directly, you need to use the curve() function:
f <- function(x, a)1/(1+exp(-a*x))
Plot:
curve(f(x, 1), -1, 1, 100)
curve(f(x, 4), -1, 1, 100)

R is not Excel, or MathCAD, or any other application that might lead you to believe that changing an object's value might update other vectors that might have have used that value at some time in the past. When you did this
a <- 4
plot(x,y) # no change
There was no change in 'x' or 'y'.
Try this:
curve( 1/(1+exp(-a*x)) )
a <- 10
curve( 1/(1+exp(-a*x)) )

Related

uniroot gives multiple answers to equation with 1 unknown

I want to create a column in a data frame in which each row is the solution to an equation with 1 unknown (x). The other variables in the equation are provided in the other columns. In another Stack Overflow question, #flodel provided a solution, which I have tried to adapt. However, the output data frame omits some observations entirely, and others have "duplicates" with two different solutions to the same equation.
Sample of my data frame:
Time
id
V1
V2
V3
V4
199304
79330
259.721
224.5090
0.040140442
0.08100474
201004
77520
5062.200
3245.6921
0.037812662
0.08509553
196804
23018
202.897
842.6852
0.154956206
0.12982818
197804
12319
181.430
341.4415
0.052389156
0.14196588
199404
18542
14807.000
16537.0873
-0.001394388
0.08758791
Code with the equation I want to solve. I have simplified the equation, but the issue relates to this simple equation too.
library(plyr)
library(rootSolve
set.seed(1)
df <- adply(df, 1, summarize,
x = uniroot.all(function(x) V1 * ((V4-V3)/(x-V3)) - V2,
interval = c(-10,10)))
How can I achieve this? If possible, it would be great to do this in an efficient manner, as my actual data frame has >1,000,000 rows
The previous answer by #StefanoBarbi was pointing in the right direction.
Here are the plots of the functions implied by each row of your example data frame, with the solution superimposed as a red vertical line (so that we can see that yes, you're right that there is a root in the interval ...) [code below]
The problem is that the algorithm underlying uniroot() is only guaranteed to find the root of a function that is continuous on the interval. Your functions have discontinuities/singularities. (Even for a continuous function I'm sure that the algorithm could be broken with a function that was sufficiently weird to cause problems with floating-point math ...)
Even a bisection algorithm, which is more robust than Brent's method (the algorithm underlying uniroot) since it makes fewer assumptions about continuity of the derivative, could easily fail on this kind of discontinuous function. (It could be made to work for a function that is discontinuous but monotonic, but your example is neither continuous nor monotonic ...)
Obviously your real problem is more complex than this (or you would just be using easy analytical solution you referred to); what this means is that you need to find some way to "tame" your function. In this example, if you rearrange the function to avoid dividing by x-V3 (but without completely solving the equation) then uniroot() should work ...
f1 <- function(L) with(L, (V1/V2)*(V4-V3) + V3)
f1(df[1,])
png("badfit.png")
par(mfrow = c(2,3), bty = "l", las = 1)
for (i in 1:nrow(df)) {
with(df[i,],
curve(V1 * ((V4-V3)/(x-V3)) - V2,
from = -10, to = 10,
ylab = "", xlab = ""))
abline(v=f1(df[i,]), col = 2)
abline(h=0, col = 4)
}
dev.off()

Graphing a polynomial output of calc.poly

I apologize first for bringing what I imagine to be a ridiculously simple problem here, but I have been unable to glean from the help file for package 'polynom' how to solve this problem. For one out of several years, I have two vectors of x (d for day of year) and y (e for an index of egg production) data:
d=c(169,176,183,190,197,204,211,218,225,232,239,246)
e=c(0,0,0.006839425,0.027323127,0.024666883,0.005603878,0.016599262,0.002810977,0.00560387 8,0,0.002810977,0.002810977)
I want to, for each year, use the poly.calc function to create a polynomial function that I can use to interpolate the timing of maximum egg production. I want then to superimpose the function on a plot of the data. To begin, I have no problem with the poly.calc function:
egg1996<-poly.calc(d,e)
egg1996
3216904000 - 173356400*x + 4239900*x^2 - 62124.17*x^3 + 605.9178*x^4 - 4.13053*x^5 +
0.02008226*x^6 - 6.963636e-05*x^7 + 1.687736e-07*x^8
I can then simply
plot(d,e)
But when I try to use the lines function to superimpose the function on the plot, I get confused. The help file states that the output of poly.calc is an object of class polynomial, and so I assume that "egg1996" will be the "x" in:
lines(x, len = 100, xlim = NULL, ylim = NULL, ...)
But I cannot seem to, based on the example listed:
lines (poly.calc( 2:4), lty = 2)
Or based on the arguments:
x an object of class "polynomial".
len size of vector at which evaluations are to be made.
xlim, ylim the range of x and y values with sensible defaults
Come up with a command that successfully graphs the polynomial "egg1996" onto the raw data.
I understand that this question is beneath you folks, but I would be very grateful for a little help. Many thanks.
I don't work with the polynom package, but the resultant data set is on a completely different scale (both X & Y axes) than the first plot() call. If you don't mind having it in two separate panels, this provides both plots for comparison:
library(polynom)
d <- c(169,176,183,190,197,204,211,218,225,232,239,246)
e <- c(0,0,0.006839425,0.027323127,0.024666883,0.005603878,
0.016599262,0.002810977,0.005603878,0,0.002810977,0.002810977)
egg1996 <- poly.calc(d,e)
par(mfrow=c(1,2))
plot(d, e)
plot(egg1996)

R, graph of binomial distribution

I have to write own function to draw the density function of binomial distribution and hence draw
appropriate graph when n = 20 and p = 0.1,0.2,...,0.9. Also i need to comments on the graphs.
I tried this ;
graph <- function(n,p){
x <- dbinom(0:n,size=n,prob=p)
return(barplot(x,names.arg=0:n))
}
graph(20,0.1)
graph(20,0.2)
graph(20,0.3)
graph(20,0.4)
graph(20,0.5)
graph(20,0.6)
graph(20,0.7)
graph(20,0.8)
graph(20,0.9)
#OR
graph(20,scan())
My first question : is there any way so that i don't need to write down the line graph(20,p) several times except using scan()?
My second question :
I want to see the graph in one device or want to hit ENTER to see the next graph. I wrote
par(mfcol=c(2,5))
graph(20,0.1)
graph(20,0.2)
graph(20,0.3)
graph(20,0.4)
graph(20,0.5)
graph(20,0.6)
graph(20,0.7)
graph(20,0.8)
graph(20,0.9)
but the graph is too tiny. How can i present the graphs nicely with giving head line n=20 and p=the value which i used to draw the graph?[though it can be done by writing mtext() after calling the function graphbut doing so i have to write a similar line few times. So i want to do this including in function graph. ]
My last question :
About comment. The graphs are showing that as the probability of success ,p is increasing the graph is tending to right, that is , the graph is right skewed.
Is there any way to comment on the graph using program?
Here a job of mapply since you loop over 2 variables.
graph <- function(n,p){
x <- dbinom(0:n,size=n,prob=p)
barplot(x,names.arg=0:n,
main=sprintf(paste('bin. dist. ',n,p,sep=':')))
}
par(mfcol=c(2,5))
mapply(graph,20,seq(0.1,1,0.1))
Plotting base graphics is one of the times you often want to use a for loop. The reason is because most of the plotting functions return an object invisibly, but you're not interested in these; all you want is the side-effect of plotting. A loop ignores the returned obects, whereas the *apply family will waste effort collecting and returning them.
par(mfrow=c(2, 5))
for(p in seq(0.1, 1, len=10))
{
x <- dbinom(0:20, size=20, p=p)
barplot(x, names.arg=0:20, space=0)
}

Save heatmap.2 in variable and plot again

I use heatmap.2 from gplots to make a heatmap:
library(gplots)
# some fake data
m = matrix(c(0,1,2,3), nrow=2, ncol=2)
# make heatmap
hm = heatmap.2(m)
When I do 'heatmap.2' directly I get a plot that I can output to a device. How can I make the plot again from my variable 'hm'? Obviously this is a toy example, in real life I have a function that generates and returns a heatmap which I would like to plot later.
There are several alternatives, although none of them are particularly elegant. It depends on if the variables used by your function are available in the plotting environment. heatmap.2 doesn't return a proper "heatmap" object, although it contains the necessary information for plotting the graphics again. See str(hm) to inspect the object.
If the variables are available in your environment, you could just re-evaluate the original plotting call:
library(gplots)
# some fake data (adjusted a bit)
set.seed(1)
m = matrix(rnorm(100), nrow=10, ncol=10)
# make heatmap
hm = heatmap.2(m, col=rainbow(4))
# Below fails if all variables are not available in the global environment
eval(hm$call)
I assume this won't be the case though, as you mentioned that you are calling the plot command from inside a function and I think you're not using any global variables. You could just re-construct the heatmap drawing call from the fields available in your hm-object. The problem is that the original matrix is not available, but instead we have a re-organized $carpet-field. It requires some tinkering to obtain the original matrix, as the projection has been:
# hm2$carpet = t(m[hm2$rowInd, hm2$colInd])
At least in the case when the data matrix has not been scaled, the below should work. Add extra parameters according to your specific plotting call.
func <- function(mat){
h <- heatmap.2(mat, col=rainbow(4))
h
}
# eval(hm2$call) does not work, 'mat' is not available
hm2 <- func(m)
# here hm2$carpet = t(m[hm2$rowInd, hm2$colInd])
# Finding the projection back can be a bit cumbersome:
revRowInd <- match(c(1:length(hm2$rowInd)), hm2$rowInd)
revColInd <- match(c(1:length(hm2$colInd)), hm2$colInd)
heatmap.2(t(hm2$carpet)[revRowInd, revColInd], Rowv=hm2$rowDendrogram, Colv=hm2$colDendrogram, col=hm2$col)
Furthermore, I think you may be able to work your way to evaluating hm$call in the function's environment. Perhaps with-function would be useful.
You could also make mat available by attaching it to the global environment, but I think this is considered bad practice, as too eager use of attach can result in problems. Notice that in my example every call to func creates the original plot.
I would do some functional programming:
create_heatmap <- function(...) {
plot_heatmap <- function() heatmap.2(...)
}
data = matrix(rnorm(100), nrow = 10)
show_heatmap <- create_heatmap(x = data)
show_heatmap()
Pass all of the arguments you need to send to plot_heatmap through the .... The outer function call sets up an environment in which the inner function looks first for its arguments. The inner function is returned as an object and is now completely portable. This should produce the exact same plot each time!

How to draw lines on a plot in R?

I need to draw lines from the data stored in a text file.
So far I am able only to draw points on a graph and i would like to have them as lines (line graph).
Here's the code:
pupil_data <- read.table("C:/a1t_left_test.dat", header=T, sep="\t")
max_y <- max(pupil_data$PupilLeft)
plot(NA,NA,xlim=c(0,length(pupil_data$PupilLeft)), ylim=c(2,max_y));
for (i in 1:(length(pupil_data$PupilLeft) - 1))
{
points(i, y = pupil_data$PupilLeft[i], type = "o", col = "red", cex = 0.5, lwd = 2.0)
}
Please help me change this line of code:
points(i, y = pupil_data$PupilLeft[i], type = "o", col = "red")
to draw lines from the data.
Here is the data in the file:
PupilLeft
3.553479
3.539469
3.527239
3.613131
3.649437
3.632779
3.614373
3.605981
3.595985
3.630766
3.590724
3.626535
3.62386
3.619688
3.595711
3.627841
3.623596
3.650569
3.64876
By default, R will plot a single vector as the y coordinates, and use a sequence for the x coordinates. So to make the plot you are after, all you need is:
plot(pupil_data$PupilLeft, type = "o")
You haven't provided any example data, but you can see this with the built-in iris data set:
plot(iris[,1], type = "o")
This does in fact plot the points as lines. If you are actually getting points without lines, you'll need to provide a working example with your data to figure out why.
EDIT:
Your original code doesn't work because of the loop. You are in effect asking R to plot a line connecting a single point to itself each time through the loop. The next time through the loop R doesn't know that there are other points that you want connected; if it did, this would break the intended use of points, which is to add points/lines to an existing plot.
Of course, the line connecting a point to itself doesn't really make sense, and so it isn't plotted (or is plotted too small to see, same result).
Your example is most easily done without a loop:
PupilLeft <- c(3.553479 ,3.539469 ,3.527239 ,3.613131 ,3.649437 ,3.632779 ,3.614373
,3.605981 ,3.595985 ,3.630766 ,3.590724 ,3.626535 ,3.62386 ,3.619688
,3.595711 ,3.627841 ,3.623596 ,3.650569 ,3.64876)
plot(PupilLeft, type = 'o')
If you really do need to use a loop, then the coding becomes more involved. One approach would be to use a closure:
makeaddpoint <- function(firstpoint){
## firstpoint is the y value of the first point in the series
lastpt <- firstpoint
lastptind <- 1
addpoint <- function(nextpt, ...){
pts <- rbind(c(lastptind, lastpt), c(lastptind + 1, nextpt))
points(pts, ... )
lastpt <<- nextpt
lastptind <<- lastptind + 1
}
return(addpoint)
}
myaddpoint <- makeaddpoint(PupilLeft[1])
plot(NA,NA,xlim=c(0,length(PupilLeft)), ylim=c(2,max(PupilLeft)))
for (i in 2:(length(PupilLeft)))
{
myaddpoint(PupilLeft[i], type = "o")
}
You can then wrap the myaddpoint call in the for loop with whatever testing you need to decide whether or not you will actually plot that point. The function returned by makeaddpoint will keep track of the plot indexing for you.
This is normal programming for Lisp-like languages. If you find it confusing you can do this without a closure, but you'll need to handle incrementing the index and storing the previous point value 'manually' in your loop.
There is a strong aversion among experienced R coders to using for-loops when not really needed. This is an example of a loop-less use of a vectorized function named segments that takes 4 vectors as arguments: x0,y0, x1,y1
npups <-length(pupil_data$PupilLeft)
segments(1:(npups-1), pupil_data$PupilLeft[-npups], # the starting points
2:npups, pupil_data$PupilLeft[-1] ) # the ending points

Resources