R function doesn't do what I want - r

I don't inderstand why my function doesn't work, can you help me to find my error.
VehiculeFunction <- function(data){
my.data <- data[data$GAMME =="M1",]
ma.col = rgb(red = 0.1,blue = 1,green = 0.1, alpha = 0.2)
X <- my.data$GMF.24
Y <- my.data$Cout.24
X11()
plot(X, Y, pch=20, las = 1, col = ma.col, xlim = c(0, 10), ylim = c(0,10))
identify(X, Y, labels = my.data$NITG, cex = 0.7)
}
This one works perfectly, and when I add two variables it returns "numeric(0)"
VehiculeFunction <- function(data, x, y){
my.data <- data[data$GAMME =="M1",]
ma.col = rgb(red = 0.1,blue = 1,green = 0.1, alpha = 0.2)
X <- my.data$x
Y <- my.data$y
X11()
plot(X, Y, pch=20, las = 1, col = ma.col, xlim = c(0, 10), ylim = c(0,10))
identify(X, Y, labels = my.data$NITG, cex = 0.7)
}
VehiculeFunction(data.vehicule, GMF.24, Cout.24)
numeric(0)
thank you Oleg, and if i want to add a condition on another variable of my dataframe, i want to take all the three first of "RANG_NITG_PROJET_K"
my.data1 <- my.data[my.data$RANG_NITG_PROJET_K == 1|2|3,] ?
but | think it's false because when i do this
my.data1 <- data.vehicule[data.vehicule$RANG_NITG_PROJET_K == 1,]
my.data2 <- data.vehicule[data.vehicule$RANG_NITG_PROJET_K == 2,]
my.data3 <- data.vehicule[data.vehicule$RANG_NITG_PROJET_K == 3,]
my.data <- rbind(my.data1, my.data2, my.data3)
it gives me two different dataframe ?

Try the following. You cannot access my.data columns (or list elements) using the $ operator in this case and you need to pass strings for x and y:
VehiculeFunction <- function(data, x, y, gamme){
my.data <- data[data$GAMME == gamme,]
ma.col = rgb(red = 0.1,blue = 1,green = 0.1, alpha = 0.2)
X <- my.data[[x]] # <- change from $ to [[]]
Y <- my.data[[y]] # <- change from $ to [[]]
X11()
plot(X, Y, pch=20, las = 1, col = ma.col, xlim = c(0, 10), ylim = c(0,10))
identify(X, Y, labels = my.data$NITG, cex = 0.7)
}
VehiculeFunction(data.vehicule, "GMF.24", "Cout.24", "M1") # <- change to strings

Related

Using xyplot within custom function with panel function in R

I'm trying to generate a custom xyplot in the lattice graphics package. It is a plot of two lines with range limits plotted.
# Raw dataframe in long format
df <- data.frame(
Response = c(runif(100), rnorm(100)),
Trial = c(rep("A", 100), rep("C", 100)),
Year = 1:10,
Rep = rep(1:10, each = 10))
# Aggregate the data (take mean/min/max across "Rep" variable
gdf <- do.call(data.frame, aggregate(Response ~ Year + Trial, data = df,
FUN = function(x) c(avg = mean(x), mini = min(x), maxi = max(x))))
# Plot using xyplot (without making this a function)
my.panel.bands <- function(x, y, upper, lower, fill, col,
subscripts, ..., font, fontface){
upper <- upper[subscripts]
lower <- lower[subscripts]
panel.polygon(c(x, rev(x)), c(upper, rev(lower)),
col = fill, alpha = 0.2, border = FALSE, ...)
}
f1 <- formula(Response.avg ~ Year)
p <- xyplot(x = f1, data = gdf, groups = Trial,
col=c("red", "blue"), pch = 16,
scales = list(x = list(rot = 45)),
xlab = 'Year', ylab = 'Response',
layout = c(1, 1),
ylim = c(min(gdf$Response.mini), max(gdf$Response.maxi)),
upper = gdf$Response.maxi,
lower = gdf$Response.mini,
panel = function(x, y, ...){
panel.superpose(x, y, panel.groups = my.panel.bands,
type = 'l', fill = c("red", "blue"),...)
panel.xyplot(x, y, type = 'b', cex = 0.6, lty = 1, ...)
}
)
png("panel_plot.png")
print(p)
dev.off()
However, if I try to make a custom function out of this xyplot command then I get a very different plot to what I was expecting. I'm assuming I'm doing something incorrect with passing the grouping variable or in using the panel function.
panel_plot <- function(f, df, grouper, xlabel, ylabel,
ylim, upper_border, lower_border, mfcol){
p <- xyplot(x = f, data = df, groups = eval(grouper),
col = c("red", "blue"), pch = 16,
scales = list(x = list(rot = 45)),
xlab = xlabel, ylab = ylabel,
ylim = ylim,
layout = mfcol,
upper = upper_border,
lower = lower_border,
panel = function(x, y, ...){
panel.superpose(x, y, panel.groups = my.panel.bands,
type = 'l', fill = c("red", "blue"),...)
panel.xyplot(x, y, type = 'b', cex = 0.6, lty = 1, ...)
}
)
return(p)
}
f1 <- formula(Response.avg ~ Year)
p <- panel_plot(f1, gdf, grouper = Trial,
xlabel = "Year", ylabel = "Response",
ylim = c(min(gdf$Response.mini),max(gdf$Response.maxi)),
upper_border = gdf$Response.maxi,
lower_border = gdf$Response.mini,
mfcol = c(1, 1))
png("panel_plot_asfunction.png")
print(p)
dev.off()
Finally, if I pass the name of the variable to the group argument as a string and modify the panel_plot() function to redefine a new variable in the data.frame, then it works as expected but this seems like a strange way to do things.
panel_plot <- function(f, df, grouper, xlabel, ylabel,
ylim, upper_border, lower_border, mfcol){
df$grouper <- df[, grouper]
p <- xyplot(x = f, data = df, groups = grouper,
...
p <- panel_plot(f1, gdf, grouper = "Trial",
...
How do I define the panel_plot function so that I don't have to create a dummy column and can pass the variable name (Trial) to this function so that it is not passed as a string?
I've tried using the suggestion here but using eval() on the variable name provided the unexpected figure above.
Actually, consider combining both solutions of your linked SO post using match.call() list and eval(call(), ...). By themselves alone neither worked on my end.
panel_plot <- function(f, df, grouper, xlabel, ylabel,
ylim, upper_border, lower_border, mfcol){
ll <- as.list(match.call(expand.dots = FALSE)[-1])
my_panel <- function(x, y, ...){
panel.superpose(x, y, panel.groups = my.panel.bands,
type = 'l', fill = c("red", "blue"),...)
panel.xyplot(x, y, type = 'b', cex = 0.6, lty = 1, ...)
}
p <- eval(call("xyplot",
x = ll$f,
data = ll$df,
groups = ll$grouper,
xlab = ll$xlabel, ylab = ll$ylabel,
ylim = ll$ylim,
layout = ll$mfcol,
upper = ll$upper_border,
lower = ll$lower_border,
panel = my_panel
)
)
return(p)
}
f1 <- formula(Response.avg ~ Year)
p <- panel_plot(f1, gdf, grouper = Trial,
xlabel = "Year", ylabel = "Response",
ylim = c(min(gdf$Response.mini), max(gdf$Response.maxi)),
upper_border = gdf$Response.maxi,
lower_border = gdf$Response.mini,
mfcol = c(1, 1))
p
However, aesthetics are not exactly the same such as red and blue lines and points. Possibly this is because eval(call(...)) requires all arguments to be specified? Try adjusting arguments or my_panel.
Alternatively, use your dummy column but still pass unquoted name and evaluate column name with eval. Here, aesthetics render as desired non-function version.
panel_plot <- function(f, df, grouper, xlabel, ylabel,
ylim, upper_border, lower_border, mfcol){
df$grouper <- eval(as.name(deparse(substitute(grouper))), df, .GlobalEnv)
p <- xyplot(x = f, data = df, groups = grouper,
...
}

Add the new regression line but keep the regression lines from previous runs in R

Background
I have a function called TPN (R code is below the picture). When you run this function, it produces two plots (see picture below). The bottom-row plot samples from the top-row plot and then adds a red regression line. Each time you run the TPN function, the bottom-row plot produces a new red-colored regression line.
Question
In the bottom-row plot, I was wondering if there is a way I could KEEP the regression lines from previous runs each time I run the TPN function (see picture below)?
That is, each time that I run a new TPN function the regression line from a previous run is kept in its place (probably in a color other than "red" for distinction purposes), and the new regression line is just added to he bottom-row plot?
############## Input Values #################
TPN = function( each.sub.pop.n = 150,
sub.pop.means = 20:10,
predict.range = 10:0,
sub.pop.sd = .75,
n.sample = 2 ) {
#############################################
par( mar = c(2, 4.1, 2.1, 2.1) )
m = matrix( c(1, 2), nrow = 2, ncol = 1 ); layout(m)
set.seed(2460986)
Vec.rnorm <- Vectorize(function(n, mean, sd) rnorm(n, mean, sd), 'mean')
y <- c( Vec.rnorm(each.sub.pop.n, sub.pop.means, sub.pop.sd) )
set.seed(NULL)
x <- rep(predict.range, each = each.sub.pop.n)
plot(x, y, ylim = range(y)) ## Top-Row Plot
sample <- lapply(split(y, x), function(z) sample(z, n.sample, replace = TRUE))
sample <- data.frame(y = unlist(sample),
x = as.numeric(rep(names(sample), each = n.sample)))
x = sample$x ; y = sample$y
plot(x, y, ylim = range(y)) #### BOTTOM-ROW PLOT
abline(lm(y ~ x), col = 'red') # Regression Line
}
## TEST HERE:
TPN()
It ain't that easy. I made another function and edit the first one as well.
To summarize what I have done:
I made the first function to set par(new = TRUE) at the end of it. Also, set the color for points in the bottom row plot to be white only for formatting. You can get rid of col = 'white', bg = 'white' if you wish.
Then, in the second function top row plot does not get plotted and yaxis won't be added to the bottom row plot from each "test".
Look below:
############## Input Values #################
TPN = function( each.sub.pop.n = 150,
sub.pop.means = 20:10,
predict.range = 10:0,
sub.pop.sd = .75,
n.sample = 2 ) {
#############################################
par( mar = c(2, 4.1, 2.1, 2.1) )
m = matrix( c(1, 2), nrow = 2, ncol = 1 ); layout(m)
set.seed(2460986)
Vec.rnorm <- Vectorize(function(n, mean, sd) rnorm(n, mean, sd), 'mean')
y <- c( Vec.rnorm(each.sub.pop.n, sub.pop.means, sub.pop.sd) )
set.seed(NULL)
x <- rep(predict.range, each = each.sub.pop.n)
par(new = FALSE)
plot(x, y, ylim = range(y)) ## Top-Row Plot
sample <- lapply(split(y, x), function(z) sample(z, n.sample, replace = TRUE))
sample <- data.frame(y = unlist(sample),
x = as.numeric(rep(names(sample), each = n.sample)))
x = sample$x ; y = sample$y
plot(x, y, ylim = range(y), col = 'white', bg = 'white') #### BOTTOM-ROW PLOT
abline(lm(y ~ x), col = 'red') # Regression Line
par(new = TRUE)
}
The second one does not plot the top row one:
############## Input Values #################
TPN2 = function( each.sub.pop.n = 150,
sub.pop.means = 20:10,
predict.range = 10:0,
sub.pop.sd = .75,
n.sample = 2 ) {
#############################################
par( mar = c(2, 4.1, 2.1, 2.1) )
m = matrix( c(1, 2), nrow = 2, ncol = 1 ); layout(m)
set.seed(2460986)
Vec.rnorm <- Vectorize(function(n, mean, sd) rnorm(n, mean, sd), 'mean')
y <- c( Vec.rnorm(each.sub.pop.n, sub.pop.means, sub.pop.sd) )
set.seed(NULL)
x <- rep(predict.range, each = each.sub.pop.n)
#par(new = FALSE) #comment-out
#plot(x, y, ylim = range(y)) ##Top-Row Plot #comment-out
sample <- lapply(split(y, x), function(z) sample(z, n.sample, replace = TRUE))
sample <- data.frame(y = unlist(sample),
x = as.numeric(rep(names(sample), each = n.sample)))
x = sample$x ; y = sample$y
plot(x, y, ylim = range(y), axes = FALSE, col = 'white', bg = 'white') ##BOTTOM-ROW PLOT
abline(lm(y ~ x), col = 'blue') # Regression Line
par(new = TRUE)
}
Then your test would be like this:
## TEST HERE:
TPN()
TPN2()
TPN2()
TPN2()
This is the output:
A simple way to do what you want is to change your main effect (currently none) to return an accumulation of previous regressions and your side effect (plotting) to loop through these previous regressions (in blue) in addition to the current one (in red).
Another tip: you can use the abline(reg=lm(y~x)) argument and just accumulate the lm objects in a list. It's not necessary to store coefficients and intercepts separately as suggested in the other answer. Keeping the lm objects is also a good idea in case you want to go back and look at average R-squared, etc. -- you couldn't do that using only the coefficients.
Your new function could look like:
TPN.accum <- function( each.sub.pop.n = 150,
sub.pop.means = 20:10,
predict.range = 10:0,
sub.pop.sd = .75,
n.sample = 2,
lm.history = list() # the accumulator
){
par( mar = c(2, 4.1, 2.1, 2.1) )
m <- matrix( c(1, 2), nrow = 2, ncol = 1 ); layout(m)
set.seed(2460986)
Vec.rnorm <- Vectorize(function(n, mean, sd) rnorm(n, mean, sd), 'mean')
y <- c( Vec.rnorm(each.sub.pop.n, sub.pop.means, sub.pop.sd) )
set.seed(NULL)
x <- rep(predict.range, each = each.sub.pop.n)
plot(x, y, ylim = range(y)) ### Top-Row Plot
sample <- lapply(split(y, x), function(z) sample(z, n.sample, replace = TRUE))
sample <- data.frame(y = unlist(sample),
x = as.numeric(rep(names(sample), each = n.sample)))
x <- sample$x ; y <- sample$y
lm.current <- lm(y~x) # the current regression
plot(x, y, ylim = range(y)) ### Bottom-Row Plot
abline(reg = lm.current, col = 'red') # plot current regression (red)
for( i in seq_along(lm.history) ){
abline(reg=lm.history[[i]], col='blue') # plot any previous regressions (blue)
}
return(c(lm.history, list(lm.current))) # append current regression to accumulator
}
To initialize it and then run it repeatedly, just do something like:
tpn.history <- TPN.accum()
for (i in 1:5) tpn.history <- TPN.accum(lm.history=tpn.history)
And your output will look like:
I propose two possibilities:
Use par(mfg) to define on which panel to draw, so that you can add new points or lines on any of the two. For the color, I propose to add options saying if this is the first plot or the last plot of the series.
Store the coefficients of the abline to be used on other plots.
Use par(mfg)
I used some transparent color so that we do not see all superimposition of each iteration. Depending on what you want to achieve, you can modify this.
############## Input Values #################
TPN <- function(each.sub.pop.n = 150,
sub.pop.means = 20:10,
predict.range = 10:0,
sub.pop.sd = .75,
n.sample = 2,
plot = TRUE,
first = FALSE,
last = FALSE) {
#############################################
if (plot & first) {
plot.new()
m <- matrix( c(1, 2), nrow = 2, ncol = 1 ); layout(m)
par( mar = c(2, 4.1, 2.1, 2.1) )
}
set.seed(2460986)
Vec.rnorm <- Vectorize(function(n, mean, sd) rnorm(n, mean, sd), 'mean')
y <- c( Vec.rnorm(each.sub.pop.n, sub.pop.means, sub.pop.sd) )
set.seed(NULL)
x <- rep(predict.range, each = each.sub.pop.n)
if (plot) {
par(mfg = c(1,1)) ## Top-Row Plot
if (first) {
plot(x, y, ylim = range(y), col = "transparent")
} else if (last) {
plot(x, y, ylim = range(y))
}
}
sample <- lapply(split(y, x), function(z) sample(z, n.sample, replace = TRUE))
sample <- data.frame(y = unlist(sample),
x = as.numeric(rep(names(sample), each = n.sample)))
x = sample$x ; y = sample$y
if (plot) {
par(mfg = c(2,1)) #### BOTTOM-ROW PLOT
if (first) {
plot(x, y, ylim = range(y), col = "transparent")
}
if (last) {
points(x, y)
}
abline(lm(y ~ x), col = c('blue', 'red')[(last) + 1]) # Regression Line
}
}
## TEST HERE:
n <- 10
for (i in 1:n) {
TPN(first = ifelse(i == 1, TRUE, FALSE), last = ifelse(i == n, TRUE, FALSE))
}
Store the abline coefficients
There is no need of transparent color here because, a new plot is created for each iteration.
############## Input Values #################
TPN <- function(each.sub.pop.n = 150,
sub.pop.means = 20:10,
predict.range = 10:0,
sub.pop.sd = .75,
n.sample = 2,
plot = TRUE,
coefs = FALSE,
coefsup = NULL) {
#############################################
if (plot) {
m <- matrix( c(1, 2), nrow = 2, ncol = 1 )
layout(m)
par( mar = c(2, 4.1, 2.1, 2.1) )
}
set.seed(2460986)
Vec.rnorm <- Vectorize(function(n, mean, sd) rnorm(n, mean, sd), 'mean')
y <- c( Vec.rnorm(each.sub.pop.n, sub.pop.means, sub.pop.sd) )
set.seed(NULL)
x <- rep(predict.range, each = each.sub.pop.n)
if (plot) {
plot(x, y, ylim = range(y))
}
sample <- lapply(split(y, x), function(z) sample(z, n.sample, replace = TRUE))
sample <- data.frame(y = unlist(sample),
x = as.numeric(rep(names(sample), each = n.sample)))
x = sample$x ; y = sample$y
if (plot) {
plot(x, y, ylim = range(y))
# Add the previous lines if exists
if (!is.null(coefsup)) {
apply(coefsup, 1, function(x) abline(a = x[1], b = x[2], col = "blue"))
}
abline(lm(y ~ x), col = 'red') # Regression Line
}
if (coefs) {return(coef(lm(y ~ x)))}
}
# TEST with coefs
n <- 10
coefsup <- NULL
for (i in 1:n) {
coefsup <- rbind(coefsup, TPN(coefs = TRUE, coefsup = coefsup))
}
In both cases, the output is what you expect:

Plot vectors of gradient descent in R

I've code gradient descent algorithm in R and now I'm trying to "draw" the path of the vectors.
I've got draw points in my contour plot, but it's not correct because nobody knows what happened first.
In my algorith always I have an previous state P=(Xi,Yi) and a later state L=(Xi+1,Yi+1), so, How can I draw the vector PL in a contour or a persp plot?
I only got this with contour, where the red point is the convergence:
The same for persp:
Thanks all!
EDIT:
Graphics can be obtanined respectively:
f<-function(u,v){
u*u*exp(2*v)+4*v*v*exp(-2*u)-4*u*v*exp(v-u)
}
x = seq(-2, 2, by = 0.5)
y = seq(-2, 2, by = 0.5)
z <- outer(x,y,f)
#Contour plot
contour(x,y,z)
#Persp plot
persp(x, y, z, phi = 25, theta = 55, xlim=c(-2,2), ylim=c(-2,2),
xlab = "U", ylab = "V",
main = "F(u,v)", col="yellow", ticktype = "detailed"
) -> res
Taking Himmelblau's function as a test example:
f <- function(x, y) { (x^2+y-11)^2 + (x+y^2-7)^2 }
Its partial derivatives:
dx <- function(x,y) {4*x**3-4*x*y-42*x+4*x*y-14}
dy <- function(x,y) {4*y**3+2*x**2-26*y+4*x*y-22}
Running the gradient descent:
# gradient descent parameters
num_iter <- 100
learning_rate <- 0.001
x_val <- 6
y_val <- 6
updates_x <- vector("numeric", length = num_iter)
updates_y <- vector("numeric", length = num_iter)
updates_z <- vector("numeric", length = num_iter)
# parameter updates
for (i in 1:num_iter) {
dx_val = dx(x_val,y_val)
dy_val = dy(x_val,y_val)
x_val <- x_val-learning_rate*dx_val
y_val <- y_val-learning_rate*dx_val
z_val <- f(x_val, y_val)
updates_x[i] <- x_val
updates_y[i] <- y_val
updates_z[i] <- z_val
}
Plotting:
x <- seq(-6, 6, length = 100)
y <- x
z <- outer(x, y, f)
plt <- persp(x, y, z,
theta = -50-log(i), phi = 20+log(i),
expand = 0.5,
col = "lightblue", border = 'lightblue',
axes = FALSE, box = FALSE,
ltheta = 60, shade = 0.90
)
points(trans3d(updates_x[1:i], updates_y[1:i], updates_z[1:i],pmat = plt),
col = c(rep('white', num_iter-1), 'blue'),
pch = 16,
cex = c(rep(0.5, num_iter-1), 1))
There's a trick to plotting points using persp, as mentioned in ?persp. By employing the power of trans3d, you can successfully put points and lines on a perspective plot.
f<-function(u,v){
u*u*exp(2*v)+4*v*v*exp(-2*u)-4*u*v*exp(v-u)
}
x = seq(-2, 2, by = 0.5)
y = seq(-2, 2, by = 0.5)
z <- scale(outer(x,y,f))
view <- persp(x, y, z, phi = 30, theta = 30, xlim=c(-2,2), ylim=c(-2,2),
xlab = "X", ylab = "Y", zlab = "Z", scale = FALSE,
main = "F(u,v)", col="yellow", ticktype = "detailed")
set.seed(2)
pts <- data.frame(x = sample(x, 3),
y = sample(y, 3),
z = sample(z, 3))
points(trans3d(x = pts$x, y = pts$y, z = pts$z, pmat = view), pch = 16)
lines(trans3d(x = pts$x, y = pts$y, z = pts$z, pmat = view))

Plot a line that connects the outer points of a plot

I would like to plot a line that connects the outer points of the plot
plot(rnorm(1000), rnorm(1000), xlim=c(-5,5),ylim=c(-5,5))
and thus "bags" all points of the plot
Function to be used here is chull. Line 4 is there to close the circle. For more examples, see here.
x <- data.frame(x = rnorm(100), y = rnorm(100))
plot(x)
chx <- chull(x)
chx <- rbind(x = x[chx, ], x[chx[1], ])
lines(chx)
a <- rnorm(1000)
b <- rnorm(1000)
Plot_ConvexHull<-function(xcoord, ycoord, lcolor){
hpts <- chull(x = xcoord, y = ycoord)
hpts <- c(hpts, hpts[1])
lines(xcoord[hpts], ycoord[hpts], col = lcolor)
}
(xrange <- range(c(a)))
(yrange <- range(c(b)))
par(tck = 0.02, mgp = c(1.7, 0.3, 0))
plot(a, b, type = "p", pch = 1, col = "black", xlim = c(xrange), ylim = c(yrange))
Plot_ConvexHull(xcoord = a, ycoord = b, lcolor = "black")

Plotting deviations from regression line

I want to plot a regression line with (a = 0 and b = 1) and add the individual point deviations from this along with identifying the data point with name.
set.seed(123)
namelab <- paste ("ET", 1:10, sep = "")
xvar <- 1:10
yvar <- rnorm(10, 5, 5)
myd <- data.frame(namelab, xvar, yvar)
plot(xvar, yvar)
abline (a= 0, b = 1, col = "red", lty = 2)
Just manual sketch of my intention, I just labelled a single point just for example. The line drawn need a slim.
dev.new(width=4, height=4)
plot(xvar, yvar, asp=1)
a = 0
b = 1
abline (a, b, col = "red", lty = 2)
myd$xint = with(myd, (b*yvar + xvar - b*a) / (b^2 + 1))
myd$yint = with(myd, (b*yvar + b*xvar + a) / (b^2 + 1))
with(myd, segments(xvar, yvar, xint, yint))
with(myd, text(xvar, yvar, labels=namelab, pos=3, cex=0.5))
...and if you did want vertical as opposed to perpendicular offsets, here is a pretty straightforward option:
set.seed(123)
namelab <- paste ("ET", 1:10, sep = "")
xvar <- 1:10
yvar <- rnorm(10, 5, 5)
plot(xvar, yvar)
abline (a= 0, b = 1, col = "red", lty = 2)
segments(xvar,yvar,xvar,xvar)
text(xvar,yvar,namelab,pos=3)
For this to work for any value of a and b, you would use:
segments(xvar,yvar,xvar,((xvar*b)+a))

Resources