Is there any way to create lines in R connecting two points?
I am aware of lines(), function, but it creates line segment what I am looking for is an infinite length line.
Here's an example of Martha's suggestion:
set.seed(1)
x <- runif(2)
y <- runif(2)
# function
segmentInf <- function(xs, ys){
fit <- lm(ys~xs)
abline(fit)
}
plot(x,y)
segmentInf(x,y)
#define x and y values for the two points
x <- rnorm(2)
y <- rnorm(2)
slope <- diff(y)/diff(x)
intercept <- y[1]-slope*x[1]
plot(x, y)
abline(intercept, slope, col="red")
# repeat the above as many times as you like to satisfy yourself
Use segment() function.
#example
x1 <- stats::runif(5)
x2 <- stats::runif(5)+2
y <- stats::rnorm(10)
plot(c(x1,x2), y)
segments(x1, y[1:5], x2, y[6:10], col= 'blue')
Related
I have triplets cloud of points (x,y,z).
I would like to bin them and average their values in each bin. So far, I used binning function which works fine as shown in my test code:
n=1000
x <- rnorm(n)
y <- 0.2*x + 0.5*rnorm(n)
z <- y + 0.5*rnorm(n)
x1 <- cbind(x,y,z)
xb<- binning(x1, nbins=12)
mz1<-apply(xb$table.freq, c(1,2), sum)
dim1 <- ncol(mz1)
dim2 <- length(mz1[,1])
image(1:dim1, 1:dim2, mz1, axes = TRUE, xlab="", ylab="")
However, instead of frequencies I wish to get average z values and instead of bins to be able to plot against x, y average values.
Can this be done with/extending binning command ?
Thank you !
I want to plot linear-model-lines for each ID.
How can I create predictions for multiple lms (or glms) using sequences of different length? I tried:
#some fake data
res<-runif(60,1,20)
var<-runif(60,10,50)
ID<-rep(c("A","B","B","C","C","C"),10)
data<- data.frame(ID,res,var)
#lm
library(data.table)
dt <- data.table(data,key="ID")
fits <- lapply(unique(data$ID),
function(z)lm(res~var, data=dt[J(z),], y=T))
#sequence for each ID of length var(ID)
mins<-matrix(with(data, tapply(var,ID,min)))
mins1<-mins[,1]
maxs<-matrix(with(data,tapply(var,ID,max)))
maxs1<-maxs[,1]
my_var<-list()
for(i in 1:3){
my_var[[i]]<- seq(from=mins1[[i]],to=maxs1[[i]],by=1)
}
# predict on sequences
predslist<- list()
predslist[[i]] <- for(i in 1:3){
dat<-fits[[i]]
predict(dat,newdata= data.frame("var"= my_var,type= "response", se=TRUE))
}
predict results error
Plotting lm lines only for var[i] ranges works in ggplot:
library(ggplot2)
# create ID, x, y as coded by Matt
p <- qplot(x, y)
p + geom_smooth(aes(group=ID), method="lm", size=1, se=F)
Is something like this what you're after?
# generating some fake data
ID <- rep(letters[1:4],each=10)
x <- rnorm(40,mean=5,sd=10)
y <- as.numeric(as.factor(ID))*x + rnorm(40)
# plotting in base R
plot(x, y, col=as.factor(ID), pch=16)
# calling lm() and adding lines
lmlist <- lapply(sort(unique(ID)), function(i) lm(y[ID==i]~x[ID==i]))
for(i in 1:length(lmlist)) abline(lmlist[[i]], col=i)
Don't know if the plotting part is where you're stuck, but the abline() function will draw a least-squares line if you pass in an object returned from lm().
If you want the least-squares lines to begin & end with the min & max x values, here's a workaround. It's not pretty, but seems to work.
plot(x, y, col=as.factor(ID), pch=16)
IDnum <- as.numeric(as.factor(ID))
for(i in 1:length(lmlist)) lines(x[IDnum==i], predict(lmlist[[i]]), col=i)
For example, let say:
x <- rnorm(20)
y <- rnorm(20) + 1
n <- seq(1,20,1)
data <- data.frame(n, x, y)
Is it possible to plot y~x with the indexed value of each pair at the top of the plot?
Can it be done with the base graphics, not ggplot?
It may be simple, but I am struggling to find help via Google. My guess is I'm using a poor selection of words.
Any help is much appreciated!
plot(x,y)
text(x = x, y = y, n, pos = 3)
#Adds text 'n' at co-ordinate (x,y)
# "pos = 3" means the text will be just above the co-ordinates
#See ?text for more
If you wanted to plot all the indices on a same line above the plot boundary, you can specify the appropriate value for y when using text. However, you will first have to pass par(xpd=TRUE) to be able to draw outside plot boundary
Yes we can add label. Try this code:
x <- rnorm(20)
y <- rnorm(20) + 1
n <- seq(1,20,1)
data <- data.frame(n, x, y)
plot(y~x)
with(data, text(y~x, labels = row.names(data)))
I've created a nice plot using scatter3d() and Rcmdr. That plot contains two nice surface smooths. Now I'd like to add to this plot one more surface, the truth (i.e. the surface defined by the function generating my observations minus the noise component).
Here is my code so far:
library(car)
set.seed(1)
n <- 200 # number of observations (x,y,z) to be generated
sd <- 0.3 # standard deviation for error term
x <- runif(n) # generate x component
y <- runif(n) # generate y component
r <- sqrt(x^2+y^2) # used to compute z values
z_t <- sin(x^2+3*y^2)/(0.1+r^2) + (x^2+5*y^2)*exp(1-r^2)/2 # calculate values of true regression function
z <- z_t + rnorm(n, sd = sd) # overlay normally distrbuted 'noise'
dm <- data.frame(x=x, y=y, z=z) # data frame containing (x,y,z) observations
dm_t <- data.frame(x=x,y=y, z=z_t) # data frame containing (x,y) observations and the corresponding value of the *true* regression function
# Create 3D scatterplot of:
# - Observations (this includes 'noise')
# - Surface given by Additive Model fit
# - Surface given by bivariate smoother fit
scatter3d(dm$x, dm$y, dm$z, fit=c("smooth","additive"), bg="white",
axis.scales=TRUE, grid=TRUE, ellipsoid=FALSE, xlab="x", ylab="z", zlab="y")
The solution given in another thread is to then define a function:
my_surface <- function(f, n=10, ...) {
ranges <- rgl:::.getRanges()
x <- seq(ranges$xlim[1], ranges$xlim[2], length=n)
y <- seq(ranges$ylim[1], ranges$ylim[2], length=n)
z <- outer(x,y,f)
surface3d(x, y, z, ...)
}
f <- function(x, y)
sin(x^2+3*y^2)/(0.1+r^2) + (x^2+5*y^2)*exp(1-r^2)/2
my_surface(f, alpha=0.2)
This however yields an error, saying (translated from German since this is my system language, I apologize):
Error in outer(x, y, f) :
Dimension [Product 100] does not match the length of the object [200]
I then tried an alternative approach:
x <- seq(0,1,length=20)
y <- x
z <- outer(x,y,f)
surface3d(x,y,z)
This does add a surface to my plot but it doesn't look right at all (i.e. the observations are not even close to it). Here's what the supposed true surface looks like (this is obviously wrong):
Thanks!
I think the problem may in fact be scaling. Here I created a couple of points that sit on the plane z = x+y. Then I proceeded to try to plot that plane using my method above:
library(car)
n <- 50
x <- runif(n)
y <- runif(n)
z <- x+y
scatter3d(x,y,z, surface = FALSE)
f <- function(x,y)
x + y
x_grid <- seq(0,1, length=20)
y_grid <- x_grid
z_grid <- outer(x_grid, y_grid, f)
surface3d(x_grid, y_grid, z_grid)
This gives me the following plot:
Maybe one of you can help me out with this?
The scatter3d function in car rescales data before plotting it, which makes it incompatible with essentially all rgl plotting functions, including surface3d.
You can get a plot something like what you want by using all rgl functions, e.g. plot3d(x, y, z) in place of scatter3d, but of course it will have rgl-style axes rather than car-style axes.
Using R, I would like to plot a linear relationship between two variables, but I would like the fitted line to be present only within the range of the data.
For example, if I have the following code, I would like the line to exist only from x and y values of 1:10 (with default parameters this line extends beyond the range of data points).
x <- 1:10
y <- 1:10
plot(x,y)
abline(lm(y~x))
In addition to using predict with lines or segments you can also use the clip function with abline:
x <- 1:10
y <- 1:10
plot(x,y)
clip(1,10, -100, 100)
abline(lm(y~x))
Instead of using abline(), (a) save the fitted model, (b) use predict.lm() to find the fitted y-values corresponding to x=1 and x=10, and then (c) use lines() to add a line between the two points:
f <- lm(y~x)
X <- c(1, 10)
Y <- predict(f, newdata=data.frame(x=X))
plot(x,y)
lines(x=X, y=Y)
You can do this using predict.
You can predict on specific values of x (see ?predict)
x<-1:10
y<-1:10
plot(x,y)
new <- data.frame(x = seq(1, 5, 0.5))
lines(new$x, predict(lm(y~x), new))
The plotrix library has the ablineclip() function for just this:
x <- 1:10
y <- 1:10
plot(x,y)
ablineclip(lm(y~x),x1=1,x2=5)
An alternative is to use the segments function (doc here).
Say you estimated the line, and you got an intercept of a and a slope of b. Thus, your fitted function is y = a + bx.
Now, say you want to show the line for x between x0 and x1. Then, the following code plots your line:
# inputs
a <- 0.5
b <- 2
x0 <- 1
x1 <- 5
# graph
plot(c(0,5), c(0,5), type = "n", xlab = "", ylab = "", bty='l')
segments(x0, a+b*x0, x1, a+b*x1)
Simply replace the values of a, b, x0, x1 with those of your choosing.
For those like me who came to this question wanting to plot a line for an arbitrary pair of numbers (and not those that fit a given regression), the following code is what you need:
plot(c(0,5), c(0,5), type = "n", xlab = "", ylab = "", bty='l')
segments(x0, yo, x1, y1)
Simply replace the values of x0, y0, x1, y1 with those of your choosing.