Plotting 2 out of 3 columns of a data frame - r

I have a data frame consisting of 400 rows of x, y and z values, and I would like to plot the x column against the y column.
This is what I have got so far:
sample3d = function(n)
{
df = data.frame()
while(n>0)
{
X = runif(1,-1,1)
Y = runif(1,-1,1)
Z = runif(1,-1,1)
a = X^2 + Y^2 + Z^2
if( a < 1 )
{
b = (X^2+Y^2+Z^2)^(0.5)
vector = data.frame(X = X/b, Y = Y/b, Z = Z/b)
df = rbind(vector,df)
n = n- 1
}
}
df
}
sample3d(400)

If you need to draw random points on the surface of a sphere, you need just to extract the polar coordinates. Something like that:
sample3d_v2 <- function(n) {
phi<-runif(n,0,2*pi)
cost<-runif(n,-1,1)
sint<-sqrt(1-cost^2)
data.frame(X=sint*cos(phi),Y=sint*sin(phi),Z=cost)
}
Just some tests:
system.time(old<-sample3d(4000))
# user system elapsed
# 3.895 0.000 3.879
system.time(new<-sample3d_v2(4000))
# user system elapsed
# 0.000 0.000 0.002
As you can see, a gain of thousands of times. Test the results are correct:
require(rgl)
plot3d(old)
open3d();plot3d(new)
Regarding your question: just name the object resulting for your function and plot the X and Y components.
data<-sample3d_v2(400)
plot(data$X,data$Y)

I named the dataframe your function creates data
(data <- sample3d(400)).
Then, if you need a scatterplot (points at x and y coordinates) of the X and Y column, you can use base R's function plot(). e.g. plot(x = data$X, y = data$Y). If you are like me and don't want to type data$ before every column you can make the columns available to functions as vectors by wrapping everything inside with(data = data, {...}).
If you want to create a scatterplot of all three coordinates, you might want to check out the packages in the following two examples:
library(scatterplot3d)
with(data = data, {
scatterplot3d(x = X, y = Y, z = Z)
})
library(rgl)
with(data, {
plot3d(X, Y, Z)
})

Related

Finding z value for given x,y pair in by interp() returned list (rlang)

I am using in R the akima interp() function to smooth GPS coordinates regarding the measured altitude.
s = interp(x, y, z, nx=100, ny=100)
X are longitude values
Y are latitude values
Z the corresponding altitude
I want a corresponding Z value to a given X,Y pair by using the returned list s.smooth.
How it must be implemented?
Actually, I am only able to use
df1<-data.frame(s.smooth)
df1[which.min(abs(x1-df1$x))]
to get the nearest x value for one value x1.
I need a function like z_i=f(x_i,y_i) with given x_i and y_i. This position pare is not part of the initial lists x,y,z.
To get the interpolated value at specific values of x and y, you can use the xo and yo arguments of interp. To vectorize this and return the results in a data frame, you can write a little wrapper function:
library(akima)
interp_at <- function(x, y, z, xout, yout) {
do.call(rbind, lapply(seq_along(xout), function(i) {
as.data.frame(t(unlist(interp(x, y, z, xo = xout[i], yo = yout[i]))))
}))
}
So, if we had the following data:
set.seed(1)
x <- rnorm(500)
y <- rnorm(500)
z <- 10 - x^2 - y^2
and we want to know the value at [0, 0], [1, 0.5], and [2, 1] we can just do:
interp_at(x, y, z, xout = c(0, 1, 2), yout = c(0, 0.5, 1))
#> x y z
#> 1 0 0.0 9.992207
#> 2 1 0.5 8.727516
#> 3 2 1.0 4.982223
Created on 2022-06-13 by the reprex package (v2.0.1)

Generating 3D data with cube as a decision surface

I am new to using r program. I have a task to use r to create a function to simulate standard normal distribution containing 500 observations and three variables, x,y,& z.
I am to use cube as a decision surface to categorize observations based on whether they fell within or outside the cube.
Below is my code. I am able to plot the 3D data, but I am not sure of how to categorize the datasets into two classes.
library(scatterplot3d)
set.seed (1234)
nObs <- 500
x <- matrix (rnorm (1.25*nObs), ncol =2)
y <- matrix (rnorm (1.25*nObs), ncol =2)
z <- matrix (rnorm (1.25*nObs), ncol =2)
mSample <- function(nObs,x,y,z){
x1 <- rnorm(1,x)
x1[y==1,] <- x[y==1,] + 1
mSample <- as_tibble(rbind(mvnfast::rmvn(x,y = y1,z = z1), mvnfast::rmvn(x,y = y1,z = z1)))
mSample[1:x1, 1.25] <- 0
mSample[(x1 + 1):(x1 + 1), 1.25] <- 1
mSample <- mSample[sample(nrow(mSample)), ]
colnames(mSample <- c("x", "y", "class"))
mSample
}
spl <- scatterplot3d(x,y,z)
spl <- scatterplot3d(x,y,z,pch=16,highlight.3d=TRUE)
I had a similar question to this recently. Basically, to know if a given point is inside or outside of a cube, first you need to know the length of the cube.
Then, simply iterate over all the points (nObs) and do an if statement
if (x > -cubeLength ** x < cubeLength && y > -cubeLength ** y < cubeLength && z > -cubeLength ** z < cubeLength) {
classify positive
}
else {
classify negative
}

Contour - Plot - Ascending order

I have a matrix containing position (X,Y,elevation). I add a column to the matrix I call "index". I create a X and a Y vector from the matrix. They both include the index column. I then sort in ascending value the X and Y vector I just made. I then construct a Z matrix containing the elevation and I relate it to the position using the index. I then try to use the command contour (I want to plot a contour graph) and I get the error saying that X and Y should be ascending order... which I just made !!! What did I do wrong?
noeud<-read.table("position.out")
Matrice_Noeud<-matrix(ncol = ncol(noeud), nrow=nrow(noeud))
for (i in 1:nrow(noeud)) {
for (j in 1:ncol(noeud)) {
Matrice_Noeud[i,j]<-noeud[i,j]
}
}
Matrice_Noeud <- cbind(Matrice_Noeud, c(seq(1,nrow(noeud),1)))
x<-data.frame(x=Matrice_Noeud[,1],Index=Matrice_Noeud[,4])
y<-data.frame(y=Matrice_Noeud[,2],Index=Matrice_Noeud[,4])
X<-x[order(x$x),]
Y<-y[order(y$y),]
Z<-matrix(NA, ncol=nrow(noeud),nrow=nrow(noeud))
for (x_i in 1:nrow(noeud)) {
for (y_i in 1:nrow(noeud)) {
if (Y$Index[y_i]==X$Index[x_i]) {
niveau<-which(Matrice_Noeud[,4]==Y$Index[y_i])
Z[x_i,y_i]<-Matrice_Noeud[niveau,3]
}
}
}
Xx<-array(X[,1])
Yy<-array(Y[,1])
Zz<-data.frame(Z)
contour(Xx,Yy,Zz)
OK, since I'd started doing it, I've done it.
#### making example data
## assumptions: length(unique(x))=19, length(unique(y))=12, nrow(data)=121
## (They mean the number of grid points is 19 * 12 = 228, but z.value is only 121.)
xyz.f <- function(m, n) - m + (n - 7)^2 + 16 # make z from x and y (it means nothing special)
xyz <- cbind( xyz <- expand.grid(x = round(seq(11,15,,19), 2), y = round(seq(6,10,,12), 2)),
z = apply(xyz, 1, function(k) xyz.f(k[1], k[2])) )
set.seed(1); ind <- sample(19*12, 121) # decide to use the 121 z of 19*12
noeud <- as.matrix(xyz[ind,]) # example data maked out
#### making contour()'s arguments
Xx <- sort(unique(noeud[,1]))
Yy <- sort(unique(noeud[,2])) # nrow(noeud); length(Xx); length(Yy) # OK (121, 19, 12)
Zz <- matrix(NA, ncol=length(Yy), nrow=length(Xx)) # make 19 x 12 Z matrix (empty)
# In each row, calculate x (y) value is what number in Xx (Yy) (= the position in Z matrix)
X0 <- as.numeric( factor( noeud[,1] ) ) # (edit) using Mr.Tufte's code in R help mailing.
Y0 <- as.numeric( factor( noeud[,2] ) )
apply(cbind(X0, Y0, noeud[,3]), 1, function (a) Zz[ a[1], a[2] ] <<- a[3])
## contour()'s arguments ( Xx, Yy, Zz ) maked out
contour(Xx, Yy, Zz, xlab="including NAs") # length(Zz); length(Zz[!is.na(Zz)]) # OK (228,121)
#### interpolating
## I know few packages having interpolation functions.
library(akima) # use cubic spline interpolation methods of H. Akima
NOEUD <- interp(noeud[,1], noeud[,2], noeud[,3])
#### results
par.old <- par(no.readonly=T); par(mfrow=c(1,3), mar=c(4,0,1,0))
contour(Xx, Yy, Zz, xlab="including NAs", yaxt="n") # the including NAs data
contour(NOEUD, xlab="Akima interpolation", yaxt="n") # the Akima interpolation data
contour(Xx, Yy, matrix(xyz[,3], nrow=19), xlab="origin", yaxt="n") # the origin data
# (edit) I noticed some interp()'s arguments make a difference (default: linear=T, extrap=F).
contour(interp(noeud[,1], noeud[,2], noeud[,3], linear=T, extrap=F), xlab="Akima interp() default")
contour(interp(noeud[,1], noeud[,2], noeud[,3], linear=F, extrap=F), xlab="interp(linear=F)")
contour(interp(noeud[,1], noeud[,2], noeud[,3], linear=F, extrap=T), xlab="interp(linear=F, extrap=T)")
par(par.old)
### supplement (using the same data, output is about the same)
noeud2 <- data.frame(x=noeud[,1], y=noeud[,2], z=noeud[,3]) # equal to the including NAs data
NOEUD2 <- cbind(expand.grid(x=NOEUD$x, y=NOEUD$y), z=c(NOEUD$z)) # equal to the Akima interpolation data
ggplot2::ggplot( noeud2, aes( x, y, z = z )) + geom_contour()
lattice::contourplot( z ~ x * y, NOEUD2 )

Functions in R - using eval() and parse() to plot expressions in rgl

I am very new to R. I am trying to create a function where the user is able to input expressions into arguments. These inputs are then used in plot3d through the rgl package. The function I have so far is :
flight_sim <- function(xval, yval, zval)
{
# Evaluate arguments and convert them into expressions
eval(parse(text = zval))
z <- data.frame(zval)
eval(parse(text = xval))
x <- data.frame(xval)
eval(parse(text = yval))
y <- data.frame(yval)
flight_path <- as.data.frame(cbind(x,y,z))
}
I have a readline() and switch() command :
cat('Select the flight path you wish to plot from the list below :
1. Helix
2. Conical
3. Spherical
4. Define your own flight path...')
userplot <- readline('Enter number here : ') # Allow user to enter choice from above
switch(userplot,"1"=flight_sim( sin(z), 1-cos(z), seq(0,20, pi/32) ),
"2"=flight_sim( z*cos(6*z), z*sin(6*z), seq(0,10, pi/64) ),
"3"=flight_sim( sin(z)*cos(20*z), sin(z)*sin(20*z), seq(0,pi,pi/399)),
"4"=custom())
Where custom() just prompts the user via readline() to enter x, y and z values, which is then followed by eval() and parse() and it works fine.
The problem I've been having is that x and y need to be functions of z, and this causes an error :
Error in parse(text = xval) : object 'z' not found
I thought by making the flight_sim function evaluate the zval argument first that it would fix it, however as I'm new to R I'm just getting more and more lost.
I hope what I have explained here makes some sense. I appreciate any help that can be provided.
Nothing is being passed as text in your example so using parse() doesn't seem necessary. If you want to delay evaulation, the best way would be to use substitute to grab the parameters as promises and then evaluate them in the context of your fliht_sim function. Here's what that would look like
flight_sim <- function(xval, yval, zval) {
z <- eval(substitute(zval))
x <- eval(substitute(xval))
y <- eval(substitute(yval))
data.frame(x,y,z)
}
userplot="2"
x <- switch(userplot,"1"=flight_sim( sin(z), 1-cos(z), seq(0,20, pi/32) ),
"2"=flight_sim( z*cos(6*z), z*sin(6*z), seq(0,10, pi/64) ),
"3"=flight_sim( sin(z)*cos(20*z), sin(z)*sin(20*z), seq(0,pi,pi/399)),
"4"=custom())
head(x)
# x y z
# 1 0.00000000 0.00000000 0.00000000
# 2 0.04697370 0.01424932 0.04908739
# 3 0.08162934 0.05454298 0.09817477
# 4 0.09342212 0.11383519 0.14726216
# 5 0.07513972 0.18140332 0.19634954
# 6 0.02405703 0.24425508 0.24543693
If I'm interpreting your question correctly, it seems like you'd need to redefine your function. To the best of my knowledge, you can't define an argument in the function definition as a function of another argument. You'd need to do that inside the body of the function. So you'd want something like this:
flight_sim <- function(userplot) {
if (userplot == "1") {
z <- seq(0, 20, pi / 32)
x <- sin(z)
y <- 1 - cos(z)
} else if (userplot == "2") {
z <- seq(0, 10, pi / 64)
x <- z * cos(6 * z)
y <- z * sin(6 * z)
} else if (userplot == "3") {
z <- seq(0, pi, pi / 399)
x <- sin(z) * cos(20 * z)
y <- sin(z) * sin(20 * z)
} else if (userplot == "4") {
x <- readline("Please enter a function for the x-value: ")
y <- readline("Please enter a function for the y-value: ")
z <- readline("Please enter a function for the z-value: ")
eval(parse(text = z)) # have to evaluate z first since x and y are functions of z
eval(parse(text = x))
eval(parse(text = y))
} else {
valid_response <- FALSE
while (!valid_response) {
userplot <- readline("Please enter a valid response (1-4): ")
if (userplot %in% 1:4) {
valid_response <- TRUE
flight_sim(userplot)
}
}
}
dat <- data.frame(x, y, z)
return(dat)
}
cat('Select the flight path you wish to plot from the list below :
1. Helix
2. Conical
3. Spherical
4. Define your own flight path...')
userplot <- readline('Enter number here : ') # Allow user to enter choice from above
dat <- flight_sim(userplot)
head(dat)
x y z
1 0.000000000000000000 0.000000000000000000 0.000000000000000000
2 0.046973698885313400 0.014249315773629733 0.049087385212340517
3 0.081629338302900922 0.054542980081485989 0.098174770424681035
4 0.093422122547587999 0.113835185692147969 0.147262155637021552
5 0.075139716235543288 0.181403322008714424 0.196349540849362070
6 0.024057025623845932 0.244255080177979672 0.245436926061702587
In the code above, I've also included one last else statement to catch inappropriate responses from your users. If they enter a choice that could break your code, it will now catch that and ask them to reenter their response.

R: Draw a polygon with conditional colour

I want to colour the area under a curve. The area with y > 0 should be red, the area with y < 0 should be green.
x <- c(1:4)
y <- c(0,1,-1,2,rep(0,4))
plot(y[1:4],type="l")
abline(h=0)
Using ifelse() does not work:
polygon(c(x,rev(x)),y,col=ifelse(y>0,"red","green"))
What I achieved so far is the following:
polygon(c(x,rev(x)),y,col="green")
polygon(c(x,rev(x)),ifelse(y>0,y,0),col="red")
But then the red area is too large. Do you have any ideas how to get the desired result?
If you want two different colors, you need two different polygons. You can either call polygon multiple times, or you can add NA values in your x and y vectors to indicate a new polygon. R will not automatically calculate the intersection for you. You must do that yourself. Here's how you could draw that with different colors.
x <- c(1,2,2.5,NA,2.5,3,4)
y <- c(0,1,0,NA,0,-1,0)
#calculate color based on most extreme y value
g <- cumsum(is.na(x))
gc <- ifelse(tapply(y, g,
function(x) x[which.max(abs(x))])>0,
"red","green")
plot(c(1, 4),c(-1,1), type = "n")
polygon(x, y, col = gc)
abline(h=0)
In the more general case, it might not be as easy to split a polygon into different regions. There seems to be some support for this type of operation in GIS packages, where this type of thing is more common. However, I've put together a somewhat general case that may work for simple polygons.
First, I define a closure that will define a cutting line. The function will take a slope and y-intercept for a line and will return the functions we need to cut a polygon.
getSplitLine <- function(m=1, b=0) {
force(m); force(b)
classify <- function(x,y) {
y >= m*x + b
}
intercepts <- function(x,y, class=classify(x,y)) {
w <- which(diff(class)!=0)
m2 <- (y[w+1]-y[w])/(x[w+1]-x[w])
b2 <- y[w] - m2*x[w]
ix <- (b2-b)/(m-m2)
iy <- ix*m + b
data.frame(x=ix,y=iy,idx=w+.5, dir=((rank(ix, ties="first")+1) %/% 2) %% 2 +1)
}
plot <- function(...) {
abline(b,m,...)
}
list(
intercepts=intercepts,
classify=classify,
plot=plot
)
}
Now we will define a function to actually split a polygon using the splitter we've just defined.
splitPolygon <- function(x, y, splitter) {
addnullrow <- function(x) if (!all(is.na(x[nrow(x),]))) rbind(x, NA) else x
rollup <- function(x,i=1) rbind(x[(i+1):nrow(x),], x[1:i,])
idx <- cumsum(is.na(x) | is.na(y))
polys <- split(data.frame(x=x,y=y)[!is.na(x),], idx[!is.na(x)])
r <- lapply(polys, function(P) {
x <- P$x; y<-P$y
side <- splitter$classify(x, y)
if(side[1] != side[length(side)]) {
ints <- splitter$intercepts(c(x,x[1]), c(y, y[1]), c(side, side[1]))
} else {
ints <- splitter$intercepts(x, y, side)
}
sideps <- lapply(unique(side), function(ss) {
pts <- data.frame(x=x[side==ss], y=y[side==ss],
idx=seq_along(x)[side==ss], dir=0)
mm <- rbind(pts, ints)
mm <- mm[order(mm$idx), ]
br <- cumsum(mm$dir!=0 & c(0,head(mm$dir,-1))!=0 &
c(0,diff(mm$idx))>1)
if (length(unique(br))>1) {
mm<-rollup(mm, sum(br==br[1]))
}
br <- cumsum(c(FALSE,abs(diff(mm$dir*mm$dir))==3))
do.call(rbind, lapply(split(mm, br), addnullrow))
})
pss<-rep(unique(side), sapply(sideps, nrow))
ps<-do.call(rbind, lapply(sideps, addnullrow))[,c("x","y")]
attr(ps, "side")<-pss
ps
})
pss<-unname(unlist(lapply(r, attr, "side")))
src <- rep(seq_along(r), sapply(r, nrow))
r <- do.call(rbind, r)
attr(r, "source")<-src
attr(r, "side")<-pss
r
}
The input is just the values of x and y as you would pass to polygon along with the cutter. It will return a data.frame with x and y values that can be used with polygon.
For example
x <- c(1,2,2.5,NA,2.5,3,4)
y <- c(1,-2,2,NA,-1,2,-2)
sl<-getSplitLine(0,0)
plot(range(x, na.rm=T),range(y, na.rm=T), type = "n")
p <- splitPolygon(x,y,sl)
g <- cumsum(c(F, is.na(head(p$y,-1))))
gc <- ifelse(attr(p,"side")[is.na(p$y)],
"red","green")
polygon(p, col=gc)
sl$plot(lty=2, col="grey")
This should work for simple concave polygons as well with sloped lines. Here's another example
x <- c(1,2,3,4,5,4,3,2)
y <- c(-2,2,1,2,-2,.5,-.5,.5)
sl<-getSplitLine(.5,-1.25)
plot(range(x, na.rm=T),range(y, na.rm=T), type = "n")
p <- splitPolygon(x,y,sl)
g <- cumsum(c(F, is.na(head(p$y,-1))))
gc <- ifelse(attr(p,"side")[is.na(p$y)],
"red","green")
polygon(p, col=gc)
sl$plot(lty=2, col="grey")
Right now things can get a bit messy when the the vertex of the polygon falls directly on the splitting line. I may try to correct that in the future.
A faster, but not very accurate solution is to split data frame to list according to grouping variable (e.g. above=red and below=blue). This is a pretty nice workaround for rather big (I would say > 100 elements) datasets. For smaller chunks some discontinuity may be visible:
x <- 1:100
y1 <- sin(1:100/10)*0.8
y2 <- sin(1:100/10)*1.2
plot(x, y2, type='l')
lines(x, y1, col='red')
df <- data.frame(x=x, y1=y1, y2=y2)
df$pos_neg <- ifelse(df$y2-df$y1>0,1,-1) # above (1) or below (-1) average
# create the number for chunks to be split into lists:
df$chunk <- c(1,cumsum(abs(diff(df$pos_neg)))/2+1) # first element needs to be added`
df$colors <- ifelse(df$pos_neg>0, "red","blue") # colors to be used for filling the polygons
# create lists to be plotted:
l <- split(df, df$chunk) # we should get 4 sub-lists
lapply(l, function(x) polygon(c(x$x,rev(x$x)),c(x$y2,rev(x$y1)),col=x$colors))
As I said, for smaller dataset some discontinuity may be visible if sharp changes occur between positive and negative areas, but if horizontal line distinguishes between those two, or more elements are plotted then this effect is neglected:

Resources