Creating 3D scatter plot in R? - r

I have my data in the variable data:
data = read.csv("datafile.csv")
datafile.csv is of the form:
x1,y1,z1
x2,y2,z2
.....
xn,yn,zn
How do I create a 3D scatter plot? (the scale etc. should be automatically taken care of).

Let's simulate a data example.
#create data observations for x, y and z
x = c(10,09,03,04,05)
y = c(08,04,07,08,09)
z = c(15,10,11,09,09)
#join vectors x, y and z directly into a data.frame as suggested by #thelatemail.
data=data.frame(x,y,z)
The object data is supposed to simulate the data you have. See it below
data
x y z
1 10 8 15
2 9 4 10
3 3 7 11
4 4 8 9
5 5 9 9
The answer:
library(scatterplot3d)
scatterplot3d(data$x,data$y,data$z)
See ?scatterplot3d to explore other arguments inside this function.

Related

Grouping Set of Points to a Pre Defined Point

I'm looking to create a model that classifies a set of points that are near a pre-defined point.
For example, let's say I have points:
X
Y
1
1
1
2
1
3
2
1
2
3
3
1
3
2
3
3
6
6
8
7
8
5
9
3
10
7
My goal is to identify which points are closest to predefined point (2,2) and ideally output which points those are.
I tried using KNN, but I could not figure out how to get the KNN model to train results near (2,2). Any guidance to how I may accomplish this would be awesome. :)
Plot of Points
df <- data.frame( x = c(1,1,1,2,2,2,3,3,3,6,8,8,9,10), y = c(1,2,3,1,2,3,1,2,3,6,7,5,3,7))
df
goal_point <- c(x=2,y=2)
goal_point
You might approach this by calculating distance from goal as a feature.
df$dist = sqrt((df$x - goal_point["x"])^2 +
(df$y - goal_point["y"])^2)
df$clust = kmeans(df, 2)$cluster
library(ggplot2)
ggplot(df, aes(x, y, color = clust)) +
geom_point()
In this case kmeans is using x, y, and distance from goal. You could also use just distance from goal by using df$clust = kmeans(df[,3], 2)$cluster, which would lead here to the same clustering.

How to interpolate a single point where line crosses a baseline between two points [duplicate]

This question already has answers here:
get x-value given y-value: general root finding for linear / non-linear interpolation function
(2 answers)
Closed 3 years ago.
I am new to R but I am trying to figure out an automated way to determine where a given line between two points crosses the baseline (in this case 75, see dotted line in image link below) in terms of the x-coordinate. Once the x value is found I would like to have it added to the vector of all the x values and the corresponding y value (which would always be the baseline value) in the y value vectors. Basically, have a function look between all points of the input coordinates to see if there are any linear lines between two points that cross the baseline and if there are, to add those new coordinates at the baseline crossing to the output of the x,y vectors. Any help would be most appreciated, especially in terms of automating this between all x,y coordinates.
https://i.stack.imgur.com/UPehz.jpg
baseline = 75
X <- c(1,2,3,4,5,6,7,8,9,10)
y <- c(75,53,37,25,95,35,50,75,75,75)
Edit: added creation of combined data frame with original data + crossing points.
Adapted from another answer related to two intersecting series with uniform X spacing.
baseline = 75
X <- c(1,2,3,4,5,6,7,8,9,10)
Y1 <- rep(baseline, 10)
Y2 <- c(75,53,37,25,95,35,50,75,75,75)
# Find points where x1 is above x2.
above <- Y1>Y2
# Points always intersect when above=TRUE, then FALSE or reverse
intersect.points<-which(diff(above)!=0)
# Find the slopes for each line segment.
Y2.slopes <- (Y2[intersect.points+1]-Y2[intersect.points]) /
(X[intersect.points+1]-X[intersect.points])
Y1.slopes <- rep(0,length(Y2.slopes))
# Find the intersection for each segment
X.points <- intersect.points + ((Y2[intersect.points] - Y1[intersect.points]) / (Y1.slopes-Y2.slopes))
Y.points <- Y1[intersect.points] + (Y1.slopes*(X.points-intersect.points))
# Plot.
plot(Y1,type='l')
lines(Y2,type='l',col='red')
points(X.points,Y.points,col='blue')
library(dplyr)
combined <- bind_rows( # combine rows from...
tibble(X, Y2), # table of original, plus
tibble(X = X.points,
Y2 = Y.points)) %>% # table of interpolations
distinct() %>% # and drop any repeated rows
arrange(X) # and sort by X
> combined
# A tibble: 12 x 2
X Y2
<dbl> <dbl>
1 1 75
2 2 53
3 3 37
4 4 25
5 4.71 75
6 5 95
7 5.33 75
8 6 35
9 7 50
10 8 75
11 9 75
12 10 75

Color the individuals of a R PCoA plot by groups

Should be a simple question, but I haven't found exactly how to do it so far.
I have a matrix as follow:
sample var1 var2 var3 etc.
1 5 7 3 1
2 0 1 6 8
3 7 6 8 9
4 5 3 2 4
I performed a PCoA using Vegan and plotted the results. Now my problem is that I want to color the samples according to a pre-defined group:
group sample
1 1
1 2
2 3
2 4
How can I import the groups and then plot the points colored according to the group tey belong to? It looks simple but I have been scratching my head over this.
Thanks!
Seb
You said you used vegan PCoA which I assume to mean wcmdscale function. The default vegan::wcmdscale only returns a scores matrix similarly as standard stats::cmdscale, but if you added some special arguments (such as eig = TRUE) you get a full wcmdscale result object with dedicated plot and points methods and you can do:
plot(<pcoa-result>, type="n") # no reproducible example: edit like needed
points(<pcoa-result>, col = group) # no reproducible example: group must be visible
If you have a modern vegan (2.5.x) the following also works:
library(magrittr)
plot(<full-pcoa-result>, type = "n") %>% points("sites", col = group)

2 y-axes Dumbbell ggplot2

enter image description hereI am quite new to R and programming in general. So please forgive my ignorance, I am trying to learn.
I have two sets of data and I would like to plot them against each other. Both have 27 rows and 3 columns; one set is called "range" and the other is called "rangePx".
Column “Comp” has the different components, column “Min” is the minimum concentration in % and column “Max” is the maximum concentration in %.
I want to make a 2-y axis dumbbell plot, with the y axis being the different components and x axis being the concentration.
I do manage to create 1 y axis dumbbell plot, but I have troubles to add the second y axis.
Here is a snap from the "range" data
head(range)
# A tibble: 6 x 3
Comp Min Max
<chr> <dbl> <dbl>
1 Methane 0.0100 100
2 Ethane 0.0100 65.0
3 Ethene 0.100 20.0
4 Propane 0.0100 40.0
5 Propene 0.100 6.00
6 Propadien 0.0500 2.00
and here is a snap from the "rangePx" data
head(rangePx)
# A tibble: 6 x 3
Comp Min Max
<chr> <dbl> <dbl>
1 Methane 50.0 100
2 Ethane 0.00800 14.0
3 Ethene 0 0
4 Propane 0.00800 8.00
5 Propene 0 0
6 Propadien 0 0
Here is the piece of code that I use:
library(ggplot2)
library(ggalt)
library(readxl)
theme_set(theme_classic())
range <- read_excel(range.xlsx)
rangePx <- read_excel(rangePx.xlsx")
p <- ggplot(range, aes(x=Max, xend=Min, y = Comp, group=Comp))
p <- p + geom_dumbbell(color="blue")
p
px <- ggplot(rangePx, aes(x=Max, xend=Min, y = Comp, group=Comp))
px <- px + geom_dumbbell(color="green")
p <- p + geom_dumbbell(aes(y=px, color="red"))
p
and here is the complain I get when I call p:
Error: Aesthetics must be either length 1 or the same as the data (27): y, colour, x, xend, group
Here I saw a 6x3 data frame but my original data are 27x3
can anyone help me?
Thnx in advance
ggplot2 does not have the ability to plot 2 y-axes - this is an intentional decision by Hadley Wickham who wrote the package. You can see his response to a similar question here where he comments on his reasons for not including:
Plot with 2 y axes, one y axis on the left, and another y axis on the right
As mentioned in the comments and in reply to the question, if you want to use ggplot2 you have to use faceting to compare. Otherwise you need to use a different plotting package.

Simple scatter plot of paired data of diff lengths, R

I have a set of paired data, x and y, that I want to plot but they are of varying lengths due to some NA values in y. How can I plot x and y only where there is data present in both variables?
x y
10 1
2 3
4 NA # not plotted
10 40
try - plot(na.pass(df)) might be useful in this case.

Resources