Connect observations with lines to a common point (i.e. (0,0)) - r

I would like to connect observations from my df with a common point, i.e. the centerpoint (0,0) using ggplot2.
x y
1 5 4
2 -4 -2
3 -1 5
4 2 -8
Using geom_point(), I get the following.
Now, I would like to have lines connecting the four observations with the centerpoint at (0,0), like in the following (not made with R):
Is this possible at all using ggplot2?

I found a solution:
ggplot(df) + geom_point(aes(x,y)) + geom_segment(aes(xend=0, yend=0))
Answer based on #roland comments on a question.

Related

Grouping Set of Points to a Pre Defined Point

I'm looking to create a model that classifies a set of points that are near a pre-defined point.
For example, let's say I have points:
X
Y
1
1
1
2
1
3
2
1
2
3
3
1
3
2
3
3
6
6
8
7
8
5
9
3
10
7
My goal is to identify which points are closest to predefined point (2,2) and ideally output which points those are.
I tried using KNN, but I could not figure out how to get the KNN model to train results near (2,2). Any guidance to how I may accomplish this would be awesome. :)
Plot of Points
df <- data.frame( x = c(1,1,1,2,2,2,3,3,3,6,8,8,9,10), y = c(1,2,3,1,2,3,1,2,3,6,7,5,3,7))
df
goal_point <- c(x=2,y=2)
goal_point
You might approach this by calculating distance from goal as a feature.
df$dist = sqrt((df$x - goal_point["x"])^2 +
(df$y - goal_point["y"])^2)
df$clust = kmeans(df, 2)$cluster
library(ggplot2)
ggplot(df, aes(x, y, color = clust)) +
geom_point()
In this case kmeans is using x, y, and distance from goal. You could also use just distance from goal by using df$clust = kmeans(df[,3], 2)$cluster, which would lead here to the same clustering.

divide not rectangle plot into subplots within spatstat package in R

I have data that contains information about sub-plots with different numbers and their corresponding species types (more than 3 species within each subplot). Every species have X & Y coordinates.
> df
subplot species X Y
1 1 Apiaceae 268675 4487472
2 1 Ceyperaceae 268672 4487470
3 1 Vitaceae 268669 4487469
4 2 Ceyperaceae 268665 4487466
5 2 Apiaceae 268662 4487453
6 2 Magnoliaceae 268664 4487453
7 3 Magnoliaceae 268664 4487453
8 3 Apiaceae 268664 4487456
9 3 Vitaceae 268664 4487458
with these data, I have created ppp for the points of each subplot within a window of general plot (big).
grp <- factor(data$subplot)
win <- ripras(data$X, data$Y)
p.p <- ppp(data$X, data$Y, window = window, marks = grp)
Now I want to divide a plot into equal 3 x 3 sub-plots because there are 9 subplots. The genetal plot is not rectangular looks similar to rombo shape when I plot.
I could use quadrats() funcion as below but it has divided my plot into unequal subplots. Some are quadrat, others are traingle etc which I don't want. I want all the subplots to be equal sized quadrats (divide it by lines that paralel to each sides). Can you anyone guide me for this?
divide <-quadrats(p.patt,3,3)
plot(divide)
Thank you!
Could you break up the plot canvas into 3x3, then run each plot?
> par(mfrow=c(3,3))
> # run code for plot 1
> # run code for plot 2
...
> # run code for plot 9
To return back to one plot on the canvas type
> par(mfrow=c(1,1))
This is a question about the spatstat package.
You can use the function quantess to divide the window into tiles of equal area. If you want the tile boundaries to be vertical lines, and you want 7 tiles, use
B <- quantess(Window(p.patt), "x", 7)
where p.patt is your point pattern.

Create heatmap in R using stat_density2d

I have several (x,y) coordinates, and each one is associated with a binary value (either 1 or 0). I want to create a heatmap showing what the probability is at each point that a given point in that location will have a 1 associated with it.
Sample data:
data = read.table(header=TRUE,
text="x y value
7 3 0
4 5 0
3 7 1
3 6 0
4 5 1
5 6 0")
And so on. I can create a plot showing where the points are concentrated using the following:
ggplot(data, aes(x=x,y=y)) + stat_density2d(aes(fill=..level..), geom="polygon")
But when I try to set fill = value, I get the following error:
Error in unit(tic_pos.c, "mm") : 'x' and 'units' must have length > 0
How do I do this?
Edit: I should add that I can easily accomplish this using stat_summary2d or even geom_tile, but it looks much more boxy and less smooth, which I want it to be.

Too many x-axis values on ggplot2 scatterplot

I would be very thankful for anyone with advice on this. I think this is a similar to question to one previously posted here (Too many factors on x axis).
I have a dataset as follows:
> head(outputDF)
var1 var2 snpR stepD
1 A B 1.55809163171629 6
2 A C 1.57475543745267 6
3 A D 1.36003481988361 4
4 A E 1.60338829251054 4
5 A F 1.54720598772132 5
6 B C 1.10321616677002 2
I have a nice scatterplot from the function:
ggplot(outputDF, aes(x=snpR, y=stepD)) +geom_point(shape=1) +xlab("SNPR Distance") +
ylab("StepD Distance")
But the problem is that since there are so many distinct snpR values on the x-axis, the x-axis numbers are unreadable, and there are too many vertical grids coming off each of these x-axis number labels.
I know it is a trick with scale_x_continuous but I am just lost playing around with it...

Plot points for every 15 minutes

I have a text file having the numbers(of float type) which represents time in seconds. I wish to represent the number of occurances every 15 minutes. The sample of my file is:
0.128766
2.888977
25.087900
102.787657
400.654768
879.090874
903.786754
1367.098789
1456.678567
1786.564569
1909.567567
for first 900 seconds(15 minutes), there are 6 occurances. I want to plot that point on y axis first. Then from 900-1800(next 15 minutes), there are 4 occurances. So, i want to plot 4 on my y-axis next. This should go on...
I know the basic plot() function, but i don't know how to plot every 15 minutes. If there is a link present, please guide me to that link.
Use findInterval():
counts <- table(findInterval(x, seq(0, max(x), 900)))
counts
1 2 3
6 4 1
It's easy to plot:
plot(counts)
To build on Andrie's answer. You can add plot(counts, type = 'p') to plot points or plot(counts, type = 'l') to plot a connected line. If you want to plot a curve for the counts you would need to model it using ?lm or ?nls.

Resources