How to make the speed profile of a moving object? - r

I am an R beginner user and I face the following problem. I have the following data frame:
distance speed
1 61.0 36.4
2 51.4 35.3
3 42.2 34.2
4 33.4 32.8
5 24.9 31.3
6 17.5 28.4
7 11.5 24.1
8 7.1 19.4
9 3.3 16.9
10 0.5 15.5
11 4.4 15.1
12 8.5 15.5
13 13.1 17.3
14 18.8 20.5
15 25.7 24.1
16 33.3 26.3
17 41.0 27.0
18 48.7 27.7
19 56.6 28.4
20 64.8 29.2
21 73.6 31.7
22 83.3 34.2
23 93.4 35.3
The column distance represents the distance of a following object over a specific point and the column speed the object's speed. As you can see the object is getting closer to the point and then it is getting away. I am trying to make its speed profile. I tried the following code but it didn't give me the plot I want (because I want to show how its speed is changing when the moving object moves closer and past the reference point)
ggplot(speedprofile, aes(x = distance, y = speed)) + #speedprofile is the data frame
geom_line(color = "red") +
geom_smooth() +
geom_vline(xintercept = 0) # the vline is the reference line
The plot is the following:
Then, I tried to set the first 10 distances as negative manually which are prior to zero (0). So I get a plot closer to that I want:
But there is a problem. The distance can't be defined as negative.
To sum up, the expected plot is the following (and I am sorry for the quality).
Do you have any ideas on how to solve this?
Thank you in advance!

You can do something like this to auto-compute the change point (to know when the distance should be negative) and then set the axis labels to be positive.
Your data (in case anyone needs it to answer):
read.table(text="distance speed
61.0 36.4
51.4 35.3
42.2 34.2
33.4 32.8
24.9 31.3
17.5 28.4
11.5 24.1
7.1 19.4
3.3 16.9
0.5 15.5
4.4 15.1
8.5 15.5
13.1 17.3
18.8 20.5
25.7 24.1
33.3 26.3
41.0 27.0
48.7 27.7
56.6 28.4
64.8 29.2
73.6 31.7
83.3 34.2
93.4 35.3", stringsAsFactors=FALSE, header=TRUE) -> speed_profile
Now, compute the "real" distance (negative for approaching, positive for receding):
speed_profile$real_distance <- c(-1, sign(diff(speed_profile$distance))) * speed_profile$distance
Now, compute the X axis breaks ahead of time:
breaks <- scales::pretty_breaks(10)(range(speed_profile$real_distance))
ggplot(speed_profile, aes(real_distance, speed)) +
geom_smooth(linetype = "dashed") +
geom_line(color = "#cb181d", size = 1) +
scale_x_continuous(
name = "distance",
breaks = breaks,
labels = abs(breaks) # make all the labels for the axis positive
)
Provided fonts are working well on your system you could even do:
labels <- abs(breaks)
labels[(!breaks == 0)] <- sprintf("%s\n→", labels[(!breaks == 0)])
ggplot(speed_profile, aes(real_distance, speed)) +
geom_smooth(linetype = "dashed") +
geom_line(color = "#cb181d", size = 1) +
scale_x_continuous(
name = "distance",
breaks = breaks,
labels = labels,
)

Related

How ro draw a multiline plot in R

I have a dataframe with 6 features like this:
X1 X2 X3 X4 X5 X6
Modern Dog 9.7 21.0 19.4 7.7 32.0 36.5
Golden Jackal 8.1 16.7 18.3 7.0 30.3 32.9
Chinese Wolf 13.5 27.3 26.8 10.6 41.9 48.1
Indian Wolf 11.5 24.3 24.5 9.3 40.0 44.6
Cuon 10.7 23.5 21.4 8.5 28.8 37.6
Dingo 9.6 22.6 21.1 8.3 34.4 43.1
I want to draw a line plot like this:
I'm trying this:
plot(df$X1, type = "o",col = "red", xlab = "Month", ylab = "Rain fall")
lines(c(df$X2, df$X3, df$X4, df$X5, df$X6), type = "o", col = "blue")
But it's only plotting a single variable. I'm sorry if this question is annoying, i'm totally new to R and i just don't know how to get this done. I would really appreciate any help on this.
Thanks in advance
The easiest way would be to convert your dataset to a long format (e.g. by using the gather function in the tidyr package), and then plotting using the group aesthetic in ggplot.
I recreate your dataset, assuming your group variable is named "Group":
df <- read.table(text = "
Group X1 X2 X3 X4 X5 X6
Modern_Dog 9.7 21.0 19.4 7.7 32.0 36.5
Golden_Jackal 8.1 16.7 18.3 7.0 30.3 32.9
Chinese_Wolf 13.5 27.3 26.8 10.6 41.9 48.1
Indian_Wolf 11.5 24.3 24.5 9.3 40.0 44.6
Cuon 10.7 23.5 21.4 8.5 28.8 37.6
Dingo 9.6 22.6 21.1 8.3 34.4 43.1 ",
header = TRUE, stringsAsFactors = FALSE)
Then convert the dataset to long format and plot:
library(tidyr)
library(ggplot2)
df_long <- df %>% gather(X1:X6, key = "Month", value = "Rainfall")
ggplot(df_long, aes(x = Month, y = Rainfall, group = Group, shape = Group)) +
geom_line() +
geom_point() +
theme(legend.position = "bottom")
See also the answers here: Group data and plot multiple lines.

How to make frequency table with specific class in R

I want to make a frequency table with matrix "b", which has 100 observations.
I'm trying to make all observations are cut into 15 classes, so that for example the frequency table should include some 'empty classes' that no observations are included.
However, when I use function table(), classes(or levels) are included whose observations are not empty. ( 12 levels)
How can I force them to have 15 levels?
> b <- matrix(as.matrix(b),ncol=1)
> fivenum(b)
[1] 24.2 24.7 24.9 25.1 25.6
> bcut <- seq(from = 24.2, by =0.1, length.out = 16); bcut
[1] 24.2 24.3 24.4 24.5 24.6 24.7 24.8 24.9 25.0 25.1 25.2 25.3
[13] 25.4 25.5 25.6 25.7
> bgroup <- factor(cut(x = b, breaks = bcut, include.lowest = T))
> levels(bgroup)
[1] "[24.2,24.3]" "(24.3,24.4]" "(24.4,24.5]" "(24.6,24.7]"
[5] "(24.7,24.8]" "(24.8,24.9]" "(24.9,25]" "(25.1,25.2]"
[9] "(25.2,25.3]" "(25.3,25.4]" "(25.4,25.5]" "(25.6,25.7]"

R - Use a value in a variable to conditionally select multiple values from other variables

I am trying to compare colours in 3D CIELuv colourspace, and I want to identify the L, U, and V values for the colour that is closest to my primary colour of interest. I have calculated the Euclidean distance between each source colour (represented by the three coordinates, L, U, and V, for each colour) and the primary colour (for which I also have the LUV coordinates, not shown for space). The distances between each colour and the primary colour are stored in the three DistCol variables. I then found the smallest of these distances using df$Min.Dist <- colnames(df[c(10:12)])[unlist(apply(df[c(10:12)], 1, which.min))]. Example:
Colour1L Colour1U Colour1V Colour2L Colour2U Colour2V Colour3L Colour3U Colour3V DistCol1 DistCol2 DistCol3 Min.Dist
1 25.5 9.0 -54.5 98.8 0.0 -1.6 63.9 55.0 60.2 25.4 82.1 137.8 DistCol1
2 8.7 14.8 5.6 41.7 133.2 27.6 41.7 133.2 27.6 144.2 58.3 133.3 DistCol2
3 83.2 24.7 -42.7 21.6 -0.4 0.8 83.2 24.7 -42.7 12.1 170.6 102.3 DistCol1
4 55.0 -49.8 62.5 99.2 0.1 -1.8 55.0 -49.8 62.5 213.7 103.4 67.7 DistCol3
I want to use the Min.Dist variable (or any other method really, if there's a better way!) to conditionally select all three L, u, and v values for whichever colour is the closest. That is, in the first row, Min.Dist is DistCol1, so the three Source values would all come from the three Colour1 columns. My final output would ideally look like:
Colour1L Colour1U Colour1V Colour2L Colour2U Colour2V Colour3L Colour3U Colour3V DistCol1 DistCol2 DistCol3 Min.Dist SourceL SourceU SourceV
1 25.5 9.0 -54.5 98.8 0.0 -1.6 63.9 55.0 60.2 25.4 82.1 137.8 DistCol1 25.5 9.0 -54.5
2 8.7 14.8 5.6 41.7 133.2 27.6 41.7 133.2 27.6 144.2 58.3 133.3 DistCol2 41.7 133.2 27.6
3 83.2 24.7 -42.7 21.6 -0.4 0.8 83.2 24.7 -42.7 12.1 170.6 102.3 DistCol1 83.2 24.7 -42.7
4 55.0 -49.8 62.5 99.2 0.1 -1.8 55.0 -49.8 62.5 213.7 103.4 67.7 DistCol3 55.0 -49.8 62.5
I have previously obtained a similar result using a long nested ifelse expression for each of the L, U, and V dimensions e.g. df$SourceL <- ifelse(df$Min.Dist =="DistCol1", Colour1L, ifelse(df$Min.Dist == "DistCol2", Colour2L, ifelse(... but I'm dealing with 8-10 colours in my real data and this is extremely tedious and prone to error.
I apologise if this question has already been answered elsewhere, and would very much appreciate any advice or direction to a resource for this. Thank you as well to everyone who answers questions on this forum - your advice has been invaluable for solving many R problems over the past months!
Doing this in a non-vectorized way with base R:
rebuilding your data.frame:
df <- data.frame(c(25.5,8.7,83.2,55),c(9,14.8,24.7,-49.8),c(-54.5,5.6,-42.7,62.5), c(98.8,41.7,21.6,99.2),c(0,133.2,-0.4,0.1),c(-1.6,27.6,0.8,-1.8),c(63.9,41.7,83.2,55),c(55,133.2,24.7,-49.8),c(60.2,27.6,-42.7,62.5),c(25.4,144.2,12.1,213.7),c(82.1,58.3,170.6,103.4),c(137.8,133.3,102.3,67.7),c("DistCol1","DistCol2","DistCol1","DistCol3"))
colnames(df) <- c("Colour1L", "Colour1U", "Colour1V", "Colour2L", "Colour2U", "Colour2V", "Colour3L", "Colour3U", "Colour3V", "DistCol1", "DistCol2", "DistCol3", "Min.Dist")
Looping over rows
for (i in 1:length(df$Colour1L)) {
df$SourceL[i] <- df[[paste0("Colour",substr(df$Min.Dist,8,8)[i],"L")]][i]
df$SourceU[i] <- df[[paste0("Colour",substr(df$Min.Dist,8,8)[i],"U")]][i]
df$SourceV[i] <- df[[paste0("Colour",substr(df$Min.Dist,8,8)[i],"V")]][i]
}

Density plot in ggplot2 with y value taken into account

My data contains x axis points and y value for each x axis point. The x axis points are not evenly distributed. I need to visualize how the x axis points are clustered and how does the y value appears for such clusters. To see how the x values are clustered I can plot density plot on x value, however it does not reflect the y values at that cluster.
for example- if 100 points (lets say) on x axis are very close to each other and all has positive y value I want my plot go up at that point, if those 100 points has negative y value I want my plot go down the zero line in plot, if those 100 points has both positive and negative y values I want my plot be around zero point. Similarly, even if the those 100 points all has positive value, if they are scattered along long distance I want the plot be near the zero line.
In short, density of x points and its y value both matters to me and I want to plot smooth line. Could anyone help me with this?(stat_smooth did not do the work as it makes my plot almost straight line)
here are my x and y axis values (I did not know how to insert table here)
x axis values
x_value
86645
87018
987522
989433
989934
991055
995476
9987548
9987885
9988511
9988522
9991975
9992246
9992428
9993646
9993668
9994285
9994309
9994317
9994425
9994437
9994581
9994856
9994878
9995045
9995072
9995103
9995142
9995153
9995521
9996329
9996568
9997122
9997269
9997277
9997282
9998216
9999596
9999838
10001799
10004506
10007993
10008597
10009002
10009022
10009225
10009530
10009657
10010526
10012288
10012897
10012899
10012901
10014614
10014903
10015001
10015039
10015059
10015340
10015342
10016761
10018152
10020062
10024053
10024058
10024284
10024318
10025853
10026758
10028903
10029674
10029835
10030862
10031185
10031737
10033603
10035054
10035100
10036294
10036678
10036691
10036698
10036783
10037234
10037289
10037388
10039332
10039431
10042426
10042469
10042471
10043156
10043218
10043225
10045396
10045986
10046533
10046604
10047066
10047179
10047865
10048106
10048136
10048873
10049328
10049724
10049961
10049974
10050014
10050020
10050039
10050041
10050450
10050451
10050558
10050561
10051330
10051336
10052228
Y axix values:
y_value
16.7
14.3
10.5
18.2
20.0
16.7
14.3
10.4
27.3
22.2
11.1
-18.2
-10.1
-13.3
-26.4
-13.3
-15.4
14.3
15.4
11.7
26.7
18.2
64.7
21.2
20.0
11.8
-17.9
25.0
14.2
20.0
18.2
12.5
12.5
10.5
11.1
12.5
14.3
-20.0
12.5
-20.0
16.7
13.3
18.2
20.0
30.0
20.0
11.8
-18.8
20.0
20.0
12.5
18.8
13.3
-15.4
18.2
18.9
28.6
20.0
12.5
16.1
15.4
10.5
13.3
29.7
23.1
18.2
14.3
12.5
12.5
16.7
11.1
20.0
18.2
18.2
13.2
13.3
11.8
15.4
14.3
23.8
18.2
33.3
18.2
-12.5
12.5
23.1
21.7
14.3
16.7
11.1
16.7
12.5
11.1
12.5
18.2
12.5
11.0
20.0
18.2
15.8
10.5
10.2
10.5
14.3
11.8
25.0
13.8
16.4
16.7
-18.2
18.2
16.7
18.2
18.2
11.8
12.5
14.3
17.9
10.5
Note: In what follows, I've combined your x and y data into a data frame df with columns x and y.
Looking at a simple scatter plot, it appears that your data is grouped more or less into five clusters:
with(df,plot(x,y))
To see the distribution in both the x and y-direction you need a 2-dimensional kernal density estimate, which is available in package MASS. You can then plot this in 3 dimensions (with the density as z) using the rgl package.
library(MASS) # for kde2d(...)
library(rgl) # for open3d(...) and surface3d(...)
dens <- kde2d(df$x,df$y)
zlim <- range(dens$z)
palette <- rev(heat.colors(10))
col <- palette[9*(dens$z-zlim[1])/diff(zlim) + 1] # assign colors to heights for each point
with(dens,open3d(scale=c(x=1/diff(range(x)),y=1/diff(range(y)),z=1/diff(range(z)))))
with(dens,surface3d(x,y,z, color=col))
title3d(xlab="X",ylab="Y")

Changing ggplot2 legend title without altering graphical parameters

I have found many topics about the legend title with ggplot2 but after a couple of hours I have not been able to handle my situation.
Here is the dataset:
> dat
FACTOR1 FACTOR2 lsmean lower.CL upper.CL
1 A aa 26.2 25.6 26.8
2 B aa 24.8 23.9 25.7
3 A bb 26.0 25.2 26.7
4 B bb 24.9 23.9 25.9
5 A cc 24.4 23.9 24.8
6 B cc 23.9 22.9 25.0
7 A dd 24.9 24.3 25.6
8 B dd 23.2 22.3 24.0
And the graphic of interest:
gp0 <- ggplot(dat, aes(x=FACTOR2, y=lsmean, group=FACTOR1, colour=FACTOR1))
( gp1 <- gp0 + geom_line(aes(linetype=FACTOR1), size=.6) +
geom_point(aes(shape=FACTOR1), size=3) +
geom_errorbar(aes(ymax=upper.CL, ymin=lower.CL), width=.1) +
geom_errorbar(aes(ymax=upper.CL, ymin=lower.CL), width=.1) )
If I use scale_colour_manual() to change the legend title then I get an unexpected additional legend:
gp1 + scale_colour_manual("NEW TITLE",values=c("red","blue"))
I suppress this additional legend with scale_"aes"_manual("guide=none", values=...) but I don't understand how to control the parameters (the style of points and lines):
gp1 + scale_colour_manual("NEW TITLE",values=c("red","blue")) +
scale_shape_manual(guide = 'none', values=c(1,2)) +
scale_linetype_manual(guide = 'none', values=c(1,3))
Please how to reproduce the first plot with and only with a new legend title ?
You have to set the same title for all aes() attributes you have used, for example, using function labs().
gp1 + scale_colour_manual(values=c("red","blue"))+
labs(colour="NEW TITLE",linetype="NEW TITLE",shape="NEW TITLE")

Resources