I prepare a data.frame as follow;
#create dataframe
df <-data.frame(x = c(rnorm(300, 3, 2.5), rnorm(150, 7, 2)), # create random data
y = c(rnorm(300, 6, 2.5), rnorm(150, 2, 2)),
z = c(rnorm(300, 6, 2.5), rnorm(150, 2, 2)),
group = c(rep('A', 300), rep('B', 150))) # add two groups
The relationship between y and x is like below, when showing by ggplot2;
#for y-x correlation by group with fit curve
gg <- ggplot(df, aes(x=x, y=y)) +
stat_density_2d(geom = "polygon", aes(alpha = ..level..,fill=group))+
geom_smooth(method = 'loess')
print(gg)
Then, I created plot_ly 3D figure as follows;
#plot_ly 3D plot
s = interp(x = df$x, y = df$y, z = df$z,duplicate = "mean") # prepare for plot_ly plot
p <- plot_ly(x = s$x, y = s$y, z = s$z,colorscale = 'Jet')%>% # plot_ly
add_surface()
, which created a graph as below;
Then, here is the question.
I would like to add the first ggplot2 figure at the bottom of the second plot_ly figure, like as below;
Is there any way (function or package) to accomplish this with R?
Related
I am trying to plot rectangles in the x-axis for different classes in the y-axis. I want to do this with geom_rect, but I don't want to use y_min and y_max since I want these to be determined by the classes (i.e. factors) I have in my data.
I managed to get the plot I want changing the breaks and the tick labels manually, but I am sure there must be a better way to do this.
Small toy example:
data <- data.frame(x_start = c(0, 2, 4, 6),
x_end = c(1, 3, 5, 7),
y_start = c(0, 0, 2, 2),
y_end = c(1, 1, 3, 3),
info = c("x", "x", "y", "y"))
Original plot:
ggplot(data ,aes(xmin=x_start, xmax=x_end, ymin=y_start, ymax=y_end, fill=info)) + geom_rect()
Plot that I want:
ggplot(data ,aes(xmin=x_start, xmax=x_end, ymin=y_start, ymax=y_end, fill=info)) + geom_rect() +
scale_y_continuous(breaks = c(0.5,2.5), labels = c("x","y"))
library(dplyr)
y_lab <- data %>%
distinct(y_end, y_start, info) %>%
mutate(y_mid = (y_end + y_start)/2)
ggplot(data, aes(xmin=x_start, xmax=x_end, ymin=y_start, ymax=y_end, fill=info)) +
geom_rect() +
scale_y_continuous(breaks = y_lab$y_mid, labels = y_lab$info)
Or using geom_tile:
ggplot(data, aes(x = (x_start + x_end)/2, y = info, fill=info, width = 1)) +
geom_tile()
Below is a simple code to produce stat_density_2d plot of X~Y.
plot_data <-
data.frame(X = c(rnorm(300, 3, 2.5), rnorm(150, 7, 2)),
Y = c(rnorm(300, 6, 2.5), rnorm(150, 2, 2)),
Z = c(rnorm(300, 60, 5), rnorm(150, 40, 5)),
Label = c(rep('A', 300), rep('B', 150)))
ggplot(plot_data, aes(x=X, y=Y)) +
stat_density_2d(geom = "polygon", aes(alpha = stat(level), fill = Label))
As I understand, the density is based on the count of X~Y. My question is can I use Z as the density? Or perhaps, add Z as weight to the density? I'm not sure I'm making sense here. As it is the density of X~Y is useful for me. But I'm just wondering if I can add the information in Z into the density of X~Y.
Perhaps you have alternative idea? Both the density of X~Y and Z are information that I want to convey. Currently I'm separating them into separate density X~Y, X~Z, Y~Z, and they're all useful to me (using my data of course).
Edit2, Manual Calculation Plan:
I'm still working on this as I go. This is a general idea of what I'm planning to do.
Instead of using stat_density_2d, I plan to calculate the density itself using the method used in stat_density_2d, which is MASS::kde2d().
I would then use interpolation such as akima::interp, to interpolate the Z into X~Y grid.
I would then multiply Z unto the density of X~Y (from 1) as a form of weightage.
Plot them again using ggplot.
Edit 3: Update with code of applying Z as weightage towards density of X~Y.
library(ggplot2)
library(data.table)
library(ggnewscale)
library(akima)
library(MASS)
plot_data <-
data.frame(X = c(rnorm(300, 3, 2.5), rnorm(150, 7, 2)),
Y = c(rnorm(300, 6, 2.5), rnorm(150, 2, 2)),
Z = c(rnorm(300, 60, 5), rnorm(150, 60, 5)),
Label = c(rep('A', 300), rep('B', 150)))
setDT(plot_data)
#Interpolation of Z into X~Y grid for Label A
int_plot_data_A=with(plot_data[Label=="A"],interp(x=X,y=Y,z=Z,nx=100,ny=100))
rownames(int_plot_data_A$z)=int_plot_data_A$x
colnames(int_plot_data_A$z)=int_plot_data_A$y
plot_data_Z_A <- melt(int_plot_data_A$z)
names(plot_data_Z_A) <- c("X", "Y", "Z")
#Calculation of kde2d for Label A
plot_data_A=plot_data[Label=="A"]
kde2d_A=kde2d(plot_data_A$X,plot_data_A$Y,n=100)
rownames(kde2d_A$z)=kde2d_A$x
colnames(kde2d_A$z)=kde2d_A$y
plot_kde_A <- melt(kde2d_A$z, na.rm = TRUE)
names(plot_kde_A) <- c("X", "Y", "Z")
#Interpolation of Z into X~Y grid for Label B
int_plot_data_B=with(plot_data[Label=="B"],interp(x=X,y=Y,z=Z,nx=100,ny=100))
rownames(int_plot_data_B$z)=int_plot_data_B$x
colnames(int_plot_data_B$z)=int_plot_data_B$y
plot_data_Z_B <- melt(int_plot_data_B$z)
names(plot_data_Z_B) <- c("X", "Y", "Z")
#Calculation of kde2d for Label B
plot_data_B=plot_data[Label=="B"]
kde2d_B=kde2d(plot_data_B$X,plot_data_B$Y,n=100)
rownames(kde2d_B$z)=kde2d_B$x
colnames(kde2d_B$z)=kde2d_B$y
plot_kde_B <- melt(kde2d_B$z, na.rm = TRUE)
names(plot_kde_B) <- c("X", "Y", "Z")
#Filtering out values under 0.01. It makes the plot better. This is subjective
setDT(plot_kde_A)
plot_kde_A[Z<0.01]=NA
setDT(plot_kde_B)
plot_kde_B[Z<0.01]=NA
#Calculate for A weighted with Z
plot_kde_A_Weight_Z=plot_kde_A
plot_kde_A_Weight_Z$Z=plot_kde_A_Weight_Z$Z*plot_data_Z_A$Z
#Calculate for B weighted with Z
plot_kde_B_Weight_Z=plot_kde_B
plot_kde_B_Weight_Z$Z=plot_kde_B_Weight_Z$Z*plot_data_Z_B$Z
ggplot() +
geom_contour_fill(data=plot_kde_A,aes(x=X,y=Y,z=Z),alpha=0.8,bins=10) +
scale_fill_continuous(low = "white", high = "blue") +
geom_contour(data=plot_kde_A_Weight_Z,aes(x=X,y=Y,z=Z),bins=10) +
new_scale_fill() +
geom_contour_fill(data=plot_kde_B,aes(x=X,y=Y,z=Z),alpha=0.8,bins=10) +
scale_fill_continuous(low = "white", high = "red") +
geom_contour(data=plot_kde_B_Weight_Z,aes(x=X,y=Y,z=Z),color="red",bins=10)
Edit 1: After searching around while dropping keyword ggplot, I found something called kernel density estimation And this post, Plot contours of distribution on all three axes in 3D plot, feels like this is what I'm looking for to visualise my data. Unfortunately, I found out that ggplot does not have 3D functionality. There's a 4 years package called gg3D? plotly seems to be the best candidate for this? The final figure in the post looks like what I'm trying to achieve.
In R, using gganimate, one can make an animated plot where events appear and disappear with time. For example:
library(lubridate)
library(gganimate)
df=data.frame(
x=c(1,2,3,4),
y=c(1,2,3,4),
start=c(1,2,3,4),
end=c(5,6,7,8),
en=as_date(1),
ex=as_date(1))
ggplot(data=df, aes(x=x,y=y))+
geom_point()+
gganimate::transition_events(
start=start,
end=end,
enter_length = as.numeric(en),
exit_length = as.numeric(ex))
This produces a plot in which points appear according to column "start" and desappear according to column "end".
I wonder if there is an easy way to achieve the same in with plotly (preferably using ggplotly()), getting a slider to move along the time.
Here is an example using ggplotly. The result however isn't exactly the same:
library(plotly)
library(lubridate)
df = data.frame(
x = c(1, 2, 3, 4),
y = c(1, 2, 3, 4),
start = c(1, 2, 3, 4),
end = c(5, 6, 7, 8),
en = as_date(1),
ex = as_date(1)
)
frame_list <- Map(seq, from = df$start, to = df$end)
DF <- data.frame(x = rep(df$x, times = lengths(frame_list)),
y = rep(df$y, times = lengths(frame_list)),
frame = unlist(frame_list))
p <- ggplot(DF, aes(x, y)) +
geom_point(aes(size = y, frame = frame))
fig <- ggplotly(p)
fig %>%
animation_opts(
frame = 0,
easing = "linear",
redraw = FALSE,
mode = "immediate"
)
fig
I'm trying to visualize some data that looks like this
line1 <- data.frame(x = c(4, 24), y = c(0, -0.42864), group = "group1")
line2 <- data.frame(x = c(4, 12 ,24), y = c(0, 2.04538, 3.4135), group = "group2")
line3 <- data.frame(x = c(4, 12, 24), y = c(0, 3.14633, 3.93718), group = "group3")
line4 <- data.frame(x = c(0, 3, 7, 12, 18), y = c(0, -0.50249, 0.11994, -0.68694, -0.98949), group = "group4")
line5 <- data.frame(x = c(0, 3, 7, 12, 18, 24), y = c(0, -0.55753, -0.66006, 0.43796, 1.38723, 3.17906), group = "group5")
df <- do.call(rbind, list(line1, line2, line3, line4, line5))
What I'm trying to do is plot the least squares line (and points) for each group on the same plot. And I'd like the colour of the lines and points to correspond to the group.
All I've been able to do is plot the points according to their group
ggplot(data = df, aes(x, y, colour = group)) + geom_point(aes(size = 10))
But I have no idea how to add in the lines as well and make their colours correspond to the points that they are fitting.
I'd really appreciate any help with this. It's turning out to be so much harder than I though it would be.
You can simply add a geom_smooth layer to your plot
ggplot(data = df, aes(x, y, colour = group)) + geom_point(aes(size = 10)) +
geom_smooth(method="lm",se=FALSE)
method="lm" specifies that you want a linear model
se=FALSE to avoid plotting confidence intervals
After experimenting with different point sizes and shapes when plotting with ggplot2, I found that I was no longer able to plot circular points. These simple examples illustrate the problem:
# Plot 1 - square points (symbol #15) appear correctly
#
df = data.frame(x = c(1, 2, 3), y = c(4, 5, 6))
g1 <- ggplot(df, aes(x = x, y = y))
g1 <- g1 + geom_point(size = 3, shape = 15)
g1
Plot 1 output:
# Plot 2 - circular points (symbol #16) appear as diamonds
#
df = data.frame(x = c(1, 2, 3), y = c(4, 5, 6))
g1 <- ggplot(df, aes(x = x, y = y))
g1 <- g1 + geom_point(size = 3, shape = 16)
g1
Plot 2 output:
# Plot 3 - triangular points (symbol #17) appear correctly
#
df = data.frame(x = c(1, 2, 3), y = c(4, 5, 6))
g1 <- ggplot(df, aes(x = x, y = y))
g1 <- g1 + geom_point(size = 3, shape = 17)
g1
Plot 3 output:
# Plot 4 - diamond points (symbol #18) appear correctly
#
df = data.frame(x = c(1, 2, 3), y = c(4, 5, 6))
g1 <- ggplot(df, aes(x = x, y = y))
g1 <- g1 + geom_point(size = 3, shape = 18)
g1
Plot 4 output:
What do I have to do to plot circular points again?
(I'm running R 3.1.3 and RStudio 0.98.1103 in Windows 7.)
It looks like it has to do with the limited resolution of the RStudioGD() graphics device. It becomes a non-issue by avoiding the RStudio interface:
g1 <- ggplot(df, aes(x = x, y = y))
g1 <- g1 + geom_point(size = 3)
g1
(from RStudio interface via save image)
ggsave(g1, filename = "image.png")
ggsave gives you more finely-tuned control over graphics parameters, including the height/width, dpi (for raster images, eg. png), and file format. See the ?ggsave documentation for details.
Or alternatively, bump the geom_point up to size = 4.