Hello,
I have a dateset structured as shown in the link above. I am extremely new to R. And this is probably super easy to get done. But I cannot figure out how to plot this dataset using ggplot...
Could anyone guide and give me hints?
I basically want to color lines according to socioeconomic levels and visualize it by each years' value...
You need to reshape you data to run ggplot.
library(reshape)
library(dplyr)
library(ggplot2)
df_long <- melt(df) # reshape the dataframe to a long format
df_long %>%
ggplot( aes(x=variable, y=value, group=group, color=group)) +
geom_line()
Note: You will get better answers if you post your code with a reproducible dataset.
i am trying to add wilcoxon stats in my graph, but the "stat_compare_means" does not work...
i have tried both ggplot and ggplot2.
library(readxl)
library(dplyr)
library(tidyverse)
library(ggpubr)
library(dplyr)
library(tidyr)
library(ggplot2)
library(Rtsne)
require(ggpubr)
#excel sheet resolution, voxel size comparison
data<-read_excel("res_all.xlsx", sheet="resolution")
# transform to long format using dplyr (included in tidyverse)
data_long <- as_tibble(data) %>%
gather(key, value,-parameter) %>%
mutate(cohort=ifelse(grepl("per",key), "per", "val"))
# plot graph
graph <- ggplot(data_long) +
aes(x=parameter, y=value, fill=cohort)+
geom_boxplot()+
stat_compare_means(method= "wilcox.test")
graph + ggtitle("Resolution comparison")+
theme_minimal()
error is Error in stat_compare_means(method = "wilcox.test") :
could not find function "stat_compare_means"
is it any other way to add W and p-values in my graph?
Thank you in advance.
[1]: https://i.stack.imgur.com/yfp8E.png
I think you forgot a "+" after theme_minimal().
Oh, and stat_compare_means is from ggpubr package, not ggplot. be sure you included it. Check if you have library(ggpubr) or require(ggpubr) in your R session. It is good if you can include full code and result in sessioninfo() for further troubleshoot.
The stat_compare_means() was introduced in ggpubr ver 0,1,3. So check the package with ?ggpubr for the version and lsf.str("package:ggpubr") to list all functions inside the package.
Here is the t-SNE code using IRIS data:
library(Rtsne)
iris_unique <- unique(iris) # Remove duplicates
iris_matrix <- as.matrix(iris_unique[,1:4])
set.seed(42) # Set a seed if you want reproducible results
tsne_out <- Rtsne(iris_matrix) # Run TSNE
# Show the objects in the 2D tsne representation
plot(tsne_out$Y,col=iris_unique$Species)
Which produces this plot:
How can I use GGPLOT to make that figure?
I think the easiest/cleanest ggplot way would be to store all the info you need in a data.frame and then plot it. From your code pasted above, this should work:
library(ggplot2)
tsne_plot <- data.frame(x = tsne_out$Y[,1], y = tsne_out$Y[,2], col = iris_unique$Species)
ggplot(tsne_plot) + geom_point(aes(x=x, y=y, color=col))
My plot using the regular plot function is:
plot(tsne_out$Y,col=iris_unique$Species)
I have successfully created a very nice boxplot (for my purposes) categorized by a factor and binned, according to the answer in my previous post here:
ggplot: arranging boxplots of multiple y-variables for each group of a continuous x
Now, I would like to customize the x-axis labels according to the number of observations in each boxplot.
require (ggplot2)
require (plyr)
library(reshape2)
set.seed(1234)
x<- rnorm(100)
y.1<-rnorm(100)
y.2<-rnorm(100)
y.3<-rnorm(100)
y.4<-rnorm(100)
df<- (as.data.frame(cbind(x,y.1,y.2,y.3,y.4)))
dfmelt<-melt(df, measure.vars = 2:5)
dfmelt$bin <- factor(round_any(dfmelt$x,0.5))
dfmelt.sum<-summary(dfmelt$bin)
ggplot(dfmelt, aes(x=bin, y=value, fill=variable))+
geom_boxplot()+
facet_grid(.~bin, scales="free")+
labs(x="number of observations")+
scale_x_discrete(labels= dfmelt.sum)
dfmelt.sum only gives me the total number of observations for each bin not for each boxplot.
Boxplots statistics give me the number of observations for each boxplot.
dfmelt.stat<-boxplot(value~variable+bin, data=dfmelt)
dfmelt.n<-dfmelt.stat$n
But how do I add tick marks and labels for each boxplot?
Thanks, Sina
UPDATE
I have continued working on this. The biggest problem is that in the code above, only one tick mark is provided per facet. Since I also wanted to plot the means for each boxplot, I have used interaction to plot each boxplot individually, which also adds tick marks on the x-axis for each boxplot:
require (ggplot2)
require (plyr)
library(reshape2)
set.seed(1234) x<- rnorm(100)
y.1<-rnorm(100)
y.2<-rnorm(100)
y.3<-rnorm(100)
y.4<-rnorm(100)
df<- (as.data.frame(cbind(x,y.1,y.2,y.3,y.4))) dfmelt<-melt(df, measure.vars = 2:5)
dfmelt$bin <- factor(round_any(dfmelt$x,0.5))
dfmelt$f2f1<-interaction(dfmelt$variable,dfmelt$bin)
dfmelt_mean<-aggregate(value~variable*bin, data=dfmelt, FUN=mean)
dfmelt_mean$f2f1<-interaction(dfmelt_mean$variable, dfmelt_mean$bin)
dfmelt_length<-aggregate(value~variable*bin, data=dfmelt, FUN=length)
dfmelt_length$f2f1<-interaction(dfmelt_length$variable, dfmelt_length$bin)
On the side: maybe there is a more elegant way to combine all those interactions. I'd be happy to improve.
ggplot(aes(y = value, x = f2f1, fill=variable), data = dfmelt)+
geom_boxplot()+
geom_point(aes(x=f2f1, y=value),data=dfmelt_mean, color="red", shape=3)+
facet_grid(.~bin, scales="free")+
labs(x="number of observations")+
scale_x_discrete(labels=dfmelt_length$value)
This gives me tick marks on for each boxplot which can be potentially labeled. However, using labels in scale_x_discrete only repeats the first four values of dfmelt_length$value in each facet.
How can that be circumvented?
Thanks, Sina
look at this answer, It is not on the label but it works - I have used this
Modify x-axis labels in each facet
You can also do as follows, I also have used that
library(ggplot2)
df <- data.frame(group=sample(c("a","b","c"),100,replace=T),x=rnorm(100),y=rnorm(100)*rnorm(100))
xlabs <- paste(levels(df$group),"\n(N=",table(df$group),")",sep="")
ggplot(df,aes(x=group,y=x,color=group))+geom_boxplot()+scale_x_discrete(labels=xlabs)
This also works
library(ggplot2)
library(reshape2)
df <- data.frame(group=sample(c("a","b","c"),100,replace=T),x=rnorm(100),y=rnorm(100)*rnorm(100))
df1 <- melt(df)
df2 <- ddply(df1,.(group,variable),transform,N=length(group))
df2$label <- paste0(df2$group,"\n","(n=",df2$N,")")
ggplot(df2,aes(x=label,y=value,color=group))+geom_boxplot()+facet_grid(.~variable)
My question is related to Andrie's answer to my earlier question. My question is whether is this possible to display the variable labels and car label under the corresponding segments of the dendrogram?
library(ggplot2)
library(ggdendro)
data(mtcars)
x <- as.matrix(scale(mtcars))
dd.row <- as.dendrogram(hclust(dist(t(x))))
ddata_x <- dendro_data(dd.row)
p2 <- ggplot(segment(ddata_x)) +
geom_segment(aes(x=x0, y=y0, xend=x1, yend=y1))
print(p2)
Make sure you have version 0.0-7 of ggdendro and then use the convenience function ggdendrogram:
library(ggplot2)
library(ggdendro)
ggdendrogram(dd.row)
If you want full control over how the labels are displayed, you can extract and manipulate these from ddata_x using either:
ddata_x$labels
label(ddata_x)
To add to your plot:
p2 + geom_text(data=label(ddata_x), aes(label=text, x=x, y=0))
You can find more information in the vignette, vignette("ggdendro")