I am looking to create a plot in R that shows the relative change of some variables between two factors. I would like to stack them to reduce redundant text and make it easy to visually compare the changes between the two factors. I would like it to look something like this: http://postimg.org/image/clmw5zj37/.
where the lines (or bars) represent the relative change (y) in each variable (X), a solid circle (or any other symbol) represents no change, and an asterisk indicates the the change is statistically significant. Anyone have an idea of how to accomplish this in R?
This?
set.seed(1)
df <- data.frame(x=toupper(letters[1:10]),
y=rnorm(20,0,50),
sig=sample(0:1,20,replace=T),
factor=rep(c("Factor1","Factor2"),each=10))
library(ggplot2)
ggplot(df) +
geom_point(aes(x=x,y=y),shape=1,size=3)+
geom_linerange(aes(x=x,ymin=0,ymax=y))+
geom_text(data=df[df$sig==1,], aes(x=x,y=y+10*sign(y)),label="*",size=10)+
geom_hline(yintercept=0)+
facet_grid(factor~.)
Note that it is considered polite to provide a representative dataset. See this link for the way to formulate a question well.
Edit In response to OP's comment.
To plot points only wheny=0, set data=df[df$y==0,] in the call to geom_point(...). Vertical alignment of the stars can be done using vjust= in the call to geom_text(...). So, this code:
set.seed(1)
df <- data.frame(x=toupper(letters[1:10]),
y=rnorm(20,0,50),
sig=sample(0:1,20,replace=T),
factor=rep(c("Factor1","Factor2"),each=10))
df[sample(1:nrow(df),4),2:3]=0 # add some zeros to example
library(ggplot2)
ggplot(df) +
geom_point(data=df[df$y==0,],aes(x=x,y=y),size=5)+
geom_linerange(aes(x=x,ymin=0,ymax=y))+
geom_text(data=df[df$sig==1,], aes(x=x,y=y+10*sign(y)),
label="*", size=10, vjust=+0.65)+
geom_hline(yintercept=0)+
facet_grid(factor~.)
Genreates this ggplot:
Related
I'm trying plots a graph lines using ggplot library in R, but I get a good plots but I need reduce the gradual space or height between rows grid lines because I get big separation between lines.
This is my R script:
library(ggplot2)
library(reshape2)
data <- read.csv('/Users/keepo/Desktop/G.Con/Int18/input-int18.csv')
chart_data <- melt(data, id='NRO')
names(chart_data) <- c('NRO', 'leyenda', 'DTF')
ggplot() +
geom_line(data = chart_data, aes(x = NRO, y = DTF, color = leyenda), size = 1)+
xlab("iteraciones") +
ylab("valores")
and this is my actual graphs:
..the first line is very distant from the second. How I can reduce heigth?
regards.
The lines are far apart because the values of the variable plotted on the y-axis are far apart. If you need them closer together, you fundamentally have 3 options:
change the scale (e.g. convert the plot to a log scale), although this can make it harder for people to interpret the numbers. This can also change the behavior of each line, not just change the space between the lines. I'm guessing this isn't what you will want, ultimately.
normalize the data. If the actual value of the variable on the y-axis isn't important, just standardize the data (separately for each value of leyenda).
As stated above, you can graph each line separately. The main drawback here is that you need 3 graphs where 1 might do.
Not recommended:
I know that some graphs will have the a "squiggle" to change scales or skip space. Generally, this is considered poor practice (and I doubt it's an option in ggplot2 because it masks the true separation between the data points. If you really do want a gap, I would look at this post: axis.break and ggplot2 or gap.plot? plot may be too complexe
In a nutshell, the answer here depends on what your numbers mean. What is the story you are trying to tell? Is the important feature of your plots the change between them (in which case, normalizing might be your best option), or the actual numbers themselves (in which case, the space is relevant).
you could use an axis transformation that maps your data to the screen in a non-linear fashion,
fun_trans <- function(x){
d <- data.frame(x=c(800, 2500, 3100), y=c(800,1950, 3100))
model1 <- lm(y~poly(x,2), data=d)
model2 <- lm(x~poly(y,2), data=d)
scales::trans_new("fun",
function(x) as.vector(predict(model1,data.frame(x=x))),
function(x) as.vector(predict(model2,data.frame(y=x))))
}
last_plot() + scale_y_continuous(trans = "fun")
enter image description here
I have a dataset:
a<-c(1,2,3,4,5,6,7,8,9,10)
b<-c(2,2,2,2,4,5,6,8,4,1)
c<-c("red","red","red","blue","blue","blue","orange","orange","orange","orange")
data<-data.frame(a=a,b=b,c=c)
I now want to plot the data on a graph with each group having a different colour:
plot(a[c=="red"],b[c=="red"],col="red",xlim=c(min(a),max(a)),ylim=c(min(b),max(b)))
points(a[c=="blue"],b[c=="blue"],col="blue")
points(a[c=="orange"],b[c=="orange"],col="orange")
This works fine - however, say if I have 30 groups, the task of writing the code becomes tedious. I am wondering if there is a better way of writing the code such that R will automatically plot the graph and give different colours to different groups?
Also, I wonder if there is a quick way to display a legend in the graph.
Thank you for all your help.
Try this:
with(data,plot(a,b,col=c))
The col argument in plot() stands for color. This can contain a vector of the colors you want.
Additionally, you don't have to make a column just to define the color if the color-group relationship is not that important. For example, you could make column c a more meaningful column like this:
a<-c(1,2,3,4,5,6,7,8,9,10)
b<-c(2,2,2,2,4,5,6,8,4,1)
c<-c(rep('Group1',3),rep('Group2',3),rep('Group3',4))
data<-data.frame(a=a,b=b,c=c)
Then to plot, use:
with(data,plot(a,b,col=c))
To add a legend:
legend('topleft',legend = levels(data[,'c']),col=1:nlevels(data[,'c']),pch=1)
Try ggplot2
library(ggplot2)
ggplot(data=data, aes(x=a, y=b, colour=c)) + geom_point()
I'm very new to R and have tried to search around for an answer to my question, but couldn't find quite what I was looking for (or I just couldn't figure out the right keywords to include!). I think this is a fairly common task in R though, I am just very new.
I have a x vs y scatterplot and I want to color those points for which there is at least a 2-fold enrichment, ie where x/y>=2 . Since my values are expressed as log2 values, the the transformed value needs to be x/y>=4.
I currently have the scatterplot plotted with
plot(log2(counts[,40], log2(counts[,41))
where counts is a .csv imported files and 40 & 41 are my columns of interested.
I've also created a column for fold change using
counts$fold<-counts[,41]/counts[,40]
I don't know how to incorporate these two pieces of information... Ultimately I want a graph that looks something like the example here: http://s17.postimg.org/s3k1w8r7j/error_messsage_1.png
where those points that are at least two-fold enriched will colored in blue.
Any help would be greatly appreciated. Thanks!
Is this what you're looking for:
# Fake data
dat = data.frame(x=runif(100,0,50), y = rnorm(100, 10, 2))
plot(dat$x, dat$y, col=ifelse(dat$x/dat$y > 4, "blue", "red"), pch=16)
The ifelse statement creates a vector of "blue" and "red" (or whatever colors you want) based on the values of dat$x/dat$y and plot uses that to color the points.
This might be helpful if you've never worked with colors in R.
Another option is to use ggplot2 instead of base graphics. Here's an example:
library(ggplot2)
ggplot(dat, aes(x,y, colour=cut(x/y, breaks=c(-1000,4,1000),
labels=c("<=4",">4")))) +
geom_point(size=5) +
labs(colour="x/y")
I have several data and I need to plot them compactly in a picture like this:
I already tried par() layout() and ggplot() but plots are displayed so far each other.
I need them to be very close, as if they were in the same plot with a different y (e.g. plot1 y=0, plot2 y=1, plot3 y=3 and so on..)
Can someone help me?
That can be acquired using the layout, also, but maybe an easier approach is to set the graphical parameters in a suitable way.
Function par() let's you specify the number of panels in a single figure using the argument mfrow. It takes a vector of two numbers, that specify the number sub-figure rows and columns. For example, c(2,1) would create two rows of figure,s but only a single column. That's what is in your example figure. You can change the number of figure rows to the number of sub-figures you would like to plot vertically.
In addition, the margins around each sub-figure can be set using the argument mar. The margins are specified in the order of 1. bottom, 2. left, 3. top., and 4. right. Making the bottom and top margins smaller would draw your sub-figures closer together.
In R this could look something like the following:
# Simulate some random data
a<-runif(10000)
b<-runif(10000)
# Open a new plot windows
# width: 7 inches, height: 2 inches
x11(width=7, height=1)
# Specify the number of sub-figures
# Specify the margins (top and bottom are 0.1, left and right are 2)
# Needs some experimenting with to get these right
par(mfrow=c(2,1), mar=c(0.1,2,0.1,2))
# Plot the figures
barplot(a)
barplot(b)
The resulting figure should roughly resemble this:
Here is ggplot version using facet_grid:
df <- data.frame(a=runif(3e3), b=rep(letters[1:3], 1e3), c=rep(1:1e3, 3))
ggplot(df, aes(y=a, x=c)) + geom_bar(stat="identity") + facet_grid(b ~ .)
Is it possible to extract specific sections of a ggplot figure/map and place them side by side in a secondary figure but still add points to the three frames as if they were still one plot i.e. for the following map
create a map split into 3 sections which can then be manipulated as one graph (i.e. adding points to all three sections of the graph simultaneously?
UPDATE: Reproducible example
set.seed(1)
dfx<-c(sample(1:1000,100),sample(2000:3000,100),sample(4000:3000,100))
dfy<-c(sample(1:1000,100),sample(2000:3000,100),sample(4000:3000,100))
p<-ggplot()+
coord_fixed()+
geom_point(aes(x=dfx,y=dfy))
p
I can get partway there but can't retain the effects of coord_equal or coord_fixed while allowing free scales ... hopefully someone else can step in and get the rest of the way. (This has come up before -- scatterplot with equal axes -- but I haven't seen a solution.)
dd <- data.frame(dfx,dfy)
dd2 <- transform(dd,panel=cut(dfx,seq(0,4000,by=1000),labels=1:4))
p <- ggplot(dd2)+geom_point(aes(dfx,dfy)) + coord_equal()
p + facet_wrap(~panel,nrow=1,scale="free")