I have a bunch of data that looks like this:
Track X1 X Y
1 Point 1 147.8333 258.5000
2 Point 2 148.5000 258.8333
3 Point 3 151.1667 260.8333
4 Point 4 154.5000 264.5000
5 Point 5 158.1667 266.5000
6 Point 6 161.5000 269.5000
I want to plot a heatmap of this, so a nice looking graph labelled x and y for the position coordinates, with a gradient color fill indicating the frequency that a particular point showed up, with a scale indicator showing what the colors mean. I'm looking for a simple gradient fill with a single color low and high.
I've been at this for a while but I think the first step should be to construct another data-set with the positions and a new column showing the frequencies? But I'm not 100% sure how to structure this.
So far my attempts look similar to:
ggplot(data=all_data, aes(x=X, y=Y)) + geom_tile(aes(fill=all_data$X)) +
scale_fill_gradient2(low="green", high="blue") + coord_equal()
As Jon Spring suggested, the following code shows up a graph like this:
all_data <- read.table(text = "
Track X1 X Y
1 Point 1 147.8333 258.5000
2 Point 2 148.5000 258.8333
3 Point 3 151.1667 260.8333
4 Point 4 154.5000 264.5000
5 Point 5 158.1667 266.5000
6 Point 6 161.5000 269.5000
", header = T, row.names = NULL)
ggplot(data=all_data, aes(x=X, y=Y)) + geom_bin2d()
Related
I'm trying to create a stacked bar plot to indicate when requests were made for resources on website. I would like to use a gradient legend to describe when the requests were made. How can I create a gradient legend, and would that be the right way to visualize this time domain data?
> head(livePostHit)
path date hits
1 /2017/06/27/goog-fit-cal.html 2018/04/01 1
2 /2015/05/24/sqlite-tutorial.html 2018/04/01 1
3 /2016/11/07/coin-freq.html 2018/04/01 1
4 /2017/03/30/alpine-linux.html 2018/04/01 2
5 /2018/03/09/querySelectorAll.html 2018/04/01 1
6 /2017/11/24/fedora-27-rv.html 2018/04/01 1
> ggplot(livePostHit, aes(x = path, y = hits, fill = date)) +
geom_bar(stat='identity') +
theme(axis.text = element_text(angle=75, hjust = 1),
legend.position = 'none')
I turned off the legend because there were too many groups for it to render correctly, but I would like to create like a gradient for the highest to the lowest.
I just needed to make date an actual date type rather than a factor.
livePostHit$date <- as.Date(livePostHit$date)
I am trying to plot multiple gene expressions over time in the same graph to demonstrate a similar profile and then add a line to illustrate the mean of total for each timepoint (like the figure 4b in recent Nature comm article https://www.nature.com/articles/s41467-017-02546-5/figures/4). My data has been normalised to be around 0 so they are all on the same scale.
df2 sample:
variable value gene
1 5 -0.610384193 1
2 5 -6.25967087 2
3 5 -3.773389731 3
50 6 -0.358879035 1
51 6 -6.066341017 2
52 6 -4.202998579 3
99 7 -0.103885903 1
100 7 -6.648844687 2
101 7 -5.041554127 3
I plot the expression levels with ggplot2:
plotC <- ggplot(df2, aes(x=variable, y=value, group=factor(gene), colour=gene)) + geom_line(size=0.5, aes(color=gene), alpha=0.4)
But adding the mean line in red to this plot is proving difficult. I calculated the means and put them in another dataframe:
means
value variable gene
1 -1.5037354 5 50
2 -0.8783492 6 50
3 -0.7769085 7 50
Then tried adding them as another layer:
plotC + geom_line(data=means, aes(x=variable, y=value, color="red", group=factor(gene)), size=0.75)
But I get an error Error: Discrete value supplied to continuous scale
Do you have any suggestions as to how I can plot this mean on the same graph in another color?
Thank you,
Anna
edit: the answer by RG20 is helpful, thanks for pointing out I had the color in the wrong place. However it plots the line outside the rest of the graph... I really don't understand what's wrong with my graph...
enter image description here
plotC + geom_line(data=means, aes(x=variable, y=value, group=factor(gene)), color='red',size=0.75)
I am looking for a way where data points are connected following a top-down manner to visualize a ranking. In that the y-axis represents the rank and the x-axis the attributes. With the normal setting the line connects the point starting from left to right. This results that the points are connected in the wrong order.
With the data below the line should be connected from (6,1) to (4,2) and then (5,3) etc. Optimally the ranking scale need to be inverted so that rank one starts on the top.
data <- read.table(header=TRUE, text='
attribute rank
1 6
2 5
3 4
4 2
5 3
6 1
7 7
8 11
9 10
10 8
11 9
')
plot(data$attribute,data$rank,type="l")
Is there a way to change the line drawing direction? My second idea would be to rotate the graph or maybe you have better ideas.
The graph I am trying to achieve is somewhat similar to this one:
example vertical line chart
You can do this with ggplot:
library(ggplot2)
ggplot(data, aes(y = attribute, x = rank)) +
geom_line() +
coord_flip() +
scale_x_reverse()
It solves the problem exactly the way you suggested. The first part of the command (ggplot(...) + geom_line()) creates an "ordinary" line plot. Note that I have already switched x- and y-coordinates. The next command (coord_flip()) flips x- and y-axis, and the last one (scale_x_reverse) changes the ordering of the x-axis (which is plotted as the y-axis) such that 1 is in the top left corner.
Just to show you that something like the example you linked in your question can be done with ggplot2, I add the following example:
library(tidyr)
data$attribute2 <- sample(data$attribute)
data$attribute3 <- sample(data$attribute)
plot_data <- pivot_longer(data, cols = -"rank")
ggplot(plot_data, aes(y = value, x = rank, colour = name)) +
geom_line() +
geom_point() +
coord_flip() +
scale_x_reverse()
If you intend to do your plots with R, learning ggplot2 is really worthwhile. You can find many examples on Cookbook for R.
i'm sorry to ask such a dumb question, but i can't find an answer ...
So i have this table called : "linearsep" :
color x y
1 red 1 1
2 red 1 3
3 red 3 3
4 red 2 4
5 blue 4 1
6 blue 6 3
7 blue 2 -2
8 blue 6 -1
each line corresponds to a points (1,1 ; 1,3 etc...) , I just want to plot the ''red'' points in red , and the "blue" points in blue.
I know this is pretty dumb : but i just can't find a way to get a vector with the first four line.
I thought it was something like that:
plot(linearsep$color~x, linearsep$color~y)
but obviously it doesn't work ...
I've tested some stuff with ggplot:
ggplot(data=a,
+ aes(x=x, y=y, colour=color)) + geom_point()
Which works, but seems like a 'hack' , how can i just get the vector i want ?
Someone could please help me ... Again sorry for such a dumb question
Here's the complete call
p <- ggplot(foo, aes(x,y)) +
geom_point(aes(colour = color)) +
scale_colour_manual(values = c("red","blue"))
now you can do
p
or
print(p)
By assigning the ggplot to p, you can add more layers to later
by just doing p + ggtitle("Plot Title") for instance. This will be easier than
typing out everything again.
For getting only blue data or any other condition, you can subset and assign
it to new data.frame or do it within the ggplot call
ggplot(subset(a, colour == "blue"), aes(.....
In base R:
with(linearsep,plot(x,y,col=color,pch=16))
will give you two colors. If linearsep$color is a factor (which it probably is), then the colors will be red and black, because plot(...) uses the factor levels not the labels. You can get around this by converting the factor to character.
linearsep$color <- as.character(linearsep$color)
with(linearsep,plot(x,y,col=color,pch=16))
Now the colors will be red and blue.
I have two functions, a and b, that each take a value of x from 1-3 and produce an estimate and an error.
x variable estimate error
1 a 8 4
1 b 10 2
2 a 9 3
2 b 10 1
3 a 8 5
3 b 11 3
I'd like to use geom_path() in ggplot to plot the estimates and errors for each function as x increases.
So if this is the data:
d = data.frame(x=c(1,1,2,2,3,3),variable=rep(c('a','b'),3),estimate=c(8,10,9,10,8,11),error=c(4,2,3,1,5,3))
Then the output that I'd like is something like the output of:
ggplot(d,aes(x,estimate,color=variable)) + geom_path()
but with the thickness of the line at each point equal to the size of the error. I might need to use something like geom_polygon(), but I haven't been able to find a good way to do this without calculating a series of coordinates manually.
If there's a better way to visualize this data (y value with confidence intervals at discrete x values), that would be great. I don't want to use a bar graph because I actually have more than two functions and it's hard to track the changing estimate/error of any specific function with a large group of bars at each x value.
The short answer is that you need to map size to error so that the size of the geometric object will vary depending on the value, error in this case. There are many ways to do what you want like you have suggested.
df = data.frame(x = c(1,1,2,2,3,3),
variable = rep(c('a','b'), 3),
estimate = c(8,10,9,10,8,11),
error = c(4,2,3,1,5,3))
library(ggplot2)
ggplot(df, aes(x, estimate, colour = variable, group = variable, size = error)) +
geom_point() + theme(legend.position = 'none') + geom_line(size = .5)
I found geom_ribbon(). The answer is something like this:
ggplot(d,aes(x,estimate,ymin=estimate-error,ymax=estimate+error,fill=variable)) + geom_ribbon()