I have a data frame with the following structure
df <- data.frame(Build = rep(2000:2003, each = 4),
Year = rep(2000:2003, each = 4) + 1:4, val = sort(rnorm(16)))
I would like to generate a ggplot bar plot for this data frame, using Build as x-coordinate and Year as y-coordinate, adding a gradient fill for val.
I have tried the following
ggplot(df, aes(x = Build, y = Year, fill = val)) + geom_bar(stat = "identity")
But this is what I get
What I want to see in the y-axis is the range of values that the Year variable takes for each value of Build, while preserving the color-gradient representation for value; instead what I see in the y-axis is a quantity that is not related to what I have in my data frame (sum of the values for Year?).
Could someone please point me in the right direction?
Related
I have been working on plotting several lines according to different probability levels and am stuck adding labels to each line to represent the probability level.
Since each curve plotted has varying x and y coordinates, I cannot simply have a large data-frame on which to perform usual ggplot2 functions.
The end goal is to have each line with a label next to it according to the p-level.
What I have tried:
To access the data comfortably, I have created a list df with for example 5 elements, each element containing a nx2 data frame with column 1 the x-coordinates and column 2 the y-coordinates. To plot each curve, I create a for loop where at each iteration (i in 1:5) I extract the x and y coordinates from the list and add the p-level line to the plot by:
plot = plot +
geom_line(data=df[[i]],aes(x=x.coor, y=y.coor),color = vector_of_colors[i])
where vector_of_colors contains varying colors.
I have looked at using ggrepel and its geom_label_repel() or geom_text_repel() functions, but being unfamiliar with ggplot2 I could not get it to work. Below is a simplification of my code so that it may be reproducible. I could not include an image of the actual curves I am trying to add labels to since I do not have 10 reputation.
# CREATION OF DATA
plevel0.5 = cbind(c(0,1),c(0,1))
colnames(plevel0.5) = c("x","y")
plevel0.8 = cbind(c(0.5,3),c(0.5,1.5))
colnames(plevel0.8) = c("x","y")
data = list(data1 = line1,data2 = line2)
# CREATION OF PLOT
plot = ggplot()
for (i in 1:2) {
plot = plot + geom_line(data=data[[i]],mapping=aes(x=x,y=y))
}
Thank you in advance and let me know what needs to be clarified.
EDIT :
I have now attempted the following :
Using bind_rows(), I have created a single dataframe with columns x.coor and y.coor as well as a column called "groups" detailing the p-level of each coordinate.
This is what I have tried:
plot = ggplot(data) +
geom_line(aes(coors.x,coors.y,group=groups,color=groups)) +
geom_text_repel(aes(label=groups))
But it gives me the following error:
geom_text_repel requires the following missing aesthetics: x and y
I do not know how to specify x and y in the correct way since I thought it did this automatically. Any tips?
You approach is probably a bit to complicated. As far as I get it you could of course go on with one dataset and use the group aesthetic to get the same result you are trying to achieve with your for loop and multiple geom_line. To this end I use dplyr:.bind_rows to bind your datasets together. Whether ggrepel is needed depends on your real dataset. In my code below I simply use geom_text to add an label at the rightmost point of each line:
plevel0.5 <- data.frame(x = c(0, 1), y = c(0, 1))
plevel0.8 <- data.frame(x = c(0.5, 3), y = c(0.5, 1.5))
library(dplyr)
library(ggplot2)
data <- list(data1 = plevel0.5, data2 = plevel0.8) |>
bind_rows(.id = "id")
ggplot(data, aes(x = x, y = y, group = id)) +
geom_line(aes(color = id)) +
geom_text(data = ~ group_by(.x, id) |> filter(x %in% max(x)), aes(label = id), vjust = -.5, hjust = .5)
I'm making a plot where several data points have the same coordinates. By default, the labels all overlap, but using geom_text_repel with direction = "y", I can vertically space them out.
However, every time I generate the plot, it chooses a new order for the labels. I would like them to be ordered based on a value.
I have tried:
using "arrange" to order the dataframe in the order that I want to see the labels (this seems to have no effect)
Trying to use "nudge_y" to re-arrange the labels in the order I want them. This seems to change the plot - it does "nudge" them - but it does NOT nudge them into the correct order!
Here is sample code to recreate the problem. Basically, I want the final plot to be ordered by the "order" value - so, for the three datapoints on "10", the order should be Ayala, Zoe, JL, and for the two datapoints on "5", the order should be Raph, Oona.
I've color-coded the plot to make it obvious what order they should be in - for each value, the lightest blue should be on top, and the darkest should be on the bottom.
library(tidyverse)
library(ggrepel)
name <- c("Oona","Sam","Raph", "JL", "Zoe","Ayala")
year <- rep(c("2016"),6)
value <- c(5,15,5,10,10,10) #The value I'm plotting
order <- c(5,-10,10,-5,0,5) #The value I want to order the labels by
test_df <- bind_cols(name = name, year = year, value = value, order = order) %>%
arrange(-value, -order) #arranging the df doesn't seem to affect the order on the plot at all, I just do it so I can easily preview the df in the correct order
ggplot(data = test_df, aes(x = year, y = value, group = name)) +
geom_point(aes(color = order)) +
geom_text_repel(data = test_df,
aes(label = name, color = order),
hjust = "left",
nudge_y = order, #This is where I'm trying to "nudge" them into the right order
nudge_x = -.45,
direction = "y")
I think the values in your order column were too big for the y-axis scale provided, so geom_text_repel was doing behind-the-scenes work to make it all actually fit, and changed the order of the labels in the process. When I scaled the order column down to one-fifth the sizes you had originally, it worked perfectly.
test_df$order <- test_df$order*1/5
ggplot(data = test_df, aes(x = year, y = value, group = name)) +
geom_point(aes(color = order)) +
geom_text_repel(data = test_df,
aes(label = name, color = order),
hjust = "left",
nudge_y = test_df$order,
nudge_x = -.45,
direction = "y"
)
I have data from several cells which I tested in several conditions: a few times before and also a few times after treatment. In ggplot, I use color to indicate different times of testing.
Additionally, I would like to connect with lines all data points which belong to the same cell. Is that possible?...
Here is my example data (https://www.dropbox.com/s/eqvgm4yu6epijgm/df.csv?dl=0) and a simplified code for the plot:
df$condition = as.factor(df$condition)
df$cell = as.factor(df$cell)
df$condition <- factor(df$condition, levels = c("before1", "before2", "after1", "after2", "after3")
windows(width=8,height=5)
ggplot(df, aes(x=condition, y=test_variable, color=condition)) +
labs(title="", x = "Condition", y = "test_variable", color="Condition") +
geom_point(aes(color=condition),size=2,shape=17, position = position_jitter(w = 0.1, h = 0))
I think you get in the wrong direction for your code, you should instead group and colored each points based on the column Cell. Then, if I'm right, you are looking to see the evolution of the variable for each cell before and after a treatment, so you can order the x variable using scale_x_discrete.
Altogether, you can do something like that:
library(ggplot2)
ggplot(df, aes(x = condition, y = variable, group = Cell)) +
geom_point(aes(color = condition))+
geom_line(aes(color = condition))+
scale_x_discrete(limits = c("before1","before2","after1","after2","after3"))
Does it look what you are expecting ?
Data
df = data.frame(Cell = c(rep("13a",5),rep("1b",5)),
condition = rep(c("before1","before2","after1","after2","after3"),2),
variable = c(58,55,36,29,53,57,53,54,52,52))
I am new to R and have been trying for a few days to plot histogram / bar chart to view the trend. I have this categorical variable : countryx and coded it into 1,2,3.
I have tried these 2 scripts below and got error messages as follows :
Output 1: blank chart with x and y axis, no stack/bar trend
qplot(DI$countryx,geom = "histogram",ylab = "count",
xlab = "countryx",binwidth=5,colour=I("blue"),fill=I("wheat"))
Output 2: error message- ggplot2 doesn't know how to deal with data of class integer
ggplot(DI$countryX, aes(x=countryx))
+ geom_bar(aes(y=count), stat = "count",position ="stack",...,
width =5,aes=true)
Appreciate for all advice.
Thank you very much for your help!
Multiple problems with your code. ggplot takes a dataframe, not a vector, but you're supplying a vector. Try this
ggplot(DI, aes(x=countryx, y = count)) + geom_col(width = 5)
As #yeedle mentioned you need a data.frame (maybe use as.data.frame)
How about:
library(ggplot2)
df <- data.frame(countryx = rep(1:3), count = rbinom(3,10,0.3))
p <- ggplot2::ggplot(df, aes(x = countryx, y = count)) + ylab("count")
p + geom_col(aes(x = countryx, fill = factor(countryx)))
I'm currently trying to produce two maps, one with multiple categoric values and one with continous numeric values, as in:
link
I have a dataset which provides the NPA and the two informations for each NPA: the item (category) and the Frequency (on a scale from 1 to 10):
NPA item Frequency
1000 huitante 0
1002 huitante 10
1006 quatre-vingt 3
2000 huitante 9
I have as well a specific shapefile for the country I work on (Switzerland). On a previous post, I found some interesting code, that I copy/paste here:
# open the shapefile
require(rgdal)
require(rgeos)
require(ggplot2)
ch <- readOGR(work.dir, layer = "PLZO_PLZ")
# convert to data frame for plotting with ggplot - takes a while
ch.df <- fortify(ch)
# generate fake data and add to data frame
ch.df$count <- round(runif(nrow(ch.df), 0, 100), 0)
# plot with ggplot
ggplot(ch.df, aes(x = long, y = lat, group = group, fill = count)) +
geom_polygon(colour = "black", size = 0.3, aes(group = group)) +
theme()
the author gives in comment some information to plot a specific dataset (and not fake data):
# plot just a subset of NPAs using ggplot
my.sub <- ch.df[ch.df$id %in% c(4,6), ]
ggplot(my.sub, aes(x = long, y = lat, group = group, fill = count)) +
geom_polygon(colour = "black", size = 0.3, aes(group = group)) +
theme()
And says in the comment of the post to :
replace ggplot(my.sub, aes(x = long, y = lat, group = group, fill = count))
with ggplot(my.sub, aes(x = long, y = lat, group = group, fill = frequency))
So I guess I need to extract frequency as a variable
frequency <- table(data$frequency)
And change in the code as indicated in the quote.
Unfortunately, my problem is that it does not work, I get the following comment :
Don't know how to automatically pick scale for object of type table.
Defaulting to continuous
Error: Aesthetics must either be length one, or the same length as the
dataProblems:frequency
My questions are :
how can I change the code to include my own data, and plot the numeric value (frequency)
how can I change the code to include my own data, and plot categoric value (item)
I don t need to represent frequency and item on the same map, just know how to create to seprated maps.
My dataset is in this file, with as well the shapefile I need to use.
https://www.dropbox.com/sh/5x6r6s2obfztblm/AAAesIOrxn76HU57AIF0y1Oua?dl=0
Any help will be really appreciated!