Strange behavior on ggplot2

Strange behavior on ggplot2 - r

I'm trying to do a map to identify specific areas by coloring them. First, I made this plot to check if the data was ok (Setor is the sector's number):
ggplot(aes(x = long, y = lat, fill = Setor), data = mapa2010) + geom_polygon(colour = 'black') # data is ok
Them I tried to made the plot, filling by another variable (AGSN):
ggplot(aes(x = long, y = lat, fill = AGSN), data = mapa2010) + geom_polygon(colour = 'black')
The data is exactly the same, there is no code lines between this 2 commands. I've already tried to reorder the data, but still wrong.
Anyone know why this happens, and how to solve it?

Adding the parameter group = group in aes() for second plot solve. Don't know why only the second map needs.
ggplot(aes(x = long, y = lat, fill = AGSN, group = group), data = mapa2010[order(AGSN, id, piece, order), ]) + geom_polygon(colour = 'black')

Related

Can't make a ggplot with multiple lines, geom_line()

I'm trying to plot two lines using flight data I gathered. My problem is that after trying different formulas, R is still only showing one line. I've separated my data according to regions (see image below). Can someone help me out with my formula?
If you need any additional information don't hesitate to ask, this is my first time posting on this channel.
ggplot(ica.vs.total, aes(x = Year, y = flights)) +
geom_line(aes(color = region, group = region), size = 1) +
theme_minimal()

When I enter :
library(ggplot2)
ica.vs.total = data.frame(flights=c(215947,197757,185782,201023,279218,261045,213343,205609),
region=c('TotalFlights','TotalFlights','TotalFlights','TotalFlights',
'TotalFlightsICA','TotalFlightsICA','TotalFlightsICA','TotalFlightsICA'),
Year=c(2008,2009,2010,2011,2000,2001,2002,2003))
g = ggplot(ica.vs.total, aes(x = Year, y = flights)) +
geom_line(aes(color = region, group = region), size = 1)+
theme_minimal()
print(g)
I get the expected result :
Double check your code.

Labeling a US States map

I made a map in R and was wondering how to label the States Codes (variable which is in my dataset) appropriately. Using the simple geom_text or even geom_text_repel I get a lot of labels for each State (I can actually understand why), as I proceed to show:
Map
How can I solve it so each State gets 1 and only 1 text abbreviation (these State Codes are in my dataset as a variable under the name State Codes)? Thanks in advance.
Code below:
library(tidyverse)
library(maps)
library(wesanderson)
library(hrbrthemes)
ggplot(data = data,
mapping = aes(x = long,
y = lat,
group = group,
fill = black_percentage)) +
geom_polygon(col = "black") +
geom_text(aes(label = black_percentage)) +
theme_void() +
theme(legend.position = "bottom",
legend.title = element_blank(),
plot.title = element_text(hjust = 0.5, family = "Times", face = "bold"),
plot.subtitle = element_text(hjust = 0.5, family = "Times", face = "italic"),
plot.caption = element_text(family = "Times", face = "italic"),
legend.key.height = unit(0.85, "cm"),
legend.key.width = unit(0.85, "cm")) +
scale_fill_gradient(low = "#E6A0C4",
high = "#7294D4") +
labs(title = "Percentage of Black People, US States 2018",
subtitle = "Pink colors represent lower percentages. Light-blue colors represents higer percentages") +
ggsave("failed_map.png")

Can you provide the/some sample data?
One possible reason for multiple labels is that each state has multiple rows in the data, so ggplot thinks it needs to plot multiple labels. If you only need a single label, a solution is to create a separate summary dataset, which has only one row for each state/label. You then provide this summary data to geom_text() rather than the original data. Although not the problem in this instance, this is a solution to the common problem of 'blurry' labels; when 10's or 100's of labels are printed on top of one another they appear blurry, but when a single label is printed it appears fine.
Looking at your code and mapping aesthetics, it looks like geom_text() is inheriting the x and y aesthetics from the first ggplot() line. Therefore geom_text() will make a label for every value of x and y (long and lat) per state. This also explains why the labels all appear to follow the state borders.
I would suggest that you summarise each state to a single (x, y) coordinate (e.g. the middle of the state), and give this to geom_text(). Again, without some sample data it may be hard to explain, but something like:
# make the summary label dataframe
state_labels <- your_data %>%
group_by(state) %>%
summarise(
long = mean(long),
lat = mean(lat),
mean_black = mean(black_percentage)
)
# then we plot it
ggplot(data = data,
mapping = aes(x = long,
y = lat,
group = group,
fill = black_percentage)) +
geom_polygon(col = "black") +
geom_text(data = state_labels, aes(label = mean_black))
As the name of the x and y coords are the same in your data and the new state_labels summary we made (long and lat), geom_text() will 'inherit' (assume/use) the same x and y aesthetics that you supplied inside the first line of ggplot(). This is convenient, but sometimes can cause you grief if either dataset has different/the same column names or you want to assign different aesthetics. For example, you don't need geom_text() to inherit the fill = black_percentage aesthetic (although in this instance I don't think it will cause a problem, as geom_text() doesn't accept a fill aesthetic). To disable aesthetic inheritance, simply provide inherit.aes = FALSE to the geom. In this instance, it would look like this, note how we now provide geom_text() with x and y aesthetics.
ggplot(data = data,
mapping = aes(x = long,
y = lat,
group = group,
fill = black_percentage)) +
geom_polygon(col = "black") +
geom_text(data = state_labels, aes(x = long, y = lat, label = mean_black), inherit.aes = FALSE)
EDIT If you want a single label, but the label is not a numeric value and you can't calculate a summary statistic using mean or similar, then the same principles apply; you want to create a summarised version of the data, with a single coordinates for each state and a single label - 1 row for each state. There's many ways to do this, but my go-to would be something like dplyr::first or similar.
# make the summary label dataframe
state_labels <- your_data %>%
group_by(state) %>%
summarise(
long = mean(long),
lat = mean(lat),
my_label = first(`State Codes`)
)
# then we plot it
ggplot(data = data,
mapping = aes(x = long,
y = lat,
group = group,
fill = black_percentage)) +
geom_polygon(col = "black") +
geom_text(data = state_labels, aes(label = my_label))
Finally, ggplot has several built-in functions to plot and map spatial data. It is a good idea to use these where possible, as it will make your life a lot easier. A great 3-part tutorial can be found here, and it even includes an example of exactly what you are trying to do.

Time series data using ggplot: how use different color for each time point and also connect with lines data belonging to each subject?

I have data from several cells which I tested in several conditions: a few times before and also a few times after treatment. In ggplot, I use color to indicate different times of testing.
Additionally, I would like to connect with lines all data points which belong to the same cell. Is that possible?...
Here is my example data (https://www.dropbox.com/s/eqvgm4yu6epijgm/df.csv?dl=0) and a simplified code for the plot:
df$condition = as.factor(df$condition)
df$cell = as.factor(df$cell)
df$condition <- factor(df$condition, levels = c("before1", "before2", "after1", "after2", "after3")
windows(width=8,height=5)
ggplot(df, aes(x=condition, y=test_variable, color=condition)) +
labs(title="", x = "Condition", y = "test_variable", color="Condition") +
geom_point(aes(color=condition),size=2,shape=17, position = position_jitter(w = 0.1, h = 0))

I think you get in the wrong direction for your code, you should instead group and colored each points based on the column Cell. Then, if I'm right, you are looking to see the evolution of the variable for each cell before and after a treatment, so you can order the x variable using scale_x_discrete.
Altogether, you can do something like that:
library(ggplot2)
ggplot(df, aes(x = condition, y = variable, group = Cell)) +
geom_point(aes(color = condition))+
geom_line(aes(color = condition))+
scale_x_discrete(limits = c("before1","before2","after1","after2","after3"))
Does it look what you are expecting ?
Data
df = data.frame(Cell = c(rep("13a",5),rep("1b",5)),
condition = rep(c("before1","before2","after1","after2","after3"),2),
variable = c(58,55,36,29,53,57,53,54,52,52))

How do I set the x axis continuous that each plot in the graph is scattered relatively

The left image is my current graph and I would like to make it look like the right one. I'm having two problems. The first is even if I used step in the plot, it doesn't graph the line connecting each dots. The second problem is while the right graph's plots are scattered relatively to the year, mine is scattered proportionally throughout the whole x-axis.
Here is my code
ggplot() +
geom_step(data = tbl, mapping = aes(x = tbl$date, y = tbl$size)) +
geom_point(data = tbl, aes(x = tbl$date, y = tbl$size)) +
labs(x = 'Data', y = 'Size (Kilobytes)', title = 'stringr: timeline of version sizes')
I have to somehow convert current date format(yyyy-mm-dd) and change it to just yyyy format but doing that so would make some points to be in the same year. For example, the first three dates I have are 2009-11, 2009-11, and 2010-02 so if I change the format of year, two of them will be on same spot. And I don't know how to figure this out since I am still trying to learn how to use R.
Thank you in advance!

It takes some finagling with the date, but all you should have to do is add a function from the scalespackage to set your x-axis scale. It requires your time to be as class POSIXct. Used some dummy data since you didn't post any.
library(ggplot2)
library(scales)
library(zoo)
tbl$date <- as.POSIXct(as.yearmon(tbl$date, format = "%Y-%m"))
ggplot() +
geom_step(data = tbl, mapping = aes(x = date, y = size)) +
geom_point(data = tbl, aes(x = date, y = size)) +
labs(x = 'Data', y = 'Size (Kilobytes)', title = 'stringr: timeline of version sizes') +
scale_x_datetime(labels = date_format("%Y"))

ggplot2, error in filling the area under lines

I have this data set and I want to fill the area under each line. However I get an error saying:
Error: stat_bin() must not be used with a y aesthetic.
Additionally, I need to use alpha value for transparency. Any suggestions?
library(reshape2)
library(ggplot2)
dat <- data.frame(
a = rnorm(12, mean = 2, sd = 1),
b = rnorm(12, mean = 4, sd = 2),
month = c("JAN","FEB","MAR",'APR',"MAY","JUN","JUL","AUG","SEP","OCT","NOV","DEC"))
dat$month <- factor(dat$month,
levels = c("JAN","FEB","MAR",'APR',"MAY","JUN","JUL","AUG","SEP","OCT","NOV","DEC"),
ordered = TRUE)
dat <- melt(dat, id="month")
ggplot(data = dat, aes(x = month, y = value, colour = variable)) +
geom_line() +
geom_area(stat ="bin")

I want to fill the area under each line
This means we will need to specify the fill aesthetic.
I get an error saying "Error: stat_bin() must not be used with a y aesthetic."
This means we will need to delete your stat ="bin" code.
Additionally, I need to use alpha value for transparency.
This means we need to put alpha = <some value> in the geom_area layer.
Two other things: (1) since you have a factor on the x-axis, we need to specify a grouping so ggplot knows which points to connect. In this case we can use variable as the grouper. (2) The default "position" of geom_area is to stack the areas rather than overlap them. Because you ask about transparency I assume you want them overlapping, so we need to specify position = 'identity'.
ggplot(data = dat, aes(x = month, y = value, colour = variable)) +
geom_line() +
geom_area(aes(fill = variable, group = variable),
alpha = 0.5, position = 'identity')

To get lines across categorical variables, use the group aesthetic:
ggplot(data = dat, aes(x = month, y = value, colour = variable, group = variable)) +
#geom_line(position = 'stack') + # redundant, but this is where lines are drawn
geom_area(alpha = 0.5)
To change the color inside, use the fill aesthetic.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Strange behavior on ggplot2 - r

Adding the parameter group = group in aes() for second plot solve. Don't know why only the second map needs. ggplot(aes(x = long, y = lat, fill = AGSN, group = group), data = mapa2010[order(AGSN, id, piece, order), ]) + geom_polygon(colour = 'black')

Related

Can't make a ggplot with multiple lines, geom_line()

Labeling a US States map

Time series data using ggplot: how use different color for each time point and also connect with lines data belonging to each subject?

How do I set the x axis continuous that each plot in the graph is scattered relatively

ggplot2, error in filling the area under lines

Categories

Resources