I'm trying to plot a time series in ggplot for certain export markets, for example-sake, Japan. I want to focus on a few different export items (e.g. pork, beef, wheat,etc.) by exporter (e.g. US, EU, Australia, etc.). I'd like to be able to set up the data so that I can use facet_wrap to show a graph for each of those goods in one image (representing the Japanese market), that has all relevant exporters. I've been trying to use geom_line but I have no idea how to arrange the data so that I can use facet_wrap, ggplot, etc.
You need two columns that specify the exporter and country, in long format (so each row is a unique combination of product, exporter, country and date). A reproducible example of this is shown below.
Then, the key plot element is using facet_grid(exporter~product).
export_data.df <- data.frame(
value = runif(36),
Date = rep(c(rep(as.Date("1999/1/1"),3),
rep(as.Date("1999/1/2"),3),
rep(as.Date("1999/1/3"),3),
rep(as.Date("1999/1/4"),3)),3),
exporter = rep(rep(c("Japan","USA","NZ"),3),4),
product = rep(c(rep("Pork",3),rep("Beef",3),rep("Chicken",3)),4)
)
ggplot(export_data.df) +
geom_line(mapping = aes(x = Date,y = value)) +
facet_grid(exporter~product)
Output of above code
Related
Building on this earlier question, let's say the data table has columns ID,factor,SimulationID,Data. During the plotting, we want to plot a new graph. for each (factor, SimulationID) tuple using facet_grid(). And for each of these plots, we will use ID varialbe to connect to Data points as a line. However, the set of unique ID values in for each (factor, SimulationID) tuple are different from each other.
Now, What I want is to highlight one of the curves in each of these plots separated by facet_grid().
ggplot(d) +
facet_grid(factor~SimulationID,)+
geom_line(aes(idx, value, colour = type)) +
gghighlight(ID == <choose a valid ID randomly>)
I have several datasets and my end goal is to do a graph out of them, with each line representing the yearly variation for the given information. I finally joined and combined my data (as it was in a per month structure) into a table that just contains the yearly means for each item I want to graph (column depicting year and subsequent rows depicting yearly variation for 4 different elements)
I have one factor that is the year and 4 different variables that read yearly variations, thus I would like to graph them on the same space. I had the idea to joint the 4 columns into one by factor (collapse into one observation per row and the year or factor in the subsequent row) but seem unable to do that. My thought is that this would give a structure to my y axis. Would like some advise, and to know if my approach to the problem is effective. I am trying ggplot2 but does not seem to work without a defined (or a pre defined range) y axis. Thanks
I would suggest next approach. You have to reshape your data from wide to long as next example. In that way is possible to see all variables. As no data is provided, this solution is sketched using dummy data. Also, you can change lines to other geom you want like points:
library(tidyverse)
set.seed(123)
#Data
df <- data.frame(year=1990:2000,
v1=rnorm(11,2,1),
v2=rnorm(11,3,2),
v3=rnorm(11,4,1),
v4=rnorm(11,5,2))
#Plot
df %>% pivot_longer(-year) %>%
ggplot(aes(x=factor(year),y=value,group=name,color=name))+
geom_line()+
theme_bw()
Output:
We could use melt from reshape2 without loading multiple other packages
library(reshape2)
library(ggplot2)
ggplot(melt(df, id.var = 'year'), aes(x = factor(year), y = value,
group = variable, color = variable)) +
geom_line()
-output plot
Or with matplot from base R
matplot(as.matrix(df[-1]), type = 'l', xaxt = 'n')
data
set.seed(123)
df <- data.frame(year=1990:2000,
v1=rnorm(11,2,1),
v2=rnorm(11,3,2),
v3=rnorm(11,4,1),
v4=rnorm(11,5,2))
In the time series data created below data, individuals (denoted by a unique ID) were sampled from 2 populations (NC and SC). All individuals have the same number of observations. I want to average the data for each respective "time point" for all individuals that belong to the same "State" (the average line) and I want to plot the average lines from each state against each other. I want it to look something like this:
library(tidyverse)
set.seed(123)
ID <- rep(1:10, each = 500)
Time = rep(c(1:500),10)
Location = rep(c("NC","SC"), each = 2500)
Var <- rnorm(5000)
data <- data.frame(
ID = factor(ID),
Time = Time,
State = Location,
Variable = Var
)
I would recommend getting familiar with the various dplyr functions. Specifically, group_by and summarise. You may want to read through: Introduction to dplyr or going through this series of blog posts.
In short, we are grouping the data by the Time and State variable and then summarizing that data with an average (i.e., mean(Variable)). To plot the data, we put Time on our x-axis, the newly created avg_var on our y-axis, and use State to represent color. These are assigned as our chart's aesthetics (i.e., aes(...). Finally, we add the line geom with geom_line() to render the lines on our visualization.
data %>%
group_by(Time, State) %>%
summarise(avg_var = mean(Variable)) %>%
ggplot(aes(x = Time, y = avg_var, color = State)) +
geom_line()
I am trying to calculate the city wise spend on each product on yearly basis.Also including graphical representation however I am not able to get the graphs on R?
Top_11 <- aggregate(Ca_spend["Amount"],
by = Ca_spend[c("City","Product","Month_Year")],
FUN="sum")
A <- ggplot(Top_11,aes(x=City,Month_Year,y=Amount))
A <-geom_bar(stat="identity",position='dodge',fill="firebrick1",colour="black")
A <- A+facet_grid(.~Type)
This is the code I am using.I am trying to plot City,Product,Year on same graph.
VARIABLES-(City product Month_Year Amount)
(OBSERVATIONS)- New York Gold 2004 $50,0000 (Sample DATA Type)
I'd try this:
ggplot(Top_11,aes(x=City, fill = Product, y=Amount)) +
geom_col() +
facet_wrap(~Month_Year)
For your 5 rows of sample data, that gives the graph below. You can play around with which variable goes to fill (fill color), x (x-axis), and facet_wrap (for small multiples). I see in your code you tried facet_grid(.~Type), but that won't work unless you have a column named Type.
I would like to plot several time series on the same panel graph, instead of in separate panels. I took the below R code from another stackoverflow post.
Please note how the 3 time series are in 3 different panels. How would I be able to layer the 3 time series on 1 panal, and each line can differ in color.
Time = Sys.time()+(seq(1,100)*60+c(rep(1,100)*3600*24, rep(2, 100)*3600*24, rep(3, 100)*3600*24))
Value = rnorm(length(Time))
Group = c(0, cumsum(diff(Time) > 1))
library(ggplot2)
g <- ggplot(data.frame(Time, Value, Group)) +
geom_line (aes(x=Time, y=Value, color=Group)) +
facet_grid(~ Group, scales = "free_x")
If you run the above code, you get this:
When the facet_grid() part is eliminated, I get a graph that looks like this:
Basically, I would like ggplot to ignore the differences in the dates, and only consider the times. And then use group to identify the differing dates.
This problem could potentially be solved by creating a new column that only contains the times (eg. 22:01, format="%H:%M"). However, when as.POSIXct() function is used, I get a variable that contains both date and time. I can't seem to escape the date part.
Since the data file has different days for each group's time, one way to get all the groups onto the same plot is to just create a new variable, giving all groups the same "dummy" date but using the actual times collected.
experiment <- data.frame(Time, Value, Group) #creates a data frame
experiment$hms <- as.POSIXct(paste("2015-01-01", substr(experiment$Time, 12, 19))) # pastes dummy date 2015-01-01 onto the HMS of Time
Now that you have the times with all the same date, you then can plot them easily.
experiment$Grouping <- as.factor(experiment$Group) # gglot needed Group to be a factor, to give the lines color according to Group
ggplot(experiment, aes(x=hms, y=Value, color=Grouping)) + geom_line(size=2)
Below is the resulting image (you can change/modify the basic plot as you see fit):