Trying to plot temperature and count data on same plot using xyplot? - r

I am using the xyplot in lattice trying to make a plot that shows temperature change over time in correlation with count data. I am not sure if ggplot2 would be better? My data is arrange like this:
Year (1998 1998 1999 2000 2001 2001 2002)
Low (2.777778 8.333330 10.555556 4.444444 26.388889 15.555556 12.500000)
Geese (2 14 10 16 7 10 15)
State (Arkansas California California California California Florida California)
I am stuck at this part of the code:
xyplot(c(geese,low)~year,subset=state=="California", par.settings=bwtheme, auto.key=TRUE)
The plot has the geese and low (temperature) as the same type of point and if I add a line there is no separation between the two. Please any help for this would be awesome.

To plot multiple series on the same plot, use + rather than c() to specify multiple y values. For example
xyplot(geese + low ~year, subset=state=="California", auto.key=TRUE, type="b")
That will produce

Related

How do I plot multiple lines (by levels of factor) for year series?

I'm fairly new to R and I've been having trouble with a plot.
I'm trying to create a line plot with:
$YEAR on the X axis
$METRIC on the Y axis
a different-colored line for each country (meaning, a total of 3 lines on the same plot)
$COUNTRY is a factor with 3 levels
COUNTRY YEAR METRIC
USA 2000 14.874
USA 2001 15.492
USA 2002 13.091
USA 2003 14.717
CAN 1999 15.031
CAN 2000 14.343
CAN 2001 12.972
CAN 2002 13.216
SWE 1999 14.771
SWE 2000 17.033
SWE 2001 15.932
SWE 2002 14.516
SWE 2003 15.655
When I create the plot with
plot(df$YEAR, df$METRIC, col=df$COUNTRY, type="p")
I get a plot with points for each (x,y) combination and different color for each level of the factor $COUNTRY
However, when I try to get a line for each country, with
plot(df$YEAR, df$METRIC, col=df$COUNTRY, type="l")
I get one non-stopping line, that starts with the 4 observations of "USA" and then goes back to the first year of the next country ("CAN").
Can anyone explain why is this happening?
Is it possible to create this plot using only the pre-built functions?
Thank you in advance for any assistance.
Other than my comments above, here is a basic base implementation. If initially your $COUNTRY is a factor (is.factor(df$COUNTRY)), then you can skip the creation of ctryfctr and change the lines call to lines(..., col=x$COUNTRY[1]):
df$ctryfctr <- factor(df$COUNTRY)
plot(NA, xlim=range(df$YEAR), ylim=range(df$METRIC))
for (x in split(df, df$COUNTRY)) lines(x$YEAR, x$METRIC, col=x$ctryfctr[1])
Since you seem to mix up some concepts, I thought it would be helpful to clarify things a bit.
R's base plot package is great for quick sketching without prior knowledge, but more complicated plots are defined easier with ggplot2 package. You can install it with install.packages("ggplot2"). With ggplot2 you can group the lines as you already tried, and as r2evans already pointed out.
library(ggplot2)
ggplot(df) + geom_line(aes(YEAR, METRIC, group=COUNTRY, color=COUNTRY))
So, you tell the ggplot that you are using the df as your data. You define the x and y axis for geom_line inside aes(). With group= you define the grouping variable, and with color= you define that each line is using a different color.
Hope that you have great time with R and ggplot2!

Subset data and plotting in R

I would like to use R to simplify and subset large datasets (over 100 000 values) and then plot them. Below is a simplified version of my dataset (Figure 1) where I broke it down into three years and two crop types. I have a Year (2011-2013), two crop types (Corn and Soybean) and their total Area.
I want to subset the data into the total Area of Corn and Soybean by year into a new table(example figure 2) with the year, type and total area and then plot the total area by year for each (example of plot in Figure 3).
Figure 1 Small example dataset
Figure 2 New total table
Figure 3 example of graph that I want to produce
I thought I could subset the data by year and crop with
corn2011 <- subset(CropTable, Year==2011 & Lulc=="Corn")
corn2012 <- subset(CropTable, Year==2012 & Lulc=="Corn")
and then I can summarize the data using the sum function
sum(corn2011[,3]),
but I'm not sure how to plot them yearly or against each other to have it look like Figure 3.
for your plot, you could try this
data.df <- read.table(text="
Year Type Area
1 2011 corn 30
2 2012 corn 15
3 2013 corn 50
4 2011 Soy 45
5 2012 Soy 30
6 2013 Soy 60",
header = TRUE)
ggplot(data=data.df, aes(x=as.factor(Year), y=Area, group=Type, color=Type)) + geom_line() + xlab("Year") + ylab("Area (ha)") + theme_bw() + scale_color_manual(values=c("red", "blue"))

Create an Index Chart in R - relative starting point

I need to look at relative change in 2 groups of data which have very different scales.
I would therefore think that by setting my first value to 100% and then creating a proportion to that value per group is the way forward. I can then create a line chart to show the relative movement.
I would call this an index chart so may have missed existing questions.
However I don't know how to set my data up in R to do this.
My aggregated data below. I want each of 1999 to be 100% and the subsequent years to be % of that.
> Totals
year fips Emissions
1 1999 06037 6109.6900
2 2002 06037 7188.6802
3 2005 06037 7304.1149
4 2008 06037 6421.0170
5 1999 24510 403.7700
6 2002 24510 192.0078
7 2005 24510 185.4144
8 2008 24510 138.2402
I'm probably going to want to add a bar chart behind it to show weighting too as relative change is much more dramatic for smaller data. Tips on that are appreciated too but I've not searched for that yet as the above is the primary issue IMO.
Appreciate your help.
James
For example with dplyr:
library(dplyr)
dat <-
df1 %>%
group_by(fips) %>%
mutate(ind = Emissions / first(Emissions))
And using ggplot2 to plot a line chart:
library(ggplot2)
ggplot(dat, aes(x = year, y = ind, color = as.factor(fips))) +
geom_line()

How do I do a scatter plot with dygraph in R, including mouseover date?

I have 3 set of data, both in the same time series, however I want to plot data set 1 as x axis and data set 2 and 3 as y axis. I would like data set 2 and 3 to be in a separate plot. In addition, when I mouse over the data point, I would like to see the date of the data point as well. I would like to use dygraph/tauchart in R to do this.
Another point would be to zooming of the graph as well.
This is the example of my data points in xts format.
Series 1 Series 2 Series 3
Jan 2006 28397 7.55 11376
Feb 2006 21255 7.63 8702
Mar 2006 24730 7.62 10011
Apr 2006 18981 7.50 7942
May 2006 25382 7.47 10490
Jun 2006 23874 7.53 10156
Example I have seen to plot a scatter plot but no code was shown
Edited: I have done up some scatter plot, but there still edit problem with it.
The package used is Tauchart.
I cannot combined series 2 and 3 as 2 plot(Top and Bottom) separately
The plot is not scalable on y axis. I tried using auto_scale in tau_guide_x and y, however, the x scale works but not the y. I have also tried using min and max, however it is not working too.
Code
Scatterplot1<-tauchart(a) %>%
tau_point("Series.1", "Series.2") %>%
tau_tooltip() %>%
tau_guide_x(label="Amount", auto_scale=FALSE) %>%
tau_guide_y(label="Amount", auto_scale=FALSE)
This is what I have plot and the problem come in the scaling of y axis cannot be done.
Not sure about doing this with an xts object and dygraph, but if your data is in a data frame it's easy to do with the new taucharts package .
Create part of your data as a data frame:
months <- c("Jan 2006", "Feb 2006", "Mar 2006")
Series1 <- c(28397, 21225, 24730)
Series2 <- c(7.55, 7.63, 7.62)
mydata <- data.frame(months, Series1, Series2)
Install and load taucharts:
devtools::install_github("hrbrmstr/taucharts")
library("taucharts")
Create a scatter plot with tooltip using taucharts:
tauchart(mydata) %>%
tau_point("Series1", "Series2") %>%
tau_tooltip()
You can use
{"axes": {"y": {"valueRange": [1.4, 2.5]},}

Adding total counts as horizontal lines to histograms in facet_grid()

Data:
I have a data frame comprising 4 variables and about 300k rows including a unique account ID, a start date in yyyy-mm-dd, a start year, and the total number of months to-date the customer has held an account active. Snippet of the data below (don't let the row numbers confuse, this is obviously a subset, if more data is necessary, let me know):
> head(ten.by.id)
acct.id start_date strt.yr max_ten
1 155 1998-11-01 1998 175
19 902 2001-09-01 2001 143
39 995 2001-09-01 2001 143
59 1014 2000-10-01 2000 153
78 1017 2000-04-01 2000 160
100 1137 2000-11-01 2000 153
Problem (Why I want to render a faceted plot):
Showing a histogram of the entire dataset across all years renders the following:
Obviously, there are mixed distributions of information here, but the effect is unknown. First I thought I'd check for time domain effects with a visual. By using facets, I can provide a serial histogram of frequency distributions by year, overlaying the KDE plot for each year.
If multiple distributions were a product of something that occurred over time, I could spot check relevant shape changes (i.e. uni to multimodal). I used the code below to generate this plot:
maxten_time <- ggplot(ten.by.id, aes(max_ten))
+ geom_histogram(colour="grey19", fill="orange", binwidth=2, stat="bin")
+ scale_y_continuous(breaks=seq(0,12000,by=100))
+ scale_x_continuous(breaks=seq(0,180,by=45))
+ labs(title ="Serial Distribution of Max Length of Tenure for all Customers by Start Date", x="Max Tenure(months)", y="# of Customers", colour="blue")
+ facet_grid(. ~ strt.yr) + geom_density(fill=NA, colour="orange", cex=1) + aes(y = ..count..)
Which renders the following:
Questions for recreating the faceted plot:
What I wish to do is add a horizontal line (or some other single marker) to each facet which indicates
the total # of customer starts for each year. Can this be done in a faceted
plot?
I would like to add an additional axis that spans across the facets to
mark the number of months across all years (1 to 175). Am I reaching with ggplot to try to do this (i.e. since each facet is its own plot, would aligning the month markers across all facets even be possible)? I haven't seen any relevant examples on doing something quite like this.
The objective is merely to combine the horiz lines in each facet and the axis across facets into the entire plot. Any direction would be helpful.
Phillip

Resources