time series graphs with nPlot - r

I'm trying to plot a time series graph with nPlot and having difficulties in presenting the labels of the X-axis in a desirable way.
I've been looking if this problem had come up before and it did but without solution yet (as far as i managed to find), I wonder if a solution is already available?
in this case i get the X-axis on a range between -1 and 1, and no lines on the graph:
date = c("2013-07-22", "2013-07-29" ,"2013-08-05", "2013-08-12", "2013-08-19","2013-08-26", "2013-09-02" ,"2013-09-09" ,"2013-09-16")
test = as.data.frame(date)
test$V1 = c("10","11","13","12","11","10","15","12","9")
test$V2 = c("50","51","53","52","51","50","55","52","59")
test1 = melt(test,id = c("date"))
n1 = nPlot(value ~ date, group = "variable", data = test1, type="lineWithFocusChart")
if I add and than plot again:
test1$date = as.Date(test1$date)
I get the wanted graph but the X-axis labels are in their numeric form (15900..)
Thanks.

Here is one way to make it work. I have made some changes to your code. One, I have made V1 and V2 numeric, since you want to be plotting numbers on the y axis. Second, I have added a utility function to_jsdate that takes the character date and converts it into a javascript date (number of milliseconds after 1970-01-01). Date handling is still a little raw in rCharts, but we are working on making it better.
date = c("2013-07-22", "2013-07-29" ,"2013-08-05", "2013-08-12", "2013-08-19",
"2013-08-26", "2013-09-02" ,"2013-09-09" ,"2013-09-16")
test = as.data.frame(date)
test$V1 = as.numeric(c("10","11","13","12","11","10","15","12","9"))
test$V2 = as.numeric(c("50","51","53","52","51","50","55","52","59"))
test1 = reshape2::melt(test,id = c("date"))
to_jsdate <- function(date_){
val = as.POSIXct(as.Date(date_),origin="1970-01-01")
as.numeric(val)
}
test1 = transform(test1, date2 = to_jsdate(date))
n1 = nPlot(value ~ date2, group = "variable", data = test1, type="lineWithFocusChart")
n1$xAxis(tickFormat = "#! function(d){
return d3.time.format('%Y-%m-%d')(new Date(d*1000))
} !#")
n1

Related

Use ggplot in an R function with three inputs: filename of dataframe, and two column variables of numeric data [duplicate]

This question already has answers here:
How to use a variable to specify column name in ggplot
(6 answers)
Dynamically select data frame columns using $ and a character value
(10 answers)
Closed last year.
I would like to create an R function that takes as input:
a dataframe of my choosing
two columns of the dataframe containing numeric data
The output should be a scatterplot of one column variable against another, using both base R plot function and ggplot.
Here is a toy dataframe:
df <- data.frame("choco" = 1:5,
"tea" = c(2,4,5,8,10),
"coffee" = c(0.5,2,3,1.5,2.5),
"sugar" = 16:20)
Here is the function I wrote, which doesn't work (also tried this with base R plot which didn't work - code not shown)
test <- function(Data, ing1, ing2) {
ggplot(Data, aes(x = ing1, y = ing2)) +
geom_point()
}
test(Data = df, ing1 = "choco", ing2 = "tea")
As part of the above function, I would like to incorporate an 'if..else' statement to test whether ing1 and ing2 inputs are valid, e.g.:
try(test("coffee", "mint"))
the above inputs should prompt a message that 'one or both of the inputs is not valid'
I can see that using %in% could be the right way to do this, but I'm unsure of the syntax.
df <- data.frame(
"choco" = 1:5,
"tea" = c(2, 4, 5, 8, 10),
"coffee" = c(0.5, 2, 3, 1.5, 2.5),
"sugar" = 16:20
)
test <- function(Data, ing1, ing2) {
if (ing1 %in% names(Data) & ing2 %in% names(Data)) {
ggplot(Data, aes(x = Data[, ing1], y = Data[, ing2])) +
geom_point()
}
else {
print("Both ing1, and ing2 has to be columns of data frame")
}
}
test(Data = df, ing1 = "choco", ing2 = "sugar")
Regards,
Grzegorz

Adding main titles from list to graphs in for loop

I have two datasets. One with measured concentrations for several days, the other with, for every relevant date, the wind direction.
library(ggplot2)
concentrations = data.frame(
datehour = c("2017-02-15 09:00:00", "2017-02-15 10:00:00","2017-02-15
11:00:00", "2017-02-16 09:00:00", "2017-02-16 10:00:00",
"2017-02-16 11:00:00"),
Number = c(3000, 4000, 2000, 6000, 7000, 5000),
Hour = c(9, 10, 11, 9, 10, 11))
winddir = data.frame(
Date = c("2017-02-15", "2017-02-16"),
Wind = c("S", "SW"))
I use a for loop to create a PDF with a graph of every day. This works fine.
I want, however to add a main title to every graph, with both date and wind direction of the relevant day. So I converted the wind direction dataframe into two lists, to use in the for loop.
I am able to use the date list fine, but I do not know how to obtain the wind direction for the main title.
I tried with starting with c = 1 and then add every loop 1 extra to c and use this to obtain items from wind_list with paste(wind_list[c]). This returns a title with the number of c (so "Wind: 1", "Wind: 2" etc.), instead of the wind direction.
#creating the lists
wind_list <- as.list(winddir$Wind)
date_list <- as.list(winddir$Date)
#function for plotting graph
plotdays <- function(){
ggplot() +
geom_line(data = concTemp, aes(x = Hour, y = Number))+
ylab("UFP concentration") + xlab("Hour") +
ggtitle(paste("Average UFP concentration per hour on", paste(i),
" Wind:", paste(wind_list[c])))
}
#the for loop for creating the graphs in 1 PDF
pdf("Test.pdf", onefile = TRUE)
c = 1
for (i in date_list){
concTemp <- subset(concentrations, grepl(i, datehour))
if (c == 1){
wind <- wind_list[c]
plotD <- plotdays()
print(plotD)
c = c+1
} else {
wind <- wind_list[c]
plotD <- plotdays()
print(plotD)
break
}
}
dev.off()
Any suggestions?
Your wind lists containing factors and your function seems to read the underlying integers. You could add the wind directions as character instead.
#creating the lists
wind_list <- as.list(as.character(winddir$Wind))
date_list <- as.list(as.character(winddir$Date))
Result:

Creating a boxplot loop with ggplot2 for only certain variables

I have a dataset with 99 observations and I need to create boxplots for ones with a specific string in them. However, when I run this code I get 57 of the exact same plots from the original function instead of the loop. I was wondering how to prevent the plots from being overwritten but still create all 57. Here is the code and a picture of the plot.
Thanks!
Boxplot Format
#starting boxplot function
myboxplot <- function(mydata=ivf_dataset, myexposure =
"ART_CURRENT", myoutcome = "MEG3_DMR_mean")
{bp <- ggplot(ivf_dataset, aes(ART_CURRENT, MEG3_DMR_mean))
bp <- bp + geom_boxplot(aes(group =ART_CURRENT))
}
#pulling out variables needed for plots
outcomes = names(ivf_dataset)[grep("_DMR_", names(ivf_dataset),
ignore.case = T)]
#creating loop for 57 boxplots
allplots <- list()
for (i in seq_along(outcomes))
{
allplots[[i]]<- myboxplot (myexposure = "ART_CURRENT", myoutcome =
outcomes[i])
}
allplots
I recommend reading about standard and non-standard evaluation and how this works with the tidyverse. Here are some links
http://adv-r.had.co.nz/Functions.html#function-arguments
http://adv-r.had.co.nz/Computing-on-the-language.html
I also found this useful
https://rstudio-pubs-static.s3.amazonaws.com/97970_465837f898094848b293e3988a1328c6.html
Also, you need to produce an example so that it is possible to replicate your problem. Here is the data that I created.
df <- data.frame(label = rep(c("a","b","c"), 5),
x = rnorm(15),
y = rnorm(15),
x2 = rnorm(15, 10),
y2 = rnorm(15, 5))
I kept most of your code the same and only changed what needed to be changed.
myboxplot2 <- function(mydata = df, myexposure, myoutcome){
bp <- ggplot(mydata, aes_(as.name(myexposure), as.name(myoutcome))) +
geom_boxplot()
print(bp)
}
myboxplot2(myexposure = "label", myoutcome = "y")
Because aes() uses non-standard evaluation, you need to use aes_(). Again, read the links above.
Here I am getting all the columns that start with x. I am assuming that your code gets the columns that you want.
outcomes <- names(df)[grep("^x", names(df), ignore.case = TRUE)]
Here I am looping through in the same way that you did. I am only storing the plot object though.
allplots <- list()
for (i in seq_along(outcomes)){
allplots[[i]]<- myboxplot2(myexposure = "label", myoutcome = outcomes[i])$plot
}
allplots

R Data Structure Setup for Reproducible Research

Background
I get hourly interval reports on equipment in buildings, a lot of buildings and a lot of equipment. Each parameter on the equipment is called a point and they already have a name, I don't get to choose the name of the point. Each point name is unique. What I'm trying to do is run a standard report on each building. Eventually, I'd like to move this to Shiny and look at my graphs and maybe print a report from there, but... baby steps.
Question
Am I on the right track? Is there a more efficient way of doing this? Am I going to run into problems when I start to write Markdown reports or transfer this over to Shiny?
Sample Code
library(tidyverse)
set.seed(55)
test_func <- function(pointa, pointb, mult) {
out = (pointb - pointa) * mult
return(out)
}
test_fail <- function(pointa, pointb) {
out = ifelse(pointa > (pointb - 9), 1, 0)
return(out)
}
tbl.data <- data.frame(
date = c(rep("2/1/2018", 24),
rep("2/2/2018", 24),
rep("2/3/2018", 24),
rep("2/4/2018", 24),
rep("2/5/2018", 24),
rep("2/6/2018", 24),
rep("2/7/2018", 24)),
hour = rep(0:23, 7),
equipa.vala = runif(168, min = 50, max = 60),
equipb.vala = runif(168, min = 50, max = 60)
) %>%
mutate(
equipa.valb = 10 + equipa.vala * runif(168, min = 0.75, max = 1.25),
equipb.valb = 10 + equipb.vala * runif(168, min = 0.75, max = 1.25)
)
tbl.equip <- data.frame(
equipment.id = c(1,2),
equipment.name = c("equipa", "equipb"),
equipment.mult = c(5, 7)
)
tbl.point <- data.frame(
point = c("equipa.vala", "equipa.valb", "equipb.vala", "equipb.valb"),
equipment = c("equipa", "equipa", "equipb", "equipb"),
category = c("vala", "valb", "vala", "valb")
)
for (eq in tbl.equip[,2]) {
vala <- as.character(
tbl.point$point[tbl.point$equipment == eq &
tbl.point$category == "vala"]
)
valb <- as.character(
tbl.point$point[tbl.point$equipment == eq &
tbl.point$category == "valb"]
)
equip.mult <- as.numeric(
tbl.equip$equipment.mult[tbl.equip$equipment.name == eq]
)
for.data <- tbl.data %>%
select_(cola = vala,
colb = valb) %>%
mutate(
result = test_func(cola, colb, equip.mult),
fault = test_fail(cola, colb)
)
score <- sum(for.data$fault)/length(for.data$fault)
savings <- sum(for.data$result[for.data$result > 0])
p1 <- ggplot(for.data, aes(x = colb, y = cola, color = as.factor(fault))) +
geom_point() +
annotate("text", label = paste("savings is:", savings), x = 50, y = 60) +
annotate("text", label = paste("score is:", score), y = 51, x = 80) +
ggtitle(paste("Equipment:", eq)) +
theme_minimal()
print(p1)
}
Explanation
So in this sample, the tbl.data data frame would be the data I receive from each building. I'd have to manually create the tbl.equipment and tbl.point data frames, which I'd just house in *.csv files on my machine, or database (and be able to add/edit in Shiny). There's no standard for point names and there's not a guarantee that each piece of equipment has each point. Using select() helpers such as contains() or starts_with() is out of the question.
So I just created an Equipment table, which has parameters on the equipment, (in this case a multiple). Also, there's a Point table, which tells which piece of equipment and which category each point belongs to.
For this simple example, there's two sample functions I included. One calculates a value based on the the data, the other tests for a fault. My biggest problem in the past has been when a piece of equipment doesn't have a point, it stops the execution, so I have to manually go in and take it out or something else. I guess the workaround is to use exists() or something similar and test before running that piece of code.
Again, for this simple example, I just printed a plot, but the output could be a Markdown Document (which I think I've done before, but not like this) or Shiny (which I've created some simpler Apps).
Conclusion
The big question is "Is this the "right" way of doing it?" I'm sure this is pretty common and there has to be a really efficient method I'm not using. What's going to set me up for success when I start writing code to print out reports or taking this into a Shiny App?

R - visualising data over time

I'm trying to plot a dataset over time (timeframe of ms/s). I need to show the order of events, the type of event and the duration of each event + the time between events. The dataset consists of a start time, end time and category.
I got close with this code someone used to answer a similar question back in '11 but found that I couldn't get it to colour the events according to the category, and I don't understand what the code is doing well enough to fix the issue.
zucchini <- function(st, en, mingap=1)
{
i <- order(st, en-st);
st <- st[i];
en <- en[i];
last <- r <- 1
while( sum( ok <- (st > (en[last] + mingap)) ) > 0 )
{
last <- which(ok)[1];
r <- append(r, last);
}
if( length(r) == length(st) )
return( list(c = list(st[r], en[r]), n = 1 ));
ne <- zucchini( st[-r], en[-r]);
return(list( c = c(list(st[r], en[r]), ne$c), n = ne$n+1));
}
{
zu <- zucchini(st, en, mingap = 1);
plot.new();
plot.window( xlim=c(min(st), max(en)), ylim = c(0, zu$n+1));
box(); axis(1);
for(i in seq(1, 2*zu$n, 2))
{
x1 <- zu$c[[i]];
x2 <- zu$c[[i+1]];
for(j in 1:length(x1))
rect( x1[j], (i+1)/2, x2[j], (i+1)/2+0.5,col=data$Type, border="black",
);
legend('bottomright', legend = levels(data$Type), col = 1:10, cex = 0.8, pch = 1)}
}
st <- data$Time
en <- data$End
coliflore(st,en)
current code outputs this As best as I can tell it is assigning all boxes the same colour, that of the category of the first data point.
Does anyone know either: how to get this code to assign colours to the boxes based on a category, or how to accomplish this kind of plotting another way?
Its a little hard to for me to see whats going on without a toy dataset for your example. For maximum control over coloring in plots I like to add a color column to the dataframe or create a vector to store color values for use in plotting instead of using the factor levels to generate colors (eg data$Type). For instance if I want factors 1:3 to be red, green, and blue:
# create data frame with X,Y coordinates and 3 factor levels
toy_data<- data.frame (X= 1:9, Y=9:1, Factor = rep(1:3, times=3))
# create a vector of colors to use for plotting
# color function
colFxn<-function(val){
cw_df<-data.frame(value=1:3, color = c("red", "green", "blue"))
return(cw_df[cw_df$value %in% val,]$color)
}
col_vec<-sapply (toy_data$Factor, colFxn)
#plot
plot(toy_data$X, toy_data$Y, col=col_vec)
I prefer this option because of the control I have over my colors. This can also be expanded to transparent colors by changing the alpha value using the RGB function, or through using a color pallet available through many packages.

Resources