Creating a subplot after adding plots using for loop - julia

I have dataframe with an id column. I want to filter the dataframe for a specific id and then add data from columns to a plot (This part is done using a for loop). And finally, I want to add the plots (3 in this case) into a subplot. I have achieved it like this, but the plot I end up with is incorrect. Wondering if anyone has an idea what I am doing wrong (top two subplots are empty, all info seem to be in the third subplot)?
function plotAnom(data::DataFrame)
data = copy(data)
uniqIds = unique(data.id)
# Create emplty plots
p1 = plot()
p2 = plot()
p3 = plot()
for id in uniqIds
# Filter datframe based on id
df = data[data.id .== id,:]
p1 = plot!(
df.time,
df[:,names(df)[3]],
label = id,
line = 3,
legend = :bottomright,
)
p1 = plot!(
df.time,
df.pre_month_avg,
label = id,
line = 2
)
p2 = plot!(
df.time,
df.diff,
line = 3
)
p3 = plot!(
df.time,
df.cumulative_diff,
line = 3
)
end
p = plot(p1,p2,p3,layout = (3,1), legend = false)
return p
end

In your example, you always update the latest created plot, which is your p3. When you use plot!, you specify which plot gets updated by putting it in plot!'s arguments (otherwise it updates the latest one). So I think you should do plot!(p1, ...) instead of p1 = plot!(...), and so on.

Related

Create Loop for Box Plots in R

I have a dataset that contains both numeric and categorical values. I am trying to create box plots to visually identify outliers for each numeric column in my dataset. The below code works to do this, but it is very clunky and I would not want to use this code with even more variables. I am looking for a way to use a loop to create box plots using a loop in R.
Here is the clunky code that works without a loop:
#Using Boxplots, check for outliers in each in each float or integer value column.
b <-boxplot(df$item1, main = 'item1')
b <-boxplot(df$item2, main = 'item2')
b <-boxplot(df$item3, main = 'item3')
b <-boxplot(df$item4, main = 'item4')
b <-boxplot(df$item5, main = 'item5')
b <-boxplot(df$item6, main = 'item6')
b <-boxplot(df$item7, main = 'item7')
b <-boxplot(df$item8, main = 'item8')
b <-boxplot(df$item9, main = 'item9')
b <-boxplot(df$item10, main = 'item10')
b <-boxplot(df$item11, main = 'item11')
b <-boxplot(df$item12, main = 'item12')
b <-boxplot(df$item13, main = 'item13')
b <-boxplot(df$item14, main = 'item14')
b <-boxplot(df$item15, main = 'item15')
b <-boxplot(df$item16, main = 'item16')
In python the code would be:
outliers = ['Item1', 'Item2', 'Item3', 'Item4', 'Item5', 'Item6', 'Item7', 'Item8', 'Item9', 'Item10', 'Item11', 'Item12', 'Item13', 'Item14', 'Item15', 'Item16']
i=0
while i < len(outliers):
sns.boxplot(x = outliers[i], data = df)
plt.show()
i = i + 1
(I am looking for something similar in R!)
Thank you!
Using a for loop to loop over the columns and a minimal reprex based on mtcars you could do
outliers <- c("mpg", "hp")
for (i in outliers) {
boxplot(mtcars[i], main = i)
}

How to change the color of outliers of certain category in boxplot()?

Put simply, I want to color outliers, but only if they belong to specific category, i.e. I want
boxplot(mydata[,2:3], col=c("chartreuse","gold"), outcol="red")
but red only for those elements for which mydata[,1] is M .
It appears that outcol only specifies one color per variable (box). However, you can use points to overplot individual points any way that you want. You need to figure out the relevant x and y coordinates to use for plotting. When you make a boxplot with a statement like boxplot(mydata[,2:3]) the first variable (column 2) is plotted at x=1 and the second variable (column 3) is plotted at x=2. By capturing the return value of boxplot you can figure out the y values. Since you do not provide any data, I will illustrate with randomly generated data.
## Data
set.seed(42)
NumPts = 400
a = rnorm(NumPts)
b = rnorm(NumPts)
c = rnorm(NumPts)
CAT = sample(c("M", "N"), NumPts, replace=T)
mydata = data.frame(a,b,c, CAT)
## Find outliers
BP = boxplot(mydata[,2:3], col=c("chartreuse","gold"))
OUT2 = which(mydata[,2] %in% BP$out)
OUT3 = which(mydata[,3] %in% BP$out)
## Find outliers with category == M
M_OUT2 = OUT2[which(mydata$CAT[OUT2] == "M")]
M_OUT3 = OUT3[which(mydata$CAT[OUT3] == "M")]
## Plot desired points
points(rep(1, length(M_OUT2)),mydata[M_OUT2, 2], col="red")
points(rep(2, length(M_OUT3)),mydata[M_OUT3, 3], col="red")

R - visualising data over time

I'm trying to plot a dataset over time (timeframe of ms/s). I need to show the order of events, the type of event and the duration of each event + the time between events. The dataset consists of a start time, end time and category.
I got close with this code someone used to answer a similar question back in '11 but found that I couldn't get it to colour the events according to the category, and I don't understand what the code is doing well enough to fix the issue.
zucchini <- function(st, en, mingap=1)
{
i <- order(st, en-st);
st <- st[i];
en <- en[i];
last <- r <- 1
while( sum( ok <- (st > (en[last] + mingap)) ) > 0 )
{
last <- which(ok)[1];
r <- append(r, last);
}
if( length(r) == length(st) )
return( list(c = list(st[r], en[r]), n = 1 ));
ne <- zucchini( st[-r], en[-r]);
return(list( c = c(list(st[r], en[r]), ne$c), n = ne$n+1));
}
{
zu <- zucchini(st, en, mingap = 1);
plot.new();
plot.window( xlim=c(min(st), max(en)), ylim = c(0, zu$n+1));
box(); axis(1);
for(i in seq(1, 2*zu$n, 2))
{
x1 <- zu$c[[i]];
x2 <- zu$c[[i+1]];
for(j in 1:length(x1))
rect( x1[j], (i+1)/2, x2[j], (i+1)/2+0.5,col=data$Type, border="black",
);
legend('bottomright', legend = levels(data$Type), col = 1:10, cex = 0.8, pch = 1)}
}
st <- data$Time
en <- data$End
coliflore(st,en)
current code outputs this As best as I can tell it is assigning all boxes the same colour, that of the category of the first data point.
Does anyone know either: how to get this code to assign colours to the boxes based on a category, or how to accomplish this kind of plotting another way?
Its a little hard to for me to see whats going on without a toy dataset for your example. For maximum control over coloring in plots I like to add a color column to the dataframe or create a vector to store color values for use in plotting instead of using the factor levels to generate colors (eg data$Type). For instance if I want factors 1:3 to be red, green, and blue:
# create data frame with X,Y coordinates and 3 factor levels
toy_data<- data.frame (X= 1:9, Y=9:1, Factor = rep(1:3, times=3))
# create a vector of colors to use for plotting
# color function
colFxn<-function(val){
cw_df<-data.frame(value=1:3, color = c("red", "green", "blue"))
return(cw_df[cw_df$value %in% val,]$color)
}
col_vec<-sapply (toy_data$Factor, colFxn)
#plot
plot(toy_data$X, toy_data$Y, col=col_vec)
I prefer this option because of the control I have over my colors. This can also be expanded to transparent colors by changing the alpha value using the RGB function, or through using a color pallet available through many packages.

R - If else statement within for loop

I have a data frame with 3 columns of data that I would like to plot separately - 3 plots. The data has NA in it (in different places within the 3 columns). I basically want to interpolate the missing values and plot that segment of the line (multiple sections) in red and the remainder of the line black.
I have managed to use 'zoo' to create the interpolated data but am unsure how then to plot this data point a different colour. I have found the following Elegant way to select the color for a particular segment of a line plot?
but was thinking I could use a for loop with if else statement to create the colour column as advised in the link - I would need 3 separate colour columns as I have 3 datasets.
Appreciate any help - cannot really provide an example as I'm unsure where to start! Thanks
This is my solution. It assumes that the NAs are still present in the original data. These will be omitted in the first plot() command. The function then loops over just the NAs.
You will probably get finer control if you take the plot() command out of the function. As written, "..." gets passed to plot() and a type = "b" graph is mimicked - but it's trivial to change it to whatever you want.
# Function to plot interpolated valules in specified colours.
PlotIntrps <- function(exxes, wyes, int_wyes, int_pt = "red", int_ln = "grey",
goodcol = "darkgreen", ptch = 20, ...) {
plot(exxes, wyes, type = "b", col = goodcol, pch = ptch, ...)
nas <- which(is.na(wyes))
enn <- length(wyes)
for (idx in seq(nas)) {
points(exxes[nas[idx]], int_wyes[idx], col = int_pt, pch = ptch)
lines(
x = c(exxes[max(nas[idx] - 1, 1)], exxes[nas[idx]],
exxes[min(nas[idx] + 1, enn)]),
y = c(wyes[max(nas[idx] - 1, 1)], int_wyes[idx],
wyes[min(nas[idx] + 1, enn)]),
col = int_ln, type = "c")
# Only needed if you have 2 (or more) contiguous NAs (interpolations)
wyes[nas[idx]] <- int_wyes[idx]
}
}
# Dummy data (jitter() for some noise)
x_data <- 1:12
y_data <- jitter(c(12, 11, NA, 9:7, NA, NA, 4:1), factor = 3)
interpolations <- c(10, 6, 5)
PlotIntrps(exxes = x_data, wyes = y_data, int_wyes = interpolations,
main = "Interpolations in pretty colours!",
ylab = "Didn't manage to get all of these")
Cheers.

time series graphs with nPlot

I'm trying to plot a time series graph with nPlot and having difficulties in presenting the labels of the X-axis in a desirable way.
I've been looking if this problem had come up before and it did but without solution yet (as far as i managed to find), I wonder if a solution is already available?
in this case i get the X-axis on a range between -1 and 1, and no lines on the graph:
date = c("2013-07-22", "2013-07-29" ,"2013-08-05", "2013-08-12", "2013-08-19","2013-08-26", "2013-09-02" ,"2013-09-09" ,"2013-09-16")
test = as.data.frame(date)
test$V1 = c("10","11","13","12","11","10","15","12","9")
test$V2 = c("50","51","53","52","51","50","55","52","59")
test1 = melt(test,id = c("date"))
n1 = nPlot(value ~ date, group = "variable", data = test1, type="lineWithFocusChart")
if I add and than plot again:
test1$date = as.Date(test1$date)
I get the wanted graph but the X-axis labels are in their numeric form (15900..)
Thanks.
Here is one way to make it work. I have made some changes to your code. One, I have made V1 and V2 numeric, since you want to be plotting numbers on the y axis. Second, I have added a utility function to_jsdate that takes the character date and converts it into a javascript date (number of milliseconds after 1970-01-01). Date handling is still a little raw in rCharts, but we are working on making it better.
date = c("2013-07-22", "2013-07-29" ,"2013-08-05", "2013-08-12", "2013-08-19",
"2013-08-26", "2013-09-02" ,"2013-09-09" ,"2013-09-16")
test = as.data.frame(date)
test$V1 = as.numeric(c("10","11","13","12","11","10","15","12","9"))
test$V2 = as.numeric(c("50","51","53","52","51","50","55","52","59"))
test1 = reshape2::melt(test,id = c("date"))
to_jsdate <- function(date_){
val = as.POSIXct(as.Date(date_),origin="1970-01-01")
as.numeric(val)
}
test1 = transform(test1, date2 = to_jsdate(date))
n1 = nPlot(value ~ date2, group = "variable", data = test1, type="lineWithFocusChart")
n1$xAxis(tickFormat = "#! function(d){
return d3.time.format('%Y-%m-%d')(new Date(d*1000))
} !#")
n1

Resources