geom_bspline across multiple plots combined into a single figure - r

I would like to create a ggplot2 layer that includes multiple geom_bspline(), or something similar, to point to regions on different plots after combining them into a single figure. A feature in the data seen in one plot appears in another plot after a transformation. However, it may not be clear to a non-expert they are due to the same phenomenon. The plots are to be combined into a single figure using ggarrange(), cowplot(), patchwork() or something similar.
I can get by using ggforce::geom_ellipse() on each plot but it's not as clean. Any suggestions?

Of course, after asking the question and staring at the figure in question, it came to me that I simply need to add a geom_bspline() to the combined figure. Tried that earlier but didn't give enough thought to the coordinates on the new layer. The coordinates of the spline are given in the range of 0 to 1 for both the x and y values on this new layer. Simple and obvious.

Related

Issues with combining different (continuous and ordinal) plot types into one plot

I am preparing a figure for a paper presenting data for 2 different experiments in one plot. For that reason I don't need a legend for every plot, so I try to combine them with ggdraw from cowplot.
My code
should generate a reproducible example
and gives this output:
It seems like the two figures get the same slot (A) and the legend gets slot (B). Typically, I would probably use facet wrap to plot them together (which should also guarantee that the scaling/legend is consistent across the two plots.), but that will probably not work in this case, as I am trying to add an additional figure type to C and D.
The problem is that this figure type is ordinal so I have used a somewhat “hacky” approach to plot it, giving me this figure looking essentially as I want it to:
I so far have not been able to extract to another element that ggdraw can use.
Ideally the final plot should roughly look like this (of course with different labels):
How would you go about plotting these different types together?
Thank you for taking time to read my question and I hope that you can help me. I now it is quite a mouth full, but I was not sure how I meaningfully could reduce it to smaller chunks.

Force starting point of lines()

Perhaps because the question is so basic, the keywords that I can think up for this question all directs me to other things. I am trying to draw a graph with spiky curve lines that connect the medians. The real data is very big, but the starting values are duplicates of (0,0):
DATA<-data.frame(time<-c(sort(rep(c(0,2,4,8,12),4))),
conc<-c(rep(0,4),rnorm(n=4,mean=30),
rnorm(n=4,mean=10),
rnorm(n=4,mean=35),
rnorm(n=4,mean=15)))
# Create blank graph
plot(NULL,NULL,xlab="Time",ylab="Conc",
xlim=c(0,15),ylim=c(0,40),main="Example")
# Add line
require(quantreg)
require(plyr)
require(MatrixModels)
DATA<-plyr::arrange(DATA,time)
fit3<-rqss(DATA$conc~qss(DATA$time,constraint="N"),tau=0.5,data = DATA)
lines(unique(DATA$time)[-1],fit3$coef[1] + fit3$coef[-1],lwd=2)
As you can see, the line does not connect to the starting (0,0) values and instead start at the next lowest level.
I was tempted to cheat, but it does not connect to the lines and I would really prefer to work it out with the rest of the code instead of trying to pass off two lines as one:
# Cheating getaway but does not work well, segments are not connected
segments(x0=0,y0=0,x1=2,y1=30,lwd=2)
Some relevant answers that I found were not appropriate for my situation.
Line in R plot should start at a different timepoint for example suggest modifying the data, which would not help to extend my line and plus my actual data is too big that I would be wary to do this kind of manipulation. I would not want to use plot(x,y,type="l") even though it goes through the (0,0) point, because 1) it looks bad on the huge data, and 2) I would have to overlay another similar line using lines(). I wonder whether it has more to do with rqss and less with lines?
I apologize if this has already been asked before.

R - Adding series to multiple plots

I have the following plot:
plot.ts(returns)
I have another dataframe ma_sd which contains the rolling SD from moving averages of the above returns. The df is structured exactly like returns. Is there a simple way to add each line to the corresponding plots?
lines(1:N, ma_sd) seemed intuitive, but it does not work.
Thanks
The only way I can see you doing this is to plot them separately. This code is a bit clunky but will allow you full flexibility to be able to specify labels and axis ranges. You can build on this.
par(mfrow=c(3,1),oma=c(5,4,4,2),mar=c(0,0,0,0))
time<-as.data.frame(matrix(c(1:length(returns[,1])),length(returns[,1]),3))
plot(time[,1],returns[,1],type='l',xaxt='n')
points(time[,1],ma_sd[,1],type='l',col='red')
plot(time[,2],returns[,2],type='l',xaxt='n')
points(time[,2],ma_sd[,2],type='l',col='red')
plot(time[,3],returns[,3],type='l')
points(time[,3],ma_sd[,3],type='l',col='red')

Intelligent Y Axis Scaling BarPlot R

I want to plot some data with barplot. Rather, I want to make a bar graph and barplot seemed the logical choice. I am plotting just fine but I was wondering if there is a way to intelligently scale the y axis to round up from the highest count.
For example I set the yaxis in this case to be 30, because I knew that Strand.22 had 27 counts in it: barplot(unlist(d), ylim=c(0,30), xlab="Forward Reverse", ylab="Counts")
In the future, I want this script to run on its own, so it would be optimal for the the Y-axis to choose it's own ylim. Short of pulling the information out of my 'd' variable I can't think of a good way to do this. Is there an easy way to do this with barplot? Would some other plotter work better? I have seen things about ggplots but it seemed super complex and I wasn't sure that it would do anything better.
EDIT: If I do not choose a ylim it picks automatically and this is what it decided was best.
I disagree with it's choice.
If you don't specify ylim, R will come up with something based on the data. (Sounds like you don't like it's choice, which is fair.)
If you specify something based on the data like:
barplot(unlist(d), ylim=c(0,1.1*max(unlist(d)))
R will draw you a plot that reflects the maximum value of data. That example just takes the maximum of your values and multiplies that by 1.1 (this could be any number) to give it a little extra height. R does something similar to this when you make a scatterplot but it handles barplots slightly differently.

How to avoid overplotting (for points) using base-graph?

I am in my way of finishing the graphs for a paper and decided (after a discussion on stats.stackoverflow), in order to transmit as much information as possible, to create the following graph that present both in the foreground the means and in the background the raw data:
However, one problem remains and that is overplotting. For example, the marked point looks like it reflects one data point, but in fact 5 data points exists with the same value at that place.
Therefore, I would like to know if there is a way to deal with overplotting in base graph using points as the function.
It would be ideal if e.g., the respective points get darker, or thicker or,...
Manually doing it is not an option (too many graphs and points like this). Furthermore, ggplot2 is also not what I want to learn to deal with this single problem (one reason is that I tend to like dual-axes what is not supprted in ggplot2).
Update: I wrote a function which automatically creates the above graphs and avoids overplotting by adding vertical or horizontal jitter (or both): check it out!
This function is now available as raw.means.plot and raw.means.plot2 in the plotrix package (on CRAN).
Standard approach is to add some noise to the data before plotting. R has a function jitter() which does exactly that. You could use it to add the necessary noise to the coordinates in your plot. eg:
X <- rep(1:10,10)
Z <- as.factor(sample(letters[1:10],100,replace=T))
plot(jitter(as.numeric(Z),factor=0.2),X,xaxt="n")
axis(1,at=1:10,labels=levels(Z))
Besides jittering, another good approach is alpha blending which you can obtain (on the graphics devices supporing it) as the fourth color parameter. I provided an example for 'overplotting' of two histograms in this SO question.
One additional idea for the general problem of showing the number of points is using a rug plot (rug function), this places small tick marks along the margin that can show how many points contribute (still use jittering or alpha blending for ties). This allows the actual points to show their true rather than jittered values, but the rug can then indicate which parts of the plot have more values.
For the example plot direct jittering or alpha blending is probably best, but in some other cases the rug plot can be useful.
You may also use sunflowerplot, while it would be hard to implement it here. I would use alpha-blending, as Dirk suggested.

Resources