Initializing and customizing an autoplot r-project object - r

My head is getting sore from me banging it so much.
I have a time-series that I've converted into an xts object w/ 7 variables. Now I'm trying to plot 4 of them, all price indices, on the same graph. I used autoplot (from the ggfortify package) to initialize the graph, and this is where the trouble begins.
Autoplot doesn't seem to work unless I give it at least one variable to plot. That's fine, but the two customizations I want for the variable -- its color and line type -- seem to have no effect.
But once I create the plot this way, I have little trouble adding the other 3 variables by adding geom_lines. Here's sort of what the code looks like:
p <- autoplot(foo.xts,xlab="Year",
ylab="Price Index",
columns="Variable1",linetype=4) # the linetype accomplishes nothing
p <- p + geom_line(aes(y="Variable2", color="green", linetype="solid"
# etc. for the other 2 variables
p # The 3 added variables do get the selected colors & line types.
But how can I customize the line for the first variable?
Then there's another problem in that I can't get a legend to appear. Here's how I'm trying to do that:
p <- p + scale_color_discrete(
name="Price Indices",
breaks=c("Variable1", "Variable2", "Variable3", "Variable4"),
labels=c("Index 1", "Index 2", "Index 3", "Index 4"))
This seems to accomplish nothing.
One thing I'd add is that in my various experiments trying to get the legend to work, I've sometimes gotten two sets of keys: one for colors and one for line types. This is obviously not what I'm after.
If someone could help me with this, I'd be forever in your debt!

I spent yesterday away from the computer, and when I returned in the evening fixed the problems. Here's how:
Stopped using autoplot. It's a classic case of hand-holding that throws you over the cliff. In other words, it automatically formats the plot in ways that are difficult (impossible?) to customize. Instead, ggplot makes the initial plot.
Since I'm making a series of plots, moved all the shared features to a separate, preamble section. This section creates a base plot, sets the x-axis variable (the date of the observation), labels the x-axis, and formats its tick marks. It also sets up standardized colors, line styles, and shapes to be used by all the "production" plots.
To set up the standardized elements, it uses scale_color_manual, etc. Each one has to be identical in all respects except those that are unique to its specific aesthetic attribute. E.g., scale_color_manual uses values like "red" whereas scale_linetype_manual uses values like "solid." Each manual setting includes the following elements: legend.title*, values, labels, and guide = guide_legend()*. (Items marked with * must be identical, otherwise you'll get different legends for each one.) For each plot, the actual legend title is first stored in a variable, legend.title, and then used in all the manual scale setting. This way the manual settings can be moved to the common section, but each plot has is own unique title for its legend.
3A. Actually, I was wrong about this. I was thinking LaTeX, where most things are evaluated where they appear at execution time. So a scale_color_manual statement at the start could change later on just by changing the value of legend.title. But in R, things are evaluated sequentially, and changing legend.title after the scale_color_manual statement is executed will have no effect. I worked around this by defining several variables in the preamble (e.g., one with the colors I'm using) and then using these variables in the various source_x_manual statements. This way, the only thing that change is the legend title.
Then each production plot starts by copying the base plot, labeling the y-axis, and then adds the geometric objects that it needs.
This approach has several advantages. 1) It modularizes the plotting so that problems are easier to isolate and solve, and most solved problems in the preamble section are solved for all plots. 2) It standardizes the plots, ensuring that their common features are formatted identically. 3) It reduces each production plot to a few statements; since this is the unique part for each plot, creating a new style of plot becomes relatively easy. 4) The value added by autoplot becomes minimal because this approach, separating shared elements in a preamble, compensates by isolating reusable code and the preamble, once debugged, allows much more fine-grain customization.
If you have any questions, please feel free to ask.

Related

Pictured link is my coding. How do I make a proper good graph?

Okay so I have an assignment where I need to conduct a graph that best represents the before and after affects of two streams. The graph(s) have to contain means and standard error for each stream in each year.. I cannot figure the proper coding for the graph. I continue to get errors and bad graphs. I will attach a sample of what the data looks like too.
A sample of the data, it changes to after at 51
Try to post a reproducible error or specification of your problem.
As far as I can analyze your problem, you maybe should not create b4, because it does not seem to be an effective subset. If you want to assemble certain plots, you can use plot_grid from cowplot.
Otherwise you can add facet_wrap(~ VARIABLE_NAME) to ggplot in order to create many plots divided by deviating observations in the specified variable.
If you are not happy with the visual outcome and result of your graph, you can choose another theme, e.g. theme_bw() which can be simply added to your ggplot function. You can add and change further labels with labs() and theme().

Issues with combining different (continuous and ordinal) plot types into one plot

I am preparing a figure for a paper presenting data for 2 different experiments in one plot. For that reason I don't need a legend for every plot, so I try to combine them with ggdraw from cowplot.
My code
should generate a reproducible example
and gives this output:
It seems like the two figures get the same slot (A) and the legend gets slot (B). Typically, I would probably use facet wrap to plot them together (which should also guarantee that the scaling/legend is consistent across the two plots.), but that will probably not work in this case, as I am trying to add an additional figure type to C and D.
The problem is that this figure type is ordinal so I have used a somewhat “hacky” approach to plot it, giving me this figure looking essentially as I want it to:
I so far have not been able to extract to another element that ggdraw can use.
Ideally the final plot should roughly look like this (of course with different labels):
How would you go about plotting these different types together?
Thank you for taking time to read my question and I hope that you can help me. I now it is quite a mouth full, but I was not sure how I meaningfully could reduce it to smaller chunks.

plot function type=“n” is ignored for plot(y~x)?

I am trying to plot a graph of certain values against time using the plot function.
I am simply trying to change the representation of the dots, using the pch= function. However R is simply ignoring me! I have also tried removing the dots so that I can place labels instead, but when I type in type="n" it ignores that too!
I am using the exact same format of code that I have used for other plots but this time it just isn't cooperating.
If I specify other features such as the title or the x/y axis labels, it will add those in but it simply ignores the pch or type commands.
This is my basic code:
plot(Differences ~ Time, data=subsetH)
But if I run
plot(Differences ~ Time, type="n", data=subsetH)
or
plot(Differences ~ Time, pch=2, data=subsetH)
it keeps plotting the same thing.
Is there something obvious I have missed?
I just came across your question because I encountered the same thing - creating an empty plot did not work, as type='n' was always ignored (as well as other type specifications).
With the help of this entry: Plotting time-series with Date labels on x-axis
I realized that my date format needed to be assigned as "date" class (as.Date()).
I know your entry dates back a little bit already, but maybe it's still useful.

Box plots, plots in octave

I'm new to Octave, so there are many confusing things for me, and I've never done computer programming before so most of the language is also confusing.
I have sets of data c_o, m_o, y_o, k_o as 144 x 1 matrices (column vectors?)
Box plots
Using examples I found online, I wrote this:
axis ([0,5]);
boxplot (c_o, m_o, y_o, k_o);
set(gca (), "xtick", [1,2,3,4], "xticklabel", {"cyan", "magenta", "yellow", "key"});
However, it results in an error
Boxplot.m: grouping vector may only be passed as second arg
I have no idea what this means.
Plots
I'm trying to figure out how to plot multiple data sets with different colors.
For example,
figure (1); plot (c_o , "c");
works perfectly fine.
However, I'd like to remove the horizontal axis, change the horizontal axis from [0,200] to [0,150] , and plot multiple sets of data on the same plot (not multiple plots in the same figure, but the different data on the same set of axis). I haven't been able to find out how, though.
For the record, I do know that there are probably other programming languages more suited for statistical analysis; it just so happens that my first use of this happened to be statistical in nature.

R histogram - too many variables

I am trying to illustrate a histogram of 33 different variables. Due to the number of variables I think "beside" different Colors I need to label each bar in a clear way, even using an arrow, if its doable.
I was wondering about
1) How can I define 33 distinct color in R
2) How can I label them, say vertical below X axis with a certain distance from each other to make my figure more clear.
I am using multhist function from Plotrix package, and for data you can image just 33 random vector with different length !
Thanks
As Chris mentioned, trying to distinguish 33 colours doesn't work for humans. You need to find a different plot type that doesn't rely on only colour.
Without a reproducible example, it is not possible to say what this plot should be, but here's some generic colour advice.
Use HCL colours rather the RGB or HSV. Read Escaping RGBland by Achim Zeileis for an explanation. There are some useful functions for generating palettes in the colorspace package.
If your variables are unordered categories (i.e., encoded as factors) then your colours should have different hues. (Use rainbow_hcl.)
If your variables are in some sort of order (ranges or ordered factors) then your colours should have different lightness or chroma. (Use sequential_hcl.) A variation on this is if they differ about some midpoint, in which case you need diverge_hcl.
You can define colors in R in any number of ways; try ?rainbow or ?greyscale for some suggestions
You could also look at all the colors here and just create a vector of your desired colors that you call inside your plot function.
Your problem though is that the human eye and the printing process has trouble distinguishing and reproducing that many distinct colors. See the documentation at the colorbrewer site for more information (and advice on picking colors).
Not sure I understand what your trying to do with the labels, but you can re-label an axis with a call to axis. See the documentation in ?axis.

Resources