Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I plot two hexbin graphs with R (with package 'hexbin') from data file with two columns gr and ug.
The first plot : gr as a function of ug
The second plot : ug as a fonction of gr
Why aren't they perfectly symmetrical?
Thanks in advance
Notice that in both cases the hexagons are oriented to have 2 sides vertical and no sides horizontal. To be perfectly symmetric one of the plots would need to have the rotated hexagons (2 sides horizontal).
So the binning is slightly different between the 2 graphs and points that are near the boundary in the 1st plot may fall into a different cell (symmetrically) in the 2nd plot. So while the 2 plots are similar overall you will see some minor differences due to how the data is binned.
This is true in general for plots/techniques that depend on binning continuous data, a slight change to how the binning is done will results in usually minor changes in the results. It is good to do multiple plots with small changes to the options that determine the binning to see how much things change.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
Each time I encounter a new plotting function in Base R (e.g., dotchart(), smoothScatter(). matplot()), I wish there was a list of plotting functions in Base R which I could refer to for various plotting cases.
Question:
I was wondering if any our colleagues might be aware of a list of plotting functions in Base R which I could refer to for various plotting cases?
You could use
library(help = "graphics")
that will display the list of plotting functions e.g.:
...
barplot Bar Plots
box Draw a Box around a Plot
boxplot Box Plots
boxplot.matrix Draw a Boxplot for each Column (Row) of a
Matrix
bxp Draw Box Plots from Summaries
cdplot Conditional Density Plots
clip Set Clipping Region
contour Display Contours
coplot Conditioning Plots
I found this site R plot gallery that has a wide array of basic types. Click on them to see the numerous variants and the function call for each.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
is there a way to increase the y axis length to the maximun value?
When I use this code:
par(mfrow=c(3,5))
for (i in c("mrts","p100e10","p75","PIA","pop1076","pop1616","pop2911","pop500","pop800","rev84","SugarCaneFarms","Swiss","USbanks","UScities","UScolleges"))
{
boxplot(dados[[i]],xlab=i)
}
But then it appears boxplots with a low y axis. I need to change the y axis but I didnt want to change one by one, I want to appear the last value.
Boxplots
How Can I do that?
If it is not possible, how can I do it one by one?
Thanks
You can specify ylim with the minimal and maximal values of the y axis.
In your example:
boxplot(dados[[i]],xlab=i,ylim=c(min(dados[[i]]),max(dados[[i]])))
ylim=c(-min,max)
It's a corollary of xlim and should solve your issue.
This question already has answers here:
Using ggplot2, can I insert a break in the axis?
(10 answers)
Closed 3 years ago.
I have used Thinkcell, and one of its cool features is that it breaks very long y-axis to fit the graph. I am not sure whether we can do this with ggplot2. I am a beginner in ggplot2. So, I'd appreciate any thoughts.
For example:
Series <- c(1:6)
Values <- c(899, 543, 787, 35323, 121, 234)
df_val_break <- data.frame(Series, Values)
ggplot(data=df_val_break, aes(x=Series, y=Values)) +
geom_bar(stat="identity")
This creates a graph like this:
However, I want a graph that looks something like this:
However, it seems that broken axis is not supported in ggplot2 because it's misleading (Source: Using ggplot2, can I insert a break in the axis?). This thread suggests a couple of things--faceting and tables.
While I like tables, but I don't like faceting because my categorical variable "Series" are closely related. Moreover, I'd prefer Excel for drawing tables--it's fast.
I have two questions:
Question 1: One of the options I liked is at https://stats.stackexchange.com/questions/1764/what-are-alternatives-to-broken-axes. The graph is at
.
I am unable to replicate similar graph because of the scaling issue.
Question 2: This is a minor question just in case there were new packages introduced that might help us to do this. (The linked SO thread above is older than 5 years. ) Are there any other options on the table?
Update: I don't think my question is duplicate for two reasons: a) I have already gone through the indicated thread, and have referenced here explaining that I am looking for a solution that looks like the third graph in my post. Specifically, I am looking to plot both the graphs--one with shorter scales and the other with 1/20 scale in one graph. I am unable to do this using ggplot2 because of scale issue. Either both the sub-graphs get scaled to 1/nth or one of them get scaled to normal range. I believe this version is much relatable for non-technical audience who don't understand log and Inverse transformation.
I took a stab at this one. I'm a beginner so I am not sure whether this can be improved further in terms of placement of text. I struggled with fitting both high growth rate series and low growth rate series in one graph because of different scales. So, I used facetting.
Here's the code:
ggplot(data = df_val_break,aes(x=Series,y=Values)) +
geom_bar(stat = "identity") +
facet_wrap(~Modified) +
geom_text(data = df_val_break[df_val_break$Modified=="HIGH_GROWTH",], aes(label = "x20 growth rate"),hjust=0.5, vjust=0)
ggsave("post.png")
Here's the output:
There are quite a few issues that I see:
a) High_growth rate graph has Series 2 and Series 6 on the x-axis, although we don't need them. I don't know how to turn them off.
b) geom_text overlaps with the bar. This looks a little annoying.
c) I'd believe that the graph is a little misleading, especially for HIGH_GROWTH section because the y-axis isn't scaled with LOW_GROWTH I was originally thinking of showing two different y-axis--one scaled by 1/20 and the other unscaled.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I had a question on how to change/customize the upper and lower limit of a notch on a boxplot created by ggplot2. I looked through the function stat_boxplot and found that ggplot calculates the notch limits with the equation median +/- 1.58 * iqr / sqrt(n). However instead of that equation I wanted to change it with my own set of upper and lower notch limits.
My data has 4 factors and for each factor I calculated the median and did a bootstrap to get a 95% confidence interval of that median. Thus in the end I would like to change every boxplot to have its own unique notch upper and lower limit.
I'm not sure if this is even possible in ggplot and was wondering if people have an idea on how to do this?
Thanks again!
I've figured out one way to customize the notches on a plot using ggplot with the function ggplot_build.
After plotting a boxplot with say:
p<-ggplot(combined,aes(x=foo,y=bar)) + geom_boxplot(notch=TRUE)
not really sure what exactly happens with ggplot_build but seems like it converts the plot into a data-frame ish structure so one can manipulate it if wanted.
gg<-ggplot_build(p)
afterwards:
gg$data[[1]]$notchlower
gg$data[[1]]$notchupper
contains the notch limits for your plot and you can basically change it with something like:
gg$data[[1]]$notchlower<-50
gg$data[[1]]$notchupper<-100
And if you had mulitple boxplots and wanted to individually change each boxplot:
gg$data[[1]]$notchlower[1]<-50
gg$data[[1]]$notchlower[2]<-50
....
gg$data[[1]]$notchlower[n]<-50
gg$data[[1]]$notchupper[1]<-100
gg$data[[1]]$notchupper[2]<-100
....
gg$data[[1]]$notchupper[n]<-100
Anyways hopefully this is a valid method to do and it would be of help for other people.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
This is a question both about best practices for visual representation of data and about how to draw plots in R/ggplot2.
I am trying to find a way to graphically represent the story told here:
"We had 2000 test cases, of which 500 had errors. After investigation, we found that 400 of the tests were Big and 1600 were Small; only 25 of the Big tests had errors, so we set them aside, leaving 1600 Small tests, of which 475 had errors. We then found that 400 of the Small tests were Clockwise and 1200 were Counter-Clockwise; only 20 of the Small Clockwise tests had errors, so we set them aside, leaving 1200 Small Counter-Clockwise tests, of which 455 had errors."
In other words, I am using categories to separate my test cases, and I want to represent how the fraction of errors in each category changes with my progress.
Here's some R with the data:
tests <- data.frame(n.all=c(2000,400,1600,400,1200),n.err=c(500,25,475,20,455),sep.1=as.factor(c("all","Big","Small","Small","Small")),sep.2=as.factor(c("all","all","all","Clockwise","Counter-Clockwise")))
With this small amount of data, a simple numeric table might be the best choice; let's assume that the story continues, with more and more separating categories being used, so that simply listing the numbers isn't the best choice.
What would be a good way to represent this data? I can think of a few possibilities:
Pie charts, showing slices of the pie being taken away, and the breakdown of errors/no errors in what remains
Bar charts, similar
Bar charts with ribbons showing the "flow" of separating away categories, like Minard's chart of Napoleon's march
Similar, but with the bar charts showing the fractions horizontally rather than vertically
All four methods show the absolute amount of test cases decreasing, and the fraction of errors in the separated category as well as what remains. I think I like #4 best, but I've got an open mind.
How should this kind of data be represented, and can R/ggplot2 be used to do so?
Remember the 3 things that should be in line when drawing graphs; the message you are telling, the message the data is telling you and the message the graph is telling you.
In my opinion your option 4 is the best one to get the message across consistently.
I also arrive at number 4 by sheer elimination: ;)
Columns are not suitable since you are combining vertical representation with a horizontal flow, comparing pie charts are also not easy to do (even within a pie chart it is already difficult to compare the different parts) so they are not an option either. Leaving you with option 4 indeed :)
You can also try a Sankey Diagram. Sankey Diagrams in R? might be helpful