Jump y axis values when highest value is to far away from the other points - graph

Basically I'm building an area graph with Chart.js, the data that I'm using in order to build the graph usually contains a peak that is much higher than the rest of the points and the y-axis range of values will be to high, to notice the diference between the lower points and it wil seem almost as a parallel line to the x-axis as we can see in this image:
Graph with problems
The solution I want to try is to skip the values from the y-axis between the lower points and the peak of the graph, and accomplish a graph presentation similar to this one:
Solution graph sketch
As we can see at this sketch the y-axis has a normal scale until 300 but then as the next point is to far away from the other ones the y-axis values are skiped.
So what I want to know is if this jump on the values of the y-axis is possible to achieve with this library (Chart.js) and if so where can I find documentation about it, because I already looked everywhere and couldn't find a thing. If not I would ask you for recommendations of any other librarys where I could achieve this.

Related

Is it possible to edit the numbers displayed on an axis without moving the points that were plotted?

I've come up with a graph (a scatterplot) of the log(1+inf) (inf = number of people infected with a given disease on the y-axis against one of the explanatory variables, in this case, the populational density (pop./kmĀ²; x-axis) on my model. The log transformation was used merely for visualization, because it spreads the distribution of the data and allows for more aesthetically appealing plots. Basically, what I want is both axis to show the value of that same variable before the log transformation. The dots need to be plotted like plot(log(1+inf),log(populational_density), but the number on the axis should refer to plot(inf,populational_density). I've provided a picture of my graph with some manual editing on the y-axis to show you the idea of what I want.
The numbers in red would be the 'inf' values equivalent to log(inf);
Please, bear in mind that those values in red do not correspond to reality.
I understand the whole concept of y = f(x), but i've been asked to provide it. Is this possible? I'm using the ggplot2package for plotting.

Confidence interval square in a plot with one variable in each axis in ggplot

Although it might sound easy at first, I do not have a scatterplot. And I think that is what make this question challenging. I am having this plot, which comes from this question.
Summing up, each axis represents a variable that is not connected to the other. It is not an XY scatterplot, as you see.
I wonder to know if there is any possibility to trace the 95% confidence interval for the mean in both variables, and draw a square in the middle of the plot representing the overlapping area among both datasets.
The result might be something similar to this, bearing in mind that 95CL represented do not correspond to reality (just for the sake of illustrating how it might appear):
Here is a another question which deals with this situation, but not using ggplot.

Changing axis endpoints in iTorch plot

I'm trying to graph some data in an iTorch notebook. I can generate plots fine, but I want to change the endpoints of the axes. (My autogenerated y-axis is from 20-100, but I'd rather it be from 0-100 since it's a graph of percentages and I want the lower left corner to be the origin.)
I looked in the documentation, and in the list of methods implemented, and in the source code, but didn't find anything that lets me do this. I can zoom the generated graph, but it preserves aspect ratio, so I can't zoom just one of the axes.
Does anyone know how to do this? I'm half convinced this isn't implemented, but it seems like a very strange feature to leave out.
You can change the axis scale by mouse-scrolling over the axis ticks, and translate by drag-and-drop.
I know it's not ideal, but I had the same problem and that's the best I could find so far.

Irregular scaling of axis in R

I have computed values for several categories for three networks. I'd like to create a bar plot in R to show the differences between these parameters for the networks. So far I plotted this with the barplot R function with the categories on the x-axis, their values on the y-axis and to each category three bars (one for each network).
But now I have one value which is much higher than all the others. Therefore the differences for the rest cannot be seen since they're represented only by a thin line because of that one large bar which almost fills the whole plot.
My idea was now to plot the values on the y-axis on an irregular scale, meaning for example, that one half represents the values from 0 to 300, and the other half from 300 to 3000. Is there any way to do this? Or a good alternative approach to handle this problem? I also thought of plotting the logarithm but unfortunatly I have also negative values.
I would suggest that an irregular scale isn't a good plan - I think it confuses viewers of the chart. Instead, you could use the layout() function to plot three separate barplots in a horizontal layout. Thus, each category could have it's own plot, with it's own scale.
If, however, you still have a single bar at 3000, while everything else is at 300, that won't really help. In that case, you could manually set your y-axis limits with ylim=c(min,max). To keep the bar from stretching off the screen, you can just use simple logic to define anything > 300 as 300, or something similar. Then, put a text point there stating the actual value (using text, maybe with arrow).
With those ideas out there, I would suggest that a graph where one value is 10x the other values might not really be worth presenting, or if it is, the main takeaway from it isn't going to be "how do values 2 and 3 compare to each other", it's going to be "holy moley look how much bigger 1 is than 2 and 3". So, it might not be a big deal if one bar is giant and two are small, as long as you aren't doing all 9 on a single plot (which would screw up other, relevant comparisons). So, if you split them using layout(), then it wouldn't be as big of a deal.

How to avoid overplotting (for points) using base-graph?

I am in my way of finishing the graphs for a paper and decided (after a discussion on stats.stackoverflow), in order to transmit as much information as possible, to create the following graph that present both in the foreground the means and in the background the raw data:
However, one problem remains and that is overplotting. For example, the marked point looks like it reflects one data point, but in fact 5 data points exists with the same value at that place.
Therefore, I would like to know if there is a way to deal with overplotting in base graph using points as the function.
It would be ideal if e.g., the respective points get darker, or thicker or,...
Manually doing it is not an option (too many graphs and points like this). Furthermore, ggplot2 is also not what I want to learn to deal with this single problem (one reason is that I tend to like dual-axes what is not supprted in ggplot2).
Update: I wrote a function which automatically creates the above graphs and avoids overplotting by adding vertical or horizontal jitter (or both): check it out!
This function is now available as raw.means.plot and raw.means.plot2 in the plotrix package (on CRAN).
Standard approach is to add some noise to the data before plotting. R has a function jitter() which does exactly that. You could use it to add the necessary noise to the coordinates in your plot. eg:
X <- rep(1:10,10)
Z <- as.factor(sample(letters[1:10],100,replace=T))
plot(jitter(as.numeric(Z),factor=0.2),X,xaxt="n")
axis(1,at=1:10,labels=levels(Z))
Besides jittering, another good approach is alpha blending which you can obtain (on the graphics devices supporing it) as the fourth color parameter. I provided an example for 'overplotting' of two histograms in this SO question.
One additional idea for the general problem of showing the number of points is using a rug plot (rug function), this places small tick marks along the margin that can show how many points contribute (still use jittering or alpha blending for ties). This allows the actual points to show their true rather than jittered values, but the rug can then indicate which parts of the plot have more values.
For the example plot direct jittering or alpha blending is probably best, but in some other cases the rug plot can be useful.
You may also use sunflowerplot, while it would be hard to implement it here. I would use alpha-blending, as Dirk suggested.

Resources