heatmap.2 color legend custom bins - r

Hi there stackoverflow community!
I am a graduate student inquiring for some consultation on an aethetics R problem I am encountering.
The data I am working with is in the form of a VERY large matrix (49x51).
My problem is that my data ranges from very small to very large, with the bulk of my data falling within the "very large" end of the spectrum, so unless I convert my data to log10, the heatmap is rather boring and almost entirely the same color.
The spectrum of my data is totally within the range I am expecting, but I am hoping to display it in a more aesthetic way.
Proposed solution: I think I need to bin my data in a non-uniform way. If you look at the attached image, you will see that their heatmap looks nice and the color key shows the heat spectrum in a non-fixed bin format. I would like to do something like that, however, I am not sure how to declare cutoffs for each bin. I would ideally like to declare the cutoffs.
For example, bin 1 (0-1), bin 2 (2-50), bin 3 (51-5000). As you can see, my bins would not be fixed in equal increments.
I have been using heatmap.2 for this. Thanks so much in advance!
heatmap with color legend in non-uniform bins:

Hey #Punintended and #S Rivero,
I think I have reached the point that my heatmap will only improve marginally. Both of you contributed deeply to this success, so thanks! First, to condense the matrix values as much as possible, I normalized by column. I was then able to assign gradients. This turned out much better than I had hoped. As you can see, most of my data is clustered (check out the density in the key) at very low values, this is okay though, for I am interested in the higher values. I had to use custom color gradients to account for possible instances of colorblind attendees that might look at my poster. Anyways, if you guys have comments or recommendations, they will be much appreciated :). Again, thanks a bunch!
enter image description here

Related

Faceting in ggplot() in R

I am trying to build a plot for a numeric variable rider_count vs a categorical variable weekdays("Mon", "Tue"....), and this plot is required to be a faceting plot with 55 categories,
I tried to use
ggplot(aes(x=wday, y=rider_count_sum)) +
geom_bar(stat = "identity") +
facet_wrap(~counter_edited, scales="free")
However, the output of it is twisted very hard due to the scale does not fit.
Are there any ways to make it scale normally?
The issue you here is your faceting. It produces a grid of 8 x 7 cells. The plot displays on my monitor at about 18cm x 11cm in size. That means each cell is approximately 2.25cm x 1.5cm. Is a cell of that size large enough to provide meaningful information in the form of a plot? I would say "no".
So, you have two options: increase the size of the graphic or reduce the size of the grid.
Is increasing the size of the plot an option? Well, how big would each cell have to be to be meaningful? I don't know: you'd have to experiment: it would depend on the viewing distance and the level of information you'd want to convey. As a thought experiment, let's say you need each cell to be 8cm x 8cm to be interpretable. That means the graphic would need to be at least 64cm x 56cm. That would require an A1/ANSI D sheet of paper. That's heading to paper size. Unless you're talking posters, that's not reasonable. Even as a poster, a reader would have to stand so close that they wouldn't get the message of the whole graphic. On a digital display, you'd again be talking about a wall mounted unit. Standing close enough to look at a cell, pixel resulution would be an issue. Scrolling on a smaller unit would destroy the whole purpose of using a facted display.
Pagination would also destroy the benefit of faceting: you wouldn't be able to see all the data at the same time.
So, whilst increasing the size of your plot might be technical possible, I don't think it would be practically useful.
What about reducing the number of cells? That to me would be the way to go. Simplify your presentation to allow your message to come across. For example, summairse weekdays vs weekends in one graphic, differences between weekdays in another. That reduces one dimension from 7 to either 2 or 5. I don't know how you construct counter_edited, so I don't know what the columns of your facet represent, but could you perhaps reduce the number of categories to 3 or 4? Combined with my weekday/weekend suggestion, would give you grids of between 4x5 and 2x3. Much more managable (though even 4x5 may be too complex).
In short: even if making you current graphic look better is technically possible, I doubt it will ever be practically useful. I suggest adopting a different approach. The question I would ask is deeper than the simple technical one of improving your graphic: what is your underlying purpose? Once you know that, adapt your presentation to best address your objective.

How can I convert a scatterplot into a hexagonal/honeycomb chart?

I have data in a scatterplot that places colors respective of their lightness (x-axis) and saturation (y-axis):
Here's the data set in a spreadsheet.
I would like to transform this into a hexagonal/honeycomb chart. I did this by hand...finding some "lines" in the data for the edges, and then filling in the middle based on intuition:
I'm not sure if that's the "best" honeycomb representation of the data, but something that looks okay to me.
Does anyone have a suggestion on how I could make this process into an algorithm? I have a feeling there's some math or algorithms that would fit this problem which I am unaware of.
Thanks!

Decreasing the range of a dataset

I have a dataset which ranges from 0.00000787 to 1.39151821, quite a large disparity when it comes to plotting the data. I'd like to try and decrease the range of data so the plot (I'm using a colour coded plot, and right now it's pretty monotonous) is more visually understandable. I tried using log(dataset) however this creates some negative numbers which my software doesn't like.
Mathematics is not my strong point, if someone could recommend a method of fitting my data into a smaller range it would be much appreciated.
Thanks.
Try log + 1, like this:
list<-seq(0.00000787,1.39151821,0.01)
plot(log(list+1))

Building a heatmap in R with more information

I made a heatmap on R and most of it is one colour. I have two columns of data which showed up as various colours, but the rest of it is red.
Does anyone know how to increase the "resolution" of this? I don't mean anything about how to make the image more clear (which is why I think I'm having trouble searching for info on it). I mean, how do I make my heatmap more meaningful and not all mostly one colour.
Thanks and sorry if this has been answered somewhere else. I think I don't know the key term I need to search properly.
Edit:
Here is the code I used so far (heatdata is my matrix):
heatmap <- heatmap(heatdata,Rowv=NA,Colv=NA,col=cm.colors(256),scale="row")

Intelligent Y Axis Scaling BarPlot R

I want to plot some data with barplot. Rather, I want to make a bar graph and barplot seemed the logical choice. I am plotting just fine but I was wondering if there is a way to intelligently scale the y axis to round up from the highest count.
For example I set the yaxis in this case to be 30, because I knew that Strand.22 had 27 counts in it: barplot(unlist(d), ylim=c(0,30), xlab="Forward Reverse", ylab="Counts")
In the future, I want this script to run on its own, so it would be optimal for the the Y-axis to choose it's own ylim. Short of pulling the information out of my 'd' variable I can't think of a good way to do this. Is there an easy way to do this with barplot? Would some other plotter work better? I have seen things about ggplots but it seemed super complex and I wasn't sure that it would do anything better.
EDIT: If I do not choose a ylim it picks automatically and this is what it decided was best.
I disagree with it's choice.
If you don't specify ylim, R will come up with something based on the data. (Sounds like you don't like it's choice, which is fair.)
If you specify something based on the data like:
barplot(unlist(d), ylim=c(0,1.1*max(unlist(d)))
R will draw you a plot that reflects the maximum value of data. That example just takes the maximum of your values and multiplies that by 1.1 (this could be any number) to give it a little extra height. R does something similar to this when you make a scatterplot but it handles barplots slightly differently.

Resources