Ratio of top/bottom 10% of values in Looker - looker

Dear members of the community,
I am looking for some help with the following question (as this is my first question, I hope I will ask correctly - if not, I will be adding more info).
I have a data set that is used for a Looker dashboard. Everything works fine, but in order to demonstrate some data ratios I decided to make a ratio based on the average of top 10% of values divided by the average of bottom 10%. In turn the ratio would show a slightly more valuable insight into the top and bottom difference compared to taking only a single top/bottom value (therefore the average).
Example:
(Average(top 10% of values))/(average(bottom 10% of values))
Any support will be much appreciated!
Best
Simon
I have tried many solutions, used filtered “subsets”, default value subset, I even tried using a function to get to the result BUT IT does not work. The ration is almost always 1. Coming from spreadsheets this seemed like a very easy task. What am I missing?
Spreadsheet example of this calculation (not the nicest, but works):
=average(SORTN(A2:A;COUNTA(A2:A)*0,1;;0;TRUE))/average(SORTN(A2:A;COUNTA(A2:A)*0,1;;1;TRUE))
The query function might be even better.
In comparison, the ration works perfectly with the following equation in Looker.
Max (dataset)/ min(dataset)

Related

Is there a way to show the significance of category means against a set value, not one another?

I am graphing measured results versus expected results from a model, grouped by categories (the category in the boxplot below is one of a few different ones I'm using). For each data point, I subtracted the expected from the observed to determine the difference. My task is to modify the model to minimize the difference.
I would like to add the significance level to this chart but all resources I am finding are to compare means of each category to one another. In this case, I would like to know if each of the category's means is significantly different from 0. I can run this test one by one, selecting for data points falling within each category and testing for a difference from 0, but this seems inefficient.
Is there a way to automatically generate this and plot it? stat_compare_means seemed promising but I couldn't figure out how to make it work, while stat_pvalue_manual may hold more promise if I figure out how to code this.
Thanks in advance!
Sample boxplot (too new to add preview)

How to decided between font size, margins and png() parameters to achieve good definition and consistent visualisation?

This is a question that has me banging my head against a wall for a while now. Much of R coding produces consistent results when used for analysis, in a sense that sometimes there are more than one ways to achieve something but your output would be something shareable and consistent. Let's say a dataframe or a datatable and so on and so forth.
However, I'm finding myself struggling to understand how can I achieve a mainstreamed process when generating plots. Font size, margin size, height, width and resolution. All those influence each other.
You change your resolution and suddenly your font size changes drastically when saving with png(). You go back and you change the dimensions and there you are with extremely small font size or with a pixeled chart looking at you.
So, because I still trust in the ggplot and png() process and believe that it must be me that messes up or doesn't do the correct steps in his workflow the question is:
What is the sweet point between all those factors that makes plotting with R easy, consistent and high-quality?
I understand that some of these factors cannot be standardised since it depends on the amount of information and how complex a chart is. But how do others ensure consistent font size against changes in resolution, height, width and plot margins?
I've came across some useful resources such as:
[https://blog.revolutionanalytics.com/2009/01/10-tips-for-making-your-r-graphics-look-their-best.html][1]
[https://support.rstudio.com/hc/en-us/articles/200488548-Problem-with-Plots-or-Graphics-Device][1]
But none really speaks to how you mainstream a visualization process in R. Still great tips though.
Any advice or ideas are honestly appreciated. Thank you.

heatmap.2 color legend custom bins

Hi there stackoverflow community!
I am a graduate student inquiring for some consultation on an aethetics R problem I am encountering.
The data I am working with is in the form of a VERY large matrix (49x51).
My problem is that my data ranges from very small to very large, with the bulk of my data falling within the "very large" end of the spectrum, so unless I convert my data to log10, the heatmap is rather boring and almost entirely the same color.
The spectrum of my data is totally within the range I am expecting, but I am hoping to display it in a more aesthetic way.
Proposed solution: I think I need to bin my data in a non-uniform way. If you look at the attached image, you will see that their heatmap looks nice and the color key shows the heat spectrum in a non-fixed bin format. I would like to do something like that, however, I am not sure how to declare cutoffs for each bin. I would ideally like to declare the cutoffs.
For example, bin 1 (0-1), bin 2 (2-50), bin 3 (51-5000). As you can see, my bins would not be fixed in equal increments.
I have been using heatmap.2 for this. Thanks so much in advance!
heatmap with color legend in non-uniform bins:
Hey #Punintended and #S Rivero,
I think I have reached the point that my heatmap will only improve marginally. Both of you contributed deeply to this success, so thanks! First, to condense the matrix values as much as possible, I normalized by column. I was then able to assign gradients. This turned out much better than I had hoped. As you can see, most of my data is clustered (check out the density in the key) at very low values, this is okay though, for I am interested in the higher values. I had to use custom color gradients to account for possible instances of colorblind attendees that might look at my poster. Anyways, if you guys have comments or recommendations, they will be much appreciated :). Again, thanks a bunch!
enter image description here

Extracting boundary line from Image in R

So I got some kind of cross section picture in jpg format I want to work with. For better understanding I just drew a picture, hopefully symbolising well enough kinda how the real pictures will look like:
At the top of the picture is material A, at the bottom material B.
Goal: I want to get the Pixels of the boundary line between both materials.
My way so far:
I already know how to read pictures with package called EBImage
I also know, that this will result in a matrix with a color value for
every pixel.
I thought it would be better to convert the jpeg into a binary picture with only black and white colors.
I thought filling up the black part below (Material B) and reducing the noise would be nice, so I could use column sums (a sum of 1's) to find the row number where material A touches material B, which should be my searched boundary line (right?).
Problems:
I don't find filters which fill up the black parts intelligently, in the real pictures, there will be much more noise, which will complicate things even further...
I am not sure if all this is even necessary, and there is a more efficient way to reach my goal of finding the boundary line
Thank you very much for every tip in advance!
Answers will always be vague when there's no example to work with. I would normally use ImageJ for a task like this but EBImage has the commands that I would use.
From EBImage I would make binary and then erode , dilate, and fill holes (fillHull).
Your picture looks like it might be a candidate for a support vector machine. There are a couple of packages for R with svm functions, one is e1071.

Creating a chart in ASP.net

I am working on a project and need to add an additional image using the asp:chart control. Unfortunately, I've never had to use this control before and it's a bit complex to use, so I need some help.
Basically, I need to create a stacked Column chart with two legends and two columns. The first column is "income" and stacks three values. (Wages, interest and other.) The second column is "expenses" and stacks two values. (Mortgage, Other.) Each value has it's own value.
The legend for income should be on the left, the column for expenses to the right. These legends should display the texts and values for it's related value plus a 'Total' label with value.
For this task, I only have to deal with 5 values over two columns but the asp:chart control is huge and I'm drowning in all it's options. And they want it ready yesterday, so no pressure. It's already overdue... :-)
No, it's not homework. If it was, I would have practical documentation and the additional how-to information. Since my Boss expects me to add this, he just gave me absolutely no information to work with, except for the code which already contains several other charts, none of them like this one and all done by previous victims who each used their own coding style. Basically, the project code is a huge mess so useless as documentation. (And amazingly it works, as long as I only use asp:chart for these graphics.)
The biggest problem I'm having is stacking the values correctly. Since I have two columns and 3 values, it could be solved with three series, each with points for column 1 and 2. Unfortunately, this puts income and expense in the same label, which is not what I want.
If I make it 5 series, for every value one point, then the second column doesn't start at the right height. So that won't work either.
You can download Samples for Chart Control from msdn which give you complete in depth knowledge of how to use them
http://archive.msdn.microsoft.com/mschart/Release/ProjectReleases.aspx?ReleaseId=1591
For learning see these blogs as well
https://web.archive.org/web/20211020203246/https://www.4guysfromrolla.com/articles/072209-1.aspx
http://weblogs.asp.net/scottgu/archive/2008/11/24/new-asp-net-charting-control-lt-asp-chart-runat-quot-server-quot-gt.aspx

Resources