Plotting mean-values of elements with same frequencies - plot

I have two columns in my google sheet that corresponds to 1) the frequencies of the elements and 2) their respective 'values'.
What I want is a diagram that holds the different frequencies on the x-axis, and for each frequency I want the y-axis to hold that specific frequency's value (and if there are more than one element with that frequency I want it to plot their mean value).
Two elements can share the same frequency and/or the same score, and that's why I want the mean-functionality added aswell.
If the following data would be my values:
280 6
280 4
250 2
240 1
230 3
Forgive my ascii-skills, but I'd want the graph to plot the following in that case:
^
.
.
|
| |
| |
| | |
| | | |
| | | | |
___230___240___250___260___270___280___...>
I'm not entirely familiar with Google Sheets yet and I'm not really sure how to accomplish this.

I think a pivot table should serve. Your LH column in Rows, RH in Values, with Summarise by AVERAGE. Then chart the results (select what in the image is B14:B17, Insert..., Chart and accept the first recommendation):

Related

R shiny leaflet bubble map by Count

I'm trying to make a Shiny app that uses some Geographical data I have stored in MySQL. The data currently contains a list of Longitudes and Latitudes (of client location), client age, client gender, and other various demographics. For example:
+--------+-------+--------+---------+
| Member | Long | Lat | Gender |
+--------+-------+--------+---------+
| A | 34 | -118 | M |
| B | 34 | -118 | F |
| C | 41 | -74 | M |
| D | 39 | -77 | M |
+--------+-------+--------+---------+
I want to use leaflet to create a bubble map for the locations - a bigger bubble at a location means that more clients are present at that area. It seems like the function addCircles does this pretty well, but the problem is that I don't have a count of the number of clients at each location to assign to the radius parameter- each row in my table represents information for a particular client, not a location. Is there a way to obtain that info?
My best guess is to create a new table where each row represents counts for a different longitude and latitude and then use a count column to count the number of times that location appears among clients, but I'm not sure if this is the best way since it involves creating a new table, and I would have to create additional columns like "number of males" and "number of females" for every factor I want to account for. And what if I wanted to adjust my map for females who are in the age range of 40-50? The number of columns I would have to create would easily exceed 100..

How to get rid of circular/cyclic dependencies in TIBCO Spotfire?

I have two tables that are linked via a relation (edit -> data table properties -> relations). One contains some raw data, and the other contains aggregated data (calculation on the value).
You can see some examples below. Here, data are linked on "category" column.
RAW DATA
category | id | value
---------+----+------
A | 1 | 10
A | 2 | 20
A | 3 | 30
A | 4 | 30
B | 1 | 20
B | 2 | 20
COMPUTED DATA
category | any_calculation //aggregation of raw data based on category
---------+----------------
A | 10
B | 20
To do the calculation, I use a R/TERR function that take raw data as an input, and that output computed data.
Then I display raw data in a scatter plot (one per category), and I add a curve that is taken from the column "any_calculation" of the computed data.
My main problem is that my table with computed data isn't filled by the R/TERR script. The cause is, in my opinion, the cyclic dependency between those two tables.
Do you have any idea/workaround/fix ?
I should also add that I can't do the calculation in the scatter plot (huge calculation). I use Spotfire 7.8.0.
It seems like a table can't be modified/edited by different sources, that is to say multiple scripts (R and Python) can't have the same table as an output.
To fix my problem, I created a new table in one of my script. Then I created a relation between this table and the other one from the other script.

Power Bi graph like pivot graph

I'm new to Power Bi, followed most of the tutorial on MS but haven't figured yet how creat a graph that resembles this graphic I did with Excel - Pivot Graph, using as source the same data table.
What I need to recreate in Power Bi is a column graph with the most requested (pre-orders requests % of total sum) products in different price ranges.
Pivot Graph
Table ie.
| Date | Product | 3 to 5 Eur | 5 to 8 Eur | 8 to 11 Eur |
----------------------------------------------------------
| mar17| Coffe | 12 | 7 | 2 |
| mar17| Milk | 15 | 3 | 1 |
| mar17| Honey | 17 | 0 | 5 |
| mar17| Sugar | 20 | 9 | 8 |
Thank in advance for the help.
Bests,
Alberto
Edit - Thanks to Mike Honey for pointing out the original request was for % of grand total. I have added an additional step to accomplish this and cleaned up some existing steps.
When I imported your sample data into Power BI, I got this (looking at the data in the Query Editor window).
From there, Select the Data and Product columns and then click on Transform -> Unpivot Columns -> Unpivot Other Columns...
... which results in this.
Just to clean this up, I renamed the Attribute and Value columns and changed the data type of the Value column. In the end, it looks like this.
Then just click on Home -> Close & Apply to get back in the Report Editor window, where you can create a graph and configure it as shown such:
Axis:
Price Range
Product
Value:
Quantity
Then click of the forked, drill-down arrow in the top left corner of the graph to show Price Range and Product.
Which looks like this.
Next, while not necessary I feel that it is very nice, with the graph selected, click on the paint roller icon and expand the X-Axis category. In there, turn off Concatenate labels.
Finally, to get the bars to be % grand total, simply right click on Quantity in the Value section of the graph's fields and then select Show value as -> Percent of grand total.
To get the final results that look like this.

Is it possible to combine separate boxplot summaries into one and create the combined graph?

I am working with rather large datasets (appx. 4 mio rows per month with 25 numberic attributes and 4 factor attributes). I would like to create a graph that contains per month (for the last 36 months) a boxplot for each numeric attribute per product (one of the 4 factor attributes).
So as an example for product A:
-
_ | -
_|_ | _|_
| | | | |
| | _|_ | |
| | | | |---|
| | |---| | |
|---| | | | |
|_ _| | | |_ _|
| |_ _| |
| | |
- | -
-
--------------------------------------------------------------
jan '10 feb '10 mar '10 ................... feb '13
But since these are quite large datasets I will be working with I would like some advice to get started on how to approach. My idea (but I am not sure if this is possible) is to
a) extract the data per month per product
b) create a boxplot for that specific month (so let's say jan'10 for product A)
c) store the boxplot summary data somewhere
d) repeat a-c for all months until feb '13
e) combine all the stored boxplot summary data into one
f) plot the combined boxplot g) repeat a-f for all other products
So my main question is: is it possible to combine separate boxlot summaries into one and create the combined graph as sketched above from this?
Any help would be appreciated,
Thank you
Here's a long-hand example that you can probably cook something up around:
Read in the individual datasets - you might want to overwrite the same data or wrap this step in a function given the large data you are using.
dset1 <- 1:10
dset2 <- 10:20
dset3 <- 20:30
Store some boxplot info, notice the plot=FALSE
result1 <- boxplot(dset1,plot=FALSE,names="month1")
result2 <- boxplot(dset2,plot=FALSE,names="month2")
result3 <- boxplot(dset3,plot=FALSE,names="month3")
Group up the data and plot with bxp
mylist <- list(result1, result2, result3)
groupbxp <- do.call(mapply, c(cbind, mylist))
bxp(groupbxp)
Result:
You will not be able to predict with absolute precision what the values of the "fivenum" values will be for combined assembly of values. Think about the situation with two groups for which you have the 75th percentiles in each group and the counts of observations in each group. Suppose the percentiles are unequal. You cannot just take the weighted mean of the percentiles to get the 75th percentile of the aggregated values. The see the help page for ?boxplot.stats. I would think, however, that you might come very close by using the median values of the fivenum collections. This might be a place to start your examinations.
mo.mtx <- tapply(dat$values, dat$month, function( mo.dat) c( fivenum(mo.dat), length(mo.dat) )
matplot( mo.mtx[, 1:5] , type="l" )

R heatmap with different color scales for different rows

I am wondering what would be an easy solution to produce heatmaps() with of composed data that require different scaling for different rows.
So in my case the columns represent different events of the same type, and the rows are different observations of these events that can be binary or diff. continuous data.
F.ex:
Event: ev1 | ev2 | ev3 | ev4 | ev5 | ev6
Obs1: 1 | 0 | 1 | 1 | 0 | 0
Obs2: 5.6 | 0.2 | 4.8 | 7.1 | 0.1 | 0.8
Thanks in advance for hints and help
To have a single heatmap where different rows have been scaled differently would be rather confusing, since you would need a legend for each shading. This would quickly clutter up the plot.
If you suggest that you don't need a legend, then simply scale your values to range between zero/one and you should get would you are after. So for Obs2 you would have something like:
scaled_obs2 = (Obs2 - min(Obs2))/(Obs - min(Obs2))

Resources