Limit the number of joined rows in Kusto / KQL / ADX - azure-data-explorer

Given the following Kusto query, is it possible to limit the result set so only the two cities with highest population per country are retrieved?
My real scenario is for sure a lot more complex but I've spent several hours now to figure out how I could do this. I tried with the top-nested operator but this operator actually changes the column layout by aggregating on a single column and not just reduces the amount of fetched rows by grouping criteria.
let population=datatable (name: string, population: int64) [
"New York", 4478934739,
"Washington DC", 412165236,
"Miami", 124437843,
"Berlin", 222347384,
"Munich", 6783434,
"Hamburg", 6000033
];
let country=datatable (name: string, country: string) [
"New York", "US",
"Washington DC", "US",
"Miami", "US",
"Berlin", "DE",
"Munich", "DE",
"Hamburg", "DE"
];
population
| join kind=inner country on name

Would this work?
Note that the partition operator is currently limited to 64 values (this is a temporary limitation)
let Populations=datatable (name: string, population: int64) [
"New York", 4478934739,
"Washington DC", 412165236,
"Miami", 124437843,
"Berlin", 222347384,
"Munich", 6783434,
"Hamburg", 6000033
];
let Countries=datatable (name: string, country: string) [
"New York", "US",
"Washington DC", "US",
"Miami", "US",
"Berlin", "DE",
"Munich", "DE",
"Hamburg", "DE"
];
Countries
| partition by country(
lookup Populations on name
| top 2 by population
)
If you can't use partition due to the number of partitions limitation here is an alternative:
let Populations=datatable (name: string, population: int64) [
"New York", 4478934739,
"Washington DC", 412165236,
"Miami", 124437843,
"Berlin", 222347384,
"Munich", 6783434,
"Hamburg", 6000033
];
let Countries=datatable (name: string, country: string) [
"New York", "US",
"Washington DC", "US",
"Miami", "US",
"Berlin", "DE",
"Munich", "DE",
"Hamburg", "DE"
];
Countries
| lookup Populations on name
| order by country, population desc
| extend rn = row_number(0, country != prev(country))
| where rn <=1
| project country, name, population

Related

Risk of a single Doc with a dozen arrays containing thousands of small objects

Can the write operations on a single doc break when many users are online?
There will be ~12 such arrays, each with tens of thousands of such objects. Write operations would be:
increment(+1) for the count field of existing objects based on currency and activity (stay, sports...),
add an entire new object when it's a new currency and country,
update an existing object, eg: increment(-1) or price change.
Note: all data is displayed at once on a single page.
stayCount [
{price: "15", count: 2, country: "USA", currency: "USD"},
{price: "15, count: 3, country: "UAE", currency: AED},
{price: "25", count: 5, country: "USA", currency: USD}
]
sportsCount [
{price: "15", count: 1, country: "Germany", currency: EUR},
{price: "49, count: 6, country: "UAE", currency: AED},
{price: "49", count: 8, country: "France", currency: EUR}
]
Asking because of the one write max per Doc per second rule.

Kusto - Render Column chart as per bucket values (extend operator)

If I run the below Query to the following datatable I am getting results only as 10E for event _count>10. Is there any reason the other categories are not getting displayed in the bucket. I would like to render column chart as per the event count category. Thanks.
| summarize event_count=count() by State
| where event_count > 10
| extend bucket = case (
event_count > 10, "10E",
event_count > 100, "100E",
event_count > 500, "500E",
event_count > 1000, "1000E",
event_count > 5000, ">5000E",
"N/A")
| project bucket```
datatable (State: string, event_count: long) [
"VIRGIN ISLANDS",long(12),
"AMERICAN SAMOA",long(16),
"DISTRICT OF COLUMBIA",long(22),
"LAKE ERIE",long(27),
"LAKE ST CLAIR",long(32),
"LAKE SUPERIOR",long(34),
"RHODE ISLAND",long(51),
"LAKE HURON",long(63),
"CONNECTICUT",long(148)
]
When a condition is true in a "case" function, it doe not continue to the next one. Since all of your counts are bigger than 10, then the first category is correct for all of them. It seems that you wanted that the condition would be less or equal to, here is an example:
datatable (State: string, event_count: long) [
"VIRGIN ISLANDS",long(12),
"AMERICAN SAMOA",long(16),
"DISTRICT OF COLUMBIA",long(22),
"LAKE ERIE",long(27),
"LAKE ST CLAIR",long(32),
"LAKE SUPERIOR",long(34),
"RHODE ISLAND",long(51),
"LAKE HURON",long(63),
"CONNECTICUT",long(148)
]
| where event_count > 10
| extend bucket = case (
event_count <= 10, "10E",
event_count <= 100, "100E",
event_count <= 500, "500E",
event_count <= 1000, "1000E",
event_count <= 5000, ">5000E",
"N/A")
| summarize sum(event_count) by bucket
| render columnchart
bucket
sum_event_count
100E
257
500E
148

How to effectively chain groupby queries from flat api data in Kafka Streams?

I have some random data coming from an API into a Kafka topic that looks like this:
{"vin": "1N6AA0CA7CN040747", "make": "Nissan", "model": "Pathfinder", "year": 1993, "color": "Blue", "salePrice": "$58312.28", "city": "New York City", "state": "New York", "zipCode": "10014"}
{"vin": "1FTEX1C88AF678435", "make": "Audi", "model": "200", "year": 1991, "color": "Aquamarine", "salePrice": "$65651.53", "city": "Newport Beach", "state": "California", "zipCode": "92662"}
{"vin": "JN8AS1MU1BM237985", "make": "Subaru", "model": "Legacy", "year": 1990, "color": "Violet", "salePrice": "$21325.27", "city": "Joliet", "state": "Illinois", "zipCode": "60435"}
{"vin": "SCBGR3ZA1CC504502", "make": "Mercedes-Benz", "model": "E-Class", "year": 1986, "color": "Fuscia", "salePrice": "$81822.04", "city": "Pasadena", "state": "California", "zipCode": "91117"}
I am able to create KStream objects and observe them, like this:
KStream<byte[], UsedCars> usedCarsInputStream =
builder.stream("used-car-colors", Consumed.with(Serdes.ByteArray(), new UsedCarsSerdes()));
//k, v => year, countof cars in year
KTable<String,Long> yearCount = usedCarsInputStream
.filter((k,v)->v.getYear() > 2010)
.selectKey((k,v) -> v.getVin())
.groupBy((key, value) -> Integer.toString(value.getYear()))
.count().toStream().print(Printed.<String, Long>toSysOut().withLabel("blah"));
This of course gives us a count of the records grouped by each year greater than 2010. However, what I would like to do in the next step, but have been unable to accomplish, is to simply take each of those years, as in a foreach, and count the number of cars in each color per year. I attempted writing a foreach on yearCount.toStream() to further process the data, but got no results.
I am looking for output that might look like this:
{
"2011": [
{
"blue": "99",
"green": "243,",
"red": "33"
}
],
"2012": [
{
"blue": "74,",
"green": "432,",
"red": "2"
}
]
}
I believe I may have answered my own question. I would welcome any others to comment on my own solution.
What I did not realize is that you can do GroupBy an object that is essentially a compound object. In this case, I needed the equivalent of this following SQL statement
SELECT year, color, count(*) FROM use_car_colors AS years
GROUP BY year, color
In Kafka Streams, you can accomplish this by creating an object -- in this situation, I created a POJO class called 'YearColor' with members year and color -- and then select that as a key in Kafka Streams:
usedCarsInputStream
.selectKey((k,v) -> new YearColor(v.getYear(), v.getColor()))
.groupByKey(Grouped.with(new YearColorSerdes(), new UsedCarsSerdes()))
.count()
.toStream()
.peek((yc, ct) -> System.out.println("year: " + yc.getYear() + " color: " + yc.getColor()
+ " count: " + ct));
You of course have to implement the Serializer and Deserializer for this object (and I did with YearColorSerdes()). My output when running the Kafka Streams application gives me updates on the modified counts, a la:
year: 2012 color: Maroon count: 2
year: 2013 color: Khaki count: 1
year: 2012 color: Crimson count: 5
year: 2011 color: Pink count: 4
year: 2011 color: Green count: 2
which is what I was looking for.

JQ If then Else

How would I do an if then else on the value of a field? Such as my data I am working with looks like:
{"_key": "USCA3DC_8f4521822c099c3e",
"partner_attributions": ["This business is a Yelp advertiser."],
"showcase_photos": [
["Mathnasium of Westwood - Westwood, CA, United States. Nice and caring instructors", "http://s3-media1.fl.yelpcdn.com/bphoto/KeKAhvy2HHY4KGpvA24VaA/ls.jpg"],
["Mathnasium of Westwood - Westwood, CA, United States. Prize box and estimation jar!", "http://s3-media3.fl.yelpcdn.com/bphoto/lJWHHCAVaUMfeFD7GDKtHw/ls.jpg"],
["Mathnasium of Westwood - Westwood, CA, United States. New table setup!!!!", "http://s3-media2.fl.yelpcdn.com/bphoto/kVYJrYqDRHPOH4F2uTuFVg/ls.jpg"],
["Mathnasium of Westwood - Westwood, CA, United States. Halloween party", "http://s3-media3.fl.yelpcdn.com/bphoto/wKm5KjF0V8MsPTVSuofPEQ/180s.jpg"],
["Mathnasium of Westwood - Westwood, CA, United States", "http://s3-media4.fl.yelpcdn.com/bphoto/r2981msJm0c1ocU09blb1A/180s.jpg"],
["Mathnasium of Westwood - Westwood, CA, United States", "http://s3-media3.fl.yelpcdn.com/bphoto/r2Vgo18YKeUojDvjQMRF_A/180s.jpg"]
],
"review_count": "24",
"yelp_id": "t7WyXcABE3xj20G-UqXalA",
"rating_value": "5.0",
"coordinates": {
"latitude": "34.042568",
"longitude": "-118.431038"
}
}
This is just a small sample of it, however I am using this expression to parse it:
{_key, last_visited, name, phone, price_range, rating_value, review_count, updated, url, website, yelp_id} + (if (.partner_attributions | length) > 0 then .partner_attributions == "yes" else .partner_attributions == "no" end) + ([leaf_paths as $path | {"key": $path | map(tostring) | join("_"),"value": getpath($path)}] | from_entries)
What I want to do is have an If then else for the Partner_attributions field that if there is something there make it a yes and if it is null make it No. I have tried a few things with no success, seems simple enough but having trouble trying to figure it out.
Can someone help?
Better yet perhaps:
.partner_attributions |= (if length > 0 then "yes" else "no" end)
This just updates the one field without modifying anything else.
You need to create an actual object to add to the first object. Change the if expression to:
{partner_attributions: (if (.partner_attributions | length) > 0 then "yes" else "no" end)}

Creating a continuous heat map in R

I have a series of x and y coordinates that each have a distance attached to them. I would like to create a heat map that displays the average distance for every point within the x and y ranges as a heat map. Since the points are not spaced evenly from each other in a lattice-like shape, the method would require some kind of smoothing function that clusters data and calculates the average for each point the vicinity and then representing that average with a color.
So far, using ggplot2, I can only find methods like stat_density2d and geom_tile, which only work for displaying point density and representing evenly spaced points (as far as I can tell).
Ideally it would follow the same principle as this image:
in which colors were assigned based on the given points in the vicinity even though the density and placement of the points was not uniform.
I do not want to create a heat map in matrix form like this image:
in which a table is color-coded. Instead, I would like to create a continuous heat map using non-uniformly distributed x and y coordinates that, in effect, displays the limit in which the data is broken into infinitely many rectangles. This may not be the actual method used by the function, but it provides a general idea as to what I'm looking for.
Here is some sample data:
data=data.frame(x=c(1,1,2,2,3,4,5,6,7,7,8,9),
y=c(2,4,5,1,3,8,4,8,1,1,6,9),
distance=c(66,84,93,76,104,29,70,19,60,50,46,36))
How can I make a heat map with distance as the color scale that covers the entire range of numbers, like the plot in the first link provided?
Any help is greatly appreciated!
In order to generate a continuous map with irregularly-spaced coordinates you need first to intrapolate a regular grid (here using function interp of package akima):
require(akima)
data <- data.frame(x=c(1,1,2,2,3,4,5,6,7,7,8,9),
y=c(2,4,5,1,3,8,4,8,1,1,6,9),
distance=c(66,84,93,76,104,29,70,19,60,50,46,36))
resolution <- 0.1 # you can increase the resolution by decreasing this number (warning: the resulting dataframe size increase very quickly)
a <- interp(x=data$x, y=data$y, z=data$distance,
xo=seq(min(data$x),max(data$x),by=resolution),
yo=seq(min(data$y),max(data$y),by=resolution), duplicate="mean")
image(a) #you can of course modify the color palette and the color categories. See ?image for more explanation
Or you can use, for the plotting itself, function filled.contour:
filled.contour(a, color.palette=heat.colors)
There is a user-written function here that produces heatmaps using ggplot2:
http://www.r-bloggers.com/ggheat-a-ggplot2-style-heatmap-function/
And their example image:
If what you want is a topo map as in your example, there are plenty of tools for that (just search under "topo map".
And finally, there's the isarithmic map, which just goes to show that you need to make clear exactly what you want done if you want some smoothing incorporated:
http://dsparks.wordpress.com/2011/10/24/isarithmic-maps-of-public-opinion-data/
using the akima::interp solution suggested by #plannapus, you can convert it to a ggplot2 heatmap.
Advantage of this ggplot2 solution is that you can easily add initial points with geom_point() or density curves with geom_density2d() (although here density will be unreliable with the 12 points you have).
library(akima)
library(tidyverse)
data <- data.frame(x=c(1,1,2,2,3,4,5,6,7,7,8,9),
y=c(2,4,5,1,3,8,4,8,1,1,6,9),
distance=c(66,84,93,76,104,29,70,19,60,50,46,36))
resolution <- 0.1 # you can increase the resolution by decreasing this number (warning: the resulting dataframe size increase very quickly)
a <- interp(x=data$x, y=data$y, z=data$distance,
xo=seq(min(data$x),max(data$x),by=resolution),
yo=seq(min(data$y),max(data$y),by=resolution), duplicate="mean")
res <- a$z %>%
magrittr::set_colnames(a$y) %>%
as_tibble() %>%
mutate(x=a$x) %>%
gather(y, z, -x, convert=TRUE)
res %>%
ggplot(aes(x, y)) +
geom_tile(aes(fill=z)) +
geom_point(data=data) +
scale_fill_viridis_c()
Created on 2020-01-29 by the reprex package (v0.3.0.9001)
ggplot2::ggfluctuation(data, type="colour")
I can't give out all this data but the head is below in the dput structure.
structure(list(X1 = 236:241, HomeTeam = structure(c(8L, 19L,
37L, 4L, 6L, 15L), .Label = c("Arizona Cardinals", "Atlanta Falcons",
"Baltimore Ravens", "Buffalo Bills", "Carolina Panthers", "Chicago Bears",
"Cincinnati Bengals", "Cleveland Browns", "Dallas Cowboys", "Denver Broncos",
"Detroit Lions", "Green Bay Packers", "Houston Oilers", "Houston Texans",
"Indianapolis Colts", "Jacksonville Jaguars", "Kansas City Chiefs",
"Los Angeles Raiders", "Los Angeles Rams", "Miami Dolphins",
"Minnesota Vikings", "New England Patriots", "New Orleans Saints",
"New York Giants", "New York Jets", "Oakland Raiders", "Philadelphia Eagles",
"Phoenix Cardinals", "Pittsburgh Steelers", "San Diego Chargers",
"San Francisco 49ers", "Seattle Seahawks", "St. Louis Rams",
"Tampa Bay Buccaneers", "Tennessee Oilers", "Tennessee Titans",
"Washington Redskins"), class = "factor"), AwayTeam = structure(c(9L,
28L, 11L, 20L, 21L, 22L), .Label = c("Arizona Cardinals", "Atlanta Falcons",
"Baltimore Ravens", "Buffalo Bills", "Carolina Panthers", "Chicago Bears",
"Cincinnati Bengals", "Cleveland Browns", "Dallas Cowboys", "Denver Broncos",
"Detroit Lions", "Green Bay Packers", "Houston Oilers", "Houston Texans",
"Indianapolis Colts", "Jacksonville Jaguars", "Kansas City Chiefs",
"Los Angeles Raiders", "Los Angeles Rams", "Miami Dolphins",
"Minnesota Vikings", "New England Patriots", "New Orleans Saints",
"New York Giants", "New York Jets", "Oakland Raiders", "Philadelphia Eagles",
"Phoenix Cardinals", "Pittsburgh Steelers", "San Diego Chargers",
"San Francisco 49ers", "Seattle Seahawks", "St. Louis Rams",
"Tampa Bay Buccaneers", "Tennessee Oilers", "Tennessee Titans",
"Washington Redskins"), class = "factor"), Date = structure(c(45L,
45L, 45L, 45L, 45L, 45L), .Label = c("1990-09-09", "1990-09-10",
"1990-09-16", "1990-09-17", "1990-09-23", "1990-09-24", "1990-09-30",
"1990-10-01", "1990-10-07", "1990-10-08", "1990-10-14", "1990-10-15",
"1990-10-18", "1990-10-21", "1990-10-22", "1990-10-28", "1990-10-29",
"1990-11-04", "1990-11-05", "1990-11-11", "1990-11-12", "1990-11-18",
"1990-11-19", "1990-11-22", "1990-11-25", "1990-11-26", "1990-12-02",
"1990-12-03", "1990-12-09", "1990-12-10", "1990-12-15", "1990-12-16",
"1990-12-17", "1990-12-22", "1990-12-23", "1990-12-29", "1990-12-30",
"1990-12-31", "1991-01-05", "1991-01-06", "1991-01-12", "1991-01-13",
"1991-01-20", "1991-01-27", "1991-09-01", "1991-09-02", "1991-09-08",
"1991-09-09", "1991-09-15", "1991-09-16", "1991-09-22", "1991-09-23",
"1991-09-29", "1991-09-30", "1991-10-06", "1991-10-07", "1991-10-13",
"1991-10-14", "1991-10-17", "1991-10-20", "1991-10-21", "1991-10-27",
"1991-10-28", "1991-11-03", "1991-11-04", "1991-11-10", "1991-11-11",
"1991-11-17", "1991-11-18", "1991-11-24", "1991-11-25", "1991-11-28",
"1991-12-01", "1991-12-02", "1991-12-08", "1991-12-09", "1991-12-14",
"1991-12-15", "1991-12-16", "1991-12-21", "1991-12-22", "1991-12-23",
"1991-12-28", "1991-12-29", "1992-01-04", "1992-01-05", "1992-01-12",
"1992-01-26", "1992-09-06", "1992-09-07", "1992-09-13", "1992-09-14",
"1992-09-20", "1992-09-21", "1992-09-27", "1992-09-28", "1992-10-04",
"1992-10-05", "1992-10-11", "1992-10-12", "1992-10-15", "1992-10-18",
"1992-10-19", "1992-10-25", "1992-10-26", "1992-11-01", "1992-11-02",
"1992-11-08", "1992-11-09", "1992-11-15", "1992-11-16", "1992-11-22",
"1992-11-23", "1992-11-26", "1992-11-29", "1992-11-30", "1992-12-03",
"1992-12-06", "1992-12-07", "1992-12-12", "1992-12-13", "1992-12-14",
"1992-12-19", "1992-12-20", "1992-12-21", "1992-12-26", "1992-12-27",
"1992-12-28", "1993-01-02", "1993-01-03", "1993-01-09", "1993-01-10",
"1993-01-17", "1993-01-31", "1993-09-05", "1993-09-06", "1993-09-12",
"1993-09-13", "1993-09-19", "1993-09-20", "1993-09-26", "1993-09-27",
"1993-10-03", "1993-10-04", "1993-10-10", "1993-10-11", "1993-10-14",
"1993-10-17", "1993-10-18", "1993-10-24", "1993-10-25", "1993-10-31",
"1993-11-01", "1993-11-07", "1993-11-08", "1993-11-14", "1993-11-15",
"1993-11-21", "1993-11-22", "1993-11-25", "1993-11-28", "1993-11-29",
"1993-12-05", "1993-12-06", "1993-12-11", "1993-12-12", "1993-12-13",
"1993-12-18", "1993-12-19", "1993-12-20", "1993-12-25", "1993-12-26",
"1993-12-27", "1993-12-31", "1994-01-02", "1994-01-03", "1994-01-08",
"1994-01-09", "1994-01-15", "1994-01-16", "1994-01-23", "1994-01-30",
"1994-09-04", "1994-09-05", "1994-09-11", "1994-09-12", "1994-09-18",
"1994-09-19", "1994-09-25", "1994-09-26", "1994-10-02", "1994-10-03",
"1994-10-09", "1994-10-10", "1994-10-13", "1994-10-16", "1994-10-17",
"1994-10-20", "1994-10-23", "1994-10-24", "1994-10-30", "1994-10-31",
"1994-11-06", "1994-11-07", "1994-11-13", "1994-11-14", "1994-11-20",
"1994-11-21", "1994-11-24", "1994-11-27", "1994-11-28", "1994-12-01",
"1994-12-04", "1994-12-05", "1994-12-10", "1994-12-11", "1994-12-12",
"1994-12-17", "1994-12-18", "1994-12-19", "1994-12-24", "1994-12-25",
"1994-12-26", "1994-12-31", "1995-01-01", "1995-01-07", "1995-01-08",
"1995-01-15", "1995-01-29", "1995-09-03", "1995-09-04", "1995-09-10",
"1995-09-11", "1995-09-17", "1995-09-18", "1995-09-24", "1995-09-25",
"1995-10-01", "1995-10-02", "1995-10-08", "1995-10-09", "1995-10-12",
"1995-10-15", "1995-10-16", "1995-10-19", "1995-10-22", "1995-10-23",
"1995-10-29", "1995-10-30", "1995-11-05", "1995-11-06", "1995-11-12",
"1995-11-13", "1995-11-19", "1995-11-20", "1995-11-23", "1995-11-26",
"1995-11-27", "1995-11-30", "1995-12-03", "1995-12-04", "1995-12-09",
"1995-12-10", "1995-12-11", "1995-12-16", "1995-12-17", "1995-12-18",
"1995-12-23", "1995-12-24", "1995-12-25", "1995-12-30", "1995-12-31",
"1996-01-06", "1996-01-07", "1996-01-14", "1996-01-28", "1996-09-01",
"1996-09-02", "1996-09-08", "1996-09-09", "1996-09-15", "1996-09-16",
"1996-09-22", "1996-09-23", "1996-09-29", "1996-09-30", "1996-10-06",
"1996-10-07", "1996-10-13", "1996-10-14", "1996-10-17", "1996-10-20",
"1996-10-21", "1996-10-27", "1996-10-28", "1996-11-03", "1996-11-04",
"1996-11-10", "1996-11-11", "1996-11-17", "1996-11-18", "1996-11-24",
"1996-11-25", "1996-11-28", "1996-12-01", "1996-12-02", "1996-12-05",
"1996-12-08", "1996-12-09", "1996-12-14", "1996-12-15", "1996-12-16",
"1996-12-21", "1996-12-22", "1996-12-23", "1996-12-28", "1996-12-29",
"1997-01-04", "1997-01-05", "1997-01-12", "1997-01-26", "1997-08-31",
"1997-09-01", "1997-09-07", "1997-09-08", "1997-09-14", "1997-09-15",
"1997-09-21", "1997-09-22", "1997-09-28", "1997-09-29", "1997-10-05",
"1997-10-06", "1997-10-12", "1997-10-13", "1997-10-16", "1997-10-19",
"1997-10-20", "1997-10-26", "1997-10-27", "1997-11-02", "1997-11-03",
"1997-11-09", "1997-11-10", "1997-11-16", "1997-11-17", "1997-11-23",
"1997-11-24", "1997-11-27", "1997-11-30", "1997-12-01", "1997-12-04",
"1997-12-07", "1997-12-08", "1997-12-13", "1997-12-14", "1997-12-15",
"1997-12-20", "1997-12-21", "1997-12-22", "1997-12-27", "1997-12-28",
"1998-01-03", "1998-01-04", "1998-01-11", "1998-01-25", "1998-09-06",
"1998-09-07", "1998-09-13", "1998-09-14", "1998-09-20", "1998-09-21",
"1998-09-27", "1998-09-28", "1998-10-04", "1998-10-05", "1998-10-11",
"1998-10-12", "1998-10-15", "1998-10-18", "1998-10-19", "1998-10-25",
"1998-10-26", "1998-11-01", "1998-11-02", "1998-11-08", "1998-11-09",
"1998-11-15", "1998-11-16", "1998-11-22", "1998-11-23", "1998-11-26",
"1998-11-29", "1998-11-30", "1998-12-03", "1998-12-06", "1998-12-07",
"1998-12-13", "1998-12-14", "1998-12-19", "1998-12-20", "1998-12-21",
"1998-12-26", "1998-12-27", "1998-12-28", "1999-01-02", "1999-01-03",
"1999-01-09", "1999-01-10", "1999-01-17", "1999-01-31", "1999-09-12",
"1999-09-13", "1999-09-19", "1999-09-20", "1999-09-26", "1999-09-27",
"1999-10-03", "1999-10-04", "1999-10-10", "1999-10-11", "1999-10-17",
"1999-10-18", "1999-10-21", "1999-10-24", "1999-10-25", "1999-10-31",
"1999-11-01", "1999-11-07", "1999-11-08", "1999-11-14", "1999-11-15",
"1999-11-21", "1999-11-22", "1999-11-25", "1999-11-28", "1999-11-29",
"1999-12-02", "1999-12-05", "1999-12-06", "1999-12-09", "1999-12-12",
"1999-12-13", "1999-12-18", "1999-12-19", "1999-12-20", "1999-12-24",
"1999-12-25", "1999-12-26", "1999-12-27", "2000-01-02", "2000-01-03",
"2000-01-08", "2000-01-09", "2000-01-15", "2000-01-16", "2000-01-23",
"2000-01-30", "2000-09-03", "2000-09-04", "2000-09-10", "2000-09-11",
"2000-09-17", "2000-09-18", "2000-09-24", "2000-09-25", "2000-10-01",
"2000-10-02", "2000-10-08", "2000-10-09", "2000-10-15", "2000-10-16",
"2000-10-19", "2000-10-22", "2000-10-23", "2000-10-29", "2000-10-30",
"2000-11-05", "2000-11-06", "2000-11-12", "2000-11-13", "2000-11-19",
"2000-11-20", "2000-11-23", "2000-11-26", "2000-11-27", "2000-11-30",
"2000-12-03", "2000-12-04", "2000-12-10", "2000-12-11", "2000-12-16",
"2000-12-17", "2000-12-18", "2000-12-23", "2000-12-24", "2000-12-25",
"2000-12-30", "2000-12-31", "2001-01-06", "2001-01-07", "2001-01-14",
"2001-01-28", "2001-09-09", "2001-09-10", "2001-09-23", "2001-09-24",
"2001-09-30", "2001-10-01", "2001-10-07", "2001-10-08", "2001-10-14",
"2001-10-15", "2001-10-18", "2001-10-21", "2001-10-22", "2001-10-25",
"2001-10-28", "2001-10-29", "2001-11-04", "2001-11-05", "2001-11-11",
"2001-11-12", "2001-11-18", "2001-11-19", "2001-11-22", "2001-11-25",
"2001-11-26", "2001-11-29", "2001-12-02", "2001-12-03", "2001-12-09",
"2001-12-10", "2001-12-15", "2001-12-16", "2001-12-17", "2001-12-22",
"2001-12-23", "2001-12-29", "2001-12-30", "2002-01-06", "2002-01-07",
"2002-01-12", "2002-01-13", "2002-01-19", "2002-01-20", "2002-01-27",
"2002-02-03", "2002-09-05", "2002-09-08", "2002-09-09", "2002-09-15",
"2002-09-16", "2002-09-22", "2002-09-23", "2002-09-29", "2002-09-30",
"2002-10-06", "2002-10-07", "2002-10-13", "2002-10-14", "2002-10-20",
"2002-10-21", "2002-10-27", "2002-10-28", "2002-11-03", "2002-11-04",
"2002-11-10", "2002-11-11", "2002-11-17", "2002-11-18", "2002-11-24",
"2002-11-25", "2002-11-28", "2002-12-01", "2002-12-02", "2002-12-08",
"2002-12-09", "2002-12-15", "2002-12-16", "2002-12-21", "2002-12-22",
"2002-12-23", "2002-12-28", "2002-12-29", "2002-12-30", "2003-01-04",
"2003-01-05", "2003-01-11", "2003-01-12", "2003-01-19", "2003-01-26",
"2003-09-04", "2003-09-07", "2003-09-08", "2003-09-14", "2003-09-15",
"2003-09-21", "2003-09-22", "2003-09-28", "2003-09-29", "2003-10-05",
"2003-10-06", "2003-10-12", "2003-10-13", "2003-10-19", "2003-10-20",
"2003-10-26", "2003-10-27", "2003-11-02", "2003-11-03", "2003-11-09",
"2003-11-10", "2003-11-16", "2003-11-17", "2003-11-23", "2003-11-24",
"2003-11-27", "2003-11-30", "2003-12-01", "2003-12-07", "2003-12-08",
"2003-12-14", "2003-12-15", "2003-12-20", "2003-12-21", "2003-12-22",
"2003-12-27", "2003-12-28", "2004-01-03", "2004-01-04", "2004-01-10",
"2004-01-11", "2004-01-18", "2004-02-01", "2004-09-09", "2004-09-11",
"2004-09-12", "2004-09-13", "2004-09-19", "2004-09-20", "2004-09-26",
"2004-09-27", "2004-10-03", "2004-10-04", "2004-10-10", "2004-10-11",
"2004-10-17", "2004-10-18", "2004-10-24", "2004-10-25", "2004-10-31",
"2004-11-01", "2004-11-07", "2004-11-08", "2004-11-14", "2004-11-15",
"2004-11-21", "2004-11-22", "2004-11-25", "2004-11-28", "2004-11-29",
"2004-12-05", "2004-12-06", "2004-12-12", "2004-12-13", "2004-12-18",
"2004-12-19", "2004-12-20", "2004-12-24", "2004-12-25", "2004-12-26",
"2004-12-27", "2005-01-02", "2005-01-08", "2005-01-09", "2005-01-15",
"2005-01-16", "2005-01-23", "2005-02-06", "2005-09-08", "2005-09-11",
"2005-09-12", "2005-09-18", "2005-09-19", "2005-09-25", "2005-09-26",
"2005-10-02", "2005-10-03", "2005-10-09", "2005-10-10", "2005-10-16",
"2005-10-17", "2005-10-21", "2005-10-23", "2005-10-24", "2005-10-30",
"2005-10-31", "2005-11-06", "2005-11-07", "2005-11-13", "2005-11-14",
"2005-11-20", "2005-11-21", "2005-11-24", "2005-11-27", "2005-11-28",
"2005-12-04", "2005-12-05", "2005-12-11", "2005-12-12", "2005-12-17",
"2005-12-18", "2005-12-19", "2005-12-24", "2005-12-25", "2005-12-26",
"2005-12-31", "2006-01-01", "2006-01-07", "2006-01-08", "2006-01-14",
"2006-01-15", "2006-01-22", "2006-02-05", "2006-09-07", "2006-09-10",
"2006-09-11", "2006-09-17", "2006-09-18", "2006-09-24", "2006-09-25",
"2006-10-01", "2006-10-02", "2006-10-08", "2006-10-09", "2006-10-15",
"2006-10-16", "2006-10-22", "2006-10-23", "2006-10-29", "2006-10-30",
"2006-11-05", "2006-11-06", "2006-11-12", "2006-11-13", "2006-11-19",
"2006-11-20", "2006-11-23", "2006-11-26", "2006-11-27", "2006-11-30",
"2006-12-03", "2006-12-04", "2006-12-07", "2006-12-10", "2006-12-11",
"2006-12-14", "2006-12-16", "2006-12-17", "2006-12-18", "2006-12-21",
"2006-12-23", "2006-12-24", "2006-12-25", "2006-12-30", "2006-12-31",
"2007-01-06", "2007-01-07", "2007-01-13", "2007-01-14", "2007-01-21",
"2007-02-04", "2007-09-06", "2007-09-09", "2007-09-10", "2007-09-16",
"2007-09-17", "2007-09-23", "2007-09-24", "2007-09-30", "2007-10-01",
"2007-10-07", "2007-10-08", "2007-10-14", "2007-10-15", "2007-10-21",
"2007-10-22", "2007-10-28", "2007-10-29", "2007-11-04", "2007-11-05",
"2007-11-11", "2007-11-12", "2007-11-18", "2007-11-19", "2007-11-22",
"2007-11-25", "2007-11-26", "2007-11-29", "2007-12-02", "2007-12-03",
"2007-12-06", "2007-12-09", "2007-12-10", "2007-12-13", "2007-12-15",
"2007-12-16", "2007-12-17", "2007-12-20", "2007-12-22", "2007-12-23",
"2007-12-24", "2007-12-29", "2007-12-30", "2008-01-05", "2008-01-06",
"2008-01-12", "2008-01-13", "2008-01-20", "2008-02-03", "2008-09-04",
"2008-09-07", "2008-09-08", "2008-09-14", "2008-09-15", "2008-09-21",
"2008-09-22", "2008-09-28", "2008-09-29", "2008-10-05", "2008-10-06",
"2008-10-12", "2008-10-13", "2008-10-19", "2008-10-20", "2008-10-26",
"2008-10-27", "2008-11-02", "2008-11-03", "2008-11-06", "2008-11-09",
"2008-11-10", "2008-11-13", "2008-11-16", "2008-11-17", "2008-11-20",
"2008-11-23", "2008-11-24", "2008-11-27", "2008-11-30", "2008-12-01",
"2008-12-04", "2008-12-07", "2008-12-08", "2008-12-11", "2008-12-14",
"2008-12-15", "2008-12-18", "2008-12-20", "2008-12-21", "2008-12-22",
"2008-12-28", "2009-01-03", "2009-01-04", "2009-01-10", "2009-01-11",
"2009-01-18", "2009-02-01", "2009-09-10", "2009-09-13", "2009-09-14",
"2009-09-20", "2009-09-21", "2009-09-27", "2009-09-28", "2009-10-04",
"2009-10-05", "2009-10-11", "2009-10-12", "2009-10-18", "2009-10-19",
"2009-10-25", "2009-10-26", "2009-11-01", "2009-11-02", "2009-11-08",
"2009-11-09", "2009-11-12", "2009-11-15", "2009-11-16", "2009-11-19",
"2009-11-22", "2009-11-23", "2009-11-26", "2009-11-29", "2009-11-30",
"2009-12-03", "2009-12-06", "2009-12-07", "2009-12-10", "2009-12-13",
"2009-12-14", "2009-12-17", "2009-12-19", "2009-12-20", "2009-12-21",
"2009-12-25", "2009-12-27", "2009-12-28", "2010-01-03", "2010-01-09",
"2010-01-10", "2010-01-16", "2010-01-17", "2010-01-24", "2010-02-07",
"2010-09-09", "2010-09-12", "2010-09-13", "2010-09-19", "2010-09-20",
"2010-09-26", "2010-09-27", "2010-10-03", "2010-10-04", "2010-10-10",
"2010-10-11", "2010-10-17", "2010-10-18", "2010-10-24", "2010-10-25",
"2010-10-31", "2010-11-01", "2010-11-07", "2010-11-08", "2010-11-11",
"2010-11-14", "2010-11-15", "2010-11-18", "2010-11-21", "2010-11-22",
"2010-11-25", "2010-11-28", "2010-11-29", "2010-12-02", "2010-12-05",
"2010-12-06", "2010-12-09", "2010-12-12", "2010-12-13", "2010-12-16",
"2010-12-19", "2010-12-20", "2010-12-23", "2010-12-25", "2010-12-26",
"2010-12-27", "2010-12-28", "2011-01-02", "2011-01-08", "2011-01-09",
"2011-01-15", "2011-01-16", "2011-01-23", "2011-02-06"), class = "factor"),
Season = c(1991, 1991, 1991, 1991, 1991, 1991), HomeRecord = structure(c(1L,
1L, 17L, 17L, 17L, 1L), .Label = c("(0-1-0)", "(0-10-0)",
"(0-11-0)", "(0-12-0)", "(0-13-0)", "(0-14-0)", "(0-15-0)",
"(0-16-0)", "(0-2-0)", "(0-3-0)", "(0-4-0)", "(0-5-0)", "(0-6-0)",
"(0-7-0)", "(0-8-0)", "(0-9-0)", "(1-0-0)", "(1-1-0)", "(1-10-0)",
"(1-10-1)", "(1-11-0)", "(1-11-1)", "(1-12-0)", "(1-13-0)",
"(1-14-0)", "(1-15-0)", "(1-2-0)", "(1-3-0)", "(1-4-0)",
"(1-5-0)", "(1-6-0)", "(1-7-0)", "(1-8-0)", "(1-8-1)", "(1-9-0)",
"(1-9-1)", "(10-0-0)", "(10-1-0)", "(10-2-0)", "(10-3-0)",
"(10-4-0)", "(10-5-0)", "(10-5-1)", "(10-6-0)", "(10-6-1)",
"(10-7-0)", "(10-7-1)", "(10-8-0)", "(11-0-0)", "(11-1-0)",
"(11-2-0)", "(11-3-0)", "(11-4-0)", "(11-5-0)", "(11-5-1)",
"(11-6-0)", "(11-6-1)", "(11-7-0)", "(11-7-1)", "(11-8-0)",
"(12-0-0)", "(12-1-0)", "(12-2-0)", "(12-3-0)", "(12-4-0)",
"(12-5-0)", "(12-6-0)", "(12-7-0)", "(12-8-0)", "(13-0-0)",
"(13-1-0)", "(13-2-0)", "(13-3-0)", "(13-4-0)", "(13-5-0)",
"(13-6-0)", "(14-0-0)", "(14-1-0)", "(14-2-0)", "(14-3-0)",
"(14-4-0)", "(14-5-0)", "(14-6-0)", "(15-0-0)", "(15-1-0)",
"(15-2-0)", "(15-3-0)", "(15-4-0)", "(15-5-0)", "(16-0-0)",
"(16-1-0)", "(16-2-0)", "(16-3-0)", "(16-4-0)", "(17-0-0)",
"(17-2-0)", "(18-0-0)", "(18-1-0)", "(2-0-0)", "(2-1-0)",
"(2-10-0)", "(2-11-0)", "(2-11-1)", "(2-12-0)", "(2-13-0)",
"(2-14-0)", "(2-2-0)", "(2-3-0)", "(2-4-0)", "(2-5-0)", "(2-6-0)",
"(2-7-0)", "(2-8-0)", "(2-9-0)", "(3-0-0)", "(3-1-0)", "(3-10-0)",
"(3-11-0)", "(3-11-1)", "(3-12-0)", "(3-13-0)", "(3-2-0)",
"(3-3-0)", "(3-4-0)", "(3-5-0)", "(3-6-0)", "(3-7-0)", "(3-8-0)",
"(3-9-0)", "(4-0-0)", "(4-1-0)", "(4-10-0)", "(4-11-0)",
"(4-11-1)", "(4-12-0)", "(4-2-0)", "(4-3-0)", "(4-4-0)",
"(4-5-0)", "(4-6-0)", "(4-6-1)", "(4-7-0)", "(4-7-1)", "(4-8-0)",
"(4-8-1)", "(4-9-0)", "(5-0-0)", "(5-1-0)", "(5-10-0)", "(5-11-0)",
"(5-2-0)", "(5-3-0)", "(5-3-1)", "(5-4-0)", "(5-4-1)", "(5-5-0)",
"(5-5-1)", "(5-6-0)", "(5-6-1)", "(5-7-0)", "(5-8-0)", "(5-8-1)",
"(5-9-0)", "(6-0-0)", "(6-1-0)", "(6-10-0)", "(6-2-0)", "(6-3-0)",
"(6-3-1)", "(6-4-0)", "(6-4-1)", "(6-5-0)", "(6-5-1)", "(6-6-0)",
"(6-6-1)", "(6-7-0)", "(6-7-1)", "(6-8-0)", "(6-8-1)", "(6-9-0)",
"(6-9-1)", "(7-0-0)", "(7-1-0)", "(7-2-0)", "(7-3-0)", "(7-3-1)",
"(7-4-0)", "(7-4-1)", "(7-5-0)", "(7-5-1)", "(7-6-0)", "(7-6-1)",
"(7-7-0)", "(7-7-1)", "(7-8-0)", "(7-9-0)", "(8-0-0)", "(8-1-0)",
"(8-10-0)", "(8-2-0)", "(8-3-0)", "(8-3-1)", "(8-4-0)", "(8-4-1)",
"(8-5-0)", "(8-5-1)", "(8-6-0)", "(8-6-1)", "(8-7-0)", "(8-7-1)",
"(8-8-0)", "(8-9-0)", "(9-0-0)", "(9-1-0)", "(9-2-0)", "(9-3-0)",
"(9-4-0)", "(9-5-0)", "(9-5-1)", "(9-6-0)", "(9-6-1)", "(9-7-0)",
"(9-8-0)", "(9-9-0)"), class = "factor"), AwayRecord = structure(c(17L,
17L, 1L, 1L, 1L, 17L), .Label = c("(0-1-0)", "(0-10-0)",
"(0-11-0)", "(0-12-0)", "(0-13-0)", "(0-14-0)", "(0-15-0)",
"(0-16-0)", "(0-2-0)", "(0-3-0)", "(0-4-0)", "(0-5-0)", "(0-6-0)",
"(0-7-0)", "(0-8-0)", "(0-9-0)", "(1-0-0)", "(1-1-0)", "(1-10-0)",
"(1-10-1)", "(1-11-0)", "(1-11-1)", "(1-12-0)", "(1-13-0)",
"(1-14-0)", "(1-15-0)", "(1-2-0)", "(1-3-0)", "(1-4-0)",
"(1-5-0)", "(1-6-0)", "(1-7-0)", "(1-8-0)", "(1-8-1)", "(1-9-0)",
"(1-9-1)", "(10-0-0)", "(10-1-0)", "(10-2-0)", "(10-3-0)",
"(10-4-0)", "(10-5-0)", "(10-5-1)", "(10-6-0)", "(10-6-1)",
"(10-7-0)", "(10-7-1)", "(10-8-0)", "(11-0-0)", "(11-1-0)",
"(11-2-0)", "(11-3-0)", "(11-4-0)", "(11-5-0)", "(11-5-1)",
"(11-6-0)", "(11-6-1)", "(11-7-0)", "(11-7-1)", "(11-8-0)",
"(12-0-0)", "(12-1-0)", "(12-2-0)", "(12-3-0)", "(12-4-0)",
"(12-5-0)", "(12-6-0)", "(12-7-0)", "(12-8-0)", "(13-0-0)",
"(13-1-0)", "(13-2-0)", "(13-3-0)", "(13-4-0)", "(13-5-0)",
"(13-6-0)", "(14-0-0)", "(14-1-0)", "(14-2-0)", "(14-3-0)",
"(14-4-0)", "(14-5-0)", "(14-6-0)", "(15-0-0)", "(15-1-0)",
"(15-2-0)", "(15-3-0)", "(15-4-0)", "(15-5-0)", "(16-0-0)",
"(16-1-0)", "(16-2-0)", "(16-3-0)", "(16-4-0)", "(17-0-0)",
"(17-2-0)", "(18-0-0)", "(18-1-0)", "(2-0-0)", "(2-1-0)",
"(2-10-0)", "(2-11-0)", "(2-11-1)", "(2-12-0)", "(2-13-0)",
"(2-14-0)", "(2-2-0)", "(2-3-0)", "(2-4-0)", "(2-5-0)", "(2-6-0)",
"(2-7-0)", "(2-8-0)", "(2-9-0)", "(3-0-0)", "(3-1-0)", "(3-10-0)",
"(3-11-0)", "(3-11-1)", "(3-12-0)", "(3-13-0)", "(3-2-0)",
"(3-3-0)", "(3-4-0)", "(3-5-0)", "(3-6-0)", "(3-7-0)", "(3-8-0)",
"(3-9-0)", "(4-0-0)", "(4-1-0)", "(4-10-0)", "(4-11-0)",
"(4-11-1)", "(4-12-0)", "(4-2-0)", "(4-3-0)", "(4-4-0)",
"(4-5-0)", "(4-6-0)", "(4-6-1)", "(4-7-0)", "(4-7-1)", "(4-8-0)",
"(4-8-1)", "(4-9-0)", "(5-0-0)", "(5-1-0)", "(5-10-0)", "(5-11-0)",
"(5-2-0)", "(5-3-0)", "(5-3-1)", "(5-4-0)", "(5-4-1)", "(5-5-0)",
"(5-5-1)", "(5-6-0)", "(5-6-1)", "(5-7-0)", "(5-8-0)", "(5-8-1)",
"(5-9-0)", "(6-0-0)", "(6-1-0)", "(6-10-0)", "(6-2-0)", "(6-3-0)",
"(6-3-1)", "(6-4-0)", "(6-4-1)", "(6-5-0)", "(6-5-1)", "(6-6-0)",
"(6-6-1)", "(6-7-0)", "(6-7-1)", "(6-8-0)", "(6-8-1)", "(6-9-0)",
"(6-9-1)", "(7-0-0)", "(7-1-0)", "(7-2-0)", "(7-3-0)", "(7-3-1)",
"(7-4-0)", "(7-4-1)", "(7-5-0)", "(7-5-1)", "(7-6-0)", "(7-6-1)",
"(7-7-0)", "(7-7-1)", "(7-8-0)", "(7-9-0)", "(8-0-0)", "(8-1-0)",
"(8-10-0)", "(8-2-0)", "(8-3-0)", "(8-3-1)", "(8-4-0)", "(8-4-1)",
"(8-5-0)", "(8-5-1)", "(8-6-0)", "(8-6-1)", "(8-7-0)", "(8-7-1)",
"(8-8-0)", "(8-9-0)", "(9-0-0)", "(9-1-0)", "(9-2-0)", "(9-3-0)",
"(9-4-0)", "(9-5-0)", "(9-5-1)", "(9-6-0)", "(9-6-1)", "(9-7-0)",
"(9-8-0)", "(9-9-0)"), class = "factor"), HomeFinal = c(14L,
14L, 45L, 35L, 10L, 7L), AwayFinal = c(26L, 24L, 0L, 31L,
6L, 16L), HomeLast = c(4, 4, 5, 5, 0, 7), AwayLast = c(6,
4, 0, 1, 6, 6), Winner = c("Away", "Away", "Home", "Home",
"Home", "Away")), .Names = c("X1", "HomeTeam", "AwayTeam",
"Date", "Season", "HomeRecord", "AwayRecord", "HomeFinal", "AwayFinal",
"HomeLast", "AwayLast", "Winner"), row.names = c(NA, 6L), class = "data.frame")
Then you would do.
ggfluctuation(table(gamesWide$HomeLast, gamesWide$AwayLast), type="colour") + labs(x="Away", y="Home") + opts(title="Distribution of Last Digit of Score")
To get
Of course, that image was generated using the full dataset. This should be further extensible to data that isn't so symmetric and rectangular.

Resources