I am trying to render some geographic data onto the map in Tableau. However, some data points located at the same point, so the shape images of the data points overlaps together. By clicking on a shape, you could only get the top one.
How can we distinguish the overlapped data points in Tableau? I know that we can manually exclude the top data to see another, but is there any other way, for example, make a drop down list in the right click menu to select the overlapped data points?
Thank you!
There are a couple of ways to deal with this issue.
Some choices you can try are:
Add some transparency to the marks by editing the color shelf properties. That way at least you get a visual indication when there are multiple marks stacked on top of each other. This approach can be considered a poor man's heat map if you have many points in different areas as the denser/darker sections will have more marks. (But that just affects the appearance and doesn't help you select and view details for marks that are covered by others)
Add some small pseudo-random jitter to each coordinate using calculated fields. This will be easier when Tableau supports a rand() function, but in the meantime you can get creative enough using other fields and the math function to add a little jitter. The goal here is to slightly shift locations enough that they don't stack exactly, but not enough to matter in precision. Depends on the scale.
Make a grid style heat map where the color indicates the number of data points in each grid. To do this, you'll need to create calculated fields to bin together nearby latitudes or longitudes. Say to round each latitude to a certain number of decimal places, or use the hex bin functions in Tableau. Those calculated fields will need to have a geographic role and be treated as continuous dimensions.
Define your visualization to display one mark for each unique location, and then use color or size to indicate the number of data points at that location, as opposed to a mark for each individual data point
Related
For example, this is a heatmap from a website using GPS data:
I have gotten some degree of success with adding a weight parameter to each vertex and calculating the number of events that have vertices near those, but that takes a long time, especially with a large amount of data. It also appears a bit spotty when the distance between vertices is a bit wonky, which causes random splotches of different colors throughout the heatmap. It looks kind of cool, but it makes the data a bit harder to read.
When you zoom out, it looks a bit more continuous due to the paths overlapping more.
In R, the closest I can do to this involves using an alpha channel, but that only gets me a monochromatic heatmap, which is not always desirable, especially when you want to see lesser-traveled paths visibly. In theory I could do two lines to resolve the visibility part (first opaque, second semi-transparent), but I would like to be able to have different hue values.
Ideally I would like this to work with ggplot, but if it cannot, I would accept other methods, provided they are reasonably quick computationally.
Edit: The data format is a data frame with sequential (latitude, longitude) coordinate pairs, along with some associated data that can be used for filter & grouping (such as activity type and event ID).
Here is a sample of the data for the region displayed in the above images (~1.5 MB):
https://www.dropbox.com/s/13p2jtz4760m26d/sample_coordinate_data.csv?dl=0
I would try something like
ggplot() + geom_count(data, aes(longitude, latitude, alpha=..prop..))
but you need to show some data to check how it works.
I'm clearly struggling with this problem for a day now and can't seem to find a nice solution to it. I would really appreciate some help and I'm really a novice in R (since last week).
Problem 1:
I have a set CSV representing grid points which I can parse into a data frame (pointname, latitude, longitude).
Eg:
name,latitude,longitude
x0y0,35.9767,-122.605
x1y0,35.9767,-122.594
x2y0,35.9767,-122.583
x0y1,35.9857,-122.605
x1y1,35.9857,-122.594
x2y1,35.9857,-122.583
x0y2,35.9947,-122.605
x1y2,35.9947,-122.594
x2y2,35.9947,-122.583
The points in this file represent the lower left corner and are arranged in row major format, meaning lowest horizontal grid points first. Each point is a certain great circle distance away from its neighbors (1km). I want to create a grid overlay on a map which I've plotted using ggmap.
What I've tried or considered:
map.grid() - this is really not useful to me as I'm not looking for any kind of projection.
geom_vline() and geom_hline(). These look good but I don't have constant x and y intercepts on a plane. Moreover, once I create a grid, I'd like to use the grid to color against a density.
geom_rect() and geom_tile(). These look really promising and may be what I want. But I'm not able to find a good way of working with these.
I'd like to fill these grid boxes later with another parameter. Any suggestions on how I can create such a grid? This may be a trivial question but I don't know a lot of R yet.
Problem 2:
How can I store or hold such a grid so that I given a point (lat,lon), I can quickly get to that grid. In fact my whole back end is in C++ and can directly output the grid name x<n>y<n> directly against a given search point. I somehow am finding it difficult to count such points against grid points so that I can fill grid with a representative color.
I'm not sure if everything of what I'm saying is clear. Please tell me if I've to clarify something.
Also note that I've Googled quite a lot and not found relevant answers although some looked close.
Eg: This, ThisToo
Thanks for the help!
I'm getting familiar with Graphviz and wonder if it's doable to generate a diagram/graph like the one below (not sure what you call it). If not, does anyone know what's a good open source framework that does it? (pref, C++, Java or Python).
According to Many Eyes, this is a bubble chart. They say:
It is especially useful for data sets with dozens to hundreds of values, or with values that differ by several orders of magnitude.
...
To see the exact value of a circle on the chart, move your mouse over it. If you are charting more than one dimension, use the menu to choose which dimension to show. If your data set has multiple numeric columns, you can choose which column to base the circle sizes on by using the menu at the bottom of the chart.
Thus, any presentation with a lot of bubbles in it (especially with many small bubbles) would have to be dynamic to respond to the mouse.
My usual practice with bubble charts is to show three or four variables (x, y and another variable through the size of the bubble, and perhaps another variable with the color or shading of the bubble). With animation, you can show development over time too - see GapMinder. FlowingData provides a good example with a tutorial on how to make static bubble charts in R.
In the example shown in the question, though, the bubbles appear to be located somewhat to have similar companies close together. Even then, the exact design criteria are unclear to me. For example, I'd have expected Volkswagen to be closer to General Motors than Pfizer is (if some measure of company similarity is used to place the bubbles), but that isn't so in this diagram.
You could use Graphviz to produce a static version of a bubble chart, but there would be quite a lot of work involved to do so. You would have to preprocess the data to calculate a similarity matrix, obtain edge weights from that matrix, assign colours and sizes to each bubble and then have the preprocessing script write the Graphviz file with all edges hidden and run the Graphviz file through neato to draw it.
I have a scanned map from which i would like to extract the data into form of Long Lat and the corresponding value. Can anyone please tell me about how i can extract the data from the map. Is there any packages in R that would enable me to extract data from the scanned map. Unfortunately, i cannot find the person who made this map.
Thanks you very much for your time and help.
Take a look at OCR. I doubt you'll find anything for R, since R is primarily a statistical programming language.
You're better off with something like opencv
Once you find the appropriate OCR package, you will need to identify the x and y positions of your characters which you can then use to classify them as being on the x or y axis of your map.
This is not trivial, but good luck
Try this:
Read in the image file using the raster package
Use the locator() function to click on all the lat-long intersection points.
Use the locator data plus the lat-long data to create a table of lat-long to raster x-y coordinates
Fit a radial (x,y)->(r,theta) transformation to the data. You'll be assuming the projected latitude lines are circular which they seem to be very close to but not exact from some overlaying I tried earlier.
To sample from your image at a lat-long point, invert the transformation.
The next hard problem is trying to get from an image sample to the value of the thing being mapped. Maybe take a 5x5 grid of pixels and average, leaving out any gray pixels. Its even harder than that because some of the colours look like they are made from combining pixels of two different colours to make a new shade. Is this the best image you have?
I'm wondering what top-secret information has been blanked out from the top left corner. If it did say what the projection was that would help enormously.
Note you may be able to do a lot of the process online with mapwarper:
http://mapwarper.net
but I'm not sure if it can handle your map's projection.
I'm searching a data viewer/plotter for some data I've generated.
Facts
First some facts about the data I've generated:
There are several datasets with about 3 million data points each.
Each dataset currently is stored in ascii format.
Every line represents a point and consists of multiple columns.
The first two columns determine the position of the point (i.e. x and y value) whereas the first column is a timestamp and the second is a normalized float between 0 and 1.
The other columns contain additional data which may be used to colorize the plot or filter the data.
An example data point:
2012-08-08T01:02:03.040 0.0165719281 foobar SUCCESS XX:1
Current Approach
Currently I am generating multiple png files (with gnuplot) with different selection criteria like the following ones for each data set:
Display all points in grey.
Display all points in grey, but SUCCESS in red.
Display all points in grey, but SUCCESS in red, XX:-1 in green; if both SUCCESS and XX:-1 match use blue as coloring.
Drawbacks
With the current approach there are some drawbacks I'd like to have addressed:
I can't easily switch on/off some filters or colorings because I have to generate a new png file every time.
I need to use a limited resolution in my image file because the higher the resolution the slower is the viewer. So I can only zoom in to a limited level of detail.
I don't have the raw data available in the png viewer for each point. Ideally I'd like to have the data visible on selection of a point.
Already tested
I've already tested some other approaches:
Gnuplot itself has a viewer but it can't handle that amount of points efficiently - it is too slow and consumes too much memory.
I've had a quick look at KST, but I couldn't find a way to display 2D data and I don't think it will meet my wishes.
Wishes
I'd like to have a viewer which can operate on the raw data, can displays the points quickly if zoomed out, can also zoom in quickly and as well should resolve the aforementioned drawbacks.
Question
So finally, does anybody know of such a viewer or has another suggestion?
If there isn't a viewer some recommendations for programming it myself are welcome, too.
Thanks in advance
Stefan