cluster: :clusplot axis in wrong direction - r

I'm trying to plot the cluster obtained from fuzzy c-means clustering.
The plot should look like this.
code for the plot
plot(data$Longitude, data$Latitude, main="Fuzzy C-Means",col=data$Revised, pch=16, cex=.6,
xlab="Longitude",ylab="Latitude")
library(maps)
map("state", add=T)
However, when I tried to use clusplot the plot is displaying in opposite direction(both top and bottom and left and right) as below.
I wanna know if there's a way to reverse the plot to show in the order as the above picture.
Also, for the very dense area, it's hard to find the ellipse label. I wanna know if there's a way to show the label inside the ellipse instead of outside.
code for 2nd pic
library(cluster)
clusplot(cbind(Geocode$Longitude, Geocode$Latitude), cluster, color=TRUE,shade=TRUE,
labels=4, lines=0,col.p=cluster,
xlab="Longitude",ylab="Latitude",cex=1)

clusplot is a function that performs a lot of magic for you. In particular it projects the data set - which happens in a way you don't like, unfortunately. (Also note the scales - it centered and scaled the data, too)
clusplot.default: Creates a bivariate plot visualizing a partition (clustering) of the data. All observation are represented by points in the plot, using principal components or multidimensional scaling.
As far as I can tell, clusplot doesn't have map support, but you will want such a map I guess...
While maybe you can use the s.x.2d parameter to specify the exact projection (and this way disable automatic scaling), it probably is still difficult to add the map. Maybe look at the source of clusplot instead, and take only the parts you want?

Related

Overlapping data contour on a map

I have gone through few tutorials and answers here in stackoverflow such as:
Overlap image plot on a Google Map background in R or
Plotting contours on an irregular grid or Geographical heat map of a custom property in R with ggmap or How to overlay global map on filled contour in R language or https://blog.dominodatalab.com/geographic-visualization-with-rs-ggmaps/
They either don't serve my purpose or consider the density of the data to create the image.
I am looking for a way to plot contour on a map of a certain data, and would expect the image to look something like this:
or something like this taken from https://dsparks.wordpress.com/2012/07/18/mapping-public-opinion-a-tutorial/:
I have a data here that gives a contour plot like this in plot_ly but i want this over the map given by latitudes and longitudes.
Please guide me on how this can be done. Any links to potential answers or codes would be helpful.
Ok I did some digging and figured that to plot the data -which in this case are point values randomly distributed across the Latitude and Longitude, one has to make it continuous instead of the discreetly distributed one. To do this I interpolated the data to fill in the gaps, this method is given in Plotting contours on an irregular grid and then take it from there. Now the interpolation here is done using a linear regression, one can use other methods such as IDW, Kriging, Nearest Neighbourhood etc for which R-packages are easily available. These methods are widely used in climatology and topographic analysis. To read more about interpolation methods see this paper.

Shaded graph/network plot?

I am trying to plot quite large and dense networks (dput here). All I end up with is a bunch of overlapping dots, which does not really give me a sense of the structure or density of the network:
library(sna)
plot(data, mode = "fruchtermanreingold")
However, I have seen plots which utilizes fading to visualize the degree to which points overlap, e.g.:
How can I implement this "fading" in a plot of a graph?
Here's one way:
library(sna)
library(network)
source("modifieddatafromgist.R")
plot.network(data,
vertex.col="#FF000020",
vertex.border="#FF000020",
edge.col="#FFFFFF")
First, I added a data <- to the gist so it could be sourced.
Second, you need to ensure the proper library calls so the object classes are assigned correctly and the proper plot function will be used.
Third, you should use the extra parameters for the fruchtermanreingold layout (which is the default one for plot.network) to expand the area and increase the # of iterations.
Fourth, you should do a set.seed before the plot so folks can reproduce the output example.
Fifth, I deliberately removed cruft so you can see the point overlap, but you can change the alpha for both edges & vertices (and you should change the edge width, too) to get the result you want.
There's a ton of help in ?plot.network to assist you in configuring these options.

R non gridded filled contour

I have a set of data that I'm trying to create a surface plot of. I have an x,y point and a to colour by.
I can create a xy plot with the points coloured but I can't find a way to create a surface plot with my data. The data isn't on a normal grid and I would prefer to not normalize it if possible (or I could just use a very fine grid).
The data won't be outside the a radius=1 circle so this part would need to be blank.
The code and the plot is shown below.
I've tried using contour, filled.contour as well as surface3d (not what I wanted). I'm not real familiar with many packages in R so I'm not even sure where to begin looking for this info.
Any help in creating this plot would be appreciated.
thanks,
Gordon
dip<-data.frame(dip=seq(0,90,10))
ddr<-data.frame(ddr=seq(0,350,10))
a<-merge(dip,ddr)
a$colour<-hsv(h=runif(nrow(a)))
degrees.to.radians<-function(degrees){
radians=degrees*pi/180
radians
}
a$equal_angle_x<-sin(degrees.to.radians(a$ddr))*tan(degrees.to.radians((90-a$dip)/2))
a$equal_angle_y<-cos(degrees.to.radians(a$ddr))*tan(degrees.to.radians((90-a$dip)/2))
plot(a$equal_angle_x,a$equal_angle_y,col=a$colour,lwd=10)
With regards to the plot I was trying to create is below. I believe the link in the first comment should get me where I'm trying to go.

R - Scatter plots, how to plot points in differnt lines to overlapping?

I want to plot several lists of points, each list has distance (decimal) and error_no (1-8). So far I am using the following:
plot(b1$dist1, b1$e1, col="blue",type="p", pch=20, cex=.5)
points(b1$dist2, b1$e2, col="blue", pch=22)
to add them both to the same plot. (I will add legends, etc later on).
The problem I have is that points overlap, and even when changing the character using for plotting, it covers up previous points. Since I am planning on plotting a lot more than just 2 this will be a big problem.
I found some ways in:
http://www.rensenieuwenhuis.nl/r-sessions-13-overlapping-data-points/
But I would rather do something that would space the points along the y axis, one way would be to add .1, then .2, and so on, but I was wondering if there was any package to do that for me.
Cheers
M
ps: if I missed something, please let me know.
As noted in the very first point in the link you posted, jitter will slightly move all your points. If you just want to move the points on the y-axis:
plot(b1$dist1, b1$e1, col="blue",type="p", pch=20, cex=.5)
points(b1$dist2, jitter(b1$e2), col="blue", pch=22)
Depends a lot on what information you wish to impart to the reader of your chart. A common solution is to use the transparency quality of R's color specification. Instead of calling a color "blue" for example, set the color to #0000FF44 (Apologies if I just set it to red or green) The final two bytes define the transparency, from 00 to FF, so overlapping data points will appear darker than standalone points.
Look at the spread.labs function in the TeachingDemos package, particularly the example. It may be that you can use that function to create your plot (the examples deal with labels, but could just as easily be applied to the points themselves). The key is that you will need to find the new locations based on the combined data, then plot. If the function as is does not do what you want, you could still look at the code and use the ideas to spread out your points.
Another approach would be to restructure your data and use the ggplot2 package with "dodging". Other approaches rather than using points several times would be the matplot function, using the col argument to plot with a vector, or lattice or ggplot2 plots. You will probably need to restructure the data for any of these.

How to avoid overplotting (for points) using base-graph?

I am in my way of finishing the graphs for a paper and decided (after a discussion on stats.stackoverflow), in order to transmit as much information as possible, to create the following graph that present both in the foreground the means and in the background the raw data:
However, one problem remains and that is overplotting. For example, the marked point looks like it reflects one data point, but in fact 5 data points exists with the same value at that place.
Therefore, I would like to know if there is a way to deal with overplotting in base graph using points as the function.
It would be ideal if e.g., the respective points get darker, or thicker or,...
Manually doing it is not an option (too many graphs and points like this). Furthermore, ggplot2 is also not what I want to learn to deal with this single problem (one reason is that I tend to like dual-axes what is not supprted in ggplot2).
Update: I wrote a function which automatically creates the above graphs and avoids overplotting by adding vertical or horizontal jitter (or both): check it out!
This function is now available as raw.means.plot and raw.means.plot2 in the plotrix package (on CRAN).
Standard approach is to add some noise to the data before plotting. R has a function jitter() which does exactly that. You could use it to add the necessary noise to the coordinates in your plot. eg:
X <- rep(1:10,10)
Z <- as.factor(sample(letters[1:10],100,replace=T))
plot(jitter(as.numeric(Z),factor=0.2),X,xaxt="n")
axis(1,at=1:10,labels=levels(Z))
Besides jittering, another good approach is alpha blending which you can obtain (on the graphics devices supporing it) as the fourth color parameter. I provided an example for 'overplotting' of two histograms in this SO question.
One additional idea for the general problem of showing the number of points is using a rug plot (rug function), this places small tick marks along the margin that can show how many points contribute (still use jittering or alpha blending for ties). This allows the actual points to show their true rather than jittered values, but the rug can then indicate which parts of the plot have more values.
For the example plot direct jittering or alpha blending is probably best, but in some other cases the rug plot can be useful.
You may also use sunflowerplot, while it would be hard to implement it here. I would use alpha-blending, as Dirk suggested.

Resources