How to plot histogram in R and have exact axis values. - r

Basically I want to plot a histogram of data in R, but don't want to x axis scale to be (0, 20, 40, 60, 80). I want it to have the exact values as they appear in my data. So for example, if my data contained these values: (0, 0, 2, 5, 7, 12, 14), I want my x axis to have these numbers and not a general scale. The y axis will then have the frequencies as it already does. How do I do this?

If you are just interested in counts of possible values, and not a true histogram, then barplot would be appropriate:
x <- c(0, 0, 2, 5, 7, 12, 14)
uval <- sort(unique(x))
counts <- rowSums(sapply(x, function(x) x==uval))
barplot(counts, names=uval)

Related

How do I change hexbin plot scales?

How do I change hexbin plots scales?
I currently have this:
Instead of the scale jumping from 1 to 718, I would like it to go from 1 to 2, 3, 5, 10, 20, 40, 80, 160, 320, 640, 1280, 2560, 5120, 10240, 15935.
Here is the code I used to plot it:
hex <- hexbin(trial$pickup_longitude, trial$pickup_latitude, xbins=600)
plot(hex, colramp = colorRampPalette(LinOCS(12)))
Here's a ggplot method, where you can specify whatever breaks you want.
library(ggplot2)
library(RColorBrewer)
##
# made up sample
#
set.seed(42)
X <- rgamma(10000, shape=1000, scale=1)
Y <- rgamma(10000, shape=10, scale=100)
dt <- data.table(X, Y)
##
# define breaks and labels for the legend
#
brks <- c(0, 1, 2, 5, 10, 20, 50, 100, Inf)
n.br <- length(brks)
labs <- c(paste('<', brks[2:(n.br-1)]), paste('>', brks[n.br-1]))
##
#
ggplot(dt, aes(X, Y))+geom_hex(aes(fill=cut(..count.., breaks=brks)), color='grey80')+
scale_fill_manual(name='Count', values = rev(brewer.pal(8, 'Spectral')), labels=labs)
You cannot control the boundaries of the scale as closely as you want, but you can adjust it somewhat. First we need a reproducible example:
set.seed(42)
X <- rnorm(10000, 10, 3)
Y <- rnorm(10000, 10, 3)
XY.hex <- hexbin(X, Y)
To change the scale we need to specify a function to use on the counts and an inverse function to reverse the transformation. Now, three different scalings:
plot(XY.hex) # Linear, default
plot(XY.hex, trans=sqrt, inv=function(x) x^2) # Square root
plot(XY.hex, trans=log, inv=function(x) exp(x)) # Log
The top plot is the original scaling. The bottom left is the square root transform and the bottom right is the log transform. There are probably too many levels to read these plots clearly. Adding the argument colorcut=6 to the plot command would reduce the number of levels to 5.

Plotting the logistic function of x in R for a defined range of x

I am trying to tell R that x should range from -10 to 10, and to plot the logistic function of x for this range of values; more specifically, I am trying to plot y=logistic(x)= 1/(1+exp(-x)).
This is in the context of stats homework that is teaching GLM in R.
This is what I did to define x:
x <- c(-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Then I tried this:
glm(formula= y ~ log(x))
and got
object 'y' not found
which confused me because y should just be defined based on the x-values I am specifying, right?
I'm feeling like this should be simple but just can't figure it out.
x <- -10:10
y <- 1/(1+exp(-x))
plot(x, y, type="l")
I guess it can be quite confusing in R, it goes like this, you define the function, which you have already done:
logistic = function(x)(1/(1+exp(-x)))
Then you can do:
curve(logistic,from=-10,to=10)
Or using plot, which calls curve anyway:
plot(logistic,from=-10,to=10)

Changing legend labels in ggplotly()

I have a plot of polygons that are colored according to a quantitative variable in the dataset being cut off at certain discrete values (0, 5, 10, 15, 20, 25). I currently have a static ggplot() output that "works" the way I intend. Namely, the legend values are the cut off values (0, 5, 10, 15, 20, 25). The static plot is below -
However, when I simply convert this static plot to an interactive plot, the legend values become hexadecimal values (#54278F, #756BB1, etc.) instead of the cut off values (0, 5, 10, 15, 20, 25). A screenshot of this interactive plot is shown below -
I am trying to determine a way to change the legend labels in the interactive plot to be the cut off values (0, 5, 10, 15, 20, 25). Any suggestions or support would be greatly appreciated!
Below is the code I used to create the static and interactive plot:
library(plotly)
library(ggplot2)
library(RColorBrewer)
set.seed(1)
x = abs(rnorm(30))
y = abs(rnorm(30))
value = runif(30, 1, 30)
myData <- data.frame(x=x, y=y, value=value)
cutList = c(5, 10, 15, 20, 25)
purples <- brewer.pal(length(cutList)+1, "Purples")
myData$valueColor <- cut(myData$value, breaks=c(0, cutList, 30), labels=rev(purples))
# Static plot
sp <- ggplot(myData, aes(x=x, y=y, fill=valueColor)) + geom_polygon(stat="identity") + scale_fill_manual(labels = as.character(c(0, cutList)), values = levels(myData$valueColor), name = "Value")
# Interactive plot
ip <- ggplotly(sp)
Label using the cut points and use scale_fill_manual for the colors.
cutList = c(5, 10, 15, 20, 25)
purples <- brewer.pal(length(cutList)+1, "Purples")
myData$valueLab <- cut(myData$value, breaks=c(0, cutList, 30), labels=as.character(c(0, cutList)))
# Static plot
sp <- ggplot(myData, aes(x=x, y=y, fill=valueLab)) + geom_polygon(stat="identity") + scale_fill_manual(values = rev(purples))
# Interactive plot
ip <- ggplotly(sp)

Graphing ribbon plot with two years of data superimposed on same plot in R (ggplot2)

I am working with some National Climatic Data Center data of minimum and maximum temperatures. What I want to do is to put two winters worth of data on top of each other in one ribbon graph . . . whereby the x-axis is the month and day for each set and year is ultimately something like a factor. Below is one year of data. I would want the second year to be on this same graph.
Here is a set of example code very similar to the real data, though the geom_ribbon function is not yet returning a graph exactly like the one above any help and thoughts are appreciated:
dates <- c(20110101, 20110102, 20110103, 20110104, 20110105, 20110106, 20110107, 20120101, 20120102, 20120103, 20120104, 20120105, 20120106, 20120107)
Tmin <- c(-5, -4, -2, -10, -2, -1, 2, -10, -11, -12, -10, -5, -3, -1)
Tmax <- c(3, 5, 6, 4, 2, 7, 9, -1, 0, 1, 2, 1, 4, 8)
winter <- c("ONE","ONE","ONE","ONE","ONE","ONE","ONE","TWO","TWO","TWO","TWO","TWO","TWO","TWO")
df <- data.frame(cbind(dates, Tmin, Tmax, winter))
df$dates <- strptime(df$dates, format="%Y%m%d")
df$Tmax <- as.numeric(df$Tmax)
df$Tmin <- as.numeric(df$Tmin)
str(df)
ggplot(df, aes(x = dates, y =Tmax))+
geom_ribbon(aes(ymin=Tmax, ymax=Tmin), fill ="darkgrey", color="dark grey")
How about two overlapping ribbons with different colors?
library(lubridate)
df$yday <- yday(df$dates)
df$year <- year(df$dates)
ggplot(df, aes(yday)) + geom_ribbon(aes(ymax=Tmax, ymin=Tmin, fill=factor(year)), alpha=.5)

How to rescale label on x axis in log2(n+1) format?

I want to format my x-axis in log2(n+1) format so the x-axis labels correspond to 1, 2, 4, 16 and so on.
Input:
x <- c(1, 2, 3, 11, 15)
y <- c(1.1, 1.2, .4, 2.1, 1.5)
plot(log2(x + 1), y, axes=FALSE)
axis(1, at=(labels=as.character(formatC(x))), cex.axis=0.9)
But plot I get still has the original x-axis values.
How can I make my x-axis powers of 2 (1, 2, 4, 16, etc.)?
I guess this is what you want.
x<-c(1,2,3,11,15)
y<-c(1.1,1.2,.4,2.1,1.5)
lab<-c(1,2,4,16)
plot(log2(x+1),y,xaxt="n",xlab="x")
axis(1,at=log2(lab+1),labels=lab)
It might also be useful to calculate equally spaced labels:
lab<-round(2^seq(min(log2(x+1)),max(log2(x+1)),length.out=4)-1)

Resources