R: plot of normalmixEM shows truncated density plots (mixtools) - r

I'm currently trying to plot the components found via EM algorithm. However, the estimated densities do not extend fully to the end. It looks like this:
My code is:
plot(EM_data, which=2, xlim= c(0, 80), xlab2= "", yaxt= "n", main2 ="", lwd2=0.8, border = "azure3")
lines(density(EM_data), lty=2, lwd=0.8)
The plot is truncated wether I specify xlim or not. xlim2 is not defined for this type of plot. Where am I going wrong?

The method to plot mixEM only draws within the range of the data, if you want to extend the densities you must build your own function.
Use something like this:
Example data:
library(mixtools)
data(faithful)
attach(faithful)
set.seed(100)
EM_data<-normalmixEM(waiting, arbvar = FALSE, epsilon = 1e-03)
mixtools plot:
plot(EM_data, which=2, xlim= c(30, 110), xlab2= "", yaxt= "n", main2 ="",
lwd2=0.8, border = "azure3")
lines(density(EM_data$x), lty=2, lwd=0.8)
Adaptation by extending densities:
a <- hist(EM_data$x, plot = FALSE)
maxy <- max(max(a$density), 0.3989 * EM_data$lambda/EM_data$sigma)
hist(EM_data$x, prob = TRUE, main = "", xlab = "", xlim= c(30, 110),
ylim = c(0, maxy), yaxt= "n", border = "azure3")
for (i in 1:ncol(EM_data$posterior)) {
curve(EM_data$lambda[i] * dnorm(x, mean = EM_data$mu[i], sd = EM_data$sigma[i]),
col = 1 + i, lwd = 0.8, add = TRUE)
}
lines(density(EM_data$x), lty=2, lwd=0.8)

Related

superimpose normal density curve to histogram malfunctioning (base r)

I am using base R, and had a code for teaching about normal distribution, and have ran the code successfully many times.
Now, however, when I superimpose the normal density curve, it doesn't seem to function properly.
Here is an example code:
set.seed(100)
data <- rnorm(1000, mean = 0, sd = 1)
hist(data, main = "Normal Distribution", xlab = "X", ylab = "Frequency", col = "444", xlim=c(-4,4))
Now I try to superimpose a density curve over the plot, using the density() command:
lines(density(data), col = "red", lwd = 2)
As you see, the line is flat, and I am perplexed as to why? So I tried another method:
x <- seq(-4, 4, length.out = 100)
lines(x, dnorm(x, mean = 0, sd = 1), col = "red", lwd = 2)
But I get the same result.
Any thoughts why it's not working properly?
The answer came to me thanks to one of the users comments.
Using base R, the hist() function will not plot a probability function by default, which is what needed here. Thus, if I set freq=F the code will worked.
Here is the correct answer:
set.seed(100)
data <- rnorm(1000, mean = 0, sd = 1)
hist(data, main = "Normal Distribution", xlab = "X", ylab = "Frequency", col = "444", xlim=c(-4,4), freq = F)
lines(density(data), col ='777', lwd = 2)

Align gridlines with ticks in base R plot

I saw this picture on the Internet, and just wondered how to plot it in R. This is my code:
article <- data.frame(x = as.Date(round(runif(1000), 2) * 100, origin = '2017-01-01'), y = sample(letters[1:10], 1000, T))
plot(article$x, article$y, pch = 19, col = article$y, xlab = 'date', ylab = 'account', yaxt = 'n') + grid(nx = 10, ny = 10, lty = 1, col = 'grey')
axis(2, at = 1:10, label = levels(article$y))
And I got a picture like this. There is still a problem: the gridline on the y axis does not correspond to the axis label. So how to solve this problem, or is there a more direct method for rendering the plot?
I don't know how to fix the arguments of grid() so that it gives what you want but you could use plot() to draw a blank plot, use abline() to draw the grid, then plot the data on it using points().
So using your data
plot(article$x, article$y, type="n", xlab = 'date', ylab = 'account', yaxt = 'n', xaxt = 'n')
abline(h=1:10, v=pretty(article$x), col="grey")
points(article$x, article$y, pch = 19, col = article$y)
axis(2, at = 1:10, label = levels(article$y))
axis(1, at = pretty(article$x), label = format(pretty(article$x), "%b"))
Or just plot the data as you're doing and draw the grid afterwards using abline(), but in doing so the grid will be drawn on top of your data points.
ggplot produces a graph very similar to the first picture you included in your post:
library(ggplot2)
library(dplyr)
article %>%
ggplot(aes(x, y, colour=y)) + geom_point() + theme_light() + labs(x='date', y='account')

plotFit - data plotted as bars instead of points?

I am using the plotFit function in the investr package in R to display my data as follows:
Figure 1
The code I am using to generate this is simply:
plotFit(nls model, interval = "confidence", level = 0.95, pch = 19, shade = TRUE,
col.conf = "seagreen2", col.fit = "green", lwd.fit = 2,
ylim = c(y1,y2), xlim = c(x1,x2),
xaxp = c(0,200,10), n = 100,
ylab = "", xlab = "",
main = "")
Is there a simple way that I could adapt the code to plot the data as bars, rather than points?
Yes, use type = "h". For example,
fit <- lm(dist ~ speed, data = cars)
library(investr)
plotFit(fit)
plotFit(fit, type = "h", lwd = 3)

How to display legend without masking my spatial plot?

I get a problem for setting the position of legend and wonder if anyone can help.
I follow this example:
http://www.thisisthegreenroom.com/2009/choropleths-in-r/
My code is:
require(maps)
require(ggmap)
library(openxlsx)
rm(list = ls())
map("state", "Arizona")
setwd('M:/SCC/Q-Board')
PM25 <- read.xlsx("PM2.5_Emission_AZ_60 EIS emission sectors.xlsx", sheet = 'Emission_County', colNames = TRUE)
colors = c("#F1EEF6", "#D4B9DA", "#C994C7", "#DF65B0", "#DD1C77",
"#980043")
PM25$colorBuckets <- as.numeric(cut(PM25$PM25, c(0, 5, 10, 20, 30,40, 50)))
map("county",'Arizona', col = colors[PM25$colorBuckets], fill = TRUE,boundary = TRUE, resolution = 0,
lty = 1, projection = "polyconic")
title("PM2.5 Emission by county, 2011")
leg.txt <- c("<5", "5-10", "10-20", "20-30", "30-40", ">40")
legend("bottom", leg.txt, horiz = F, fill = colors,bty="n",title = 'Unit:1000 tons')
Then, the output figure was shown in below. I try to change the position by setting "top", "left"....
But the legend are still overlap with the figure.
Thank you for your help !
It seems to me that you simply run of out the plotting region. This is very common for spatial plot, which will often occupy a significant amount of your plotting domain. I would split the domain into two: one for spatial plot, the other for legend. The following code does this:
## a function to set up plotting region
## l: ratio of left region
## r: ratio of right region
split.region <- function(l, r) {
layout(matrix(c(rep(1, l), rep(2, r)), nrow = 1))
mai <- par("mai")
mai[2] <- 0.1
mai[4] <- 0
par(mai = mai)
}
# use 80% region for main image
# use 20% region for legend
split.region(4, 1)
## produce your main plot
image(x = 0:10/10, y = 0:10/10, matrix(rbinom(100, 1, 0.3), 10), bty= "n", xaxt = "n", yaxt = "n", ann = FALSE, main = "sample plot")
## set up 2nd plot, with nothing
plot(1:2, bty="n", ann=FALSE, xaxt = "n", yaxt = "n", col = "white")
## add your legend to your second plot
leg.txt <- c("<5", "5-10", "10-20", "20-30", "30-40", ">40")
## place legend at bottom left
legend("bottomleft", leg.txt, horiz = F, pch = 15, col = 1:6, bty="n", title = 'Unit:1000 tons', cex = 1.5)
Adjust l, r until you are satisfied.

Put one line chart and bar chart in one plot in R (Not ggplot)?

how to
Combine a bar chart and line in single plot in R (from different data sources)?
Say I have two data sources as:
barData<-c(0.1,0.2,0.3,0.4) #In percentage
lineData<-c(100,22,534,52,900)
Note that they may not be in the same scale.
Can I plot both barData and LineData in one plot and make them good looking ?
I cant use ggplot in this case so this is not a duplicated question..
Something like the following:
Maybe this helps as a starting point:
par(mar = rep(4, 4))
barData<-c(0.1,0.2,0.3,0.4) * 100
y <- lineData<-c(100,22,534,900);
x <- barplot(barData,
axes = FALSE,
col = "blue",
xlab = "",
ylab = "",
ylim = c(0, 100) )[, 1]
axis(1, at = x, labels = c("Julia", "Pat", "Max", "Norman"))
ats <- c(seq(0, 100, 15), 100); axis(4, at = ats, labels = paste0(ats, "%"), las = 2)
axis(3, at = x, labels = NA)
par(new = TRUE)
plot(x = x, y = y, type = "b", col = "red", axes = FALSE, xlab = "", ylab = "")
axis(2, at = c(pretty(lineData), max(lineData)), las = 2)
mtext(text="Lines of code by Programmer", side = 3, line = 1)
box()

Resources