Vector not being set correctly - r

This R code :
avector <- as.vector(top.links.added.overall$Amount)
x <- as.vector(top.links.added.overall[order(avector),])
x$Amount <- factor(x$Amount)
x$color[x$Amount == 100] <- "red"
x$color[x$Amount == 500] <- "blue"
x$color[x$Amount == 1000] <- "darkgreen"
dotchart(x$Amount,
labels = row.names(x),
cex=.7,
groups = x$Amount,
gcolor = "black",
color = x$color,
pch=19,
main = "Gas Mileage for Car Models\ngrouped by cylinder",
xlab = "Miles Per Gallon")
returns this error :
Error in dotchart(x$Amount, labels = row.names(x), cex = 0.7, groups = x$Amount, :
'x' must be a numeric vector or matrix
This is the datafile for top.links.added.overall :
Amount,Name
1000,Google
500,Cnn
100,Yahoo
'x' is a vector so what is causing this error ?

Remove conversion to factor x$Amount <- factor(x$Amount)
And make small change in
dotchart(x$Amount,
labels = row.names(x),
cex=.7,
groups = factor(x$Amount),
gcolor = "black",
color = x$color,
pch=19,
main = "Gas Mileage for Car Models\ngrouped by cylinder",
xlab = "Miles Per Gallon")
Probably that will help you.

Related

How can I introduce labels that show what points represent?

Hello, in what way can I neatly show what points correspond to which sample of "soy yoghurt", "oat yoghurt" and "activia".
The code I have used to generate the plot is here
color_type = rep("white", length(sample_type))
color_type[soy_yoghurt] = "brown"
color_type[oat_yoghurt] = "blue"
color_type[activia] = "gold"
pch_type = rep(NA, length(sample_type))
pch_type[all_yoghurt] = 21 # Circle symbols
# run the mds algorithm
mds = metaMDS(bray_dist)
#plot the results
par(mar=c(5,5,2,2), xpd = TRUE)
plot(main= "Ordination of milk-products",
mds$points[,1], mds$points[,2], cex = 3, pch = pch_type,
col = "black", bg = color_type, xlab = "NMDS1", ylab = "NMDS2"

Do we need to call dev.off() after creating a pdf file?

When I call dev.off() my pdf gets created but I get the following message "null device 1".
I don't get any warning when I remove dev.off and my pdf gets created so why do I need to call dev.off for?
plot(x # independent variable (population_density)
, y # dependent variable (case_fatality_rate)
, main = "ScatterPlot - Case Fatality Rate vs Population Density Per Square Mile" # chart
title
, xlab = "Population Density Per Square Mile" # x-axis label
, ylab = "Case Fatality Rate" # y-axis label
, pch = 19 # point shape (filled circle)
, frame = T # surround chart with a frame
, xlim = c(0, 1200), ylim = c(0, 3)
)
model <- lm(y ~ x, data = dataset) # compute the linear model
abline(model, col = "blue") # draw the model as a blue line
hist(y # depandant variable (case_fatality_rate)
, main = "Histogram - Case Fatality Rate Frequency" # chart title
, xlab = "Case Fatality Rate",
ylab = "Frequency",
col = "#f0ffff",
breaks = 15,
freq = FALSE,
prob = TRUE,
xlim = c(0.5,2.5),
ylim = c(0.0,2.0)
)
lines(density(y, adjust=1.2), col="blue", lwd=2)
grid(nx = NA, ny = NULL,
lty = 1, col = "gray", lwd = 1)
dev.off()

plot.zoo - Plot one graph with different colors

I want to plot a graph of a time series return on an asset with different colors for more volatile periods. I want the volatility clusters to be marked in red with the rest of the more calm periods marked in blue. I've attached an image of what I want to achieve.
My code:
plot.zoo(djr, xlab = "Time", ylab = "Returns", col = "blue")
If cond is a logical vector with your condition for more volatile periods (for example cond <- abs(Returns > 0.05)), you can use something like:
plot.zoo(djr, xlab = "Time", ylab = "Returns", col = "blue")
points(index(djr)[cond], djr[cond], type = "l", col = "red")
For multiple periods in red, lines may appear that go from one to the other. In the following example I solve this problem:
# Reproducible example:
library(zoo)
djr <- as.zoo(EuStockMarkets[, "DAX"])
djr <- (djr - mean(djr))/sd(djr)
cond <- abs(as.numeric(djr)) > 0.75
rlec <- rle(cond)
plot.zoo(djr, xlab = "Time", ylab = "Returns", col = "white")
ind <- 1
for(i in 1:length(rlec$values)) {
points(index(djr)[ind:(ind + rlec$lengths[i] - 1)],
djr[ind:(ind + rlec$lengths[i] - 1)],
type = "l", col = c("blue", "red")[rlec$values[i] + 1])
ind <- ind + rlec$lengths[i]
}
The answer by Juan works. Here's an alternate method that I found works as well
ling_segs <-ifelse(djr <= -0.02 | djr >= 0.02, cbind(djr), NA)
line_segs <- na.omit(ling_segs)
plot.zoo(cbind(djr, line_segs),
plot.type = "single",
xlab = "Date", ylab = "Returns",
col = c("blue", "red"))

How to limit variables of dotplot?

I want to create a dotplot which comprises only the top 10 values of the features in the text file. The following code works, but the output is a dotplot containing all 160 variables.
library(lattice)
table<-"imp_s2.txt"
DT<-read.table(table, header=T)
# output graph to pdf file
pdf("dotplot_s2.pdf")
colnames(DT)
DT$feature <- reorder(DT$feature, DT$IncMSE)
dotplot(feature ~ IncMSE, data = DT,
aspect = 1.5,
xlab = "Variable Importance, Scale 2",
scales = list(cex = .6),
panel = function (x, y) {
panel.abline(h = as.numeric(y), col = "gray", lty = 2)
panel.xyplot(x, as.numeric(y), col = "black", pch = 16)})
dev.off()
It would help if you included a reproducible example. My guess is that this can be done by simply subsetting your data frame so that you are including only the rows with the top 10 values. Something like this might work (although I can't test it):
# get threshold value
cutoff <- sort(DT$IncMSE, decreasing=TRUE)[10]
dotplot(feature ~ IncMSE,
data = DT[which(DT$IncMSE>=cutoff),], # this only includes top values
aspect = 1.5,
xlab = "Variable Importance, Scale 2",
scales = list(cex = .6),
panel = function (x, y) {
panel.abline(h = as.numeric(y), col = "gray", lty = 2)
panel.xyplot(x, as.numeric(y), col = "black", pch = 16)})

Change ordering of values in 'y' axis for dotchart

The following code :
avector <- as.vector(top.links.added.overall$Amount)
x <- as.vector(top.links.added.overall[order(avector),])
row.names(x) <- c("Yahoo" ,"Cnn", "Google")
x$color[x$Amount == 100] <- "red"
x$color[x$Amount == 500] <- "blue"
x$color[x$Amount == 1000] <- "darkgreen"
dotchart(x$Amount,
labels = row.names(x),
cex=.7,
groups = x$Amount,
gcolor = "black",
color = x$color,
pch=19,
main = "Gas Mileage for Car Models\ngrouped by cylinder",
xlab = "Miles Per Gallon")
Generates this graph :
Here is the format of the dataset top.links.added.overall$Amount :
here is the file dataset :
Amount,Name
1000,Google
500,Cnn
100,Yahoo
When I remove the code :
row.names(x) <- c("Yahoo" ,"Cnn", "Google")
I get row names of 1,2,3
I don't need I should need to set the names of the 'y' axis ? How can the code of the graph be amended so that the company with lowest numerical value(in this case yahoo) start at beginning of 'y' axis instead of top, which is currently what is occuring ?
I don't think I can test it with the offered R data objects but perhaps something along these lines:
x <- as.vector(top.links.added.overall[order(-avector),])
row.names(x) <- rev( c("Yahoo" ,"Cnn", "Google") )
Using mathematical negation to the order argument and the rev (reverse) function.
Edit: I now understand your frustration, but after looking at the code I decided to try this which seems to do it:
dotchart(x$Amount,
labels = row.names(x),
cex=.7,
groups = -x$Amount, # the code sorts by `as.numeric(groups)`
gcolor = "black",
color = x$color,
pch=19,
main = "Gas Mileage for Car Models\ngrouped by cylinder",
xlab = "Miles Per Gallon")

Resources