Plotting color scatterplots matrix - r

I`m using RStudio v 0.97. I want to get color scatterplot matrix, here is my code:
dt<- impact[c(3,4,7,8)]
dt.r <- cor(dt)
dt.color <- dmat.color(dt.r)
dt.order<- order.single(dt.r)
cpairs(dt, dt.order, panel.controls = dt.color, main= "Scatterplots")
But my output is black&white scatterplot and warning: "There were 50 or more warnings (use warnings() to see the first 50)"
How to fix this?

Read the help for cpairs:
Usage:
cpairs(data, order = NULL,
panel.colors = NULL, border.color = "grey70", show.points = TRUE, ...)
The parameter is panel.colors not panel.controls.
The warning is a clue - did you read the warning?
Warning messages:
1: In plot.window(...) : "panel.controls" is not a graphical parameter

Related

getting an error in decision tree model (R)

library(tree)
set.seed(1)
train=1:80
sum(is.na(sobija1))
hist(sobija1$CCSI)
sss=ifelse(sobija1$CCSI<=100, "negative","positive" )
sss=as.factor(sss)
sobija1=data.frame(sobija1,sss)
Tree_Class=tree(sss~sobija1$unemployment_rate+sobija1$house_pirce_index,sobija1,subset=train)
print(summary(Tree_Class))
plot(Tree_Class)
text(Tree_Class, pretty=0, cex=0.75)
cat("\n Confusion table for classification trees \n")
print(table(predict(Tree_Class,newdata = sobija1[-train,],type = "class"), sss[-train]))
> print(table(predict(Tree_Class,newdata = sobija1[-train,],type = "class"), sss[-train]))
Error in table(predict(Tree_Class, newdata = sobija1[-train, ], type = "class"), :
all arguments must have the same length
In addition: Warning message:
'newdata' had 41 rows but variables found have 121 rows
I tried to make a decision tree model, and was trying to make a matrix to check the error rate, but this error came up and now I have no idea how fix it.

Error in xy.coords(x, y, xlabel, ylabel, log) while knitting the file

I have everything working when I run the chunks but an error occurs when I decide to knit my .rmd file
########### needed for testing purpose #################
library(tree)
set.seed(77191)
library(ISLR)
library(randomForest)
attach(Carseats)
n=nrow(Carseats)
indices=sample(1:n,n/2,replace=F)
cstrain=Carseats[indices,]
cstest=Carseats[-indices,]
tree.cs <- tree(Sales ~. , data = cstrain)
summary(tree.cs)
plot(tree.cs)
text(tree.cs)
y_hat <-predict(tree.cs, newdata = cstest)
test.mse =mean((y_hat - cstest$Sales)^2) #Test's MSE
test.mse
######################################################
# 2nd chunk
cv.cs <- cv.tree(tree.cs)
cx =cv.cs$size
cy =cv.cs$dev
mymy <- xy.coords(cx,cy)
plot(mymy, xlab = "size", ylab = "dev", type = "b")
mini.tree <-which.min(cv.cs$dev)
points(mini.tree,cv.cs$dev[mini.tree], col="green", cex= 2, pch = 20)
2nd chunk Yields :
#3rd chunk
#pruning
prune.cs <- prune.tree(tree.cs, best = mini.tree)
plot(prune.cs) # the problematic part
y_hat <- predict(prune.cs, newdata = cstest)
mean((y_hat - cstest$Sales)^2)
The 3rd chunk has to yield something similar to this:
Not a duplicate of:
'x' is a list, but does not have components 'x' and 'y'
Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' is a list, but does not have components 'x' and 'y'
Did not solve the problem:
Fit a Decision Tree classifier to the data; Error in code
I know about the coordinates plot() needs in order to run but here I am trying to plot a tree. Also, it worked many times before but wouldn't just knit the file.
1st chuck is added in case you want to try it by yourself.
Thank you.
I suppose your problematic line should be
prune.cs <- prune.tree(tree.cs, best = cv.cs$size[mini.tree])
instead of
prune.cs <- prune.tree(tree.cs, best = mini.tree)
You are not interested in the index, which can change every time you do cross-validation, but the tree size at that index.
The same thing is true in the 2nd chunk where you have
points(mini.tree,cv.cs$dev[mini.tree], col="green", cex= 2, pch = 20)
which should be
points(cv.cs$size[mini.tree], cv.cs$dev[mini.tree], col="green", cex= 2, pch = 20)

Error message when running npreg

I'm working the npreg example in the R np package documentation (by T. Hayfield, J. Racine), section 3.1 Univariate Regression.
library("np")
data("cps71")
model.par = lm(logwage~age + I(age^2),data=cps71)
summary(model.par)
#
attach(cps71)
bw = npregbw(logwage~age) # thislne not in example 3.1
model.np = npreg(logwage~age,regtype="ll", bwmethod="cv.aic",gradients="TRUE",
+ data=cps71)
This copied directly from the example, but the npreg call results in error message
*Rerun with Debug
Error in npreg.rbandwidth(txdat = txdat, tydat = tydat, bws = bws, ...) :
NAs in foreign function call (arg 15)
In addition: Warning message:
In npreg.rbandwidth(txdat = txdat, tydat = tydat, bws = bws, ...) :
NAs introduced by coercion*
The npreg R documentation indicates the first argument should be BW specificaion. I tried setting bws=1
model.np = npreg(bws=1,logwage~age,regtype="ll",
+ bwmethod="cv.aic",gradients="TRUE", data=cps71)
which gives the following error
*Error in toFrame(xdat) :
xdat must be a data frame, matrix, vector, or factor*
First time working with density estimation in R. Please suggest how to resolve these errors.

How to render the wordcloud in R shiny?

While going through topic modelling with lda, I have to render the wordcloud output in the main panel of shiny,
The following lines define the worldcloud I have to render:
i <- 1
cloud.data <- sort(result$topics[i, ], decreasing = TRUE)[1:50]
wordcloud(names(cloud.data), freq = cloud.data, scale = c(4, 0.1), min.freq = 1,
rot.per = 0, random.order = FALSE)
I tried in renderPlot but it is giving me following error..
shiny::runApp('~/RProject/dynamic_UI')
Listening on http://127.0.0.1:3358
Error in if (grepl(tails, words[i])) ht <- ht + ht * 0.2 :
argument is of length zero
Warning in run(timeoutMs) : "min.freq" is not a graphical parameter
Warning in run(timeoutMs) : "min.freq" is not a graphical parameter
Then I corrected some parameters (for example, min.freqs for min.freq), getting now the following error:
shiny::runApp('~/RProject/dynamic_UI')
Listening on http://127.0.0.1:3358
Read 8265 items
Error in if (grepl(tails, words[i])) ht <- ht + ht * 0.2 :
argument is of length zero
Warning in run(timeoutMs) :
is.na() applied to non-(list or vector) of type 'NULL'
How could I render the wordcloud output in the main panel?
The name of my column did not match: should have been testdoc5$word but I had testdoc5$doc. Now it works!

plotting in for loop with R using package eHOF

I want to run a number (~100) of eHOF models using the R package eHOF, produce the graphs that this package can make of the models, and save jpeg files of each one. I am trying to use a for loop to accomplish this quickly, but I am not able to get it to make the graphs. I don't see anything in the R studio plots window, and I produce jpeg files that have nothing in them (as an aside problem, I am not producing the names for the jpeg files correctly in the loop).
To produce these plots, outside a loop there is no problem, if for example I call my model modSP<-HOF(Sp, ...) then using plot(ModSp) produces the desired graph. But within the loop, nothing is produced, and I get several error messages of the sort:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In plot.window(...) : "boxp" is not a graphical parameter
3: In plot.window(...) : "las.h" is not a graphical parameter
4: In plot.window(...) : "onlybest" is not a graphical parameter
5: In plot.window(...) : "para" is not a graphical parameter
Background: I am using R version 3.1.0 (2014-04-10) in R studio, and package eHOF in windows 7.
My code is as follows:
species<- read.csv("F:/Thesis_projects/Chapter4_climateChange/HOF/259species.csv")
species<-as.data.frame(species)
enviro<-read.csv("F:/Thesis_projects/Chapter4_climateChange/HOF/EnvironmentalData.csv")
enviro<-as.data.frame(enviro)
species_enviro<-merge(enviro, species, all.x=FALSE)
HOF_Sp<-species_enviro[,23:25]
GDD<-species_enviro[,19]
library(eHOF)
SpeciesCodes<-c("ACPE","ACRU2","ACSP2")
Modx<-NULL
for (Spp in seq_along(SpeciesCodes)){
Modx[[Spp]]<-HOF(HOF_Sp[[Spp]],GDD, M=1,family=binomial, bootstrap=2, freq.limit = 100)
jpeg(filename = (paste(("GDD_responsecurve_",SpeciesCodes[[Spp]],".jpg"),sep="")),
width =8.3, height = 8.3, units = "cm", pointsize = 8, bg="white", res = 800)
print(plot((paste(c(Modx[[Spp]]))), boxp = TRUE,
las.h = 1, onlybest = TRUE, para = TRUE,
gam.se = FALSE, newdata = NULL, lwd=1, leg = TRUE, add=FALSE,
xlabel="Growing degree days", ylab="Probability"))
dev.off()
}
My Data looks like this:
> head(GDD)
[1] 996.1681 996.1681 962.0662 962.0662 945.7007 945.7007
(there are lots of 1's in the species data too, just not in the first few rows).
> head(HOF_Sp)
ACMI2 ACPA ACPE
1 0 0 0
2 0 0 0
3 0 0 0
4 0 0 0
5 0 0 0
6 0 0 0
Any advice at all would be very appreciated! I think it is an issue with the plot function in the loop being mistaken for the generic plot function of R. If I didn't provide enough information, I will be happy to edit my question.
There are several things going on here:
Your SpeciesCodes vector has codes different from the column names in HOF_Sp.
seq_along(...) returns the index of each element in it's argument, not the element itself.
The plot method for HOF is invoked when an object of class HOF is passed to plot(...). But you are passing paste(c(HOF)) which is incomprehensible...
Your example is not reproducible because it does not run with the data you provided (did you try to run it??). Specifically, the sample of HOF_Sp is degenerate because there are no non-zero values.
It's impossible for me to test this code because of (4) above, but try this:
SpeciesCodes <- c("ACPE","ACRU2","ACSP2")
for (Spp in SpeciesCodes) {
model <- HOF(HOF_Sp[[Spp]],GDD, M=1,family=binomial, bootstrap=2, freq.limit = 100)
jpeg(filename = (paste(("GDD_responsecurve_",Spp,".jpg"),sep="")),
width =8.3, height = 8.3, units = "cm", pointsize = 8, bg="white", res = 800)
plot(model, boxp = TRUE,
las.h = 1, onlybest = TRUE, para = TRUE,
gam.se = FALSE, newdata = NULL, lwd=1, leg = TRUE, add=FALSE,
xlabel="Growing degree days", ylab="Probability")
dev.off()
}
Note that this invokes the vector version of HOF(...), which I gather is what you want.
This does not solve the problem in (1) above (species codes do not match columns names), but other than that it should work.

Resources