I am using RBrewer to to manually colour my ggplot bar chart but i'm having no luck.
I create my colour palette of blues and then assign it to a function for it to ramp.
blues <- brewer.pal(9, "Blues")
blue_range <- colorRamp(blues)
I then plot my stacked bar chart, where I know i have 20 groups.
ggplot(Month.Summary, aes(x=Calendar.Month, y = Measure, fill = Groups)) + geom_bar(stat="Identity", position = "fill") +scale_fill_manual(values = blue_range(20))
I unfortunately get the following error:
Error: Insufficient values in manual scale. 20 needed but only 3
provided.
I'm using Groups as my fill, where I know there are 2 instances. I'm passing 20 to the blue_range function so i'm not sure why it's saying i'm only passing 3 colours.
The blue_range() function expects values between 0 and 1. To get the discrete palette, pass a sequence to this function:
> blue_range(seq(0, 1, length.out = 20))
[,1] [,2] [,3]
[1,] 247.00000 251.00000 255.0000
[2,] 236.47368 244.26316 251.6316
[3,] 225.94737 237.52632 248.2632
[4,] 215.68421 230.78947 244.8947
[5,] 205.57895 224.05263 241.5263
[6,] 193.78947 217.21053 237.5263
[7,] 176.94737 210.05263 231.6316
[8,] 160.10526 202.89474 225.7368
[9,] 139.21053 191.68421 220.9474
[10,] 117.73684 179.89474 216.3158
[11,] 98.36842 168.10526 210.6316
[12,] 81.10526 156.31579 203.8947
[13,] 64.26316 144.26316 197.1053
[14,] 50.36842 130.36842 189.9474
[15,] 36.47368 116.47368 182.7895
[16,] 25.10526 102.89474 173.1053
[17,] 14.57895 89.42105 162.5789
[18,] 8.00000 75.78947 148.2632
[19,] 8.00000 61.89474 127.6316
[20,] 8.00000 48.00000 107.0000
This should work in the ggplot() call -- not tested because you didn't provide a reproducible example.
Note that recent ggplot2 has scale_fill_distiller() which provides a similar functionality with a more convenient interface.
Related
I am given an empirical distribution FXemp of a real-valued random variable X. Given now X1,..., Xn having the same distribution as X and dependencies given by a copula C. I would like now to produce random samples of X1,..., Xn element of R.
E.g. I am given a vector of samples and the corresponding cdf
x <- rnorm(1000)
df <- ecdf(x)
Assume that I pick for a example a t-student or Clayton copula C. How can I produce random samples of for example 10 copies of x, where their dependency is determined by C.
Is there an easy way?
Or are their any packages that can be used here?
You can sample from the copula (with uniform margins) by using the copula package, and then apply the inverse ecdf to each component:
library(copula)
x <- rnorm(100) # sample of X
d <- 5 # desired number of copies
copula <- claytonCopula(param = 2, dim = d)
nsims <- 25 # number of simulations
U <- rCopula(nsims, copula) # sample from the copula (with uniform margins)
# now sample the copies of X ####
Xs <- matrix(NA_real_, nrow = nsims, ncol = d)
for(i in 1:d){
Xs[,i] <- quantile(x, probs = U[,i], type = 1) # type=1 is the inverse ecdf
}
Xs
# [,1] [,2] [,3] [,4] [,5]
# [1,] -0.5692185 -0.9254869 -0.6821624 -1.2148041 -0.682162391
# [2,] -0.4680407 -0.4263257 -0.3456553 -0.6132320 -0.925486872
# [3,] -1.1322063 -1.2148041 -0.8115089 -1.0074435 -1.430405604
# [4,] 0.9760268 1.2600186 1.0731551 1.2369623 0.835024471
# [5,] -1.1280825 -0.8995429 -0.5761037 -0.8115089 -0.543125426
# [6,] -0.1848303 -1.2148041 -0.5692185 0.8974921 -0.613232036
# [7,] -0.5692185 -0.3070884 -0.8995429 -0.8115089 -0.007292346
# [8,] 0.1696306 0.4072428 0.7646646 0.4910863 1.236962330
# [9,] -0.7908557 -1.1280825 -1.2970952 0.3655081 -0.633521404
# [10,] -1.3226053 -1.0074435 -1.6857615 -1.3226053 -1.685761474
# [11,] -2.5410325 -2.3604936 -2.3604936 -2.3604936 -2.360493569
# [12,] -2.3604936 -2.2530003 -1.9311289 -2.2956444 -2.360493569
# [13,] 0.4072428 -0.2150035 -0.3564803 -0.1051930 -0.166434458
# [14,] -0.4680407 -1.0729763 -0.6335214 -0.8995429 -0.899542914
# [15,] -0.9143225 -0.1522242 0.4053462 -1.0729763 -0.158375658
# [16,] -0.4998761 -0.7908557 -0.9813504 -0.1763604 -0.283013334
# [17,] -1.2148041 -0.9143225 -0.5176347 -0.9143225 -1.007443492
# [18,] -0.2150035 0.5675260 0.5214050 0.8310799 0.464151265
# [19,] -1.2148041 -0.6132320 -1.2970952 -1.1685962 -1.132206305
# [20,] 1.4456635 1.0444720 0.7850181 1.0742214 0.785018119
# [21,] 0.3172811 1.2369623 -0.1664345 0.9440006 1.260018624
# [22,] 0.5017980 1.4068250 1.9950305 1.2600186 0.976026807
# [23,] 0.5675260 -1.0729763 -1.2970952 -0.3653535 -0.426325703
# [24,] -2.5410325 -2.2956444 -2.3604936 -2.2956444 -2.253000326
# [25,] 0.4053462 -0.5431254 -0.5431254 0.8350245 0.950891450
I have the following predictions which I obtained from library(vars). Lets call this vecm.pred
$price
fcst lower upper CI
[1,] 4956.787 4864.032 5049.543 92.75548
[2,] 4948.936 4844.545 5053.327 104.39064
[3,] 5089.440 4979.941 5198.939 109.49891
[4,] 5076.999 4939.429 5214.569 137.56992
[5,] 5000.012 4854.955 5145.068 145.05669
[6,] 5072.107 4910.435 5233.780 161.67272
$people
fcst lower upper CI
[1,] 2529.799 2417.699 2641.899 112.1000
[2,] 2498.627 2269.438 2727.817 229.1893
[3,] 2410.037 2116.672 2703.402 293.3648
[4,] 2418.197 2094.965 2741.429 323.2320
[5,] 2371.373 2028.816 2713.929 342.5561
[6,] 2289.163 1941.386 2636.939 347.7764
I am trying to use fanchart to show my forecasts below:
fanchart(vecm.pred, ylab = c("Price (€)","Volume"), main = c("Price","People"))
But I cannot get past the following issues:
1) How do I change the colors from the default grey scale to a heatmap of red to yellows?
2) How do I have alternative ylabs for my first and second plot? As my ylab function above just provides two y-axis names for each plot.
I would like to plot the following matrix x, so the column data are plotted according to their column name (i.e. 0.1, 0.2, etc.) on the x-axis.
> x
0.1 0.2 0.3 0.4 0.5
[1,] 5.000000e-01 5.000000e-01 5.000000e-01 5.000000e-01 0.5000000000
[2,] 2.500000e-02 5.000000e-02 7.500000e-02 1.000000e-01 0.1250000000
[3,] 2.437500e-03 9.500000e-03 2.081250e-02 3.600000e-02 0.0546875000
[4,] 2.431559e-04 1.881950e-03 6.113802e-03 1.388160e-02 0.0258483887
[5,] 2.430967e-05 3.756817e-04 1.822927e-03 5.475560e-03 0.0125901247
[6,] 2.430908e-06 7.510810e-05 5.458812e-04 2.178231e-03 0.0062158067
[7,] 2.430902e-07 1.502049e-05 1.636750e-04 8.693947e-04 0.0030885852
[8,] 2.430902e-08 3.004053e-06 4.909445e-05 3.474555e-04 0.0015395229
[9,] 2.430902e-09 6.008089e-07 1.472761e-05 1.389339e-04 0.0007685764
[10,] 2.430902e-10 1.201617e-07 4.418219e-06 5.556585e-05 0.0003839928
But when I use
plot(x, pch=20, ylim=c(0, 1))
I get the following: Plot of R matrix.
I want a plot, where x[1, 1] (i.e. 5.000000e-01) is plotted as a point on 0.1 on the x-axis and 0.5 on the y-axis.
set.seed(123)
mat<-matrix(rnorm(25),5,5)
colnames(mat)<-seq(0.1,0.5,length.out=5)
plot(x=matrix(rep(as.numeric(colnames(mat)),5), 5,5,byrow=T),y=mat)
here the first argument x will repeat the number on the x axis by 5, so 5 x 5 I'll get a matrix which will give the right x position to each y column.
matplot(x=matrix(rep(as.numeric(colnames(mat)),5), 5,5,byrow=T),y=mat)
Can also be used
I'm new of R, thus I apologize in advance if my following question may result a little dumb, but I really need some expert advice.
I have a problem in applying someone else's code to my data. The author's code works perfectly with the examples he provides, and it seems to me I am doing all the correct steps in my case, but apparently I am not.
The main function is:
grid_boot <- function(dat,name,t,ar,grid,bq,c,all,grph)
and I should simply specify the parameter and then running the code given in the script, or at least for the author's example this works.
All the specification in the examples are close to my case, except for the ar parameter.
The ar parameter is the autoregressive order of a time series, so it is simply a number from 1 to n that you choose. My series requires a simple ar = 1, but if I run the code with this specification, R give me back the following error:
" Error in solve.default(t(x) %*% x) : system is computationally
singular: reciprocal condition number = 6.07898e-34 In addition:
Warning messages: 1: In dat[(ar - k + 2):(n - k + 1)] - dat[(ar - k +
1):(n - k)] : longer object length is not a multiple of shorter
object length 2: In dat[(ar - k + 2):(n - k + 1)] - dat[(ar - k +
1):(n - k)] : longer object length is not a multiple of shorter
object length"
(I know there are other post title with this error, but nothing seems to fit my case)
In the example, the author specifies as follow:
orig <- 2 # set to 1 for original data, set to 2 for extended data #
t <- 2
grid <- 200
bq <- 9999
c <- .9
i <- 7
d <- np[i,]
if (orig==1){
y <- as.matrix(dat[d[1]:(d[2]-18)])
if (i==4) y <- y[21:82]
}else{
y <- as.matrix(dat[d[1]:d[2]])
}
name <- "GNP per Capita: 1869-1988"
ar <- d[3]
What I can't figure out is the indication ar <- d[3] and in general what precisely he means with specifying i and d. I think this specification is due to the fact that his dataset is made of several variables all written in the same column and they are associated with an index.
When I give these inputs to R (I use RStudio), in the environment pane appears as ar = 1. When I give the numerical input (ar <- 1) for my exercise instead, the only result is the error above mentioned.
Below, I report my data (as you can see, it is only a single series with few observation, so it should be an easy one) and my inputs, while the script, the example and the data for the example are downloadable here.
I hope someone can help me figuring out what am I doing wrong and I will be very grateful to anyone who is willing to help a newbie like me.
install.packages("pracma")
library(pracma)
source(file.choose())
dat <- as.matrix(read.csv(file.choose(), header = TRUE))
print(dat)
USA
[1,] 0.01075000
[2,] 0.01116000
[3,] 0.01214000
[4,] 0.01309000
[5,] 0.01668000
[6,] 0.02991000
[7,] 0.02776000
[8,] 0.04218000
[9,] 0.05415000
[10,] 0.05895000
[11,] 0.04256000
[12,] 0.03306000
[13,] 0.00622000
[14,] 0.11035000
[15,] 0.09132000
[16,] 0.05737000
[17,] 0.06486000
[18,] 0.07647000
[19,] 0.11266000
[20,] 0.13509000
[21,] 0.10316000
[22,] 0.06161000
[23,] 0.03212000
[24,] 0.04317000
[25,] 0.03561000
[26,] 0.01859000
[27,] 0.03741000
[28,] 0.04009000
[29,] 0.04827000
[30,] 0.05398000
[31,] 0.04235000
[32,] 0.03029000
[33,] 0.02952000
[34,] 0.02607000
[35,] 0.02805000
[36,] 0.02931000
[37,] 0.02338000
[38,] 0.01552000
[39,] 0.02188000
[40,] 0.03377000
[41,] 0.02826000
[42,] 0.01586000
[43,] 0.00002270
[44,] 0.02677000
[45,] 0.03393000
[46,] 0.03226000
[47,] 0.02853000
[48,] 0.03839000
[49,] -0.00000356
[50,] 0.00001640
[51,] 0.03157000
[52,] 0.02069000
[53,] 0.01465000
[54,] 0.01622000
[55,] 0.01622000
dat <- dat
name <- "Inflation"
t <- 1
ar <- 1
grid <- 200
bq <- 1999
c <- .9
all <- 0
grph <- 1
out <- grid_boot(dat, name, t, ar, grid, bq, c, all, grph)
I'm trying to reuse a HoltWinters model previously generated in R. I have found a related entry here, but it does not seem to work with HoltWinters. Basically I have tried something like this:
myModel<-HoltWinters(ts(myData),gamma=FALSE)
predict(myModel,n.ahead=10)
#time to change the data
predict(myModel,n.ahead=10,newdata=myNewData)
When I try to predict using the new data I get the same prediction.
I would appreciate any suggestion.
You can use update:
mdl <- HoltWinters(EuStockMarkets[,"FTSE"],gamma=FALSE)
predict(mdl,n.ahead=10)
Time Series:
Start = c(1998, 170)
End = c(1998, 179)
Frequency = 260
fit
[1,] 5451.093
[2,] 5447.186
[3,] 5443.279
[4,] 5439.373
[5,] 5435.466
[6,] 5431.559
[7,] 5427.652
[8,] 5423.745
[9,] 5419.838
[10,] 5415.932
predict(update(mdl,x=EuStockMarkets[,"CAC"]),n.ahead=10)]
Time Series:
Start = c(1998, 170)
End = c(1998, 179)
Frequency = 260
fit
[1,] 3995.127
[2,] 3995.253
[3,] 3995.380
[4,] 3995.506
[5,] 3995.633
[6,] 3995.759
[7,] 3995.886
[8,] 3996.013
[9,] 3996.139
[10,] 3996.266
predict.HoltWinters doesn't have a newdata argument, which is why the data doesn't get replaced. This is because the prediction doesn't require any data – it is described entirely by the coefficients argument of the model.
m <- HoltWinters(co2)
m$coefficients #These values describe the model completely;
#adding new data makes no difference