poly() and orthogonal polynomials - r

I searched about poly() in R and I think it should produce orthogonal polynomials so when we use it in regression model like lm(y~poly(x,2)) the predictors are uncorrelated. However:
poly(1:3,2)=
[1,] -7.071068e-01 0.4082483
[2,] -7.850462e-17 -0.8164966
[3,] 7.071068e-01 0.4082483
I think this is probably a stupid question but what I don't understand is the column vectors of the result poly(1:3,2) does not have inner product zero? That is -7.07*0.40-7.85*(-0.82)+7.07*0.41=/ 0? so how is this uncorrelated predictors for regression?

Your main problem is that you're missing the meaning of the e or "E notation": as commented by #MamounBenghezal above, fffeggg is shorthand for fff * 10^(ggg)
I get slightly different answers than you do (the difference is numerically trivial) because I'm running this on a different platform:
pp <- poly(1:3,2)
## 1 2
## [1,] -7.071068e-01 0.4082483
## [2,] 4.350720e-18 -0.8164966
## [3,] 7.071068e-01 0.4082483
An easier format to see:
print(zapsmall(matrix(c(pp),3,2)),digits=3)
## [,1] [,2]
## [1,] -0.707 0.408
## [2,] 0.000 -0.816
## [3,] 0.707 0.408
sum(pp[,1]*pp[,2]) ## 5.196039e-17, effectively zero
Or to use your example, with the correct placement of decimal points:
-0.707*0.408-(7.85e-17)*(-0.82)+(0.707)*0.408
## [1] 5.551115e-17

Related

r ADBUG model nls singular gradient

I've tried to fit the following into a ADBUG model using the nls function in r, but the singular matrix error kept repeating and I don't really know how to proceed on doing this...
nprice nlv2
[1,] 0.6666667 1.91666667
[2,] 0.7500000 1.91666667
[3,] 0.8333333 1.91666667
[4,] 0.9166667 1.44444444
[5,] 1.0000000 1.00000000
[6,] 1.0833333 0.58333333
[7,] 1.1666667 0.22222222
[8,] 1.2500000 0.08333333
[9,] 1.3333333 0.02777778
code:
fit <- nls(f=nprice~a+b*nlv2^c/(nlv2^c+d),start=list(a=0.083,b=1.89,c=-10.95,d=0.94))
Error in nls(f = nprice ~ a + b * nlv2^c/(nlv2^c + d), start = list(a = 0.083, :
singular gradient
Package nlsr provides an updated version of nls through function nlxb that in most cases avoids the "singular gradient" error.
library(nlsr)
fit <- nlxb(f = nprice~a+b*nlv2^c/(nlv2^c+d),
data = df,
start = list(a=0.083,b=1.89,c=-10.95,d=0.94))
## vn:[1] "nprice" "a" "b" "nlv2" "c" "d"
## no weights
fit$coefficients
## a b c d
## -2.1207e+04 2.1208e+04 -7.4083e-01 1.6236e-05
The fitted coefficients are far away from the starting values and quite big, indicating the problem is not well grounded.

Unexpected output containing plus, minus, and letters produced by subtracting one column of numbers from another in R

I have a data.frame containing a vector of numeric values (prcp_log).
waterdate PRCP prcp_log
<date> <dbl> <dbl>
1 2007-10-01 0 0
2 2007-10-02 0.02 0.0198
3 2007-10-03 0.31 0.270
4 2007-10-04 1.8 1.03
5 2007-10-05 0.03 0.0296
6 2007-10-06 0.19 0.174
I then pass this data through Christiano-Fitzgerald band pass filter using the following command from the mfilter package.
library(mFilter)
US1ORLA0076_cffilter <- cffilter(US1ORLA0076$prcp_log,pl=180,pu=365,root=FALSE,drift=FALSE,
type=c("asymmetric"),
nfix=NULL,theta=1)
Which creates an S3 object containing, among other things, and vector of "trend" values and a vector of "cycle" values, like so:
head(US1ORLA0076_cffilter$trend)
[,1]
[1,] 0.05439408
[2,] 0.07275321
[3,] 0.32150292
[4,] 1.07958965
[5,] 0.07799329
[6,] 0.22082246
head(US1ORLA0076_cffilter$cycle)
[,1]
[1,] -0.05439408
[2,] -0.05295058
[3,] -0.05147578
[4,] -0.04997023
[5,] -0.04843449
[6,] -0.04686915
Plotted:
plot(US1ORLA0076_cffilter)
I then apply the following mathematical operation in attempt to remove the trend and seasonal components from the original numeric vector:
US1ORLA0076$decomp <- ((US1ORLA0076$prcp_log - US1ORLA0076_cffilter$trend) - US1ORLA0076_cffilter$cycle)
Which creates an output of values which includes unexpected elements such as dashes and letters.
head(US1ORLA0076$decomp)
[,1]
[1,] 0.000000e+00
[2,] 0.000000e+00
[3,] 1.387779e-17
[4,] -2.775558e-17
[5,] 0.000000e+00
[6,] 6.938894e-18
What has happened here? What do these additional characters signify? How can perform this mathematical operation and achieve the desired output of simply $log_prcp minus both the $tend and $cycle values?
I am happy to provide any additional info that will help right away, just ask.

error in computing the generalized eigenvalues in R with geigen package

I'm using the R package geigen to solve the generalized eigenvalue problem AV = lambdaB*V.
This is the code:
geigen(Gamma_chi_0, diag(diag(Gamma_xi_0)),symmetric=TRUE, only.values=FALSE) #GENERALIZED EIGENVALUE PROBLEM
Where:
Gamma_chi_0
[,1] [,2] [,3] [,4] [,5]
[1,] 1.02346 -0.50204 0.41122 -0.73066 0.00072
[2,] -0.50204 0.96712 -0.33526 0.51774 -0.37708
[3,] 0.41122 -0.33526 1.05086 0.09798 0.09274
[4,] -0.73066 0.51774 0.09798 0.99780 -0.51596
[5,] 0.00072 -0.37708 0.09274 -0.51596 1.03354
and
diag(diag(Gamma_xi_0))
[,1] [,2] [,3] [,4] [,5]
[1,] -0.0234 0.0000 0.0000 0.0000 0.0000
[2,] 0.0000 0.0329 0.0000 0.0000 0.0000
[3,] 0.0000 0.0000 -0.0509 0.0000 0.0000
[4,] 0.0000 0.0000 0.0000 0.0022 0.0000
[5,] 0.0000 0.0000 0.0000 0.0000 -0.0335
But I get this error:
> geigen(Gamma_chi_0, diag(diag(Gamma_xi_0)), only.values=FALSE)
Error in .sygv_Lapackerror(z$info, n) :
Leading minor of order 1 of B is not positive definite
In matlab, using the same two matrices, it works:
opt.disp = 0;
[P, D] = eigs(Gamma_chi_0, diag(diag(Gamma_xi_0)),r,'LM',opt);
% compute first r generalized eigenvectors and eigenvalues
For example I get the following eigenvalues matrix
D =
427.8208 0
0 -38.6419
Of course in matlab I just computed the first r=2, in R i want all the eigenvalues and eigenvectors (n=5), and then i subset the first 2.
Can someone help me to solve this?
geigen has detected a symmetric matrix for Gamma_chi_0. Then Lapack encounters an error and cannot continue. Specify symmetric=FALSE in the call of geigen. The manual describes what argument symmetric does. Do this
geigen(Gamma_chi_0, B, symmetric=FALSE, only.values=FALSE)
The result is (on my computer)
$values
[1] 4.312749e+02 -3.869203e+01 -2.328465e+01 1.706288e-05 1.840783e+01
$vectors
[,1] [,2] [,3] [,4] [,5]
[1,] -0.067535068 1.0000000 0.2249715 -0.89744514 0.05194799
[2,] -0.035746438 0.1094176 0.3273440 0.03714518 1.00000000
[3,] 0.005083806 0.3782606 0.8588086 0.50306323 0.17858115
[4,] -1.000000000 0.2986963 0.4067701 -1.00000000 -0.48314183
[5,] -0.034226056 -0.6075727 1.0000000 -0.53017872 0.06738515
$alpha
[1] 1.365959e+00 -1.152686e+00 -9.202769e-01 4.352770e-07 5.588102e-01
$beta
[1] 0.003167259 0.029791306 0.039522893 0.025510167 0.030357208
This is quite close to what you show for Matlab. I know nothing about Matlab so I cannot help you with that.
Addendum
Matlab seems to use similar methods as geigen when the matrices used are determined to be symmetric or not. Your matrix Gamma_chi_0 may not be exactly symmetric. See this documentation for argument 'algorithm' of eig.
More addendum
In actual fact your matrix B is not positive definite. Try the function chol of base R. And you'll get the same error message. In this case you have to force geigen to use the general algorithm.

How to solve linear equations in R with rectangular matrix

please i'm trying to solve a 7x2 matrix problem in the form below using R- software:
A=array(c(5.54,0.96,1.59,2.07,0.73,10.64,8.28,1.41,3.77,3.11,3.74,2.93,8.29,3.33), c(7,2))
A
# [,1] [,2]
#[1,] 5.54 1.41
#[2,] 0.96 3.77
#[3,] 1.59 3.11
#[4,] 2.07 3.74
#[5,] 0.73 2.93
#[6,] 10.64 8.29
#[7,] 8.28 3.33
b=c(80814.25,34334.75,47921.75,59514.25,26981.25,63010.25,46646.25)
b
#[1] 80814.25 34334.75 47921.75 59514.25 26981.25 63010.25 46646.25
solve (A,b)
Error in solve.default(A, b) : 'a' (7 x 2) must be square
A %*% solve (A,b)
Error in solve.default(A, b) : 'a' (7 x 2) must be square
What do you think I can do to solve the problem. I need solution to two variables, x1 and x2, in the 7x2 matrix as stated above.
It seems that you're using solve when it needs a square input. In ?solve it discusses how you can use qr.solve for non-square matrices.
qr.solve(A,b)
[,1]
[1,] 3741.208
[2,] 6552.174
You might want to check that this is correct for your purposes. There are other ways to solve these types of problems. This might help you though.
The corpcor package offers a pseudoinverse function for finding the inverse of a rectangular matrix:
library(corpcor)
pseudoinverse(A)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.06271597 -0.05067830 -0.02922597 -0.03265713 -0.03964039 0.0230086
[2,] -0.05845856 0.08551514 0.05661287 0.06532450 0.06674243 0.0391552
[,7]
[1,] 0.07239133
[2,] -0.05420334
pseudoinverse(A) %*% b
[,1]
[1,] 3741.208
[2,] 6552.174

How to deal with log of zero in R in image.plot?

I have a matrix, and the entries are all probabilities. Most of the entries have very low probabilities. Some have zeros. I need to do log of the matrix. However, since there are zeros in the matrix, R generates -inf for those zero entries. My goal is to feed this log(matrix) into the image.plot(). When I feed this into the image.plot, I kept getting this error:
Error in seq.default(minz + binwidth/2, maxz - binwidth/2, by = binwidth) :
invalid (to - from)/by in seq(.)
Is there any solution in here that can help me get around this?
Here is what the matrix looks like:
0 1 2 3 4 5 6
[1,] -0.0007854138 -8.9132811 -10.011893 -10.705041 -9.606428 -9.318746 -Inf
[2,] -0.3402118357 -1.6137090 -2.742625 -4.215836 -5.721434 -7.121522 -9.606428
[3,] -0.2912175507 -2.0296478 -3.521929 -4.275321 -4.426519 -4.187369 -3.715705
[4,] -1.5244380532 -0.7048802 -2.001368 -3.405243 -3.713864 -3.143919 -3.781412
[5,] -0.7572491288 -0.7487709 -3.981208 -5.110329 -5.228577 -5.095569 -5.293395
[6,] -0.0007629648 -Inf -8.759130 -7.613998 -9.606428 -Inf -Inf
[7,] -0.0020658381 -7.4861648 -7.526987 -7.094123 -9.318746 -Inf -Inf
[8,] -0.0295715883 -6.7160566 -7.208533 -6.610696 -6.485533 -6.813220 -6.387552
[9,] -0.0032128722 -6.7160566 -7.613998 -7.871827 -7.760602 -8.759130 -8.759130
[10,] -0.4869248130 -1.3225132 -2.518576 -3.768698 -5.140520 -6.183252 -7.208533
7 8 9
[1,] -Inf -10.705041 -10.011893
[2,] -Inf -Inf -7.149693
[3,] -4.965248 -5.968842 -6.428374
[4,] -4.696227 -5.091913 -4.669559
[5,] -5.163777 -5.468599 -6.577906
[6,] -Inf -Inf -Inf
[7,] -Inf -Inf -Inf
[8,] -6.627503 -6.456545 -6.400976
[9,] -10.011893 -10.011893 -Inf
[10,] -8.402456 -7.814669 -6.546158
Here is the structure :
structure(c(0.999214894571557, 0.71161956034096, 0.747353073126963,
0.217743382682817, 0.468954688200987, 0.999237326155227, 0.997936294302378,
0.970861372812921, 0.996792283535218, 0.614513234634365, 0.000134589502018843,
0.199147599820547, 0.13138178555406, 0.49416778824585, 0.472947510094213,
0, 0.000560789591745177, 0.00121130551816958, 0.00121130551816958,
0.266464782413638, 4.48631673396142e-05, 0.0644010767160162,
0.0295423956931359, 0.135150291610588, 0.0186630776132795, 0.00015702108568865,
0.00053835800807537, 0.000740242261103634, 0.000493494840735756,
0.0805742485419471, 2.24315836698071e-05, 0.0147599820547331,
0.0139075818752804, 0.0331987438313145, 0.00603409600717811,
0.000493494840735756, 0.000829968595782862, 0.00134589502018843,
0.000381336922386721, 0.0230820995962315, 6.72947510094213e-05,
0.00327501121579183, 0.0119560340960072, 0.0243831314490803,
0.00536114849708389, 6.72947510094213e-05, 8.97263346792284e-05,
0.00152534768954688, 0.000426200089726335, 0.00585464333781965,
8.97263346792284e-05, 0.000807537012113055, 0.0151861821444594,
0.0431135038133692, 0.00612382234185734, 0, 0, 0.00109914759982055,
0.00015702108568865, 0.00206370569762225, 0, 6.72947510094213e-05,
0.0243382682817407, 0.022790489008524, 0.00502467474203679, 0,
0, 0.00168236877523553, 0.00015702108568865, 0.000740242261103634,
0, 0, 0.00697622252131, 0.00912965455361149, 0.00572005383580081,
0, 0, 0.00132346343651862, 4.48631673396142e-05, 0.000224315836698071,
2.24315836698071e-05, 0, 0.00255720053835801, 0.00614625392552714,
0.00421713772992373, 0, 0, 0.0015702108568865, 4.48631673396142e-05,
0.000403768506056528, 4.48631673396142e-05, 0.000785105428443248,
0.00161507402422611, 0.00937640197397936, 0.00139075818752804,
0, 0, 0.00165993719156572, 0, 0.00143562135486765), .Dim = c(10L,
10L), .Dimnames = list(NULL, c("0", "1", "2", "3", "4", "5",
"6", "7", "8", "9")))
If these zeros are caused by a physical measurement which should yield a positive-definite results but fails to do so for technical reasons, it might be reasonable to substitute 1/2 of the lower limit of detection for the zeros.
M2 <- M
print( min(M[M!=0]), digits=16)
#[1] 2.24315836698071e-05
M2[M2==0] <- 0.5*min(M[M!=0])
image(M2)
image(log(M2))
True, a log plot may make "the difference betweeen entries more noticeable". However, if you have zeros in your data, you'd be using it wrong. The point of a logarithmic scale is to illustrate exponential increases in the data. Having zeros, however, means that either:
the values observed were not produced by a process exhibiting exponential
growth or
you need to handle your missing values differently.
Either way, what would work a lot better in your case is taking the square root of the values. Or (n>2)-th root if you want to accentuate the difference in values even more -- the higher the n, the bigger the difference.
As per #flodel's suggestion below, the code that would do this is: image.plot(sqrt(x)) or, more generally, image.plot(x^(1/n)) for some n>1.
Hope this helps.
A simple trick is to add 1 since log1=0 such that cell with 0 still will have 0 after log transformation.
k<-matrix(c(1:8,0,0),nrow=2,ncol=5)
> k
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 0
[2,] 2 4 6 8 0
log(k)
[,1] [,2] [,3] [,4] [,5]
[1,] 0.0000000 1.098612 1.609438 1.945910 -Inf
[2,] 0.6931472 1.386294 1.791759 2.079442 -Inf
log(k+1)
[,1] [,2] [,3] [,4] [,5]
[1,] 0.6931472 1.386294 1.791759 2.079442 0
[2,] 1.0986123 1.609438 1.945910 2.197225 0
The except is thrown by seq(), which can not take -inf as any one of its arguments. You can get exactly the same type of error with the following code:
> seq(-log(0), 0, 50)
Error in seq.default(-log(0), 0, 50) : invalid (to - from)/by in seq(.)
To avoid it, follow #Metrics 's trick. Although I will suggest instead of adding 1.0, add a very small value, such as 1e-22, since your matrix is a matrix of probabilities.
Can't paste multiple lines of code in a comment, but this example shows what I meant:
> m=cbind(c(0,0.88,0.99),c(1,2,1),c(3,4,5))
> m=as.matrix(m)
> log(m)
[,1] [,2] [,3]
[1,] -Inf 0.0000000 1.098612
[2,] -0.12783337 0.6931472 1.386294
[3,] -0.01005034 0.0000000 1.609438
> m
[,1] [,2] [,3]
[1,] 0.00 1 3
[2,] 0.88 2 4
[3,] 0.99 1 5

Resources