I have a noninvertible matrix A and a vector b for which I believe there is a solution x to Ax = b. I would like to find an example of such x. When I try solve(A,b) in R it produces an error because A is singular. Is there any way to make R give me a random solution?
I eventually tried lm(b ~ 0 + A) which works. It will leave estimates for some columns as NA, which you can substitute for 0 to get an example solution. For example
A = matrix(c(1,1,0,0),nr=2,byrow=F)
b = c(2,2)
lm(b ~ A)
will produce coefficients 2 and NA for the two columns of A. solve(A,b), lsfit(A,b), and qr.solve(A,b) do not work.
Edit: MASS::ginv(A) %*% b works too.
Related
So I have been trying to decipher some code but have hit a roadblock.
can someone please explain to me the what is being done to P$L to get P$HY to get column Y.
I need to understand Functionally (visually how the data frame changes) and from a Mathematical point of view.
Thanks in advance
# create sample data frame
L <- c(15,12,9,6,3,0)
HY <- c(0,0.106,0.277,0.641,0.907,1)
P <- data.frame(L,Y)
# constants
d <- 5
# THIS IS THE PART THAT I DO NOT UNDERSTAND!!
Y <- lm(P$HY ~ poly(P$L, d))
So it re-iterate the question I´m trying to figure out, mathematically and functionally, what Y <- lm(HY ~ poly(L, d)) is doing.
You are building a linear model with HY as the dependent variable and L as the independent variable using a polynomial of degree 5 for L, so 5 terms for this variable. You are saving this model in the variable Y.
I'd like to solve an equation for a variable for each line of a given csv file.
You may know the equation as the Euler-Lotka equation.
That is what I have so far:
# seed is needed for reproducible results (otherwise random numbers will never be the same!)
set.seed(42)
# using the Euler-Lotka equation
# l = survival rate until age x
# m = amount of offspring at age x
# x = age of reproduction
# r = population growth rate
y <- function(r, l1, l2, l3, m1, m2, m3, x1, x2, x3, z){((l1*m1*exp(-r*x1)) + (l2*m2*exp(-r*x2)) + (l3*m3*exp(-r*x3))) - z}
# iterate through each line calculating r and writing it into the respective field
for (i in 1:length(neos_data$jar_no)){
# declare the variables from table (this does not work!!)
l1 <- neos_data$surv_rate_clutch1[i]
l2 <- neos_data$surv_rate_clutch2[i]
l3 <- neos_data$surv_rate_clutch3[i]
m1 <- neos_data$indiv_sum_1_clutch[i]
m2 <- neos_data$indiv_sum_2_clutch[i]
m3 <- neos_data$indiv_sum_3_clutch[i]
x1 <- neos_data$age_clutch_1[i]
x2 <- neos_data$age_clutch_2[i]
x3 <- neos_data$age_clutch_3[i]
# this works, while these numbers are the same as in the data frame
l1 <- 0.9333333
l2 <- 0.9333333
l3 <- 0.9333333
m1 <- 3.4
m2 <- 0
m3 <- 0
x1 <- 9
x2 <- 13
x3 <- 16
## uniroot finds a 0 value, so offset function, thats why -z in the upper formula
r <- uniroot(y, l1=l1, l2=l2, l3=l3, m1=m1, m2=m2, m3=m3, x1=x1, x2=x2, x3=x3, z = 1, interval = c(-1, 1))[1] #writing only the result of r into variable
# write r into table
neos_data$pop_gr[i] <- r
}
As I already commented, uniroot works fine with manual input of values. But when try to load a value from my data frame it gives the error "values of f() have the same sign".
I do understand the meaning of the error itself, but why does it work with the values I insert manually and not with the same values from the data frame (and yes, I have checked the data types).
Would be glad for any help, as what I've seen so far was not helpful in my case :)
EDIT:
To clearify: I'd like to get a value of r for which the equation becomes 0. This works with the given code very fine as far as I insert the values of the variables as a number. But when I try to pass the value from my data frame, it fails even if the same values are passed.
Ok, I think I've found the problem.
There are some lines where each part of the sum becomes 0. At each step where the loop hits the 0s the error occurs and the whole stuff doesn't work.
This seems natural as the equation is:
1 = SUM( l(x) * m(x) * exp(-r*x) )
if all l(x) and m(x) are 0 the equation cannot become 1, of course.
I didn't realize this issue as the script didn't work at all. Now, after trying and rewriting and deleting code, somehow it writes the resulting r into the data frame until the line with 0s. That brought me to this conclusion.
Why does this always happens after hours of trying? :D
However, to solve this issue, I inserted a 0.0001 at the respective fields just to get the loop running. In my case I just want to copy the r values to my data mastersheed. As there are only 3 lines with all 0 (it was because NAs couldn't be handled by uniroot) I will delete those values by hand (NAs won't disturb any further calculation).
Thanks for your help anyway. It dropped me into the right direction :)
The way I am currently using neural net is that it predicts one output point from many input points. More specifically, I run the below.
nn <- neuralnet(
as.formula(a ~ c + d),
data=Z, hidden=c(3,2), err.fct="sse", act.fct=custom,
linear.output=TRUE, rep=5)
Here, if Z is a matrix of columns with names a, b, c, it will predict one point from one row in column a from the corresponding points in rows c and d. (The vertical dimension is used as samples for training.)
Suppose there's also a column b. I am wondering if there's a way to predict both, a and b, from c and d? I've tried
as.formula(a+b ~ c+d)
but that does not appear to work.
Any ideas?
My bad, it works nicely using a + b ~ c + d. I thought the function did not accept this input (as it crashed many times), but there must have been another problem which is now gone that I cleaned it all up.
nn <- neuralnet(as.formula(a + b ~ c + d),
data=Z, hidden=c(3,2), err.fct="sse", act.fct=custom,
linear.output=TRUE, rep=5)
Works beautifully and returns two point (or two column) output! Neat.
Examples from neuralnet, the format works :)
AND <- c(rep(0,7),1)
OR <- c(0,rep(1,7))
binary.data <- data.frame(expand.grid(c(0,1), c(0,1), c(0,1)), AND, OR)
print(net <- neuralnet(AND+OR~Var1+Var2+Var3, binary.data, hidden=0,
rep=10, err.fct="ce", linear.output=FALSE))
I want to calculate the differential response of y to x (continuous) depending on the categorical variable z.
In the standard lm setup:
lm(y~ x:z)
However, I want to do this while allowing for Impulse Indicator Saturation (IIS) in the 'gets' package. However, the following syntax produces an error:
isat(y, mxreg=x:z, iis=TRUE)
The error message is of the form:
"Error in solve.qr(out, tol = tol, LAPACK = LAPACK) :
singular matrix 'a' in 'solve"
1: In x:z :
numerical expression has 96 elements: only the first used
2: In x:z :
numerical expression has 96 elements: only the first used"
How should I modify the syntax?
Thank you!
At the moment, alas, isat doesn't provide the same functionality as lm on categorical/character variables, nor on using * and :. We hope to address that in a future release.
In the meantime you'll have to create distinct variables in your dataset representing the interaction. I guess something like the following...
library(gets)
N <- 100
x <- rnorm(N)
z <- c(rep("A",N/4),rep("B",N/4),rep("C",N/4),rep("D",N/4))
e <- rnorm(N)
y <- 0.5*x*as.numeric(z=="A") + 1.5*x*as.numeric(z=="B") - 0.75*x*as.numeric(z=="C") + 5*x*as.numeric(z=="D") + e
lm.reg <- lm(y ~ x:z)
arx.reg.0 <- arx(y,mxreg=x:z)
data <- data.frame(y,x,z,stringsAsFactors=F)
for(i in z[duplicated(z)==F]) {
data[[paste("Zx",i,sep=".")]] <- data$x * as.numeric(data$z==i)
}
arx.reg.1 <- arx(data$y,mxreg=data[,c("x","Zx.A","Zx.B","Zx.C")])
isat.1 <- isat(data$y,mc=TRUE,mxreg=data[,c("x","Zx.A","Zx.B","Zx.C")],max.block.size=20)
Note that as you'll be creating dummies for each category, there's a chance those dummies will cause singularity of your matrix of explanatory variables (if, as in my example, isat automatically uses 4 blocks). Using the argument max.block.size enables you to avoid this problem.
Let me know if I haven't addressed your particular point.
Good day,
I have tried to figure this out, but I really can't!! I'll supply an example of my data in R:
x <- c(36,71,106,142,175,210,246,288,357)
y <- c(19.6,20.9,19.8,21.2,17.6,23.6,20.4,18.9,17.2)
table <- data.frame(x,y)
library(nlmrt)
curve <- "y~ a + b*exp(-0.01*x) + (c*x)"
ones <- list(a=1, b=1, c=1)
Then I use wrapnls to fit the curve and to find a solution:
solve <- wrapnls(curve, data=table, start=ones, trace=FALSE)
This is all fine and works for me. Then, using the following, I obtain a prediction of y for each of the x values:
predict(solve)
But how do I find the prediction of y for new x values? For instance:
new_x <- c(10, 30, 50, 70)
I have tried:
predict(solve, new_x)
predict(solve, 10)
It just gives the same output as:
predict(solve)
I really hope someone can help! I know if I use the values of 'solve' for parameters a, b, and c and substitute them into the curve formula with the desired x value that I would be able to this, but I'm wondering if there is a simpler option. Also, without plotting the data first.
Predict requires the new data to be a data.frame with column names that match the variable names used in your model (whether your model has one or many variables). All you need to do is use
predict(solve, data.frame(x=new_x))
# [1] 18.30066 19.21600 19.88409 20.34973
And that will give you a prediction for just those 4 values. It's somewhat unfortunate that any mistakes in specifying the new data results in the fitted values for the original model being returned. An error message probably would have been more useful, but oh well.