I want to perform a regression in a fixed-effects model. To construct such a model, I have multiple FLAGs, like the following:
y ~ x + z + FlagYear1 + FlagYear2 + FlagYear3 + FlagCountry1 + FlagCountry2
I want to perform another regression in which I have fixed effects for Year * Country, so that the model will be equal to this
y ~ x + z + FlagYear1Country1 + FlagYear1Country2 + FlagYear2Country1 + FlagYear2Country2 + FlagYear3Country1 + FlagYear3Country2
As I have 26 countries and 8 Years in my model, so it would be very time-consuming to manually construct all the FLAGs. I know there is a command to perform this automatically in Stata, how can I do the same in R?
If by 'FLAG' you are referring to 0/1 coded indicator variables (or dummy variables) then R has an easy way to enter all of these interactions into a forumla.
If you have factor variables country with 26 levels and year with 8 levels then you can use
y ~ x + z + country*year
and this will expand the factors into every combination of country and year.
Look at the documentation for formula to understand how this works.
If you already have the indicator variables then you could use
y ~ x + z + (FlagYear1 + FlagYear2 + FlagYear3) * (FlagCountry1 + FlagCountry2)
I have this simple linear model:
Y = β0 + β1*X + ε
This is the given layout for the data below:
The lay-out of the data is given below. n = how many β values there are, in this case only 1):
X Y
X1 Y1
X2 Y2
. .
Xn Y
So my desired matrix would be:
X Y
X1 Y1
My question is, I need to translate the model Y = β0 + β1*X + ε into a matrix in R. I don't have any physical data to insert, I am just wanting to translate the simple linear model into a matrix form. How would I do this in R. I've made matrices using a dataset before, but the lack of data for this is throwing me off on how to do it.
I have the following fitted model w/o restriction:
reg <- lm(y ~ indi_x + x + inter)
where indi_x = indicator variable for x > 14 and inter = interaction variable for indi_x and x.
I want to impose the restriction that indi_x + (inter * 14) = 0 to fit the two segments at x = 14. I've been using the I() function within lm but am not getting the output I want.
Thanks!
If I understand correctly, you have two slopes that are joined at x = 14, and you want to infer the individual slopes (and possibly the common intercept?)
This would do it:
reg <- lm(y ~ 1 + x + x : I(x > 14))
Note that x * I(x > 14) is now the change in slope. So the absolute slope of the second segment is slope_2 - slope_1.
I have a problem with fitting a curve in 3D point set (or point cloud) in space. When I look at curve fitting tools, they mostly create a surface when given a point set [x,y,z]. But it is not what I want. I would like to fit on point set curve not surface.
So please help me what is the best solution for curve fitting in space (3D).
Particularly, my data looks like polynomial curve in 3d.
Equation is
z ~ ax^2 + bxy + cy^2 + d
and there is not any pre-estimated coefficients [a,b,c,d].
Thanks.
xyz <- read.table( text="x y z
518315,750 4328698,260 101,139
518315,429 4328699,830 101,120
518315,570 4328700,659 101,139
518315,350 4328702,050 101,180
518315,3894328702,849 101,190
518315,239 4328704,020 101,430", header=TRUE, dec=",")
sample image is here
With a bit of data we can now demonstrate a rather hackis effort in the direction you suggest, although this really is estimating a surface, despite your best efforts to convince us otherwise:
xyz <- read.table(text="x y z
518315,750 4328698,260 101,139
518315,429 4328699,830 101,120
518315,570 4328700,659 101,139
518315,350 4328702,050 101,180
518315,389 4328702,849 101,190
518315,239 4328704,020 101,430", header=TRUE, dec=",")
lm( z ~ I(x^2)+I(x*y) + I(y^2), data=xyz)
#---------------
Call:
lm(formula = z ~ I(x^2) + I(x * y) + I(y^2), data = xyz)
Coefficients:
(Intercept) I(x^2) I(x * y) I(y^2)
-1.182e+05 -3.187e-07 9.089e-08 NA
The collinearity of x^2 and x*y with y^2 is preventing an estimate of the y^2 variable coefficient since y = x*y/x. You can also use nls to estimate parameters for non-linear surfaces.
I suppose that you want to fit a parametrized curve of of this type:
r(t) = a + bt + ct^2
Therefore, you will have to do three independent fits:
x = ax + bx*t + cx*t^2
y = ay + by*t + cy*t^2
z = az + bz*t + cz*t^2
and obtain nine fitting parameters ax,ay,az,bx,by,bz,cx,cy,cz. Your data contains the positions x,y,z and you also need to include the time variable t=1,2,3,...,5 assuming that the points are sampled at equal time intervals.
If the 'time' parameter of your data points is unknown/random, then I suppose that you will have to estimate it yourself as another fitting parameter, one per data point. So what I suggest is the following:
Assume some reasonable parameters a,b,c.
Write a function which calculates the time t_i of each data point by
minimizing the square distance between that point and the tentative
curve r(t).
Calculate the sum of all (r(t)-R(t))^2
between the curve and your dataset R. This will be your fitting score, or
the Figure of Merit
use Matlab's genetic algoritm ga() routine to
obtain an optimal a,b,c which will minimize the Figure
of Merit as defined above
Good luck!
I am quite new to R and I am having trouble figuring out how to select variables in a multivariate linear regression in R.
Pretend I have the following formulas:
P = aX + bY
Q = cZ + bY
I have a data frame with column P, Q, X, Y, Z and I need to find a, b and c.
If I do a simple multivariate regression:
result <- lm( cbind( P, Q ) ~ X + Y + Z - 1 )
It calculates a coefficient for "c" on P's regression and for "a" on Q's regression.
If I calculate the regressions individually then "b" will be different in each regression.
How can I select the variables to consider in a multivariate regression?
Thank you,
Edson
P = aX + bY;
Q = cZ + bY
in lavaan you could do it by adding an equality constraint i.e giving two parameters the same custom name
P ~ X + b*Y
Q ~ Z + b*Y
See also http://lavaan.ugent.be/tutorial/syntax2.html