Obtain spline surface on R - r

How do I generate a b-spline surface, let's say:
x=attitude$rating
y=attitude$complaints
z=attitude$privileges
would be x and y for the spline basis. z is the set of control points.

If I understand you, you have x,y, and z data and you want to use bivariate spline interpolation on x and y, using z for the control points. You can do this with interp(...) in the akima package.
library(akima)
spline <- interp(x,y,z,linear=FALSE)
# rotatable 3D plot of points and spline surface
library(rgl)
open3d(scale=c(1/diff(range(x)),1/diff(range(y)),1/diff(range(z))))
with(spline,surface3d(x,y,z,alpha=.2))
points3d(x,y,z)
title3d(xlab="rating",ylab="complaints",zlab="privileges")
axes3d()
The plot itself is fairly uninteresting with your dataset because x, y, and x are highly correlated.
EDIT response to OP's comment.
If you want a b-spline surface, try out mba.surf(...) in the unfortunately named MBA package.
library(MBA)
spline <- mba.surf(data.frame(x,y,z),100,100)
library(rgl)
open3d(scale=c(1/diff(range(x)),1/diff(range(y)),1/diff(range(z))))
with(spline$xyz,surface3d(x,y,z,alpha=.2))
points3d(x,y,z)
title3d(xlab="rating",ylab="complaints",zlab="privileges")
axes3d()

require(rms) # Harrell's gift to the R world.
# Better to keep the original names and do so within a dataframe.
att <- attitude[c('rating','complaints','privileges')]
add <- datadist(att) # records ranges and descriptive info on data
options(datadist="add") # need these for the rms functions
# rms-`ols` function (ordinary least squares) is a version of `lm`
mdl <- ols( privileges ~ rcs(rating,4)*rcs(complaints,4) ,data=att)
# Predict is an rms function that works with rms's particular classes
pred <- Predict(mdl, 'rating','complaints')
# bplot calls lattice functions; levelplot by default; this gives a "3d" plot
bplot(pred, yhat~rating+complaints, lfun=wireframe)
It's a crossed restricted-cubic spline model. If you have a favorite spline function you want to use instead, then by all means try it out. I've had good luck with the rcs- function.
This gives a more open mesh with fewer calculated points:
pred <- Predict(mdl, 'rating','complaints', np=25)
bplot(pred, yhat~rating+complaints, lfun=wireframe)
png()
bplot(pred, yhat~rating+complaints, lfun=wireframe)
dev.off()
You could use the rgl methods being illustrated by jhoward. The top of str(pred) looks like:
str(pred)
Classes ‘Predict’ and 'data.frame': 625 obs. of 5 variables:
$ rating : num 43 44.6 46.2 47.8 49.4 ...
$ complaints: num 45 45 45 45 45 ...
$ yhat : num 39.9 39.5 39.1 38.7 38.3 ...
$ lower : num 28 28.3 27.3 25 22 ...
$ upper : num 51.7 50.6 50.9 52.4 54.6 ...
snipped
library(rgl)
open3d()
with(pred, surface3d(unique(rating),unique(complaints),yhat,alpha=.2))
with(att, points3d(rating,complaints,privileges, col="red"))
title3d(xlab="rating",ylab="complaints",zlab="privileges")
axes3d()
aspect3d(x=1,z=.05)
Good illustration of the dangers of extrapolation once you realize there are no data out on the extremes of inappropriate extrapolations from that model. The rms-package has a perimeter function and the plotting functions have a perim argument to which perimeter-objects are passed.

Related

Plotting mixture of univariate normal distributions using the R package EMCluster

I would like to visualize a mixture of univariate normal distributions fitted by the EMCluster R package. I would like to get a set of Gaussian curves plotted for each group separately onto a histogram of my data like this:
However, there does not seem to be any plotting function in EMCluster doing that, and the EMCluster help files do not cover this topic. I am aware of other R packages having this functionality (e.g., mixtools or mclust), but these do not do the fitting job right unlike EMCluster, which as the only one fits the distribution correctly.
This is is the code used to fit the model:
# input data
ratio <- c(2.3082202,2.2682008,2.3430013,2.3131582,2.3442648,2.2866254,2.4438874,2.4360583,2.4907970,2.4785809,2.4368449,2.4589041,2.2404580,2.2568378,2.2305135,2.2316156,2.1975250,2.2003426,2.3995671,2.2367229,2.2380535,2.2695250,2.2795190,2.2804133,2.2873157,2.3025447,2.2981834,2.2774566,2.1899404,2.2675393,2.2863328,2.2702749,2.2173223,2.1615549,2.3052489,1.4739972,1.4703164,1.8989637,1.6309663,1.4742799,1.6040551,1.5125876,2.3090968,2.4032732,2.2939723,2.4932422,2.4572377,2.2887470,2.2541456,2.3912709,2.3709839,2.2961881,2.3252021,2.4311603,2.4526981,2.2712559,2.4556190,2.4480402,2.2345277,2.2832188,2.3259353,2.3486292,2.3477749,2.3160682,2.3025502,2.3645101,2.2972784,2.3914385,2.4182051,2.3144094,2.3182206,2.3032635,2.3304741,2.4386540,2.4750668,1.5558920,1.5520053,1.5679544,1.5674089,1.5318896,1.5378909,1.5780276,1.5884973,1.5535807,2.4912484,1.5725149,1.6011670,1.5669198,1.5581934,1.5688439,1.5578162,1.5593121,1.5563160,1.5468341,1.5433628,1.5141012,1.5842708,1.5632946,1.5233117,1.5867471,1.5010637,1.5865281,1.5836973,1.6140125,1.6285195,1.5592994,1.5584742,1.6113194,1.6074361,1.5850861,1.5612799,1.5905453,2.3874244,2.4045643,2.3996815,2.3605345,2.3346210,2.3865179,2.4406780,2.3688209,2.3649233,2.3668617,2.4675781,2.4767129,2.4664701,2.3560843,2.5487013,2.4604951,2.5133258,2.4387729,1.4955564,1.5082731,1.5440476,1.5911176,2.4114691,2.4188795,2.4320730,2.4896641,2.4439351,2.4726592,2.4819837,2.4053318,2.4694447,2.3929463,2.3948703,2.3993741,2.4535933,2.4556870,2.3877090,2.4466891,2.4426443,2.3879938,2.3402072,2.3448416,2.4782167,2.5233350,1.5917363,1.6355997,1.5816622,1.6164543,1.5466306,1.5081628,1.4992875,1.6306420,1.6762845,1.6866838,2.4210023,2.4406259,2.3720587,2.4113856,2.3190864,2.4250728,2.4457677,2.4429783,2.4122941,2.4424428,2.4527037,2.4966437,2.4035152,2.4137089,2.3954934,2.4212645,2.4087689,2.4105695,2.5525013,1.5290941,1.5728092,1.5543364,1.5641066,1.6006943,1.5889370,1.5321614,1.5586370,1.5415335,1.5491241,1.5541842,1.5852345,1.5902462,1.5801461,1.5930383,1.5523744,1.5575027,1.5546879,1.5726514,1.5938686,1.6222843,1.5881920,1.5889394,1.6246987,1.5791531,1.6059873,1.5860296,1.5701796,1.5763713,1.6315066,1.6206847,1.5685662,1.5892450,1.6186198,1.6242586,1.5481500,2.5555498,2.5258320,1.6016010,1.6078024,1.7627993,2.5282046,2.5283876,1.6267890,1.6268478,1.6069182,1.5574029,1.5672845,1.6389830,1.5810995,1.5809019,1.5621111,1.6275267,1.5630592,1.5361265,1.5750796,1.5447555,1.5755329,1.5092036,1.5607702,1.5812295,1.6181003,1.6067561,1.6846355,1.6251237,1.7229672,1.6319477,2.0122061,1.6008821,1.4929003,1.5977780,1.6336924,1.7014357,1.7084902,1.5106016,1.6115451,1.5218875,1.5394796,1.5841880,1.6021178,1.6286143,1.6322306,1.6479547,1.6051730,1.6903781,1.5605062,1.6159015,1.5648148,1.6410461,1.5526851,1.5951099,1.6702494,1.6863064,1.5488679,1.5722265,1.5789145,2.3835058)
ratio <- c(ratio,2.4312322,2.4034768,2.4384596,2.4436314,2.4303576,2.4266296,2.3727400,2.3769620,1.4916100,2.0513010,2.0537072,1.6218816,1.6315519,1.6242272,1.5328026,2.4791800,2.4943584,1.5233619,1.5520066,1.5211411,1.5946195,1.5954327,1.5099963,1.6025225,1.4824897,1.5543129,1.6041963,1.9676651,1.8662227,1.9669943,1.5584472,1.5664382,1.5551137,1.5563330,1.5637281,1.6058321,2.3605461,1.4467948,2.4481076,2.2974519,2.4029441,2.4499736,2.4637541,2.4147364,2.4483680,2.4730025,2.4344831,2.4246536,2.3520329,2.3551388,2.4119589,2.4621961,2.3978363,2.4289587,2.4244460,2.3955454,2.4106017,2.4766059,2.5505131,2.5904964,2.4405065,2.5502677,2.5418027,2.5946520,2.3462233,2.3910095,2.3808582,2.4900000,2.4929675,2.5957481,2.6473263,2.6821399,2.5863157,2.5706163,2.5873251,2.3470114,2.3549401,2.4447632,2.3895783,2.4598023,1.9194875,1.9098258,1.5563858,1.9655190,1.9843544,1.5075750,1.9197088,1.8606831,1.5003864,1.5420225,1.5832182,1.4812740,1.5309472,1.5425323,1.6757806,1.6147225,1.5449271,1.6736108,1.6062912,1.5309035,1.5375653,1.5230470,1.4537051,1.5208274,1.6269986,1.6722362,1.5627355,1.5736269,1.5597532,1.6075006,1.5912113,1.4857339,1.4865164,1.5915908,1.5626763,1.6001816,1.5838017,1.6547792,1.6521739,1.5541260,1.5589220,1.6041114,1.6052877,1.6188839,1.5851005,1.5210468,1.5914289,1.5612923,1.5543678,1.6405511,1.5491183,1.5760101,1.5988843,1.5453074,1.5629486,2.3506363,2.3900338,2.3692994,2.3774440,2.3778159,1.4801504,1.5272571,1.5429716,1.5863007,1.5595683,1.5373692,1.5547804,1.6523496,1.6553747,2.4441022,2.3405286,2.4331678,2.4992773,2.4313793,2.3579883,2.4041657,2.3764722,2.3899039,2.4933925,2.4881635,2.3814493,2.4341179,2.5575816,2.4556496,2.5024020,2.4669698,2.4933790,2.5047439,2.4975195,2.4015712,2.4469022,2.4567287,2.5280128,2.4781728,2.5028472,2.5189290,2.5043568,2.4434804,2.4769850,2.4082988,2.4613414,2.3871942,2.4679938,2.4711610,2.3716282,2.4665808,2.4719631,2.4671665,2.3450332,2.3711461,2.3567956,2.4531954,2.4627118,2.4688380,2.4264698,2.4634535,2.4754286,2.4152280,2.4200143,2.4366610,2.4748473,2.3998817,2.4435630,2.4505969,2.4825692,2.4722832,2.3908582,2.5265103,2.4205017,2.4975109,1.8273926,1.8191600,1.7649152,1.8441525,1.8362518,1.7798777,1.7749614,1.8514624,1.8231416,1.7980675,1.8139444,1.7813007,1.8112133,1.8506497,1.7888465,1.7981612,1.7877846,1.8448715,1.8428863,1.7778798,1.8004113,1.8293173,1.8581288,0.7568285,0.7681167,0.7569872,0.7478473,0.7586729,0.7574590,0.7707119,0.7654296,0.7560802,0.7223322,0.7634792,0.7577954,0.7630233,0.7439842,0.7501500,0.7666750,0.7503895,0.7622286,0.7635520,0.7534987,0.7599475,0.7748956,0.7590139,0.7568013,0.7586510,0.7564873,0.7576587,0.7756744,0.7637029,0.7579558,0.7735586,0.7692345,0.7611547,0.7551704,0.7715118,0.7736083,0.7635049,0.7559560,0.7667778,0.7618454,0.7648308,0.7693512,0.7653699,0.7589299,2.3590999,2.3619440,2.3519471,2.3453709,2.3486040,2.4295357,2.4219361,2.4488210,2.4213269,2.3923218,1.5402144,0.7652187,0.7650076,0.7486516,0.7643569,0.7677992,0.7586368,0.7584026,0.7662775,0.7740569,0.7713377,0.7740677,0.7686109,0.7490695,0.7681402,0.7639513,0.7896797,0.7660664,0.7720809,2.4936417,2.3834268,0.7626155,0.7566950,0.7824697,0.7735270,2.4219853,2.5196019,0.7636966,0.7570312,2.4539218,2.4338622,2.4396830,0.7733773,0.7632390,0.7697633,0.7677199,0.7563686,0.7569366,0.7785563,2.3883388,2.4508850,2.5454545,2.4078138,2.4295884,2.5461211,2.3826057,2.4070722,2.4136287,2.4216196,2.3854115,2.4173648,2.4553045,2.4356992,2.3863270,2.4346182,2.3953523,2.4102911,2.4179386,2.4251441,2.4217263,2.4586260,2.4059818,2.4069762,2.4099267,2.4387039,2.4458400,2.4573490,2.4759171,2.4772782,2.4684424,2.4434910,2.4804417,2.4192554,2.4454277,2.3955812)
ratio <- c(ratio,2.4749573,2.4724740,2.4209932,2.4249042,0.7610100,2.3671184,2.3697960,2.4023920,2.4053188,2.4351452,2.4386339,2.4094596,2.3468609,2.3155974,2.3104149,2.3627904,2.2658038,2.3885135,2.3038569,2.3350929,2.3612206,2.2781386,2.2077955,2.2325030,2.2679324,2.2823724,2.4542063,2.1998876,2.3031394,2.2881462,2.3198411,2.1884115,2.2710756,2.1827032,2.3180018,2.2993929,2.2874717,2.2829849,2.3455578,2.2977960,2.3410225,2.4081727,2.3638018,2.2795117,2.3976837,2.3932511,2.4075917,2.4015616,2.3418264,2.3349013,2.4240838,2.4173924,2.3960085,2.4031925,2.3945173,2.3856273,2.4690719,2.4686576,2.6263546,2.4718199,2.3463103,2.3652738,2.4379702,2.4796148,2.4287476,2.2620077,2.3805959,2.3458551,2.3461876,2.3171412,2.3897433,2.3912464,2.3614724,2.5442086,1.7773526,1.7178727,2.3689272,2.3640495,2.2916104,2.3356112,2.3964407,2.4015147,1.7590361,2.3414859,2.2694388,2.3567392,2.2708003,2.2956377,2.3101355,2.3125120,2.3004763,2.2705803,2.2875544,2.2712565,2.3089819,2.2945688,2.3205069,2.2732516,1.7467849,1.7259182,1.7686619,1.7095978,2.2485174,2.4754472,2.4495703,2.4074395,2.4539152,2.3435356,2.3811995,2.4362843,2.4403479,2.5180754,0.7635700,0.7702974,0.7722231,0.7511177,0.7551106,0.7601832,0.7361963,0.7794874,0.7934396,0.7709755,0.7844490,0.7596845,0.7930613,0.7504008,0.7627609,0.7532478,0.7581377,0.7705301,0.7721196,0.7567074,0.7828257,0.7780374,0.7669904,0.7812995,0.7834886,0.7889141,0.7859603,0.7682853,0.7478378,0.7673373,0.7629394,0.7715725,0.7791851,0.7666474,0.7714516,0.7496962,0.7533639,0.7579321,0.7737278,0.7853168,0.7645964,0.7725103,0.7677113,1.6156064,1.5390375,1.5211357,1.5385035,1.4920242,1.6010156,1.5269279,1.5407954,1.4766077,1.5165483,2.4101888,2.4461818,2.4832620,2.4234102,2.3891139,2.4782895,2.4514298,2.4655178,1.5269570,1.5543206,1.5397302,1.5121552,1.5089835,1.5259734,1.5593889,1.5634023,1.5243902,1.5291230,1.5649990,2.3870746,2.4165985,2.4108952,2.5178706,2.4626444,2.4056250,2.4241015,2.2767663,2.1800944,2.2449665,2.2155462,2.2102232,2.1888782,2.1780130,2.1797660,2.2164432,2.1815838,2.2745605,2.2695021,2.1916002,2.1620086,2.1604276,2.1606576,2.1582508,2.1380764,2.1869745,2.1967953,2.1466257,2.1625341,2.2056076,2.1491951,2.2865496,2.2079291,2.1408957,2.1638916,2.5895003,2.1258483,2.1444175,2.2055520,2.1909330,2.2072956,2.2051386,2.4054279,2.4282109,2.4143596,2.4389768,2.4716971,2.4614070,2.4076957,2.4289880,2.4710598,2.4395144,2.4251923,2.4528039,2.4597420,2.4047160,2.4214961,2.4052775,2.4002135,2.3944374,2.4051049,2.4450801,1.5737999,1.5879028,1.4740943,1.5251975,1.5084994,1.5176669,1.5827066,1.8177106,2.4441547,2.4143742,2.4699203,2.3762657,2.3688625,2.4603598,2.4191691,2.4546725,2.4275861,2.4220888,2.4021738,2.4979091,2.4092485,2.4321795,2.4349427,2.4462136,2.4662561,2.3542403,2.4312644,2.4911342,2.4228929,2.4431390,2.4214492,2.4497284,2.4559866,2.4404771,2.4329174,2.4118930,2.4634206,2.4609880)
ratio <- c(ratio,2.4827362,2.3897923,2.4143036,2.4375662,2.4641043,2.4222062,2.4024445,2.4035411,2.4564304,2.4355113,2.3968869,2.4147049,2.4618251,2.4281002,2.3793209,2.4370193,2.3959920,2.3882394,2.4338975,2.4221606,2.4012736,2.4163148,2.4014352,2.3909576,2.4497283,2.3379209,2.2898380,2.3337816,2.2496425,2.3140063,2.2480368,2.3382981,2.4370101,2.3976064,1.5515221,1.6095634,1.5416116,1.5449938,1.5357950,1.6159637,2.5099319,2.4895688,2.5469330,2.4921635,2.4849558,2.5936576,2.4149410,2.3313097,2.3891210,2.3958461,2.4481496,2.3889537,2.3672844,2.3638062,2.3213211,2.2783695,2.3841916,2.3915916,2.4455549,2.3920387,2.3327502,2.3814912,2.3998684,2.5413619,2.4137805,2.4694112,2.2786062,2.3302104,2.3343673,2.3937296,2.3613661,2.4505636,2.4254633,2.5044136,2.4344560,2.3947676,2.3527630,2.3999480,2.3888254,2.4736389,2.4171134,2.4438647,2.2072640,2.4066542,2.4867753,2.4228284,2.4383648,2.4733934,2.3856568,2.3750662,2.2850762,2.3953039,2.4272293,2.4479172,2.4583161,2.4372287,2.4391820,2.4068650,2.3923539,2.4295108,2.4074787,2.4039005,2.4916317,2.4349603,2.4786877,2.4942217,2.2279876,2.4491045,2.2402399,2.3116072,2.3713333,2.3000641,2.2848940,2.4239510,2.4702209,2.4698362,2.4101398,2.3967287,2.4586776,2.4489123,2.4533943,2.4532398,2.4235131,2.4177043,2.4007880,2.4442298,2.4343659,2.4388599,2.3495992,2.4192377,2.3928401)
library(EMCluster)
ret <- init.EM(as.data.frame(ratio),nclass=6,method="Rnd.EM") # fitting model
# now how to plot the distribution curves?
Thank you in advance for your advice.
EDIT:
The structure of the resulting object is this:
> str(ret,vec.len = 10)
List of 13
$ pi : num [1:6] 0.0537 0.234 0.299 0.1124 0.2225 0.0784
$ Mu : num [1:6, 1] 1.794 1.568 2.429 0.765 2.316 2.474
$ LTSigma : num [1:6, 1] 0.011002 0.001946 0.001355 0.000124 0.008215 0.006486
$ llhdval : num 435
$ nc : int [1:6] 52 252 406 119 195 35
$ class : num [1:1059] 5 5 5 5 5 5 3 3 3 3 3 3 5 5 5 5 5 5 3 5 5 5 5 5 5 ...
$ conv.iter: int 93
$ conv.eps : num 9.81e-07
$ flag : num 0
$ n : int 1059
$ p : int 1
$ nclass : num 6
$ method : chr "Rnd.EM"
- attr(*, "class")= chr "emret"
This is pretty straightforward. There's a little bit of calculation involved in getting the heights of the curves to line up reasonably, but otherwise it's pretty basic. First plot a histogram. If you want a box around it, like your example, do that. Then you'll need to call lines() 6 times to plot the 6 normals. In R, lines are just a sequence of interpolated points—(x,y) coordinates—so make a sufficiently fine-grained set of x coordinates, then compute the normal density for each component using dnorm() and the fitted parameters. You'll need to multiply those y-values by the appropriate proportion and a height adjustment factor to get the heights of the curves right. It turns out that the highest bin in your histogram is 82, which is approximately the peak of your third component, but since that represents only 30%, you need to rescale the adjustment factor by that. You may want to choose your own colors. Consider:
xs <- seq(min(ratio), max(ratio), length.out=1000)
windows()
h <- hist(ratio, breaks=seq(0, 3, by=0.02)); box()
# max(h$counts) # [1] 82
height <- 82/dnorm(ret$Mu[3,1], mean=ret$Mu[3,1], sd=sqrt(ret$LTSigma[3,1]))
height <- height/ret$pi[3]
for(i in 1:6){
lines(xs, dnorm(xs, mean=ret$Mu[i,1], sd=sqrt(ret$LTSigma[i,1]))*height*ret$pi[i],
lwd=2, col=i)
}

getting predicted values from an nls() model for Fabens von Bertalanffy growth curve

I have a growth dataset based off of recaptures. There are columns with the capture length, the recapture length, and the time (in yrs) based off the recapture minus the capture.
> str(data)
'data.frame': 60 obs. of 3 variables:
$ sizecapture : num 40.3 43 38.3 41.5 37.6 ...
$ sizerecapture: num 43 48.7 39.5 42 46.7 43.5 43.5 47.2 45.7 59.9 ...
$ timeinterval : num 0.945 1.036 0.997 0.997 2.471 ...
I am following Ogle 2013 vignette (http://derekogle.com/fishR/examples/oldFishRVignettes/VonBertalanffyExtra.pdf) in R, for the Fabens method of trying to derive size at age. For this method I don't need an initial age (as I don't know age at all). I am not interested in extrapolating, but for only estimating the age of individuals that I have sizes for.
I can easily follow the instructions for calculating the two parameters needed to inform the nls model: the k and the Linf. My aim is to create a age at length curve with the growth data, but when I get errors when I try to fitPlot. I get the error "Error in mdl$model[[gpos[2]]] : subscript out of bounds". I have also tried curve() and get the error "Error in FVB1(x) : could not find function "FVB1".
I also can't figure out how to extract the confidence intervals that fit with the predicted data.
I have searched and have found some similar cases but nothing that has worked. I'll continue to research, but am I missing something very basic? Below is a subsample of the data. I'd appreciate any help.
Thank you
install.packages("FSA")
install.packages("FSAdata")
install.packages("nlstools")
install.packages(car)
library(FSA)
library(FSAdata)
library(nlstools)
library(car)
sizecapture <- c(40.30,43.00,38.30,41.50,37.60,41.63,41.80,38.40,40.00,41.20,37.70,41.70,43.70,41.80,42.70,44.60,45.50,44.50,45.60,44.80,47.00,49.20,44.50,45.20,46.40,46.90,49.40,61.00,36.50,42.10,43.90,43.90,46.40,45.50,47.20,64.30,43.00,59.90,39.60,36.80)
sizerecapture < c(43.0,48.7,39.5,42.0,46.7,43.5,43.5,47.2,45.7,59.9,48.1,46.5,45.7,49.1,48.7,47.1,46.9,48.3,47.2,53.7,52.0,51.2,56.2,56.3,57.5,57.7,55.4,74.5,45.6,44.9,46.7,51.0,49.4,58.0,56.8,71.6,43.8,44.6,43.7,41.9)
timeinterval <-c(0.9452055,1.0356164,0.9972603,0.9972603,2.4712329,0.9534247,1.1945205,2.0027397,1.3178082,4.5342466,2.1863014,0.9178082,1.1315068,2.3698630,2.0575342,1.3835616,1.1726027,1.1972603,3.1698630,1.9589041,1.0712329,0.9150685,2.5671233,2.7780822,3.2000000,2.2246575,1.9150685,4.1753425,0.9287671,1.0328767,1.3945205,2.6739726,0.9205479,3.1479452,1.9506849,1.7178082,1.0520548,3.0767123,1.3726027,1.2520548)
data <- data.frame(sizecapture, sizerecapture, timeinterval)
### using Ogle 2013 to calculate Linf and k
# k and Linf
with(data,mean((log(sizerecapture)-log(sizecapture))/timeinterval))
#0.0676`
max(data$sizerecapture) # largest size is 74.5
Fabens.sv <- list(Linf=74.5, K=0.0676)
# declare the model
fvb <- vbFuns("Fabens")
# fit and summarize
FVB1<- nls(sizerecapture~
fvb(sizecapture,timeinterval,Linf,K),start=Fabens.sv,data=data)
summary(FVB1,correlation=TRUE)
# confidence intervals through bootstrapping
boot <- nlsBoot(FVB1, niter=500)
confint(boot,plot=TRUE)
# plotting a fitted line plot
ages2plot <- 0:40
LCI <- UCI <- numeric(length(ages2plot))
fitPlot(FVB1, xlim=range(ages2plot))
estes <- boot$coefboot
for (i in 1:length(ages2plot)) {
pv <-estes[,"Linf"]*(1-exp(-ests[,"K]*(ages2plot)))
LCI[i] <- quantile(pv,0.025)
UCI[i] <- quantile(pv,0.975)
}
lines(UCI~ages2plot,type="1")
lines(LCI~ages2plot,type="1")
# tried to just get a visual and errors arise
curve(FVB1)

Unable to produce values of a kernel density estimator in R

I'm simulating random numbers from the exponential distribution with rate=1. I have plotted a kernel density of the data using the density() function in R. What I want is a function f that gives me the value of the density at any point. I have tried the following code:
n=10^5
x=rexp(n,rate=1)
d=density(x,kernel="gaussian")
f=function(x){d$y[x]}
f(1)
plot(d)
However,
f(1) clearly does not match the value of the density function at the point x=1. Where am I going wrong?
density produces a list which contains, in field x, the coordinates of the points at which the density is evaluated and, in field y, the estimated values of the density at these coordinates:
> str(d)
List of 7
$ x : num [1:512] -0.348 -0.328 -0.307 -0.286 -0.266 ...
$ y : num [1:512] 0.00146 0.00256 0.00435 0.00717 0.01147 ...
......
To get a function from x and y, you can use the approxfun function:
> f <- approxfun(d$x, d$y)
> f(1)
[1] 0.3665273
> dexp(1, rate=1)
[1] 0.3678794
In your code you get d$y[1], which is the first value of d$y.

Any suggestions for how I can plot mixEM type data using ggplot2

I have a sample of 1m records obtained from my original data. (For your reference, you may use this dummy data that may generate approximately similar distribution
b <- data.frame(matrix(rnorm(2000000, mean=c(8,17), sd=2)))
c <- b[sample(nrow(b), 1000000), ]
)
I believed the histogram to be a mixture of two log-normal distributions and I tried to fit the summed distributions using EM algorithm using the following code:
install.packages("mixtools")
lib(mixtools)
#line below returns EM output of type mixEM[] for mixture of normal distributions
c1 <- normalmixEM(c, lambda=NULL, mu=NULL, sigma=NULL)
plot(c1, density=TRUE)
The first plot is a log-likelihood plot and the second (if you hit return again), gives similar to the following density curves:
As I mentioned c1 is of type mixEM[] and plot() function can accommodate that. I want to fill the density curves with colors. This is easy to do using ggplot2() but ggplot2() does not support data of type mixEM[] and throws this message:
ggplot doesn't know how to deal with data of class mixEM
Is there any other approach I can take for this problem?
Look at the structure of the returned object (this should be documented in the help):
> # simple mixture of normals:
> x=c(rnorm(10000,8,2),rnorm(10000,17,4))
> xMix = normalmixEM(x, lambda=NULL, mu=NULL, sigma=NULL)
Now what:
> str(xMix)
List of 9
$ x : num [1:20000] 6.18 9.92 9.07 8.84 9.93 ...
$ lambda : num [1:2] 0.502 0.498
$ mu : num [1:2] 7.99 17.05
$ sigma : num [1:2] 2.03 4.02
$ loglik : num -59877
The lambda, mu, and sigma components define the returned normal densities. You can plot these in ggplot using qplot and stat_function. But first make a function that returns scaled normal densities:
sdnorm =
function(x, mean=0, sd=1, lambda=1){lambda*dnorm(x, mean=mean, sd=sd)}
Then:
qplot(x,geom="density") + stat_function(fun=sdnorm,args=list(mean=xMix$mu[1],sd=xMix$sigma[1], lambda=xMix$lambda[1]),fill="blue",geom="polygon") + stat_function(fun=sdnorm,args=list(mean=xMix$mu[2],sd=xMix$sigma[2], lambda=xMix$lambda[2]),fill="#FF0000",geom="polygon")
Or whatever ggplot skills you have. Transparent colours on the densities might be nice.
ggplot(data.frame(x=x)) +
geom_histogram(aes(x=x,y=..density..),fill="white",color="black") +
stat_function(fun=sdnorm,
args=list(mean=xMix$mu[2],
sd=xMix$sigma[2],
lambda=xMix$lambda[2]),
fill="#FF000080",geom="polygon") +
stat_function(fun=sdnorm,
args=list(mean=xMix$mu[1],
sd=xMix$sigma[1],
lambda=xMix$lambda[1]),
fill="#00FF0080",geom="polygon")
producing:
Here's a slightly different approach which uses geom_ploygon(...) instead of multiple calls to stat_function(...). One problem with stat_function(...) is that the secondary arguments (mu, sigma, and lambda in this example), which are passed using the args=list(...) parameter, cannot be included in an aesthetic mapping, so you have to have multiple calls to stat_function(...) as is #Spacedman`s solution.
This approach builds the PDFs outside of ggplot and uses a single call to geom_polygon(...). As a result, it works without modification for an arbitrary number of distributions in the mixture.
# ggplot mixture plot
gg.mixEM <- function(EM) {
require(ggplot2)
x <- with(EM,seq(min(x),max(x),len=1000))
pars <- with(EM,data.frame(comp=colnames(posterior), mu, sigma,lambda))
em.df <- data.frame(x=rep(x,each=nrow(pars)),pars)
em.df$y <- with(em.df,lambda*dnorm(x,mean=mu,sd=sigma))
ggplot(data.frame(x=EM$x),aes(x,y=..density..)) +
geom_histogram(fill=NA,color="black")+
geom_polygon(data=em.df,aes(x,y,fill=comp),color="grey50", alpha=0.5)+
scale_fill_discrete("Component\nMeans",labels=format(em.df$mu,digits=3))+
theme_bw()
}
library(mixtools)
# two components
set.seed(1) # for reproducible example
b <- rnorm(2000000, mean=c(8,17), sd=2)
c <- b[sample(length(b), 1000000) ]
c2 <- normalmixEM(c, lambda=NULL, mu=NULL, sigma=NULL)
gg.mixEM(c2)
# three components
set.seed(1)
b <- rnorm(2000000, mean=c(8,17,30), sd=c(2,3,5))
c <- b[sample(length(b), 1000000) ]
library(mixtools)
c3 <- normalmixEM(c, k=3, lambda=NULL, mu=NULL, sigma=NULL)
gg.mixEM(c3)

Lowess function and plyr

I am working with a data frame I created and want to expand it to include a lowess fit. I have been able to add a lowess curve to the plot of my data along with an lm fit but I cannot figure out how to add the lowess values to my data frame.
Please forgive the ugliness of my code as I do everything by brute force (i.e. suggestions for simplification/efficiency are appreciated.) I apologize as I am not allowed to post images. I have a scatterplot with an lm fit yielding an R^2 of .7897 and a lowess curve which well replicates the qqplot of the lm fit. When I get a reputation of "10" I will post it for edification/ease of visualization:
##Read in Data
OPM.df <- read.csv("On Peak Mod TMAX.csv", header = TRUE)
## Data frame 2008-2012
OPM5.df <- OPM.df[4606:6140,]
##Verify headings
OPM5.df[1,]
SummerOPM5.df <- OPM5.df[month(OPM5.df$Date) >= 6 & month(OPM5.df$Date) <= 9, ]
###Fit Linear Regression to Data
fitsummerX <- lm(SummerOPM5.df$MaxLoad~SummerOPM5.df$TMAX)
summary(fitsummerX)
##Plot data
windows()
plot(SummerOPM5.df$TMAX, SummerOPM5.df$MaxLoad, main="Linear Regression Adjusted R- squared: 0.7897",)
## Add fit lines
abline(fitsummerX, col="red") # regression line (y~x)
lines(lowess(SummerOPM5.df$TMAX,SummerOPM5.df$MaxLoad), col="blue") # lowess line (x,y)
## plyr augmentation of df
SummerOPM5.df <- ddply(SummerOPM5.df, .(Date, MaxLoad, TMAX, OnPeakTotal), transform,
Lowess = (lowess(TMAX,MaxLoad)$y))
##Verify headings and values
SummerOPM5.df [1:5,]
This gives the following:
Date MaxLoad TMAX OnPeakTotal Lowess
1 2008-06-02 2880 214.0 43307 2880
2 2008-06-03 2860 197.0 43166 2860
3 2008-06-04 2787 172.5 42088 2787
4 2008-06-05 2902 216.5 43333 2902
5 2008-06-06 3078 275.0 45325 3078
The Lowess values are the same as the MaxLoad
If I run the lowess seperately:
Lowess = (lowess(SummerOPM5.df$TMAX,SummerOPM5.df$MaxLoad))
Lowess$y[1:5]
I get this:
[1] 2522.221 2569.523 2603.625 2622.795 2622.795
Where the vector is in rank order.
.
I am unsure if it preserves the sequence otherwise so am unsure how to fit this to the corresponding rows of my data frame.
Suggestions?

Resources