I state that I never used the root mean squared deviation before.
I'm just tryng to reproduce what I found in an article.
Pratically, I have to quantify the noise of a "methodology" (that is the result of different noises, because of the coupling of two instruments), measuring the noise of the methodology at three different points out of the normal operation, where we know that it measures only the noise.
At the end, in the process that I'm following, you have to calculate the standard deviation among this three points and multiply it for the factor of 1.96 for the 95th percentile confidence interval (and in this way get a Limit of Detection of the method).
The time resolution is 30 minutes, then a standard deviation among three points, then the following three points, and so on. I already have a dataset organized in this way and I have calculated the standard deviation.
Just because I'm following the method of an article, they compare the limit of detection calculated using the standard deviation and the one using the root mean squared deviation.
Because at the end I have to use this limit of detection to filter my data by noise, I want to compare which is the best method for my situation, as they did.
How can I calculate the root mean squared deviation, as well as I calculated the standard deviation? Each three points (three differente columns), then the three following points (same three columns but the following row) and so on?
I already tried with the rmse function of the Metrics package, but the problem is the it require only two values: an actual and a predicted values.
Of course, as well as I did for the standard deviation calculation, I should use aggregate to iterate the function each three columns per each row.
EDIT
I put a piece of the article that I'm following for this method, to try to make you understand better what I would like
"For time lags much different from the true time lag, the standard deviation of the flux provides a measure of the random error affecting the flux"......."Multiplying this measure of the random error by α gives an estimate of the measurement precision at a given confidence interval (α =1.96 for the 95th percentile; α =3 for the 99th percentile) which can be used as the flux limit of detection (LoD) (i.e. LoDσ = α ×REσ )......"A modification of the LoDσ approach is to calculate the random error based on the root mean squared deviation (RMSE) of flux from zero ,which reflects the variability in the crosscovariance function in these regions, but also its offset from zero
I share with you part of the dataset and the code the I tried to use.
Dataset
structure(list(`101_LOD` = c(-0.00647656063436054, 0.00645714072316343,
0.00174533523902105, -0.000354643362187957, -0.000599093190801188,
0.00086188829059792), `101_LOD.1` = c(0.00380625456526623, -0.00398115037246045,
0.00158673927930099, -0.00537583996746438, -0.00280048350643599,
0.00348232298529063), `101_LOD.2` = c(-0.00281100080425964, -0.00335537844222041,
0.00611652518452308, -0.000738139825060029, 0.00485039477849737,
0.00412428118507656), `107_LOD` = c(0.00264717678436649, 0.00339296025595841,
0.00392733001719888, 0.0106686039973083, 0.00886643251752075,
0.0426091484273961), `107_LOD.1` = c(0.000242380702002215, -0.00116108069669281,
0.0119784744970561, 0.00380805756323248, 0.00190407945251567,
0.00199684331869391), `107_LOD.2` = c(-0.0102716279438754, -0.00706528150567528,
-0.0108745954674186, -0.0122962259781756, -0.00590383880635847,
-0.00166664119985051), `111_LOD` = c(-0.00174374098054644, 0.00383270191075735,
-0.00118363208946644, 0.00107908760333878, -9.30127551375776e-05,
-0.00141500588842743), `111_LOD.1` = c(0.000769378300959002,
0.00253820252869653, 0.00110643824418424, -0.000338050323261079,
-0.00313666295753596, 0.0043919374295125), `111_LOD.2` = c(0.000177265973907964,
0.00199829884609846, -0.000490950219515303, -0.00100263695578483,
0.00122606902671889, 0.00934018452187161), `113_LOD` = c(0.000997977666838309,
0.0062400770296875, -0.00153620247996209, 0.00136849054508488,
-0.00145700847633675, -0.000591288575933268), `113_LOD.1` = c(-0.00114161441697546,
0.00152607521404826, 0.000811193628975422, -0.000799514037634276,
-0.000319008435039752, -0.0010086036089075), `113_LOD.2` = c(-0.000722312098377764,
0.00364767954707251, 0.000547744649351312, 0.000352509651080838,
-0.000852173274761947, 0.00360487150682726), `135_LOD` = c(-0.00634051802134062,
0.00426062889500736, 0.00484049067127332, 0.00216220020394825,
0.00165634168942681, -0.00537970105199375), `135_LOD.1` = c(-0.00209301968088832,
0.00535855274344209, -0.00119679744329422, 0.0041216882161451,
0.00512978202611836, 0.0014048506490567), `135_LOD.2` = c(0.00022377545723911,
0.00400550696583795, 0.00198972253447825, 0.00301341644871015,
0.00256802839330668, 0.00946109288597202), `137_LOD` = c(-0.0108508893475138,
-0.0231919072487789, -0.00346546003410657, -0.00154066625155414,
0.0247266017774909, -0.0254464953061609), `137_LOD.1` = c(-0.00363025194918789,
-0.00291104074373261, 0.0024998477144967, 0.000877707284759669,
0.0095477003599792, 0.0501795740749602), `137_LOD.2` = c(0.00930498343499501,
-0.011839104725282, 0.000274929503053888, 0.000715665078729413,
0.0145503185102915, 0.0890428314632625), `149_LOD` = c(-0.000194406250680231,
0.000355157226357547, -0.000353931679163222, 0.000101471293242973,
-0.000429409422518444, 0.000344585379249552), `149_LOD.1` = c(-0.000494386150759807,
0.000384907974061922, 0.000582537329068263, -0.000173285705433721,
-6.92758935962043e-05, 0.00237942557324254), `149_LOD.2` = c(0.000368606958615297,
0.000432568466833549, 3.33092313366271e-05, 0.000715304544370804,
-0.000656902381786168, 0.000855422043674721), `155_LOD` = c(-0.000696168382693618,
-0.000917607266525328, 4.77049670728094e-06, 0.000140297660927979,
-5.99898679530658e-06, 6.71169142984434e-06), `155_LOD.1` = c(-0.000213644203677328,
-3.44396001911029e-07, -0.000524232671878577, -0.000830180665933627,
1.47799998238307e-06, -5.97640014667251e-05), `155_LOD.2` = c(-0.000749882784933487,
0.000345737159390042, -0.00076916001239521, -0.000135205762575321,
-2.55352420251723e-06, -3.07199008030628e-05), `31_LOD` = c(-0.00212014938530172,
0.0247411322547065, -0.00107990654365844, -0.000409195814154659,
-0.00768439381433953, 0.001860128524035), `31_LOD.1` = c(-0.00248488588195854,
-0.011146734518705, -0.000167943850441196, -0.0021998906531997,
0.0166775965182051, -0.0156939303287719), `31_LOD.2` = c(0.00210626277375321,
-0.00327815351414411, -0.00271043947479133, 0.00118991079627845,
-0.00838520090692615, 0.0255825346347586), `33_LOD` = c(0.0335175783154054,
0.0130192144768818, 0.0890608024914352, -0.0142431454793663,
0.00961009674973182, -0.0429774973256228), `33_LOD.1` = c(0.018600175159935,
0.04588362587764, 0.0517479021554752, 0.0453766081395813, -0.0483559729403664,
0.123771869764484), `33_LOD.2` = c(0.01906507758481, -0.00984821669825455,
0.134177176083007, -0.00544320457445977, 0.0516083894733814,
-0.0941500564321804), `39_LOD` = c(-0.148517395684098, -0.21311281527214,
0.112875846920874, -0.134256453140454, 0.0429030528286934, -0.0115143877745049
), `39_LOD.1` = c(-0.0431568202849291, -0.159003698955288, 0.0429009071238143,
-0.126060096927082, -0.078848020069061, -0.0788748111534866),
`39_LOD.2` = c(-0.16276833960171, 0.0236589399437796, 0.0828435027244962,
-0.50219849047847, -0.105196237549017, -0.161206838628339
), `42_LOD` = c(-0.00643926654994104, -0.0069253267922805,
7.63419856289838e-05, -0.0185223126108671, 0.00120855708103566,
-0.00275288147011515), `42_LOD.1` = c(-0.000866169150506504,
-0.00147791175852563, -0.000670310173141084, -0.00757733007180311,
0.0151353172950393, -0.00114193461500327), `42_LOD.2` = c(0.00719928454572906,
0.00311615354837406, 0.00270759483782046, -0.0108062423259522,
0.00158765505419478, -0.0034831499672973), `45_LOD` = c(0.00557787518897268,
0.022337270533665, 0.00657118689440082, -0.00247269227623608,
0.0191646343214611, 0.0233090596023039), `45_LOD.1` = c(-0.0305395220788143,
0.077105031761457, -0.00101713990356452, 0.0147500116150713,
-5.43009569586179e-05, -0.0235006181977403), `45_LOD.2` = c(-0.0216498682456909,
-0.0413426968184435, -0.0210779895848601, -0.0147549519865421,
0.00305229143870313, -0.0483293292336662), `47_LOD` = c(-0.00467568767221499,
-0.0199796182799552, 0.00985966068611855, -0.031010117051163,
0.0319279109813341, 0.0350743318265918), `47_LOD.1` = c(0.00820166533285921,
-0.00748186905620154, -0.010483251821707, -0.00921919551377505,
0.0129546148757833, 0.000223462281435923), `47_LOD.2` = c(0.00172469728530889,
0.0181683409295075, 0.00264937907258855, -0.0569837400476351,
0.00514558635349483, 0.0963339573489031), `59_LOD` = c(-0.00664210061621158,
-0.062069664217766, 0.0104345353700492, 0.0115323589989968,
-0.000701276829098035, -0.0397759501000331), `59_LOD.1` = c(-0.00844888486350536,
0.0207426674766074, -0.0227755432761471, -0.00370561240222376,
0.0152046240483297, -0.0127327412801225), `59_LOD.2` = c(-0.000546590647534814,
0.0178115310450356, 0.00776130696191998, 0.00162470375408126,
-0.036140754156005, 0.0197791914089296), `61_LOD` = c(0.00797528044191513,
-0.00358928087671818, 0.000662870138322471, -0.0412142836466128,
-0.00571822580078707, -0.0333870884803465), `61_LOD.1` = c(0.000105849888219735,
-0.00694734283847093, -0.00656216592134899, 0.00161225110022219,
0.0125744958934939, -0.0178560868664668), `61_LOD.2` = c(0.0049288443167774,
0.0059411543659837, -0.00165857112209555, -0.0093669075333705,
0.00655185371925189, 0.00516436591134869), `69_LOD` = c(0.0140014747729604,
0.0119645827116724, 0.0059880663080946, -0.00339119330845176,
0.00406436116298777, 0.00374425148741196), `69_LOD.1` = c(0.00465076983995792,
0.00664902297016735, -0.00183936649215524, 0.00496509351837152,
-0.0224812403463345, -0.0193087796456654), `69_LOD.2` = c(-0.00934638876711703,
-0.00802183076602164, 0.00406752039394799, -0.000421337136630527,
-0.00406768983408334, -0.0046016148041856), `71_LOD` = c(-0.00206064862123214,
0.0058604630066848, -0.00353440181333921, -0.000305197461077327,
0.00266085011303462, -0.00105635261106644), `71_LOD.1` = c(3.66652318354654e-06,
0.00542612739642576, 0.000860385212430484, 0.00157520645492044,
-0.00280256517377998, -0.00474358065422048), `71_LOD.2` = c(-0.00167098030843413,
0.0059622082597603, -0.00121597491543965, -0.000791592953383716,
-0.0022790991468459, 0.00508978650148816), `75_LOD` = c(NA,
-0.00562613898652477, -0.000103076958936504, -3.76628574664693e-05,
-0.000325767611573817, 0.000117404893823389), `75_LOD.1` = c(NA,
NA, -0.000496324358203359, -0.000517476831074487, -0.00213096062838051,
-0.00111202867609916), `75_LOD.2` = c(NA, NA, -0.000169651845347418,
-4.72864955070539e-05, -0.00144880109085214, 0.00421635976535877
), `79_LOD` = c(-0.0011901810540199, 0.00731686066269579,
0.00538551997145174, -0.00578723012473479, -0.0030246805255648,
0.00146141135533218), `79_LOD.1` = c(-0.00424278455960268,
-0.010593752642875, 0.0065136497427927, -0.00427355522802769,
0.000539975609490915, -0.0206849687839064), `79_LOD.2` = c(-0.00366739576561779,
-0.00374066839898667, -0.00132764684703939, -0.00534145222725701,
0.00920940542227595, -0.0101871763957068), `85_LOD` = c(-0.0120254177480422,
0.00369546541331518, -0.00420718877886963, 0.00414911885475517,
-0.00130381692844529, -0.00812757789798261), `85_LOD.1` = c(-0.00302024868281014,
0.00537704163310547, 0.00184264538884543, -0.00159032685888543,
-0.0062127769817834, 0.00349476605688194), `85_LOD.2` = c(0.0122689407380797,
-0.00509605601025503, -0.00641413996554198, 0.000592176121486696,
0.00131237912317341, -0.00535018996837309), `87_LOD` = c(0.00613621268007298,
0.000410268892659307, -0.00239014321624482, -0.00171179729894864,
-0.00107159765522861, -0.00708388174601732), `87_LOD.1` = c(0.00144787264098156,
-0.0025946273860992, -0.00194897899110034, 0.00157863310440493,
-0.0048913305554607, -0.000585669821053749), `87_LOD.2` = c(-0.00224691693198253,
-0.00277315666829267, 0.00166487067514155, -0.00173757960229744,
-0.00362252480121682, -0.0101992979591839), `93_LOD` = c(-0.0234225447373586,
0.0390095666365413, 0.00606244490932179, 0.0264258422783391,
0.0161211132913951, -0.0617678157059), `93_LOD.1` = c(-0.0124876313221369,
-0.0309636779639578, 0.00610883313140442, -0.0192442672220773,
0.0129557286224975, -0.00869066964782635), `93_LOD.2` = c(-0.0219837540560547,
-0.00521242297372905, 0.0179965615561871, 0.0081370991723329,
1.45427765512579e-06, -0.0111199632179688), `99_LOD` = c(0.00412086456443205,
-0.00259940538393106, 0.00742537463584133, -0.00302091572866969,
-0.00320466045653491, -0.00168702410433936), `99_LOD.1` = c(0.00280546156134205,
-0.00472591065687533, 0.00518402193979284, -0.00130887074314965,
0.00148769905391341, 0.00366250488078969), `99_LOD.2` = c(-0.00240469207099292,
-9.57307699040024e-05, -0.000145493235845501, 0.000667454164326723,
-0.0057445759245933, 0.00433464631989088), H_LOD = c(-6248.9128518109,
-10081.9540490064, -6696.91582671427, -5414.20614601348,
-3933.64339240365, -13153.7509294302), H_LOD.1 = c(-6.2489128518109,
-10.0819540490064, -6.69691582671427, -5.41420614601348,
-3.93364339240365, -13.1537509294302), H_LOD.2 = c(-6248.9128518109,
-10081.9540490064, -6696.91582671427, -5414.20614601348,
-3933.64339240365, -13153.7509294302)), row.names = c(NA,
6L), class = "data.frame")
Code
LOD_rdu=sapply(split.default(LOD_ut, rep(seq((ncol(LOD_ut) / 3)), each = 3)), function(i)
apply(i, 1, rmse))
And I get this error Error in mse(actual, predicted) :
argument "predicted" is missing, with no default
It is difficult to understand precisely what you need, I will try to answer you,
From wikipedia, RMSD serves to compare a dataset generated by a model (the model in your article i guess) to an observed distribution.
From CRAN, RMSE function in modelr package has two arguments, model and data:
modelr::rmse(model = ,data = )
This function will give you the fit of your model to the data. The first argument is a model, meaning you will probably use a function like lm() to generate it. Because you don't detail the model I can't help you more.
The second argument is the dataset, the one you provide is quite disturbing to me. R will expect a tidy set with two columns x the time of the observation and y the value.
you can first what groups your columns:
GRP = sub("[.][0-9]*","",colnames(LOD_ut))
head(GRP)
[1] "101_LOD" "101_LOD" "101_LOD" "107_LOD" "107_LOD" "107_LOD"
you can also use 1:(ncol(LOD_ut)/3), just that the above gives you the groups back.
We can use the above like this:
LOD_ut[,GRP=="101_LOD"]
101_LOD 101_LOD.1 101_LOD.2
1 -0.0064765606 0.003806255 -0.0028110008
2 0.0064571407 -0.003981150 -0.0033553784
3 0.0017453352 0.001586739 0.0061165252
4 -0.0003546434 -0.005375840 -0.0007381398
5 -0.0005990932 -0.002800484 0.0048503948
6 0.0008618883 0.003482323 0.0041242812
This calls out your first three grouped columns. Now if you do apply(..,1,sd) you get the standard deviation, now we just do it over all groups
SD = sapply(unique(GRP),function(i)apply(LOD_ut[,GRP==i],1,sd,na.rm=T))
If you must do RMSE, use predicted to be the mean:
rmse_func = function(x){
Metrics:::rmse(mean(x,na.rm=T),x[!is.na(x)])
}
RMSE = sapply(unique(GRP),function(i){
apply(LOD_ut[,GRP==i],1,rmse_func)
})