I have a discharge data that i want to display; observed vs simulated. The data is as follows;
Time observed simulated
Jan-86 0.105 0.1597
Feb-86 0.0933 0.1259
Mar-86 3.5336 0.41
Apr-86 8.8999 2.494
May-86 5.2431 1.767
Jun-86 0.9747 1.96
Jul-86 0.079 1.98
Aug-86 0.0154 1.729
Sep-86 0.0053 1.419
Oct-86 0.0135 1.121
Nov-86 0.0235 0.8664
Dec-86 0.017 0.658
Jan-87 0.017 0.4925
Feb-87 0.017 0.3855
Mar-87 3.3483 1.089
Apr-87 3.3156 1.704
May-87 0.5563 1.327
Jun-87 0.2565 1.166
Jul-87 0.0446 1.012
Aug-87 0.0096 0.8278
Sep-87 0.0007 0.6567
Oct-87 0.0018 0.5083
Nov-87 0.0139 0.3892
Dec-87 0.0087 0.2953
Jan-88 0.0025 0.2196
Feb-88 0.0017 0.1641
Mar-88 0.0099 0.3858
Apr-88 1.6217 3.929
May-88 0.3398 0.5156
Jun-88 0.762 0.5537
Jul-88 0.0242 0.4985
Aug-88 0.0002 0.4125
Sep-88 0.0003 0.4027
Oct-88 0 0.2918
Nov-88 0 0.2388
Dec-88 0.0005 0.2024
Jan-89 0.0003 0.147
Feb-89 0.0004 0.1157
Mar-89 0.0006 0.3886
Apr-89 6.5433 10.92
May-89 0.8047 1.685
Jun-89 0.7968 1.486
Jul-89 0.0836 1.407
Aug-89 0.0024 1.22
Sep-89 0.0001 0.9965
Oct-89 0 0.7846
Nov-89 0.0005 0.6097
Dec-89 0 0.4636
Jan-90 0 0.3469
Feb-90 0 0.271
Mar-90 0.2724 0.9063
Apr-90 0.3768 2.902
May-90 0.0776 0.5038
Jun-90 0.1327 0.5622
Jul-90 0.0636 0.5068
Aug-90 0.0005 0.4169
Sep-90 0 0.3328
Oct-90 0 0.2611
Nov-90 0 0.2016
Dec-90 0 0.1549
Jan-91 0 0.116
Feb-91 0.0004 0.0904
Mar-91 0.0024 0.0709
Apr-91 0.0056 0.3813
May-91 0.1312 0.6567
Jun-91 0.1033 0.6053
Jul-91 1.1491 0.6226
Aug-91 0.0957 0.5423
Sep-91 0.01 0.4529
Oct-91 0.009 0.374
Nov-91 0.0436 0.3132
Dec-91 0.0629 0.2344
Jan-92 0.0238 0.1775
Feb-92 0.0125 0.1378
Mar-92 2.4242 3.399
Apr-92 2.9119 4.284
May-92 1.0843 1.854
Jun-92 0.1473 1.7
Jul-92 0.3467 1.451
Aug-92 0.0143 1.182
Sep-92 0.0193 2.272
Oct-92 0.035 1.332
Nov-92 0.0132 1.181
Dec-92 0.0353 0.9716
Jan-93 0.0213 0.7097
Feb-93 0.0196 0.5596
Mar-93 0.2553 5.669
Apr-93 3.4093 4.912
May-93 0.4553 1.575
Jun-93 1.4621 1.56
Jul-93 2.7732 2.622
Aug-93 7.4911 1.587
Sep-93 7.7134 1.381
Oct-93 0.4065 1.133
Nov-93 0.3042 0.9257
Dec-93 0.1669 0.7514
Jan-94 0.0756 0.5657
Feb-94 0.0317 0.4464
Mar-94 1.3576 3.802
Apr-94 1.5093 4.446
May-94 0.8696 1.246
Jun-94 0.3097 1.426
Jul-94 4.1223 1.66
Aug-94 0.6915 0.7939
Sep-94 3.9228 0.6434
Oct-94 1.5528 0.5081
Nov-94 3.0506 0.3907
Dec-94 0.6294 0.3053
Jan-95 0.2484 0.2327
Feb-95 0.1053 0.1842
Mar-95 9.4852 7.073
Apr-95 3.8737 3.122
May-95 3.0692 1.754
Jun-95 0.3433 1.386
Jul-95 2.6554 1.297
Aug-95 0.3252 0.9797
Sep-95 0.2854 0.7803
Oct-95 0.2667 0.6097
Nov-95 0.1444 0.4692
Dec-95 0.1098 0.355
Jan-96 0.0696 0.265
Feb-96 0.0399 0.4352
Mar-96 0.0419 0.2793
Apr-96 16.2771 17.33
May-96 25.3653 21.04
Jun-96 0.4064 4.901
Jul-96 0.3028 3.886
Aug-96 0.097 3.1
Sep-96 0.0325 2.51
Oct-96 0.0949 2.009
Nov-96 0.2763 1.614
Dec-96 0.1307 1.252
Jan-97 0.0778 0.9253
Feb-97 0.0661 0.7211
Mar-97 0.0703 0.7519
Apr-97 27.3434 21.65
May-97 4.2895 7.989
Jun-97 0.4939 3.661
Jul-97 6.7193 3.92
Aug-97 0.1174 2.802
Sep-97 0.0858 2.229
Oct-97 2.0501 1.789
Nov-97 0.891 1.644
Dec-97 0.3561 1.288
Jan-98 0.133 0.94
Feb-98 0.8482 2.56
Mar-98 7.2317 6.613
Apr-98 3.7604 4.181
May-98 3.039 2.323
Jun-98 5.3291 2.492
Jul-98 5.6387 2.607
Aug-98 0.1308 1.943
Sep-98 0.0937 1.647
Oct-98 1.4565 1.641
Nov-98 0.7778 1.563
Dec-98 0.5755 1.692
Jan-99 0.0573 1.65
Feb-99 0.0783 1.489
Mar-99 2.3554 7.688
Apr-99 25.3018 18.41
May-99 8.7571 5.154
Jun-99 14.8313 3.564
Jul-99 4.7535 2.423
Aug-99 3.6622 1.898
Sep-99 5.0639 1.524
Oct-99 0.9153 1.186
Nov-99 0.4436 0.905
Dec-99 0.181 0.6864
Jan-00 0.1015 0.5129
Feb-00 1.9763 0.3953
Mar-00 2.5832 0.3083
Apr-00 3.6585 0.2388
May-00 0.9701 0.182
Jun-00 7.1744 0.1605
Jul-00 1.7145 0.1494
Aug-00 0.6677 0.1364
Sep-00 0.1858 0.1195
Oct-00 1.1442 0.0997
Nov-00 15.1503 0.6839
Dec-00 0.5526 0.4275
01-Jan 0.182 0.6061
01-Feb 0.1582 0.5254
01-Mar 0.7527 0.437
01-Apr 18.8305 21
01-May 4.0794 2.765
01-Jun 1.7906 5.399
01-Jul 0.2344 2.615
01-Aug 2.8721 1.896
01-Sep 0.108 1.555
01-Oct 0.0896 1.237
01-Nov 0.6865 0.9588
01-Dec 0.1609 0.7329
02-Jan 0.0987 0.5496
02-Feb 0.081 0.4299
02-Mar 0.0671 0.4125
02-Apr 1.9161 5.189
02-May 2.8088 2.423
02-Jun 18.2132 2.137
02-Jul 2.881 2.783
02-Aug 0.676 1.102
02-Sep 1.309 0.892
02-Oct 0.1844 0.7183
02-Nov 0.1415 0.56
02-Dec 0.0781 0.4277
03-Jan 0.0897 0.3211
03-Feb 0.0191 0.2515
03-Mar 1.1978 2.32
03-Apr 1.4536 2.175
03-May 1.2194 0.9472
03-Jun 2.2049 0.7456
03-Jul 0.1934 0.6395
03-Aug 0.0362 0.5237
03-Sep 0.0047 0.4738
03-Oct 0.0338 0.3477
03-Nov 0.1166 0.2821
03-Dec 0.0301 0.2319
04-Jan 0.0151 0.1851
04-Feb 0.0218 0.1462
04-Mar 2.9284 3.967
04-Apr 5.113 8.21
04-May 14.4488 6.077
04-Jun 8.7876 4.92
04-Jul 0.7572 2.781
04-Aug 0.3186 2.023
04-Sep 1.7134 1.648
04-Oct 0.834 1.385
04-Nov 1.5215 1.571
04-Dec 0.1535 1.175
05-Jan 0.0515 0.8762
05-Feb 0.0535 0.7016
05-Mar 0.5916 2.954
05-Apr 10.2761 12.22
05-May 4.3927 3.95
05-Jun 12.6566 8.826
05-Jul 13.6267 4.855
05-Aug 11.4682 3.241
05-Sep 1.2082 2.454
05-Oct 1.1875 1.986
05-Nov 1.5555 1.566
05-Dec 0.3229 1.294
06-Jan 0.1832 1.055
06-Feb 0.112 0.885
06-Mar 0.3341 3.006
06-Apr 24.8525 19.75
06-May 6.2187 4.442
06-Jun 0.3634 2.697
06-Jul 0.0534 1.889
06-Aug 0.0439 1.571
06-Sep 0.02 1.261
06-Oct 0.0418 0.9836
06-Nov 0.0612 0.7535
06-Dec 0.0747 0.5717
07-Jan 0.0644 0.43
07-Feb 0.0339 0.3319
07-Mar 2.8046 2.675
07-Apr 2.7156 3.412
07-May 0.5788 2.576
07-Jun 8.5705 9.888
07-Jul 1.3929 2.897
07-Aug 0.1146 1.758
07-Sep 0.0374 1.486
07-Oct 0.1637 1.338
07-Nov 0.1599 1.2
07-Dec 0.1165 0.9649
08-Jan 0.054 0.7372
08-Feb 0.024 0.5469
08-Mar 0.04 0.6989
08-Apr 2.3773 9.219
08-May 1.3455 3.223
08-Jun 1.4375 4.011
08-Jul 0.531 2.341
08-Aug 0.0512 1.618
08-Sep 0.0902 1.377
08-Oct 2.8219 1.115
08-Nov 4.7166 0.9028
08-Dec 0.3393 0.8564
09-Jan 0.1303 0.6376
09-Feb 0.1594 0.7089
09-Mar 10.3111 5.402
09-Apr 14.466 14.64
09-May 6.0214 13.73
09-Jun 5.4491 6.086
09-Jul 7.4774 4.059
09-Aug 0.4845 2.885
09-Sep 0.1321 2.208
09-Oct 0.0935 1.755
09-Nov 0.1702 1.367
09-Dec 0.0786 1.183
10-Jan 0.049 1.461
10-Feb 0.0502 0.8349
10-Mar 9.9809 7.328
10-Apr 2.1785 5.341
10-May 5.54 9.544
10-Jun 6.5798 10.35
10-Jul 1.4304 5.972
10-Aug 0.3424 3.768
10-Sep 8.7223 3.844
10-Oct 5.7656 4.88
10-Nov 3.7897 4.978
10-Dec 0.5271 3.289
I tried the following codes to display the data
require(xts)
data <- read.csv('./flowout13.csv')
dd1<-data.frame(data[2:3])
dd1<-ts(dd1,frequency = 12,start = 1986)
plot(as.xts(dd1),major.format="%y-%m")
title(main="Calibrated observed and simulated discharge",xlab="Time",ylab="discharge in mm")
legend("topleft", inset=0.10, title="Discharge",
c("observed","simulated","r2=0.8", "NSE=0.60"), fill=terrain.colors(2), horiz=FALSE)
And the graph does not show the actual color of the graphs.I want the black lines as observed and red as simulated but it shows different.i do not want the r2 and NSE be in any color they are just the values, i added from different calculations. I also want to change the position of xlab below the dates. Please help out. I am working on r studio.
Is this what you're looking for?
plot(as.xts(dd1), major.format="%y-%m", col = terrain.colors(2))
Related
I'm trying to integrate a list created by predict(smooth.spline) but I'm getting the following error: Error in stats::integrate(...) :
evaluation of function gave a result of wrong length.
predict(smooth.spline(x,y) gives:
$x
[1] 0.000 0.033 0.067 0.100 0.133 0.167 0.200 0.233 0.267 0.300 0.333 0.367 0.400 0.433 0.467 0.500
[17] 0.533 0.567 0.600 0.633 0.667 0.700 0.733 0.767 0.800 0.833 0.867 0.900 0.933 0.967 1.000 1.033
[33] 1.067 1.100 1.133 1.167 1.200 1.233 1.267 1.300 1.333 1.367 1.400 1.433 1.467 1.500 1.533 1.567
[49] 1.600 1.633 1.667 1.700 1.733 1.767 1.800 1.833 1.867 1.900 1.933 1.967 2.000 2.033 2.067 2.100
[65] 2.133 2.167 2.200 2.233 2.267 2.300 2.333 2.367 2.400 2.433 2.467 2.500 2.533 2.567 2.600 2.633
[81] 2.667 2.700 2.733 2.767 2.800 2.833 2.867 2.900 2.933 2.967 3.000 3.033 3.067 3.100 3.133 3.167
[97] 3.200 3.233 3.267 3.300 3.333 3.367 3.400 3.433 3.467 3.500 3.533 3.567 3.600 3.633 3.667 3.700
[113] 3.733 3.767 3.800 3.833 3.867 3.900 3.933 3.967 4.000 4.033 4.067 4.100 4.133 4.167 4.200 4.233
[129] 4.267 4.300 4.333 4.367 4.400 4.433 4.467 4.500 4.533 4.567 4.600 4.633 4.667 4.700 4.733 4.767
[145] 4.800 4.833 4.867 4.900 4.933 4.967 5.000 5.033 5.067 5.100 5.133 5.167 5.200 5.233 5.267 5.300
[161] 5.333 5.367 5.400 5.433 5.467 5.500 5.533 5.567 5.600 5.633 5.667 5.700 5.733 5.767 5.800 5.833
[177] 5.867 5.900 5.933 5.967 6.000 6.033 6.067 6.100 6.133 6.167 6.200 6.233 6.267 6.300 6.333 6.367
[193] 6.400 6.433 6.467 6.500 6.533 6.567 6.600 6.633 6.667 6.700 6.733 6.767 6.800 6.833 6.867 6.900
[209] 6.933 6.967 7.000 7.033 7.067 7.100 7.133 7.167 7.200 7.233 7.267 7.300 7.333 7.367 7.400 7.433
[225] 7.467 7.500 7.533 7.567 7.600 7.633 7.667 7.700 7.733 7.767 7.800 7.833 7.867 7.900 7.933 7.967
[241] 8.000 8.033 8.067 8.100 8.133 8.167 8.200 8.233 8.267 8.300 8.333 8.367 8.400 8.433 8.467 8.500
[257] 8.533 8.567 8.600 8.633 8.667 8.700 8.733 8.767 8.800 8.833 8.867 8.900 8.933 8.967 9.000 9.033
[273] 9.067 9.100 9.133 9.167 9.200 9.233 9.267 9.300 9.333 9.367 9.400 9.433 9.467 9.500 9.533 9.567
[289] 9.600 9.633 9.667 9.700 9.733 9.767 9.800 9.833 9.867 9.900 9.933 9.967 10.000 10.033 10.067 10.100
$y
[1] 59.96571 182.14589 308.06545 430.28967 552.13181 676.76001 796.27007 913.45605 1030.73901 1140.24735
[11] 1244.62019 1345.89199 1437.37738 1521.99577 1601.97896 1672.60118 1736.28174 1794.58753 1844.06630 1886.59891
[21] 1923.24013 1952.04715 1974.93273 1993.22884 2006.84446 2017.75964 2027.59482 2036.61631 2045.82650 2056.14890
[31] 2067.21217 2079.44489 2093.29127 2107.48046 2121.84443 2136.20938 2149.03007 2160.03152 2168.83055 2174.72156
[41] 2177.92034 2178.50434 2177.25261 2175.18231 2173.05271 2171.23280 2169.75413 2168.60865 2167.58021 2166.28136
[51] 2164.31765 2161.56924 2157.84126 2153.06845 2147.68110 2141.80856 2135.99289 2131.40947 2128.57716 2127.73980
[61] 2129.07173 2132.52768 2137.84677 2144.15311 2151.04004 2158.20845 2164.72665 2170.38182 2175.16221 2178.72060
[71] 2181.26140 2183.34329 2185.47108 2188.20964 2191.71999 2195.72978 2200.17822 2204.67512 2208.37304 2210.99201
[81] 2212.16148 2211.52661 2209.27941 2205.52709 2200.82773 2195.80333 2191.14046 2187.86227 2186.22909 2186.61490
[91] 2189.21504 2193.74033 2200.00587 2207.23478 2215.15186 2223.55507 2231.56558 2239.35648 2247.15616 2254.58452
[101] 2262.25845 2270.90839 2280.40791 2291.00929 2302.93232 2315.07098 2327.30700 2339.53707 2350.58890 2360.39110
[111] 2368.83106 2375.48715 2380.80457 2385.21836 2389.36786 2394.40853 2401.47143 2410.55245 2422.11132 2436.78865
[121] 2453.43711 2472.06315 2492.92121 2514.41941 2536.79884 2560.48574 2584.09299 2608.55242 2635.61496 2664.80169
[131] 2697.76567 2735.79016 2776.54744 2820.81417 2868.96931 2916.89215 2964.73344 3012.72300 3056.87880 3097.62601
[141] 3135.48071 3167.79172 3195.56342 3220.27772 3241.55129 3261.03300 3279.41808 3295.63106 3310.16876 3323.00826
[151] 3332.94381 3340.03845 3344.39672 3345.94806 3345.34005 3343.03700 3339.80326 3336.46397 3333.90149 3333.10272
[161] 3334.29421 3337.81087 3343.53943 3351.20699 3360.65966 3370.86645 3381.56693 3392.54603 3402.66565 3411.98625
[171] 3420.52889 3427.65472 3433.82738 3439.48350 3444.52521 3449.15602 3453.47469 3457.18103 3460.26646 3462.61691
[181] 3463.90801 3464.03740 3462.81764 3460.39884 3456.89191 3452.34917 3447.51817 3442.81170 3438.49642 3434.61442
[191] 3430.68032 3426.12851 3420.51956 3412.97424 3402.44270 3389.08015 3372.22571 3350.92543 3326.65679 3299.18832
[201] 3267.98034 3235.60437 3201.97284 3166.74241 3132.31425 3097.84231 3062.28419 3027.69000 2992.94842 2956.82062
[211] 2921.23160 2884.94573 2846.71167 2808.67879 2769.66061 2728.44573 2687.49711 2645.56586 2600.90609 2555.63728
[221] 2507.95605 2455.68553 2401.27869 2342.78231 2278.34602 2212.01091 2142.26985 2067.55831 1993.06085 1917.46648
[231] 1839.35164 1764.18963 1690.48889 1616.92292 1548.58020 1483.78349 1421.22958 1365.02723 1313.47540 1265.38224
[241] 1223.67578 1186.75059 1153.52704 1125.77912 1102.26304 1082.24588 1066.67248 1054.56916 1045.35940 1039.20608
[251] 1035.34023 1033.24970 1032.58511 1032.85175 1033.69725 1034.73437 1035.66522 1036.21146 1036.16962 1035.42480
[261] 1033.76896 1031.12350 1027.27529 1021.86005 1014.99372 1006.33762 995.34857 982.53272 967.47341 949.51507
[271] 929.75179 907.75896 882.86053 856.68919 828.72692 798.22411 767.10143 734.59731 699.82246 665.13042
[281] 629.85926 593.30425 558.00149 523.24723 488.37898 455.79640 424.78607 394.77350 367.79586 343.17422
[291] 320.38235 300.83710 283.85695 268.87085 256.54269 246.16897 237.22002 229.91066 223.66652 218.11256
[301] 213.36419 209.04868 204.88159 200.94805
smooth <- predict(smooth.spline(x,y))
Then I pass this data to the function command:
func <- function(x) smooth
#Attempt to integrate
integrate(func,0,10)$value
Error in stats::integrate(...) :
evaluation of function gave a result of wrong length
I get the same error when I attempt to Vectorize the function
> integrate(Vectorize(func),0,10)$value
Error in stats::integrate(...) :
evaluation of function gave a result of wrong length
Ultimately, I'm trying to find an upper limit to the integral with a given Area under the curve but I can't even seem to complete the integral function.
You didn't include any reproducible data, so I can't test this advice for you, but here are two suggestions.
First: if you are starting with the smooth object that has evenly spaced x values and corresponding y values from predictions from the spline, then don't bother with integrate(), just use the trapezoidal rule to approximate the integral:
with(smooth, (x[2]-x[1])*(sum(y) - mean(y[c(1, length(y))])))
The Simpson's rule formula would be a bit more accurate but also more complicated.
Second: if you are starting with data vectors x and y, then you should construct a function which takes a vector of new x values and returns the corresponding predictions of y, and pass that function to integrate(). Here I do it that way:
fit <- smooth.spline(x, y)
smooth <- function(x) predict(fit, x)$y
integrate(smooth, 0, 10)
I am running PCR on a data set, but my results from PCR is giving me the same values for both CV and adjCV, is this correct or there is anything wrong with the data.
Here is my code:
pcr <- pcr(F1~., data = data, scale = TRUE, validation = "CV")
summary(PCR)
validationplot(pcr)
validationplot(pcr, val.type = "MSEP")
validationplot(pcr, val.type = "R2")
predplot(pcr)
coefplot(PCR)
set.seed(123)
ind <- sample(2, nrow(data), replace = TRUE,
prob = c(0.8,0.2))
train <- data[ind ==1,]
test <- data[ind ==2,]
pcr_train <- pcr(F1~., data = train, scale =TRUE, validation = "CV")
y_test <- test[, 1]
pcr_pred <- predict(pcr, test, ncomp = 4)
mean((pcr_pred - y_test) ^2)
And I am getting this error when I print the mean command
Warning in mean.default((pcr_pred - y_test)^2) :
argument is not numeric or logical: returning NA
Sample data:
F1 F2 F3 F4 F5
4.378 2.028 -5.822 -3.534 -0.546
4.436 2.064 -5.872 -3.538 -0.623
4.323 1.668 -5.954 -3.304 -0.782
5.215 3.319 -5.863 -4.139 -0.632
4.074 1.497 -6.018 -3.176 -0.697
4.403 1.761 -6 -3.339 -0.847
4.99 3.105 -5.985 -3.97 -0.638
4.783 2.968 -5.94 -3.903 -0.481
4.361 1.786 -5.866 -3.397 -0.685
4.594 1.958 -5.985 -3.457 -0.91
0.858 -4.734 -6.104 -0.692 -0.87
0.878 -3.846 -6.289 -1.064 -0.618
0.876 -4.479 -6.148 -0.803 -0.801
0.937 -5.498 -5.958 -0.376 -1.184
0.953 -4.71 -6.123 -0.705 -0.96
0.738 -5.386 -5.877 -0.444 -0.884
0.833 -5.562 -5.937 -0.343 -1.104
1.184 -3.52 -6.221 -1.234 -0.38
1.3 -4.129 -6.168 -0.963 -0.73
3.359 -3.618 -5.302 0.481 -0.649
3.483 -2.938 -5.361 0.157 -0.482
3.673 -3.779 -5.326 0.516 -1.053
2.521 -6.577 -4.499 1.861 -1.374
2.52 -4.757 -4.866 1.182 -0.736
2.482 -4.732 -4.857 1.142 -0.708
2.543 -6.699 -4.496 1.947 -1.426
2.458 -3.182 -5.219 0.514 -0.255
2.558 -5.66 -4.757 1.558 -1.142
2.627 -1.806 -5.313 -1.808 1.054
3.773 -0.526 -5.236 -0.6 -0.23
3.65 -0.954 -4.97 -0.361 -0.413
3.816 -1.18 -5.228 -0.284 -0.575
3.752 -0.522 -5.346 -0.562 -0.293
3.961 -0.24 -5.423 -0.69 -0.408
3.734 -0.711 -5.307 -0.479 -0.347
4.094 -0.415 -5.103 -0.729 -0.35
3.894 -0.957 -5.133 -0.435 -0.457
3.741 -0.484 -5.363 -0.574 -0.279
3.6 -0.698 -5.422 -0.435 -0.306
3.845 -0.351 -5.306 -0.666 -0.269
3.886 -0.481 -5.332 -0.596 -0.39
3.552 -2.106 -5.043 0.128 -0.634
4.336 -10.323 -2.95 3.346 -3.494
3.918 -0.809 -5.315 -0.442 -0.567
3.757 -0.502 -5.347 -0.572 -0.288
3.712 -0.627 -5.353 -0.505 -0.314
3.954 -0.72 -5.492 -0.428 -0.691
4.088 -0.588 -5.412 -0.53 -0.688
3.728 -0.641 -5.338 -0.505 -0.321
I have the below data set:
Profit
MRO 15x5
D30
$150.00
-9.189
-0.24
$12.50
-6.076
-0.248
-$125.00
-7.699
-0.282
-$162.50
-8.008
-0.281
-$175.00
-0.183
-0.056
-$175.00
-0.235
-0.061
$275.00
0.141
-0.027
-$175.00
-4.062
-0.103
-$162.50
-5.654
-0.258
-$162.50
-1.578
-0.051
-$175.00
-3.336
-0.205
-$162.50
-1.523
-0.022
$412.50
-1.524
-0.194
$337.50
-1.049
-0.055
$100.00
-1.043
-0.059
I want to first arrange column D30 in ascending order and then look into the Profit column. If the top n row and bottom n row values (a range of cells) are less than -50 in the Profit column then delete the entire row in the data set.
The result would be like this:
Profit
MRO 15x5
D30
$275.00
0.141
-0.027
-$162.50
-1.578
-0.051
$337.50
-1.049
-0.055
-$175.00
-0.183
-0.056
$100.00
-1.043
-0.059
-$175.00
-0.235
-0.061
-$175.00
-4.062
-0.103
$412.50
-1.524
-0.194
-$175.00
-3.336
-0.205
$150.00
-9.189
-0.24
$12.50
-6.076
-0.248
This output is the result of the deletion of the top 1st row and bottom 3 rows from the entire data set as these rows (range of values) were having Profit values less than -50.
Can anyone please help me to do this in the R program using dplyr or by using some other filtering packages?
I would be thankful for your kind support.
Regards,
Farhan
Use cumany. Combined with filter, it removes rows until a criterion is met (here Profit <= -50).
The first command is a way to parse your Profit column into a numeric column.
library(dplyr)
data %>% mutate(Profit = parse_number(str_replace(Profit,"^-\\$(.*)$", "$-\\1"))) %>%
arrange(D30) %>%
filter(cumany(Profit > -50)) %>%
arrange(desc(D30)) %>%
filter(cumany(Profit > -50))
Profit MRO_15x5 D30
1 275.0 0.141 -0.027
2 -162.5 -1.578 -0.051
3 337.5 -1.049 -0.055
4 -175.0 -0.183 -0.056
5 100.0 -1.043 -0.059
6 -175.0 -0.235 -0.061
7 -175.0 -4.062 -0.103
8 412.5 -1.524 -0.194
9 -175.0 -3.336 -0.205
10 150.0 -9.189 -0.240
11 12.5 -6.076 -0.248
I'm trying to declare the colorAxis and let a series of computed "Scores" define the gradient for coloring the bubbles. The visualization just keeps giving me random colors, all with the "OutlierScore" next to them on an ugly legend to the right of the plot. I don't understand what I'm doing wrong as my options list matches all of the demo codes I find. I'm using the final gvisBubbleChart statement as the output to my renderGvis code in server.R.
Here's some sample data:
Attribute CloseRate Quotes OutlierScore Size
AdvancedShopper:N 0.261 3411 292.47 1.016
AdvancedShopper:Y 0.119 10421 259.68 2.283
PriorCarrier:HP 0.277 1876 186.46 0.739
Vehicles:1 0.183 8784 179.98 1.988
Vehicles:2 0.106 3471 121.81 1.027
LeadType:Cold 0.104 3177 117.09 0.974
SPINOFF:Y 0.414 510 115.65 0.492
LeadType:Warm 0.223 2184 115.47 0.795
MULTI_CAR_DSCNT_FLG:HMC 0.303 879 107.88 0.559
MULTI_CAR_DSCNT_FLG:MC 0.111 3451 105.75 1.024
PRI_CARR_NME:HP 0.253 1287 100.58 0.633
PriorCarrier:GEICO 0.099 2476 99.74 0.847
PriorCarrier:No Prior Insurance 0.304 802 99.61 0.545
PRI_CARR_NME:No Prior Insurance 0.304 802 99.61 0.545
FR_BAND:P-R 0.112 3227 98.15 0.983
PIP_DED:2,500 0.197 3053 95.11 0.952
AgencyName:South Agency 0.213 2120 94.81 0.783
RSrc:SPIN-OFF Additional Policy 0.434 373 91.99 0.467
CompanionType:None 0.141 11332 87.60 2.448
D2V:D1V1 0.175 5830 85.67 1.454
Here's my gvisBubbleChart declaration.
YLim = c(0,max(GData$Quotes)*1.05)
XLim = c(0,max(GData$CloseRate)*1.01)
gvisBubbleChart(GData, idvar="Attribute", xvar="CloseRate", yvar="Quotes", colorvar="OutlierScore", sizevar="Size",
options=list(title="One-Way Bubble Chart",
hAxis=paste("{title: 'Close Rate', minValue:0, maxValue:",XLim[2],"}",sep=""),
vAxis=paste("{title: 'Quotes', minValue:0, maxValue:",YLim[2],"}",sep=""),
width=1400, height=600, colorAxis="{minValue: 0, colors: ['red', 'green']}",
sizeAxis = '{minValue: 0, maxSize: 10}',
bubble="{textStyle:{color: 'none'}}"))
I have a data set for different time intervals. The data has three comment lines before data for each time interval. For each time interval there are 500 data points. I want to change the dataset such that I have the following format:
t1 t2 t3 ................
0.00208 0.00417 0.00625 .................
a1 a2 a3 ...................
b1 b2 b3 ...................
c1 c2 c3 .................
...............................
................................
The link to the file is as follows: https://www.dropbox.com/s/hc8n3qcai1mlxca/WAT_DEP.DAT
As you will see on the file, time for each interval is the second data of the third line before the data starts. For the first time, t= 0.00208. I need to change the data in several rows into one column. At last I need to create a dataframe with the format shown above. In the sample above, a1, b1, c1 are the data for time t1, and so on.
I am sorry for posting a relatively large data set.
Thank you for the help.
Sample data added
The sample data is as follows:
** N:SNAPSHOT TIME DELT[S]
** WATER DEPTH [M]: (HP(L),L=2,LA)
1800 0.00208 0.10000
3.224 3.221 3.220 3.217 3.216 3.214 3.212 3.210 3.209 3.207
3.205 3.203 3.202 3.200 3.199 3.197 3.196 3.193 3.192 3.190
3.189 3.187 3.186 3.184 3.184 3.182 3.181 3.179 3.178 3.176
3.175 3.174 3.173 3.171 3.170 3.169 3.168 3.167 3.166 3.164
3.164 3.162 3.162 3.160 3.160 3.158 3.158 3.156 3.156 3.155
3.154 3.153 3.152 3.151 3.150 3.150 3.149 3.149 3.147 3.147
3.146 3.146 3.145 3.145 3.144 3.144 3.143 3.143 3.142 3.142
3.141 3.142 3.141 3.141 3.140 3.141 3.140 3.140 3.139 3.140
3.139 3.140 3.139 3.140 3.139 3.140 3.139 3.140 3.139 3.140
3.139 3.140 3.140 3.140 3.140 3.141 3.141 3.142 3.141 3.142
3.142 3.142 3.143 3.143 3.144 3.144 3.145 3.145 3.146 3.146
3.147 3.148 3.149 3.149 3.150 3.150 3.152 3.152 3.153 3.154
3.155 3.156 3.157 3.158 3.159 3.160 3.161 3.162 3.163 3.164
3.165 3.166 3.168 3.169 3.170 3.171 3.173 3.174 3.176 3.176
3.178 3.179 3.181 3.182 3.184 3.185 3.187 3.188 3.190 3.191
3.194 3.195 3.196 3.198 3.199 3.202 3.203 3.205 3.207 3.209
3.210 3.213 3.214 3.217 3.218 3.221 3.222 3.225 3.226 3.229
3.231 3.233 3.235 3.238 3.239 3.242 3.244 3.247 3.248 3.251
3.253 3.256 3.258 3.261 3.263 3.266 3.268 3.271 3.273 3.276
3.278 3.281 3.283 3.286 3.289 3.292 3.294 3.297 3.299 3.303
3.305 3.307 3.311 3.313 3.317 3.319 3.322 3.325 3.328 3.331
3.334 3.337 3.340 3.343 3.347 3.349 3.353 3.356 3.359 3.362
3.366 3.369 3.372 3.375 3.379 3.382 3.386 3.388 3.392 3.395
3.399 3.402 3.406 3.409 3.413 3.416 3.420 3.423 3.427 3.430
3.435 3.438 3.442 3.445 3.449 3.453 3.457 3.460 3.464 3.468
3.472 3.475 3.479 3.483 3.486 3.491 3.494 3.498 3.502 3.506
3.510 3.514 3.518 3.522 3.526 3.531 3.534 3.539 3.542 3.547
3.551 3.555 3.559 3.564 3.567 3.572 3.576 3.581 3.584 3.589
3.593 3.598 3.602 3.606 3.610 3.615 3.619 3.624 3.628 3.633
3.637 3.642 3.646 3.651 3.655 3.660 3.664 3.669 3.673 3.678
3.682 3.686 3.691 3.695 3.700 3.704 3.710 3.714 3.719 3.723
3.728 3.733 3.738 3.742 3.747 3.752 3.757 3.761 3.766 3.771
3.776 3.780 3.786 3.790 3.795 3.800 3.805 3.810 3.815 3.819
3.825 3.829 3.835 3.839 3.845 3.849 3.855 3.859 3.865 3.869
3.875 3.879 3.885 3.889 3.895 3.900 3.905 3.910 3.915 3.920
3.926 3.930 3.935 3.941 3.945 3.951 3.956 3.961 3.966 3.972
3.976 3.982 3.987 3.993 3.997 4.003 4.008 4.014 4.018 4.024
4.029 4.035 4.039 4.045 4.050 4.056 4.061 4.066 4.071 4.077
4.082 4.088 4.093 4.099 4.103 4.109 4.114 4.120 4.125 4.131
4.136 4.142 4.147 4.153 4.157 4.163 4.168 4.174 4.179 4.185
4.190 4.195 4.201 4.206 4.212 4.217 4.223 4.228 4.234 4.239
4.245 4.250 4.256 4.261 4.267 4.272 4.278 4.283 4.289 4.294
4.300 4.305 4.311 4.316 4.322 4.327 4.333 4.339 4.345 4.350
4.356 4.361 4.367 4.372 4.378 4.383 4.389 4.394 4.400 4.405
4.411 4.417 4.423 4.428 4.434 4.439 4.445 4.450 4.456 4.461
4.467 4.473 4.478 4.484 4.489 4.495 4.500 4.506 4.511 4.517
4.523 4.529 4.534 4.540 4.545 4.551 4.556 4.562 4.568 4.574
4.579 4.585 4.590 4.596 4.601 4.607 4.613 4.619 4.624 4.630
4.635 4.641 4.646 4.652 4.658 4.664 4.669 4.675 4.680 4.686
4.691 4.697 4.703 4.709 4.714 4.720 4.725 4.731 4.736 4.741
** N:SNAPSHOT TIME DELT[S]
** WATER DEPTH [M]: (HP(L),L=2,LA)
3600 0.00417 0.10000
4.124 4.123 4.123 4.122 4.122 4.121 4.121 4.120 4.120 4.119
4.118 4.117 4.117 4.116 4.116 4.115 4.115 4.114 4.114 4.114
4.114 4.113 4.113 4.112 4.112 4.111 4.111 4.110 4.110 4.109
4.109 4.109 4.109 4.108 4.108 4.107 4.107 4.106 4.107 4.106
4.106 4.105 4.105 4.105 4.105 4.104 4.104 4.104 4.104 4.103
4.103 4.103 4.102 4.102 4.102 4.102 4.101 4.102 4.101 4.101
4.101 4.101 4.100 4.101 4.100 4.101 4.100 4.100 4.100 4.100
4.100 4.100 4.100 4.100 4.100 4.100 4.100 4.100 4.100 4.100
4.100 4.100 4.100 4.100 4.100 4.100 4.100 4.100 4.100 4.101
4.100 4.101 4.100 4.101 4.101 4.101 4.101 4.102 4.101 4.102
4.102 4.101 4.102 4.102 4.103 4.102 4.103 4.103 4.104 4.103
4.104 4.104 4.105 4.104 4.105 4.105 4.106 4.106 4.107 4.106
4.107 4.107 4.108 4.108 4.109 4.109 4.110 4.110 4.110 4.110
4.111 4.111 4.112 4.112 4.113 4.113 4.114 4.114 4.115 4.115
4.116 4.116 4.117 4.117 4.118 4.118 4.120 4.120 4.121 4.121
4.122 4.122 4.122 4.123 4.123 4.125 4.125 4.126 4.126 4.127
4.128 4.129 4.129 4.130 4.130 4.132 4.132 4.133 4.133 4.135
4.135 4.136 4.137 4.138 4.138 4.139 4.140 4.141 4.141 4.143
4.143 4.145 4.145 4.146 4.147 4.148 4.149 4.150 4.150 4.152
4.152 4.154 4.154 4.156 4.156 4.158 4.158 4.160 4.160 4.162
4.162 4.163 4.164 4.165 4.166 4.167 4.168 4.169 4.171 4.171
4.173 4.173 4.175 4.176 4.177 4.178 4.180 4.180 4.182 4.183
4.184 4.185 4.187 4.187 4.189 4.190 4.192 4.192 4.194 4.195
4.197 4.197 4.199 4.200 4.202 4.203 4.204 4.205 4.207 4.208
4.210 4.210 4.212 4.213 4.215 4.216 4.218 4.219 4.221 4.221
4.223 4.224 4.225 4.227 4.228 4.230 4.231 4.233 4.234 4.236
4.237 4.239 4.240 4.242 4.243 4.245 4.246 4.248 4.249 4.251
4.252 4.254 4.255 4.257 4.258 4.260 4.262 4.264 4.265 4.267
4.268 4.270 4.271 4.273 4.275 4.277 4.278 4.280 4.281 4.283
4.285 4.287 4.288 4.290 4.291 4.294 4.295 4.297 4.298 4.301
4.302 4.303 4.305 4.307 4.309 4.310 4.312 4.314 4.316 4.317
4.320 4.321 4.323 4.325 4.327 4.328 4.331 4.332 4.334 4.336
4.338 4.339 4.342 4.343 4.346 4.347 4.349 4.351 4.353 4.355
4.357 4.359 4.361 4.362 4.365 4.366 4.369 4.370 4.373 4.374
4.377 4.378 4.381 4.382 4.385 4.386 4.389 4.390 4.393 4.394
4.397 4.398 4.400 4.402 4.404 4.406 4.408 4.411 4.412 4.415
4.416 4.419 4.421 4.423 4.425 4.427 4.429 4.432 4.433 4.436
4.437 4.440 4.442 4.444 4.446 4.449 4.450 4.453 4.455 4.457
4.459 4.462 4.463 4.466 4.468 4.470 4.472 4.475 4.476 4.479
4.481 4.484 4.485 4.488 4.490 4.492 4.494 4.497 4.499 4.501
4.503 4.505 4.508 4.509 4.512 4.514 4.517 4.519 4.521 4.523
4.526 4.528 4.530 4.532 4.535 4.537 4.540 4.541 4.544 4.546
4.549 4.551 4.554 4.555 4.558 4.560 4.563 4.565 4.568 4.569
4.572 4.574 4.577 4.579 4.582 4.584 4.586 4.588 4.591 4.593
4.596 4.598 4.601 4.603 4.605 4.607 4.610 4.612 4.615 4.617
4.620 4.622 4.624 4.627 4.628 4.631 4.633 4.636 4.638 4.641
4.643 4.646 4.648 4.651 4.653 4.656 4.657 4.660 4.662 4.665
4.667 4.670 4.672 4.675 4.677 4.680 4.682 4.685 4.687 4.690
4.692 4.695 4.697 4.700 4.702 4.705 4.706 4.709 4.711 4.714
4.716 4.719 4.721 4.724 4.726 4.729 4.731 4.734 4.736 4.741
Currently, I have data of 10 columns for each time. I want to create that as one single column of 500 data points. So, I want to arrange the data columns such that first the data on row 1 will be used and then data on second row and so on. This way, we will have one column for one time.
This produces a matrix, result, containing the times in the first row and the data in columns underneath the corresponding time.
L <- readLines(infile)
nt <- length(grep("TIME", L)) # no. of TIME lines
nd <- round((length(L) / nt) - 3) # no. of data lines per time
# times
ix.times <- rep(c(FALSE, TRUE, FALSE), c(2, 1, nd))
times <- scan(text = L[ix.times]) [ c(FALSE, TRUE, FALSE) ]
# data
ix.dat <- rep(c(FALSE, TRUE), c(3, nd))
dat <- matrix(scan(text = L[ix.dat]), nc = nt)
result <- rbind(times, dat)
The first few rows are:
> head(result)
[,1] [,2]
times 0.00208 0.00417
3.22400 4.12400
3.22100 4.12300
3.22000 4.12300
3.21700 4.12200
3.21600 4.12200
For the first part of your question : On idea to remove the comments lines is to use recycling. First, I read all the data using fill=TRUE then:
dat <- read.table(file=file.Name,fill=TRUE)
Then, since you have fixed number of rows, you can do this :
dat <- dat[c(rep(FALSE,3),rep(TRUE,500)),]
You will get a clean data.frame .
I don't get your second part of the question.
Second part solution:
First, call the sample data as sample. I assume two columns in the solution below. You can use lapplyto apply to other columns.
col.1<-as.data.frame(sample[,1])
col.2<-as.data.frame(sample[,2])
Now col.1 and col.2 are dataframes. Try to have the same colnames for `rbind` to work.
sample.1<-rbind(col.1,col.2)