I am a beginner in R. I have the following problem - I want to load a CSV file into R and then convert it into a XTS object. However, after the operation I get an error. First, a small snippet of the data:
a=read.csv('/Users/..../Desktop/SYNEKTIK.csv',h=T)
head(a)
Name Date Open High Low Close Volume
1 SYNEKTIK 20110809 5.76 8.23 5.76 8.23 28062
2 SYNEKTIK 20110810 9.78 9.78 8.10 8.13 9882
3 SYNEKTIK 20110811 9.00 9.00 9.00 9.00 2978
4 SYNEKTIK 20110812 9.70 9.70 8.90 9.60 5748
5 SYNEKTIK 20110816 9.70 11.00 9.70 11.00 23100
6 SYNEKTIK 20110818 10.90 11.00 10.90 10.90 319
The following does not work:
w=xts(a[,-1],order.by=as.POSIXct(a[,1]))
As it produces the following error:
error'as.POSIXlt.character(as.character(x), ...)':
character string is not in a standard unambiguous format
Another try that did not work:
a=a[,-1]
head(a)
Date Open High Low Close Volume
1 20110809 5.76 8.23 5.76 8.23 28062
2 20110810 9.78 9.78 8.10 8.13 9882
3 20110811 9.00 9.00 9.00 9.00 2978
4 20110812 9.70 9.70 8.90 9.60 5748
5 20110816 9.70 11.00 9.70 11.00 23100
6 20110818 10.90 11.00 10.90 10.90 319
w=xts(a[,-1],order.by=as.POSIXct(a[,1]))
error 'as.POSIXct.numeric(a[, 1])':'origin' must be supplied
Finally, when I saved the date in the following format: yyyy -mm - dd Everything turned out right, and I could convert into an XTS object, why?
Maybe something like this will help:
w <- xts(a[,c(-1,-2)],order.by=as.Date(as.character(a[,2]),"%Y%m%d"))
Related
everyone. I am using DLNM in R to analyze to lag-effect of climatic conditions on the prevalence of the disease.
I followed somebody else's program strictly
, and it worked in avg.temp and max.speed, but showed err "coef/vcov not consistent with basis matrix" in avg.ap and avg.hum. However, i just changed the variables set in code, and never changed other code.
I have a hypothesis that maybe DLNM doesn't like wet weather. T T
I don't know what to do, can you help me?
Part 1 was the Successfully run code, part 2 was the code that showed err, and part 3 was the data I used.
Thank you very much. I hope you can help me
Part 1. Successfully run code
attach(cpdlnm)
cb.temp = crossbasis(avg.temp, lag=1 ,
argvar=list(fun="ns",
knots= c(10)),
arglag=list(fun="lin"))
modeltemp = glm(pre1 ~ cb.temp +
ns(no,1*1),
family=quasipoisson(), cpdlnm)
pred1.temp = crosspred(cb.temp,
modelhum,
cen=round(median(avg.temp)),
bylag=1)
Part 2. Error code
attach(cpdlnm)
cb.hum = crossbasis(avg.hum, lag=1 ,
argvar=list(fun="ns",
knots= c(10)),
arglag=list(fun="lin"))
modelhum = glm(pre1 ~ cb.hum +
ns(no,1*1),
family=quasipoisson(), cpdlnm)
pred1.hum = crosspred(cb.hum, # This step shows "coef/vcov not consistent with basis matrix"
modelhum,
cen=round(median(avg.hum)),
bylag=0.1)
Part 3. the data are as following:
no pre1 date year month avg.ap avg.temp avg.hum max.speed
1 3.23 12-Jan 2012 1 996.60 9.00 81.60 5.30
2 6.04 12-Feb 2012 2 993.20 10.90 80.80 6.20
3 5.18 12-Mar 2012 3 991.00 16.40 78.70 7.60
4 4.07 12-Apr 2012 4 985.40 23.50 73.50 7.40
5 4.88 12-May 2012 5 982.60 26.30 77.20 7.00
6 5.11 12-Jun 2012 6 978.10 27.00 81.30 6.20
7 6.18 12-Jul 2012 7 979.50 28.10 77.70 6.40
8 6.17 12-Aug 2012 8 980.40 28.00 75.60 7.90
9 5.18 12-Sep 2012 9 987.60 25.30 73.60 6.30
10 5.16 12-Oct 2012 10 990.70 23.60 72.20 6.20
11 4.61 12-Nov 2012 11 991.70 18.00 79.70 6.90
12 5.26 12-Dec 2012 12 995.00 13.20 74.90 6.50
13 3.79 13-Jan 2013 1 997.10 11.20 78.40 5.70
14 3.87 13-Feb 2013 2 993.50 15.30 82.20 6.50
15 3.37 13-Mar 2013 3 989.90 20.20 74.20 8.00
16 2.85 13-Apr 2013 4 987.00 21.50 78.50 7.70
17 4.38 13-May 2013 5 983.30 25.60 79.20 6.80
18 5.67 13-Jun 2013 6 980.60 27.40 76.90 6.60
19 6.45 13-Jul 2013 7 981.30 28.00 77.50 7.10
20 6.95 13-Aug 2013 8 980.50 27.90 78.20 7.90
21 6.51 13-Sep 2013 9 985.90 25.40 77.60 6.00
22 8.16 13-Oct 2013 10 992.20 22.10 68.80 5.30
23 5.34 13-Nov 2013 11 994.50 18.70 72.30 6.20
24 6.18 13-Dec 2013 12 997.30 11.70 67.20 5.30
25 5.69 14-Jan 2014 1 996.70 12.70 70.30 6.00
26 6.44 14-Feb 2014 2 993.00 12.10 76.90 6.40
27 4.16 14-Mar 2014 3 991.60 16.50 83.90 7.30
28 4.13 14-Apr 2014 4 987.60 22.60 82.40 6.70
29 3.96 14-May 2014 5 983.60 25.70 78.80 7.70
30 4.72 14-Jun 2014 6 979.20 27.70 81.40 7.90
31 5.21 14-Jul 2014 7 980.70 28.30 80.20 9.40
32 5.29 14-Aug 2014 8 982.40 27.50 81.30 7.50
33 6.74 14-Sep 2014 9 984.70 27.10 77.70 8.50
34 4.80 14-Oct 2014 10 991.20 23.90 73.10 5.90
35 4.31 14-Nov 2014 11 993.30 18.60 79.60 6.20
36 4.35 14-Dec 2014 12 998.70 12.30 67.30 5.90
37 2.95 15-Jan 2015 1 996.70 13.30 76.30 6.20
38 4.63 15-Feb 2015 2 993.50 15.50 78.30 6.50
39 4.00 15-Mar 2015 3 991.70 17.70 83.40 6.30
40 4.16 15-Apr 2015 4 988.40 22.80 70.20 7.30
41 4.67 15-May 2015 5 982.40 26.70 80.50 8.00
42 5.62 15-Jun 2015 6 980.90 28.20 81.00 7.40
43 5.04 15-Jul 2015 7 980.20 27.30 79.40 6.70
44 5.79 15-Aug 2015 8 982.40 27.60 80.10 6.50
45 5.28 15-Sep 2015 9 986.30 26.00 84.60 6.50
46 4.39 15-Oct 2015 10 991.20 23.00 78.30 6.90
47 4.13 15-Nov 2015 11 993.50 19.40 85.30 6.90
48 3.30 15-Dec 2015 12 997.80 13.00 80.90 5.70
49 5.30 16-Jan 2016 1 996.00 11.80 82.30 6.40
50 4.57 16-Feb 2016 2 997.80 12.20 68.90 7.00
51 4.66 16-Mar 2016 3 991.70 17.00 78.90 7.00
52 4.01 16-Apr 2016 4 984.60 23.40 80.90 9.80
53 4.90 16-May 2016 5 983.80 25.50 78.70 8.30
54 3.75 16-Jun 2016 6 981.70 28.20 78.80 7.70
55 3.13 16-Jul 2016 7 981.10 28.90 77.60 7.60
56 3.25 16-Aug 2016 8 979.00 28.00 79.80 8.70
57 2.93 16-Sep 2016 9 984.30 26.60 75.20 6.40
58 2.93 16-Oct 2016 10 987.90 24.40 72.90 7.00
59 3.08 16-Nov 2016 11 993.40 18.10 79.60 6.70
60 2.99 16-Dec 2016 12 995.70 15.40 71.70 6.80
61 3.10 17-Jan 2017 1 994.70 14.50 79.20 6.50
62 3.75 17-Feb 2017 2 994.80 14.70 71.50 8.30
63 3.49 17-Mar 2017 3 990.20 16.50 83.60 8.50
64 3.36 17-Apr 2017 4 986.80 21.90 76.70 7.80
65 3.69 17-May 2017 5 985.00 24.80 77.50 10.00
66 3.76 17-Jun 2017 6 980.20 26.90 84.80 8.50
67 2.69 17-Jul 2017 7 981.00 27.50 83.60 9.80
68 3.05 17-Aug 2017 8 980.50 27.70 83.40 9.00
69 3.05 17-Sep 2017 9 984.20 27.60 81.50 7.10
70 2.46 17-Oct 2017 10 990.00 22.80 75.90 7.90
71 2.08 17-Nov 2017 11 993.00 17.80 79.50 7.00
72 2.32 17-Dec 2017 12 996.90 13.30 69.30 6.90
73 2.53 18-Jan 2018 1 992.10 12.00 78.40 8.10
74 3.29 18-Feb 2018 2 992.90 13.40 68.70 7.20
75 3.03 18-Mar 2018 3 988.30 19.20 78.20 9.10
76 2.30 18-Apr 2018 4 986.50 21.80 77.30 8.70
77 1.75 18-May 2018 5 982.60 26.70 79.40 8.90
78 2.03 18-Jun 2018 6 978.30 26.90 81.60 9.00
79 2.79 18-Jul 2018 7 976.80 27.90 82.10 9.20
80 2.32 18-Aug 2018 8 976.40 27.50 83.40 9.60
81 1.88 18-Sep 2018 9 983.50 26.10 80.10 8.90
82 2.76 18-Oct 2018 10 990.50 21.10 78.70 7.10
83 2.14 18-Nov 2018 11 991.50 18.20 80.30 7.10
84 1.78 18-Dec 2018 12 994.50 13.00 84.00 7.80
85 2.77 19-Jan 2019 1 995.20 11.70 84.50 7.30
86 4.60 19-Feb 2019 2 990.50 13.70 84.80 8.10
87 2.32 19-Mar 2019 3 987.70 17.30 85.90 9.90
88 2.07 19-Apr 2019 4 983.60 23.10 84.80 9.80
89 2.97 19-May 2019 5 981.80 24.30 83.20 7.70
90 2.48 19-Jun 2019 6 977.80 27.50 84.80 9.00
91 2.32 19-Jul 2019 7 977.20 27.80 85.00 8.90
92 2.06 19-Aug 2019 8 977.20 28.30 81.20 10.30
93 2.10 19-Sep 2019 9 984.60 26.40 72.70 8.20
94 2.89 19-Oct 2019 10 989.10 22.70 78.00 7.00
My guess is that when you specify "knots= c(10)", 10 is within the range of temperature but not the same for humidity (if the min>10, then the lag can't be defined).
Since my data has summarized counts, I am using the freq function from summmarytools with weights.
With weights, the freq function works fine for summarizing a column when
The column is numeric or integer
The columns is a factor with NA or NaN values
But when
The column is a factor without NA or NaN values then the summary take a level away from the column and displays it in NA!!
I came across this issue in a live case and have reproduced a sample.
library(data.table)
library(summarytools)
dt <- data.table(A= as.integer( c(5,3,4,5,6,1,2,NA,3,NaN)),
B= c(5,3,4,5,6,1,2,NA,3,NaN),
C=as.factor( c(5,3,4,5,6,1,2,NA,3,NA)),
D=as.factor( c(5,3,4,5,6,1,2,NaN,3,NaN)),
E=as.factor( c(5,3,4,5,6,1,2,5,3,3)),
Frequency=c(10,20,30,40,5,60,7,80,99,10)
)
str(dt)
frequency being an integer or numeric does not matter
if we have a factor without Nan or NA values it makes a difference
writeLines("\n\n\n Without weights: No errors")
#summarytools::freq(dt[,1:5]) #Commented to minimize clutter
writeLines("\n\n\n With weights, Column E shows incorrect values but not C and D")
summarytools::freq(dt[,1:5],weights=dt$Frequency)
Without weights: No errors
With weights, Column E shows incorrect values but not C and D
1 NaN value(s) converted to NA
0 NaN value(s) converted to NA
Weighted Frequencies
dt$A
Weights: weights
Freq % Valid % Valid Cum. % Total % Total Cum.
1 60.00 22.14 22.14 16.62 16.62
2 7.00 2.58 24.72 1.94 18.56
3 119.00 43.91 68.63 32.96 51.52
4 30.00 11.07 79.70 8.31 59.83
5 50.00 18.45 98.15 13.85 73.68
6 5.00 1.85 100.00 1.39 75.07
<NA> 90.00 24.93 100.00
Total 361.00 100.00 100.00 100.00 100.00
dt$B
Type: Numeric
Freq % Valid % Valid Cum. % Total % Total Cum.
1 60.00 22.14 22.14 16.62 16.62
2 7.00 2.58 24.72 1.94 18.56
3 119.00 43.91 68.63 32.96 51.52
4 30.00 11.07 79.70 8.31 59.83
5 50.00 18.45 98.15 13.85 73.68
6 5.00 1.85 100.00 1.39 75.07
<NA> 90.00 24.93 100.00
Total 361.00 100.00 100.00 100.00 100.00
dt$C
Type: Factor
Freq % Valid % Valid Cum. % Total % Total Cum.
1 60.00 22.14 22.14 16.62 16.62
2 7.00 2.58 24.72 1.94 18.56
3 119.00 43.91 68.63 32.96 51.52
4 30.00 11.07 79.70 8.31 59.83
5 50.00 18.45 98.15 13.85 73.68
6 5.00 1.85 100.00 1.39 75.07
<NA> 90.00 24.93 100.00
Total 361.00 100.00 100.00 100.00 100.00
dt$D
Type: Factor
Freq % Valid % Valid Cum. % Total % Total Cum.
1 60.00 22.14 22.14 16.62 16.62
2 7.00 2.58 24.72 1.94 18.56
3 119.00 43.91 68.63 32.96 51.52
4 30.00 11.07 79.70 8.31 59.83
5 50.00 18.45 98.15 13.85 73.68
6 5.00 1.85 100.00 1.39 75.07
<NA> 90.00 24.93 100.00
Total 361.00 100.00 100.00 100.00 100.00
dt$E
Type: Factor
Freq % Valid % Valid Cum. % Total % Total Cum.
1 60.00 16.85 16.85 16.62 16.62
2 7.00 1.97 18.82 1.94 18.56
3 129.00 36.24 55.06 35.73 54.29
4 30.00 8.43 63.48 8.31 62.60
5 130.00 36.52 100.00 36.01 98.61
<NA> 5.00 1.39 100.00
Total 361.00 100.00 100.00 100.00 100.00
A fix was issued for this. You can install the latest version from GitHub with:
devtools::install_github("dcomtois/summarytools")
or, to get the latest development version:
devtools::install_github("dcomtois/summarytools", ref = "dev-current)
I have a dataframe of market trades and need to multiply only the put returns by -1. I have the code for that, but can't figure out how to assign it back without affecting the calls.
Input df:
Date Type Stock_Open Stock_Close Stock_ROI
0 2016-04-27 Call 5.33 4.80 -0.099437
1 2016-06-03 Put 4.80 4.52 -0.058333
2 2016-06-30 Call 4.52 5.29 0.170354
3 2016-07-21 Put 5.29 4.84 -0.085066
4 2016-08-08 Call 4.84 5.35 0.105372
5 2016-08-25 Put 5.35 4.65 -0.130841
6 2016-09-21 Call 4.65 5.07 0.090323
7 2016-10-13 Put 5.07 4.12 -0.187377
8 2016-11-04 Call 4.12 4.79 0.162621
Code:
flipped_puts = trades_df[trades_df['Type']=='Put']['Stock_ROI']*-1
trades_df['Stock_ROI'] = flipped_puts
Output of flipped puts:
1 0.058333
3 0.085066
5 0.130841
7 0.187377
Output of original DF:
Date Type Stock_Open Stock_Close Stock_ROI
0 2016-04-27 Call 5.33 4.80 NaN
1 2016-06-03 Put 4.80 4.52 0.058333
2 2016-06-30 Call 4.52 5.29 NaN
3 2016-07-21 Put 5.29 4.84 0.085066
4 2016-08-08 Call 4.84 5.35 NaN
5 2016-08-25 Put 5.35 4.65 0.130841
6 2016-09-21 Call 4.65 5.07 NaN
7 2016-10-13 Put 5.07 4.12 0.187377
8 2016-11-04 Call 4.12 4.79 NaN
try
trades_df.loc[trades_df.Type.eq('Put'), 'Stock_ROI'] *= -1
Or
trades_df.update(trades_df.query('Type == "Put"').Stock_ROI.mul(-1))
both give you
trades_df
We can use data.table from R. Convert the 'data.frame' to 'data.table' (setDT(trades_df)), specify the logical condition in 'i', multiply the 'Stock_ROI' with -1 and assign (:=) it to a new column. The other values will be filled by NA.
library(data.table)
setDT(trades_df)[Type == 'Put', Stock_ROIN := Stock_ROI * -1][]
If we want to update the same column
setDT(trades_df)[Type == 'Put', Stock_ROI := Stock_ROI * -1]
trades_df
# Date Type Stock_Open Stock_Close Stock_ROI
#1: 2016-04-27 Call 5.33 4.80 -0.099437
#2: 2016-06-03 Put 4.80 4.52 0.058333
#3: 2016-06-30 Call 4.52 5.29 0.170354
#4: 2016-07-21 Put 5.29 4.84 0.085066
#5: 2016-08-08 Call 4.84 5.35 0.105372
#6: 2016-08-25 Put 5.35 4.65 0.130841
#7: 2016-09-21 Call 4.65 5.07 0.090323
#8: 2016-10-13 Put 5.07 4.12 0.187377
#9: 2016-11-04 Call 4.12 4.79 0.162621
and want to change the other values to NA
setDT(trades_df)[Type == 'Put', Stock_ROI := Stock_ROI * -1
][Type!= 'Put', Stock_ROI := NA]
trades_df
# Date Type Stock_Open Stock_Close Stock_ROI
#1: 2016-04-27 Call 5.33 4.80 NA
#2: 2016-06-03 Put 4.80 4.52 0.058333
#3: 2016-06-30 Call 4.52 5.29 NA
#4: 2016-07-21 Put 5.29 4.84 0.085066
#5: 2016-08-08 Call 4.84 5.35 NA
#6: 2016-08-25 Put 5.35 4.65 0.130841
#7: 2016-09-21 Call 4.65 5.07 NA
#8: 2016-10-13 Put 5.07 4.12 0.187377
#9: 2016-11-04 Call 4.12 4.79 NA
I have the following 2 dataframes:
> bvg1
Parameters X18.Oct.14 X19.Oct.14 X20.Oct.14 X21.Oct.14 X22.Oct.14 X23.Oct.14 X24.Oct.14
1 24K Equivalent Plan 29.00 29.60 33.80 36.60 35.30 31.90 29.00
2 24K Equivalent Act 28.80 31.00 35.40 35.90 34.70 33.40 31.90
3 Plan Rep WS 2463.00 2513.00 2869.00 3115.00 2999.00 2714.00 2468.00
4 Act Rep WS 2447.00 2633.00 3013.00 3054.00 2953.00 2842.00 2714.00
5 Rep WS Var -16.00 120.00 144.00 -61.00 -46.00 128.00 246.00
6 Plan Rep Intakes 568.00 461.00 1159.00 1146.00 1126.00 1124.00 1106.00
7 Act Rep Intakes 707.00 494.00 1106.00 1096.00 1274.00 1087.00 1101.00
8 Rep Intakes Var 139.00 33.00 -53.00 -50.00 148.00 -37.00 -5.00
9 Plan Rep Comps_DL 468.00 54.00 836.00 1190.00 1327.00 1286.00 1108.00
10 Act Rep Comps_DL 471.00 70.00 995.00 1137.00 1323.00 1150.00 1073.00
11 Rep Comps Var_DL 3.00 16.00 159.00 -53.00 -4.00 -136.00 -35.00
12 Plan Rep Mandays_DL 148.00 19.00 260.00 368.00 412.00 398.00 345.00
13 Act Rep Mandays_DL 147.00 19.00 303.00 359.00 423.00 374.00 348.00
14 Rep Mandays Var_DL -1.00 1.00 43.00 -9.00 12.00 -24.00 3.00
15 Plan FVR Mandays_DL 0.00 0.00 4.00 18.00 18.00 18.00 18.00
16 Act FVR Mandays_DL 0.00 0.00 4.00 7.00 8.00 8.00 7.00
17 FVR Mandays Var_DL 0.00 0.00 0.00 -11.00 -10.00 -10.00 -11.00
18 Plan Rep Prod_DL 3.16 2.88 3.21 3.23 3.22 3.23 3.21
19 Act Rep Prod_DL 3.21 3.62 3.28 3.16 3.12 3.07 3.08
20 Rep Prod Var_DL 0.05 0.74 0.07 -0.07 -0.10 -0.16 -0.13
> bvg2
Parameters X18.Oct X19.Oct X20.Oct X21.Oct X22.Oct X23.Oct X24.Oct
1 24K Equivalent Plan 30.50 31.30 35.10 36.10 33.60 28.80 25.50
2 24K Equivalent Act 31.40 33.40 36.60 38.10 36.80 34.40 32.10
3 Plan Rep WS 3419.00 3509.00 3933.00 4041.00 3764.00 3220.00 2859.00
4 Act Rep WS 3514.00 3734.00 4098.00 4271.00 4122.00 3852.00 3591.00
5 Rep WS Var 95.00 225.00 165.00 230.00 358.00 632.00 732.00
6 Plan Rep Intakes 813.00 613.00 1559.00 1560.00 1506.00 1454.00 1410.00
7 Act Rep Intakes 964.00 602.00 1629.00 1532.00 1657.00 1507.00 1439.00
8 Rep Intakes Var 151.00 -11.00 70.00 -28.00 151.00 53.00 29.00
9 Plan Rep Comps_DL 675.00 175.00 1331.00 1732.00 1938.00 1706.00 1493.00
10 Act Rep Comps_DL 718.00 224.00 1389.00 1609.00 1848.00 1698.00 1537.00
11 Rep Comps Var_DL 43.00 49.00 58.00 -123.00 -90.00 -8.00 44.00
12 Plan Rep Mandays_DL 203.00 58.00 428.00 541.00 605.00 536.00 475.00
13 Act Rep Mandays_DL 215.00 63.00 472.00 542.00 608.00 556.00 523.00
14 Rep Mandays Var_DL 12.00 5.00 44.00 2.00 3.00 20.00 48.00
15 Plan FVR Mandays_DL 0.00 0.00 1.00 12.00 2.00 32.00 57.00
16 Act FVR Mandays_DL 0.00 0.00 2.00 2.00 5.00 5.00 5.00
17 FVR Mandays Var_DL 0.00 0.00 1.00 -10.00 3.00 -27.00 -52.00
18 Plan Rep Prod_DL 3.33 3.03 3.11 3.20 3.20 3.18 3.14
19 Act Rep Prod_DL 3.34 3.56 2.94 2.97 3.04 3.05 2.94
20 Rep Prod Var_DL 0.01 0.53 -0.17 -0.23 -0.16 -0.13 -0.20
It is a time series data i.e. 24K Equivalent Plan was 29 on 18th Oct, 29.60 on 19th Oct and 33.80 on 20th Oct. First dataframe have data for one business unit and second dataframe have the data for a different business unit.
I want to merge dataframes into 1 and want to analyse the variance i.e. where they differ in values. Draw ggplots like 2 histograms showing the difference, timeseries plots etc.
I have tried the following:
I can merge the two dataframes by:
joined = rbind(bvg1, bvg2)
however, i can't identify the record whether it belongs to bvg1 or bvg2 df.
if i add an additional column i.e.
bvg1$id = "bvg1"
bvg2$id = "bvg2"
then merge command doesn't work and gives the following error:
Error in match.names(clabs, names(xi)) :
names do not match previous names
Any sample code would be highly appreciated.
You can match the column names of the two datasets by stripping the . followed by the digits in the bvg1. This can be done using regex. In the below code, a lookbehind regex is used. It matches the lookbehind (?<=[A-Za-]) i.e. an alphabet followed by . followed by one or more elements .* to the end of string $ and remove those "".
colnames(bvg1) <-gsub("(?<=[A-Za-z])\\..*$", "", colnames(bvg1), perl=TRUE)
res <- rbind(bvg1, bvg2)
dim(res)
#[1] 40 9
head(res,3)
# Parameters X18.Oct X19.Oct X20.Oct X21.Oct X22.Oct X23.Oct X24.Oct
#1 24K Equivalent Plan 29.0 29.6 33.8 36.6 35.3 31.9 29.0
#2 24K Equivalent Act 28.8 31.0 35.4 35.9 34.7 33.4 31.9
#3 Plan Rep WS 2463.0 2513.0 2869.0 3115.0 2999.0 2714.0 2468.0
# id
#1 bvg1
#2 bvg1
#3 bvg1
We were shown the following R code in class:
attach(LifeCycleSavings)
boxplot(sr, main = "Box Plot of Savings Ratio")
detach()
However, why would we need to use "detach()" here? I typed "LifeCycleSavings" and still got an output as follows:
> LifeCycleSavings
sr pop15 pop75 dpi ddpi
Australia 11.43 29.35 2.87 2329.68 2.87
Austria 12.07 23.32 4.41 1507.99 3.93
Belgium 13.17 23.80 4.43 2108.47 3.82
The file "LifeCycleSavings" did not get detached.
To answer your specific question, detach in this context is removing that data frame from the search path. This means that you can no longer refer to variable names alone from that data frame:
attach(LifeCycleSavings)
> sr
[1] 11.43 12.07 13.17 5.75 12.88 8.79 0.60 11.90 4.98 10.78 16.85 3.59 11.24 12.64 12.55 10.67 3.01
[18] 7.70 1.27 9.00 11.34 14.28 21.10 3.98 10.35 15.48 10.25 14.65 10.67 7.30 4.44 2.02 12.70 12.78
[35] 12.49 11.14 13.30 11.77 6.86 14.13 5.13 2.81 7.81 7.56 9.22 18.56 7.72 9.24 8.89 4.71
> detach(LifeCycleSavings)
> sr
Error: object 'sr' not found
So at this point if we wanted to use sr we'd need to type LifeCycleSavings$sr in order to tell R where to look.
As Andrie mentioned, many people frown on this sort of use of attach and detach (although detach is sometimes also used for removing packages from the search path) because it can really clutter up your search path.