I have an output generated by the cuts functions which is the one below...lets call this ouput 'data'.
cuts: [20,25)
Time Kilometres
21 20 7.3
22 21 8.4
23 22 9.5
24 23 10.6
25 24 11.7
------------------------------------------------------------
cuts: [25,30)
Time Kilometres
26 25 12.8
27 26 13.9
28 27 15.0
29 28 16.1
30 29 17.2
------------------------------------------------------------
cuts: [30,35)
Time Kilometres
31 30 18.3
32 31 19.4
33 32 20.5
34 33 21.6
35 34 22.7
How could I access the data in each cut..like get the kilometres data from cuts:[20,25]..etc I tried doing data$Kilometres...but this does not work...So I basically want a new data frame where I could use the kilometres data seperately for each cut
The output of by here is a list, so you can use basic list indexing, either by number or name. Using the data from your question from a few hours ago, and the answer from Matthew Lundberg, we can index as follows:
> x[[1]]
Time Velocity
1 0.0 0.00
2 1.5 1.21
3 3.0 1.26
4 4.5 1.31
> x[["[6,12)"]]
Time Velocity
5 6.0 1.36
6 7.5 1.41
7 9.0 1.46
8 10.5 1.51
You can review the structure of objects in R by using str. This is usually useful to help you decide how you can extract certain information. Here's str(x):
> str(x)
List of 7
$ [0,6) :Classes ‘AsIs’ and 'data.frame': 4 obs. of 2 variables:
..$ Time : num [1:4] 0 1.5 3 4.5
..$ Velocity: num [1:4] 0 1.21 1.26 1.31
$ [6,12) :Classes ‘AsIs’ and 'data.frame': 4 obs. of 2 variables:
..$ Time : num [1:4] 6 7.5 9 10.5
..$ Velocity: num [1:4] 1.36 1.41 1.46 1.51
$ [12,18):Classes ‘AsIs’ and 'data.frame': 6 obs. of 2 variables:
..$ Time : num [1:6] 12 13 14 15 16 17
..$ Velocity: num [1:6] 1.56 1.61 1.66 1.71 1.76 1.81
$ [18,24):Classes ‘AsIs’ and 'data.frame': 5 obs. of 2 variables:
..$ Time : num [1:5] 18 19 20 21 22.5
..$ Velocity: num [1:5] 1.86 1.91 1.96 2.01 2.06
$ [24,30):Classes ‘AsIs’ and 'data.frame': 4 obs. of 2 variables:
..$ Time : num [1:4] 24 25.5 27 28.5
..$ Velocity: num [1:4] 2.11 2.16 2.21 2.26
$ [30,36):Classes ‘AsIs’ and 'data.frame': 4 obs. of 2 variables:
..$ Time : num [1:4] 30 31.5 33 34.5
..$ Velocity: num [1:4] 2.31 2.36 2.41 2.42
$ [36,42):Classes ‘AsIs’ and 'data.frame': 1 obs. of 2 variables:
..$ Time : num 36
..$ Velocity: num 2.43
- attr(*, "dim")= int 7
- attr(*, "dimnames")=List of 1
..$ cuts: chr [1:7] "[0,6)" "[6,12)" "[12,18)" "[18,24)" ...
- attr(*, "call")= language by.data.frame(data = mydf, INDICES = cuts, FUN = I)
- attr(*, "class")= chr "by"
From this we can see that we have a named list of seven items, and each list contains a data.frame. Thus, if we wanted a vector of just the "Velocity" variable (second column) for the third interval, we would use something like:
> x[[3]][[2]]
[1] 1.56 1.61 1.66 1.71 1.76 1.81
Related
I have a dataset and would like to take a lot of subsets based on various columns, values, and conditional operators. I think the most desirable output is a list containing all of these subsetted data frames as separate elements in the list. I attempted to do this by building a data frame that contains the subset conditions I would like to use, building a function, then using apply to feed that data frame to the function, but that didn't work. I'm sure there's probably a better method that uses an anonymous function or something like that, but I'm not sure how I would implement that. Below is an example code that should produce 8 subsets of data.
Original dataset, where x1 and x2 are scored on items that won't be used for subsetting and RT and LS are the variables that will be a subset on:
df <- data.frame(x1 = rnorm(100),
x2 = rnorm(100),
RT = abs(rnorm(100)),
LS = sample(1:10, 100, replace = T))
Dataframe containing the conditions for subsetting. E.g., the first subset of data should be any observations with values greater than or equal to 0.5 in the RT column, the second subset should be any observations greater than or equal to 1 in the subset column, etc. There should be 8 subsets, 4 done on the RT variable and 4 done on the LS variable.
subsetConditions <- data.frame(column = rep(c("RT", "LS"), each = 4),
operator = rep(c(">=", "<="), each = 4),
value = c(0.5, 1, 1.5, 2,
9, 8, 7, 6))
And this is the ugly function I wrote to attempt to do this:
subsetFun <- function(x){
subset(df, eval(parse(text = paste(x))))
}
subsets <- apply(subsetConditions, 1, subsetFun)
Thanks for any help!
Consider Map (wrapper to mapply) without any eval + parse. Since ==, <=, >=, and other operators can be used as functions with two arguments where 4 <= 5 can be written as `<=`(4,5) or "<="(4, 5), simply pass arguments elementwise and use get to reference the function by string:
sub_data <- function(col, op, val) {
df[get(op)(df[[col]], val),]
}
sub_dfs <- with(subsetConditions, Map(sub_data, column, operator, value))
Output
str(sub_dfs)
List of 8
$ RT:'data.frame': 62 obs. of 4 variables:
..$ x1: num [1:62] -1.12 -0.745 -1.377 0.848 1.63 ...
..$ x2: num [1:62] -0.257 -2.385 0.805 -0.313 0.662 ...
..$ RT: num [1:62] 0.693 1.662 0.731 2.145 0.543 ...
..$ LS: int [1:62] 5 5 1 2 9 1 5 9 3 10 ...
$ RT:'data.frame': 36 obs. of 4 variables:
..$ x1: num [1:36] -0.745 0.848 0.908 -0.761 0.74 ...
..$ x2: num [1:36] -2.3849 -0.3131 -2.4645 -0.0784 0.8512 ...
..$ RT: num [1:36] 1.66 2.15 1.74 1.65 1.13 ...
..$ LS: int [1:36] 5 2 1 5 9 10 2 7 1 3 ...
$ RT:'data.frame': 14 obs. of 4 variables:
..$ x1: num [1:14] -0.745 0.848 0.908 -0.761 -1.063 ...
..$ x2: num [1:14] -2.3849 -0.3131 -2.4645 -0.0784 -2.9886 ...
..$ RT: num [1:14] 1.66 2.15 1.74 1.65 2.63 ...
..$ LS: int [1:14] 5 2 1 5 5 6 9 4 8 4 ...
$ RT:'data.frame': 3 obs. of 4 variables:
..$ x1: num [1:3] 0.848 -1.063 0.197
..$ x2: num [1:3] -0.313 -2.989 0.709
..$ RT: num [1:3] 2.15 2.63 2.05
..$ LS: int [1:3] 2 5 6
$ LS:'data.frame': 92 obs. of 4 variables:
..$ x1: num [1:92] -1.12 -0.745 -1.377 0.848 0.612 ...
..$ x2: num [1:92] -0.257 -2.385 0.805 -0.313 0.958 ...
..$ RT: num [1:92] 0.693 1.662 0.731 2.145 0.489 ...
..$ LS: int [1:92] 5 5 1 2 1 9 1 5 9 3 ...
$ LS:'data.frame': 78 obs. of 4 variables:
..$ x1: num [1:78] -1.12 -0.745 -1.377 0.848 0.612 ...
..$ x2: num [1:78] -0.257 -2.385 0.805 -0.313 0.958 ...
..$ RT: num [1:78] 0.693 1.662 0.731 2.145 0.489 ...
..$ LS: int [1:78] 5 5 1 2 1 1 5 3 5 2 ...
$ LS:'data.frame': 75 obs. of 4 variables:
..$ x1: num [1:75] -1.12 -0.745 -1.377 0.848 0.612 ...
..$ x2: num [1:75] -0.257 -2.385 0.805 -0.313 0.958 ...
..$ RT: num [1:75] 0.693 1.662 0.731 2.145 0.489 ...
..$ LS: int [1:75] 5 5 1 2 1 1 5 3 5 2 ...
$ LS:'data.frame': 62 obs. of 4 variables:
..$ x1: num [1:62] -1.12 -0.745 -1.377 0.848 0.612 ...
..$ x2: num [1:62] -0.257 -2.385 0.805 -0.313 0.958 ...
..$ RT: num [1:62] 0.693 1.662 0.731 2.145 0.489 ...
..$ LS: int [1:62] 5 5 1 2 1 1 5 3 5 2 ...
You were actually pretty close with your function, but just needed to make an adjustment. So, with paste for each row, you need to collapse all 3 columns so that it is only 1 string rather than 3, then it can properly evaluate the expression.
subsetFun <- function(x){
subset(df, eval(parse(text = paste(x, collapse = ""))))
}
subsets <- apply(subsetConditions, 1, subsetFun)
Output
Then, it will return the 8 subsets.
str(subsets)
List of 8
$ :'data.frame': 67 obs. of 4 variables:
..$ x1: num [1:67] -1.208 0.606 -0.17 0.728 -0.424 ...
..$ x2: num [1:67] 0.4058 -0.3041 -0.3357 0.7904 -0.0264 ...
..$ RT: num [1:67] 1.972 0.883 0.598 0.633 1.517 ...
..$ LS: int [1:67] 8 9 2 10 8 5 3 4 7 2 ...
$ :'data.frame': 35 obs. of 4 variables:
..$ x1: num [1:35] -1.2083 -0.4241 -0.0906 0.9851 -0.8236 ...
..$ x2: num [1:35] 0.4058 -0.0264 1.0054 0.0653 1.4647 ...
..$ RT: num [1:35] 1.97 1.52 1.05 1.63 1.47 ...
..$ LS: int [1:35] 8 8 5 4 7 3 1 6 8 6 ...
$ :'data.frame': 16 obs. of 4 variables:
..$ x1: num [1:16] -1.208 -0.424 0.985 0.99 0.939 ...
..$ x2: num [1:16] 0.4058 -0.0264 0.0653 0.3486 -0.7562 ...
..$ RT: num [1:16] 1.97 1.52 1.63 1.85 1.8 ...
..$ LS: int [1:16] 8 8 4 6 10 2 6 6 3 9 ...
$ :'data.frame': 7 obs. of 4 variables:
..$ x1: num [1:7] 0.963 0.423 -0.444 0.279 0.417 ...
..$ x2: num [1:7] 0.6612 0.0354 0.0555 0.1253 -0.3056 ...
..$ RT: num [1:7] 2.71 2.15 2.05 2.01 2.07 ...
..$ LS: int [1:7] 2 6 9 9 7 7 4
$ :'data.frame': 91 obs. of 4 variables:
..$ x1: num [1:91] -0.952 -1.208 0.606 -0.17 -0.048 ...
..$ x2: num [1:91] -0.645 0.406 -0.304 -0.336 -0.897 ...
..$ RT: num [1:91] 0.471 1.972 0.883 0.598 0.224 ...
..$ LS: int [1:91] 6 8 9 2 1 8 4 5 3 4 ...
$ :'data.frame': 75 obs. of 4 variables:
..$ x1: num [1:75] -0.952 -1.208 -0.17 -0.048 -0.424 ...
..$ x2: num [1:75] -0.6448 0.4058 -0.3357 -0.8968 -0.0264 ...
..$ RT: num [1:75] 0.471 1.972 0.598 0.224 1.517 ...
..$ LS: int [1:75] 6 8 2 1 8 4 5 3 4 1 ...
$ :'data.frame': 65 obs. of 4 variables:
..$ x1: num [1:65] -0.9517 -0.1698 -0.048 0.2834 -0.0906 ...
..$ x2: num [1:65] -0.645 -0.336 -0.897 -2.072 1.005 ...
..$ RT: num [1:65] 0.471 0.598 0.224 0.486 1.053 ...
..$ LS: int [1:65] 6 2 1 4 5 3 4 1 7 4 ...
$ :'data.frame': 58 obs. of 4 variables:
..$ x1: num [1:58] -0.9517 -0.1698 -0.048 0.2834 -0.0906 ...
..$ x2: num [1:58] -0.645 -0.336 -0.897 -2.072 1.005 ...
..$ RT: num [1:58] 0.471 0.598 0.224 0.486 1.053 ...
..$ LS: int [1:58] 6 2 1 4 5 3 4 1 4 2 ...
I am trying to run a shapiro-wilk normality test on R (Rcmdr to be more accurate) by going to "Statistics=>Summary=>Descriptive statistics" and then selecting one of my dependent variable and choosing "summary by group".
Rcmdr automatically triggers the following code :
normalityTest(Algometre.J0 ~ Modalite, test="shapiro.test",
data=Dataset)
And I am getting the following error message :
'groups' must be a factor.
I have already categorized my independant variable as a factor (I swear, I did !)
Any idea what's wrong ?
Thanx in advance
Here is what str(Dataset) shows :
'data.frame': 76 obs. of 11 variables:
$ Modalite : chr "C" "C" "C" "C" ...
$ Angle.J0 : num 20.1 20.5 21 22.5 19.1 ...
$ Angle.J1 : num 21.7 22.6 22.8 23.3 20.5 ...
$ Angle.J2 : num 22.3 23 23.9 24.2 21 ...
$ Epaisseur.J0: num 1.97 1.54 1.76 1.89 1.53 1.87 1.54 2 1.79 1.41 ...
$ Epaisseur.J1: num 2.07 1.49 1.87 1.91 1.54 1.9 1.51 2.03 1.71 1.48 ...
$ Epaisseur.J2: num 2.08 1.69 1.77 2 1.61 1.99 1.38 2.06 1.86 1.53 ...
$ Algometre.J0: num 45 40 105 165 66.3 ...
$ Algometre.J1: num 32.7 39.7 91.7 124 63.7 ...
$ Algometre.J2: num 51.3 58.7 101 138 60.3 ...
$ ObsNumber : int 1 2 3 4 5 6 7 8 9 10 ...
What does that mean ?
This question already has answers here:
Can lists be created that name themselves based on input object names?
(4 answers)
Closed 2 years ago.
I have several lists (ListA, ListB, ListC...) with the same internal structure as the example below. I would like to combine all of them, keeping their structure, and have one list with all lists (ListAll). How can I do this?
Example:
I have:
ListA
$ data :'data.frame': 1 obs. of 2 variables:
..$ mean: num -0.128
..$ sd : num 1.11
$ simulations :'data.frame': 1000 obs. of 2 variables:
..$ mean: num [1:1000] -0.0116 -0.0156 0.0336 -0.0502 -0.0427 ...
..$ sd : num [1:1000] 1.003 1.014 0.963 1.036 1.051 ...
$ values:'data.frame': 35 obs. of 2 variables:
..$ C: num [1:35] 3.45 2.91 2.62 2.06 1.87 ...
..$ D: num [1:35] 5.42 2.89 3.34 1.68 1.43 ...
and several lists with the same structure.
I would like to get:
ListAll
$ ListA
$ data :'data.frame': 1 obs. of 2 variables:
..$ mean: num -0.128
..$ sd : num 1.11
$ simulations :'data.frame': 1000 obs. of 2 variables:
..$ mean: num [1:1000] -0.0116 -0.0156 0.0336 -0.0502 -0.0427 ...
..$ sd : num [1:1000] 1.003 1.014 0.963 1.036 1.051 ...
$ values:'data.frame': 35 obs. of 2 variables:
..$ C: num [1:35] 3.45 2.91 2.62 2.06 1.87 ...
..$ D: num [1:35] 5.42 2.89 3.34 1.68 1.43 ...
$ ListB
$ data :'data.frame': 1 obs. of 2 variables:
..$ mean: num -0.132
..$ sd : num 1.01
$ simulations :'data.frame': 1000 obs. of 2 variables:
..$ mean: num [1:1000] -0.0114 -0.0123 0.0378 -0.0102 -0.0340 ...
..$ sd : num [1:1000] 1.013 1.011 0.876 1.012 1.023 ...
$ values:'data.frame': 35 obs. of 2 variables:
..$ C: num [1:35] 4.41 1.61 1.42 1.96 2.07 ...
..$ D: num [1:35] 2.41 2.19 2.54 2.08 2.53 ...
** and names(listAll) would be:**
ListaA, ListB, ListC...
You can create a list of lists in base R.
ListAll <- list(ListA, ListB, ListC)
I am trying to extract the value 10.02798 (Mode 1 Forecasts)from the following results but all variations of p$ that I tried didn't work. I am using the predict function from the arfima package in R.
>library(arfima)
>set.seed(82365)
>sim <- arfima.sim(1000, model = list(dfrac = 0.4, theta=0.9, dint = 1))
>fit <- arfima(sim, order = c(0, 1, 1))
>p<- predict(fit, n.ahead = 5)
> p
$`Mode 1`
$`Mode 1`$`Forecasts and SDs`
1 2 3 4 5
Forecasts 10.02798 10.05937 10.08246 10.10474 10.12719
Exact SD 1.03035 1.14641 1.22365 1.28863 1.34813
Limiting SD 1.03028 1.14627 1.22343 1.28834 1.34774
When list items are not named, you can refer to them by their index in the list using double bracket syntax. For example, the first item has index 1, so you would extract this element like list[[1]].
In your case, the object p is a list, and its first item is an unnamed list which contains the table as shown in your question. You can extract the forecasts like so:
p[[1]]$Forecast
The first element of this vector is 10.02798, which is what you're after. So you can do
p[[1]]$Forecast[1]
The first two list items in p are not named for whatever reason, thus the double bracket syntax is required to get the first item in p. But the list items within that item are named, so you can use the $ syntax for those.
If you want to remain consistent with your extraction syntax, you can instead do
p[[1]][["Forecast"]][1]
Or even
p[[1]][[1]][1]
I think p[[1]]$Forecast[1] is just fine though.
In general, to view the structure of an object, you can use the str() function. In this case:
str(p)
#List of 9
# $ :List of 9
# ..$ Forecast : num [1:5] 10 10.1 10.1 10.1 10.1
# ..$ exactVar : num [1:5] 1.06 1.31 1.5 1.66 1.82
# ..$ exactSD : num [1:5] 1.03 1.15 1.22 1.29 1.35
# ..$ uppernp : num [1:5] 12 12.2 12.4 12.7 12.9
# ..$ lowernp : num [1:5] 7.59 7.33 7 7.03 6.73
# ..$ meanvalnp: num [1:5] 9.86 9.83 9.88 9.82 9.83
# ..$ limitVar : num [1:5] 1.06 1.31 1.5 1.66 1.82
# ..$ limitSD : num [1:5] 1.03 1.15 1.22 1.29 1.35
# ..$ sigma2 : num 1.06
# $ :List of 9
# ..$ Forecast : num [1:5] 10 10.1 10.1 10.1 10.1
# ..$ exactVar : num [1:5] 1.07 1.43 1.7 1.93 2.14
# ..$ exactSD : num [1:5] 1.04 1.2 1.3 1.39 1.46
# ..$ uppernp : num [1:5] 12 14.2 15.7 17.7 20.2
# ..$ lowernp : num [1:5] 5.96 4.02 1.89 -0.29 -2.69
# ..$ meanvalnp: num [1:5] 9.03 9.04 9.01 8.91 8.97
# ..$ limitVar : num [1:5] 1.07 1.43 1.7 1.93 2.14
# ..$ limitSD : num [1:5] 1.04 1.2 1.3 1.39 1.46
# ..$ sigma2 : num 1.07
# $ z : Time-Series [1:1000] from 1 to 1000: -0.829 0.1149 1.261 -0.0427 0.1901 ...
# $ seed : logi NA
# $ limiting: logi TRUE
# $ bootpred: logi TRUE
# $ B : num 1000
# $ predint : num 0.95
# $ name : chr "fit"
# - attr(*, "class")= chr "predarfima"
This shows that p is a list with 9 items, the first of which is itself a list with 9 items and doesn't have a name. This has an item called Forecast which is the vector you want.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have the results from a describe.by{psych} applied on a dataframe. The results is a list.
List of 1000
$ 1 :Classes ‘psych’, ‘describe’ and 'data.frame': 20 obs. of 13 variables:
..$ var : int [1:20] 1 2 3 4 5 6 7 8 9 10 ...
..$ n : num [1:20] 24 24 24 24 24 24 24 24 24 24 ...
..$ mean : num [1:20] 24 30.8 24 31.6 240 ...
..$ sd : num [1:20] 0.937 3.667 0.937 3.537 9.367 ...
..$ median : num [1:20] 23.9 31 23.9 31.9 238.6 ...
..$ trimmed : num [1:20] 24 30.9 24 31.7 239.7 ...
..$ mad : num [1:20] 1.11 4.12 1.11 3.29 11.09 ...
..$ min : num [1:20] 22.6 24 22.6 25.3 225.9 ...
..$ max : num [1:20] 25.6 36.9 25.6 36.9 256 ...
..$ range : num [1:20] 3 12.9 3 11.6 30 ...
..$ skew : num [1:20] 0.309 -0.258 0.309 -0.411 0.309 ...
..$ kurtosis: num [1:20] -1.163 -0.898 -1.163 -0.819 -1.163 ...
..$ se : num [1:20] 0.191 0.749 0.191 0.722 1.912 ...
$ 2 :Classes ‘psych’, ‘describe’ and 'data.frame': 20 obs. of 13 variables:
..$ var : int [1:20] 1 2 3 4 5 6 7 8 9 10 ...
..$ n : num [1:20] 7 7 7 7 7 7 7 7 7 7 ...
..$ mean : num [1:20] 16.3 39.3 16.3 40.7 162.9 ...
..$ sd : num [1:20] 0.609 8.045 0.609 8.394 6.086 ...
..$ median : num [1:20] 16.4 39.1 16.4 39.6 164.2 ...
..$ trimmed : num [1:20] 16.3 39.3 16.3 40.7 162.9 ...
I would like to plot a graph ( probably candlestick) or boxplots with this sample for each of the 13 metrics. Is there a package in which I can directly leverage the summary stats computed ?
You question is vague.
describeBy( describe.by is deprecated) , Report basic summary statistics by a grouping variable.
So I guess that a boxplot it is the nearest plot.
For example :
describeBy(sat.act,sat.act$gender)
group: 1
var n mean sd median trimmed mad min max range skew kurtosis se
gender 1 247 1.00 0.00 1 1.00 0.00 1 1 0 NaN NaN 0.00
education 2 247 3.00 1.54 3 3.12 1.48 0 5 5 -0.54 -0.60 0.10
age 3 247 25.86 9.74 22 24.23 5.93 14 58 44 1.43 1.43 0.62
ACT 4 247 28.79 5.06 30 29.23 4.45 3 36 33 -1.06 1.89 0.32
SATV 5 247 615.11 114.16 630 622.07 118.61 200 800 600 -0.63 0.13 7.26
SATQ 6 245 635.87 116.02 660 645.53 94.89 300 800 500 -0.72 -0.12 7.41
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
group: 2
var n mean sd median trimmed mad min max range skew kurtosis se
gender 1 453 2.00 0.00 2 2.00 0.00 2 2 0 NaN NaN 0.00
education 2 453 3.26 1.35 3 3.40 1.48 0 5 5 -0.74 0.27 0.06
age 3 453 25.45 9.37 22 23.70 5.93 13 65 52 1.77 3.03 0.44
ACT 4 453 28.42 4.69 29 28.63 4.45 15 36 21 -0.39 -0.42 0.22
SATV 5 453 610.66 112.31 620 617.91 103.78 200 800 600 -0.65 0.42 5.28
SATQ 6 442 596.00 113.07 600 602.21 133.43 200 800 600 -0.58 0.13 5.38
>
You can plot this like :
boxplot(sat.act,sat.act$gender, col ='pink')