How to move selected matrix rows to top of matrix based on a selection vector of row names - r

I have a matrix that has been ordered by rowSums(). I now want to take a selected few of these rows, by passing a char vector of row names, and easily move them back at the top of the matrix while keeping the moved rows in the same order as they are in the selection vector.
I've tried to do this with various combinations of subset() or just straight index selection, but I can never get the resulting matrix in the order I want, if it works at all. I feel like there has to be a more straightforward way to do this.
Let's say I have a matrix mat ordered by rowSums():
sam1 sam2 sam3 sam4 sam5
sig1 1 2 3 4 5
sig2 6 7 8 9 10
sig3 11 12 13 14 15
sig4a 16 17 18 19 20
sig4b 21 22 23 24 25
sig4c 26 27 28 29 30
sig5 31 32 33 34 35
sig6 36 37 38 39 40
sig7a 41 42 43 44 45
aig7b 46 47 48 49 50
And I want to take a select number of rows I'm interested in:
select = c('sig6','sig4a','sig2')
And move them back to the top of the matrix, while keeping them in the order in the select vector, while leaving the remaining unselected rows below them to get a new matrix:
sam1 sam2 sam3 sam4 sam5
sig6 36 37 38 39 40 *
sig4a 16 17 18 19 20 *
sig2 6 7 8 9 10 *
sig1 1 2 3 4 5
sig3 11 12 13 14 15
sig4b 21 22 23 24 25
sig4c 26 27 28 29 30
sig5 31 32 33 34 35
sig7a 41 42 43 44 45
aig7b 46 47 48 49 50
Is there a straightforward way to do this that doesn't involve making intermediate matrices or complicated workarounds? It seems like there should be, but I haven't been able to find a solution. Maybe I am overlooking something.

An option is to specify the vector of row names first followed by the ones that are left with setdiff
mat[c(select, setdiff(row.names(mat), select)),]
#. sam1 sam2 sam3 sam4 sam5
#sig6 36 37 38 39 40
#sig4a 16 17 18 19 20
#sig2 6 7 8 9 10
#sig1 1 2 3 4 5
#sig3 11 12 13 14 15
#sig4b 21 22 23 24 25
#sig4c 26 27 28 29 30
#sig5 31 32 33 34 35
#sig7a 41 42 43 44 45
#aig7b 46 47 48 49 50
data
mat <- structure(c(1L, 6L, 11L, 16L, 21L, 26L, 31L, 36L, 41L, 46L, 2L,
7L, 12L, 17L, 22L, 27L, 32L, 37L, 42L, 47L, 3L, 8L, 13L, 18L,
23L, 28L, 33L, 38L, 43L, 48L, 4L, 9L, 14L, 19L, 24L, 29L, 34L,
39L, 44L, 49L, 5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 45L, 50L
), .Dim = c(10L, 5L), .Dimnames = list(c("sig1", "sig2", "sig3",
"sig4a", "sig4b", "sig4c", "sig5", "sig6", "sig7a", "aig7b"),
c("sam1", "sam2", "sam3", "sam4", "sam5")))

Related

Merge 2 data frames using common date, plus 2 rows before and n-1 rows after

So i need to merge 2 data frames:
The first data frame contains dates in YYYY-mm-dd format and event lengths:
datetime length
2003-06-03 1
2003-06-07 1
2003-06-13 1
2003-06-17 3
2003-06-28 5
2003-07-10 1
2003-07-23 1
...
The second data frame contains dates in the same format and discharge data:
datetime q
2003-05-29 36.2
2003-05-30 34.6
2003-05-31 33.1
2003-06-01 30.7
2003-06-02 30.0
2003-06-03 153.0
2003-06-04 69.0
...
The second data frame is much larger.
I want to merge/join only the following rows of the second data frame to the first:
all rows that have the same date as the first frame (I know this can be done with left_join(df1,df2, by = c("datetime"))
two rows before that row
n-1 rows after that row, where n = "length" value of row in first data frame.
I would like to identify the rows belonging to the same event as well.
Ideally i would have the following output: (Notice the event from 2003-06-17)
EventDatesNancy length q event#
2003-06-03 1 153.0 1
2003-06-07 1 120.0 2
2003-06-13 1 45.3 3
2003-06-15 na 110.0 4
2003-06-16 na 53.1 4
2003-06-17 3 78.0 4
2003-06-18 na 167.0 4
2003-06-19 na 145.0 4
...
I hope this makes clear what I am trying to do.
This might be one approach using tidyverse and fuzzyjoin.
First, indicate event numbers in your first data.frame. Add two columns to indicate the start and end dates (start date is 2 days before the date, and end date is length days - 1 after the date).
Then, you can use fuzzy_inner_join to get the selected rows from the second data.frame. Here, you will want to include where the datetime in the second data.frame falls after the start date and before the end date of the first data.frame.
library(tidyverse)
library(fuzzyjoin)
df1$event <- seq_along(1:nrow(df1))
df1$start_date <- df1$datetime - 2
df1$end_date <- df1$datetime + df1$length - 1
fuzzy_inner_join(
df1,
df2,
by = c("start_date" = "datetime", "end_date" = "datetime"),
match_fun = c(`<=`, `>=`)
) %>%
select(datetime.y, length, q, event)
I tried this out with some made up data:
R> df1
datetime length
1 2003-06-03 1
2 2003-06-12 1
3 2003-06-21 1
4 2003-06-30 3
5 2003-07-09 5
6 2003-07-18 1
7 2003-07-27 1
8 2003-08-05 2
9 2003-08-14 1
10 2003-08-23 1
11 2003-09-01 3
R> df2
datetime q
1 2003-06-03 44
2 2003-06-04 52
3 2003-06-05 34
4 2003-06-06 20
5 2003-06-07 57
6 2003-06-08 67
7 2003-06-09 63
8 2003-06-10 51
9 2003-06-11 56
10 2003-06-12 37
11 2003-06-13 16
12 2003-06-14 54
13 2003-06-15 46
14 2003-06-16 6
15 2003-06-17 32
16 2003-06-18 91
17 2003-06-19 61
18 2003-06-20 42
19 2003-06-21 28
20 2003-06-22 98
21 2003-06-23 77
22 2003-06-24 81
23 2003-06-25 13
24 2003-06-26 15
25 2003-06-27 73
26 2003-06-28 38
27 2003-06-29 27
28 2003-06-30 49
29 2003-07-01 10
30 2003-07-02 89
31 2003-07-03 9
32 2003-07-04 80
33 2003-07-05 68
34 2003-07-06 26
35 2003-07-07 31
36 2003-07-08 29
37 2003-07-09 84
38 2003-07-10 60
39 2003-07-11 19
40 2003-07-12 97
41 2003-07-13 35
42 2003-07-14 47
43 2003-07-15 70
This will give the following output:
datetime.y length q event
1 2003-06-03 1 44 1
2 2003-06-10 1 51 2
3 2003-06-11 1 56 2
4 2003-06-12 1 37 2
5 2003-06-19 1 61 3
6 2003-06-20 1 42 3
7 2003-06-21 1 28 3
8 2003-06-28 3 38 4
9 2003-06-29 3 27 4
10 2003-06-30 3 49 4
11 2003-07-01 3 10 4
12 2003-07-02 3 89 4
13 2003-07-07 5 31 5
14 2003-07-08 5 29 5
15 2003-07-09 5 84 5
16 2003-07-10 5 60 5
17 2003-07-11 5 19 5
18 2003-07-12 5 97 5
19 2003-07-13 5 35 5
If the output desired is different than above, please let me know what should be different so that I can correct it.
Data
df1 <- structure(list(datetime = structure(c(12206, 12215, 12224, 12233,
12242, 12251, 12260, 12269, 12278, 12287, 12296), class = "Date"),
length = c(1, 1, 1, 3, 5, 1, 1, 2, 1, 1, 3), event = 1:11,
start_date = structure(c(12204, 12213, 12222, 12231, 12240,
12249, 12258, 12267, 12276, 12285, 12294), class = "Date"),
end_date = structure(c(12206, 12215, 12224, 12235, 12246,
12251, 12260, 12270, 12278, 12287, 12298), class = "Date")), row.names = c(NA,
-11L), class = "data.frame")
df2 <- structure(list(datetime = structure(c(12206, 12207, 12208, 12209,
12210, 12211, 12212, 12213, 12214, 12215, 12216, 12217, 12218,
12219, 12220, 12221, 12222, 12223, 12224, 12225, 12226, 12227,
12228, 12229, 12230, 12231, 12232, 12233, 12234, 12235, 12236,
12237, 12238, 12239, 12240, 12241, 12242, 12243, 12244, 12245,
12246, 12247, 12248), class = "Date"), q = c(44L, 52L, 34L, 20L,
57L, 67L, 63L, 51L, 56L, 37L, 16L, 54L, 46L, 6L, 32L, 91L, 61L,
42L, 28L, 98L, 77L, 81L, 13L, 15L, 73L, 38L, 27L, 49L, 10L, 89L,
9L, 80L, 68L, 26L, 31L, 29L, 84L, 60L, 19L, 97L, 35L, 47L, 70L
)), class = "data.frame", row.names = c(NA, -43L))

How to apply function to specific columns based upon column name?

I am working with a wide data set resembling the following:
I am looking to write a function that I can iterate over sets of columns with similar names, but with different names. For the sake of simplicity here in terms of the function itself, I'll just create a function that takes the mean of two columns.
avg <- function(data, scorecol, distcol) {
ScoreDistanceAvg = (scorecol + distcol)/2
data$ScoreDistanceAvg <- ScoreDistanceAvg
return(data)
}
avg(data = dat, scorecol = dat$ScoreGame0, distcol = dat$DistanceGame0)
How can I apply the new function to sets of columns with repeated names but different numbers? That is, how could I create a column that takes the mean of ScoreGame0 and DistanceGame0, then create a column that takes the mean of ScoreGame5 and DistanceGame5, and so on? This would be the final output:
Of course, I could just run the function multiple times, but since my full data set is much larger, how could I automate this process? I imagine it involves apply, but I'm not sure how to use apply with a repeated pattern like that. Additionally, I imagine it may involve rewriting the function to better automate the naming of columns.
Data:
structure(list(Player = c("Lebron James", "Lebron James", "Lebron James",
"Lebron James", "Lebron James", "Lebron James", "Lebron James",
"Lebron James", "Lebron James", "Lebron James", "Lebron James",
"Lebron James", "Steph Curry", "Steph Curry", "Steph Curry",
"Steph Curry", "Steph Curry", "Steph Curry", "Steph Curry", "Steph Curry",
"Steph Curry", "Steph Curry", "Steph Curry", "Steph Curry"),
Game = c(0L, 1L, 2L, 3L, 4L, 5L, 0L, 1L, 2L, 3L, 4L, 5L,
0L, 1L, 2L, 3L, 4L, 5L, 0L, 1L, 2L, 3L, 4L, 5L), ScoreGame0 = c(32L,
32L, 32L, 32L, 32L, 32L, 44L, 44L, 44L, 44L, 44L, 44L, 45L,
45L, 45L, 45L, 45L, 45L, 76L, 76L, 76L, 76L, 76L, 76L), ScoreGame5 = c(27L,
27L, 27L, 27L, 27L, 27L, 12L, 12L, 12L, 12L, 12L, 12L, 76L,
76L, 76L, 76L, 76L, 76L, 32L, 32L, 32L, 32L, 32L, 32L), DistanceGame0 = c(12L,
12L, 12L, 12L, 12L, 12L, 79L, 79L, 79L, 79L, 79L, 79L, 18L,
18L, 18L, 18L, 18L, 18L, 88L, 88L, 88L, 88L, 88L, 88L), DistanceGame5 = c(13L,
13L, 13L, 13L, 13L, 13L, 34L, 34L, 34L, 34L, 34L, 34L, 42L,
42L, 42L, 42L, 42L, 42L, 54L, 54L, 54L, 54L, 54L, 54L)), class = "data.frame", row.names = c(NA,
-24L))
Rewrite your function slightly and use it in mapply by greping over the columns. sort makes this even safer.
avg <- function(scorecol, distcol) {
(scorecol + distcol)/2
}
mapply(avg, dat[sort(grep('ScoreGame', names(dat)))], dat[sort(grep('DistanceGame', names(dat)))])
# ScoreGame0 ScoreGame5
# [1,] 22.0 20
# [2,] 22.0 20
# [3,] 22.0 20
# [4,] 22.0 20
# [5,] 22.0 20
# [6,] 22.0 20
# [7,] 61.5 23
# [8,] 61.5 23
# [9,] 61.5 23
# [10,] 61.5 23
# [11,] 61.5 23
# [12,] 61.5 23
# [13,] 31.5 59
# [14,] 31.5 59
# [15,] 31.5 59
# [16,] 31.5 59
# [17,] 31.5 59
# [18,] 31.5 59
# [19,] 82.0 43
# [20,] 82.0 43
# [21,] 82.0 43
# [22,] 82.0 43
# [23,] 82.0 43
# [24,] 82.0 43
To see what grep does try
grep('DistanceGame', names(dat), value=TRUE)
# [1] "DistanceGame0" "DistanceGame5"
in Base R:
cols_used <- names(df[, -(1:2)])
f <- sub("[^0-9]+", 'ScoreDistance', cols_used)
data.frame(lapply(split.default(df[cols_used], f), rowMeans))
ScoreDistance0 ScoreDistance5
1 22.0 20
2 22.0 20
3 22.0 20
4 22.0 20
5 22.0 20
6 22.0 20
7 61.5 23
8 61.5 23
9 61.5 23
10 61.5 23
11 61.5 23
12 61.5 23
13 31.5 59
14 31.5 59
15 31.5 59
16 31.5 59
17 31.5 59
18 31.5 59
19 82.0 43
20 82.0 43
21 82.0 43
22 82.0 43
23 82.0 43
24 82.0 43
Using tidyverse:
Here's a solution with a forloop and readr:
library(readr)
game_num <- names(dat) |>
readr::parse_number() |>
na.omit()
for(i in unique(game_num)) {
avg <- paste0("ScoreDistanceAvg", i)
score <- paste0("ScoreGame", i)
distance <- paste0("DistanceGame", i)
dat[[avg]] <- (dat[[score]] + dat[[distance]])/2
}
Which gives:
Player Game ScoreGame0 ScoreGame5 DistanceGame0 DistanceGame5 ScoreDistanceAvg0 ScoreDistanceAvg5
1 Lebron James 0 32 27 12 13 22.0 20
2 Lebron James 1 32 27 12 13 22.0 20
3 Lebron James 2 32 27 12 13 22.0 20
4 Lebron James 3 32 27 12 13 22.0 20
5 Lebron James 4 32 27 12 13 22.0 20
6 Lebron James 5 32 27 12 13 22.0 20
7 Lebron James 0 44 12 79 34 61.5 23
8 Lebron James 1 44 12 79 34 61.5 23
9 Lebron James 2 44 12 79 34 61.5 23
10 Lebron James 3 44 12 79 34 61.5 23
11 Lebron James 4 44 12 79 34 61.5 23
12 Lebron James 5 44 12 79 34 61.5 23
13 Steph Curry 0 45 76 18 42 31.5 59

R: Conditional summing in R

I have an R data frame with many columns, and I want to sum only columns (header: score) having cell value >25 under row named "Matt". The sum value can be placed after the last column.
input (df1)
Name
score
score
score
score
score
Alex
31
15
18
22
23
Pat
37
18
29
15
28
Matt
33
27
18
88
9
James
12
36
32
13
21
output (df2)
Name
score
score
score
score
score
Matt
Alex
31
15
18
22
23
68
Pat
37
18
59
55
28
110
Matt
33
27
18
88
9
148
James
12
36
32
13
21
61
Any thoughts are more than welcome,
Regards,
One option is to extract the row where 'Name' is 'Matt', without the first column create a logical vector ('i1'), use that to subset the columns and get the rowSums
i1 <- df1[df1$Name == "Matt",-1] > 25
df1$Matt <- rowSums(df1[-1][,i1], na.rm = TRUE)
Or using tidyverse
library(dplyr)
df1 %>%
mutate(Matt = rowSums(select(cur_data(),
where(~ is.numeric(.) && .[Name == 'Matt'] > 25))))
-output
# Name score score.1 score.2 score.3 score.4 Matt
#1 Alex 31 15 18 22 23 68
#2 Pat 37 18 29 15 28 70
#3 Matt 33 27 18 88 9 148
#4 James 12 36 32 13 21 61
data
df1 <- structure(list(Name = c("Alex", "Pat", "Matt", "James"), score = c(31L,
37L, 33L, 12L), score.1 = c(15L, 18L, 27L, 36L), score.2 = c(18L,
29L, 18L, 32L), score.3 = c(22L, 15L, 88L, 13L), score.4 = c(23L,
28L, 9L, 21L)), class = "data.frame", row.names = c(NA, -4L))
You can try the code below
df$Matt <- rowSums(df[-1] * (df[df$Name == "Matt", -1] > 25)[rep(1, nrow(df)), ])
which gives
> df
Name score score score score score Matt
1 Alex 31 15 18 22 23 68
2 Pat 37 18 29 15 28 70
3 Matt 33 27 18 88 9 148
4 James 12 36 32 13 21 61

Is there a way to replace NAs in R using horizontal order?

I have the following data frame:
df <-structure(list(time = c("12:00:00", "12:05:00", "12:10:00", "12:15:00",
"12:20:00", "12:25:00", "12:30:00", "12:35:00", "12:40:00", "12:45:00",
"12:50:00", "12:55:00", "13:00:00", "13:05:00", "13:10:00", "13:15:00",
"13:20:00", "13:25:00"), speedA = c(60L, 75L, 65L, 45L, 12L,
15L, 20L, 45L, 65L, 60L, 60L, 30L, 35L, 45L, 25L, 15L, 10L, 5L
), speedB = c(50L, 30L, NA, 40L, NA, NA, 18L, NA, NA, NA, 15L,
10L, 25L, NA, NA, 12L, NA, NA), speedC = c(60L, 25L, NA, NA,
30L, 15L, 50L, 60L, NA, 35L, 34L, NA, 15L, 64L, 10L, 7L, 60L,
60L), speedD = c(NA, 10L, 60L, NA, 50L, 55L, 45L, 35L, NA, NA,
45L, 60L, 35L, 34L, 36L, 39L, 48L, 47L)), class = "data.frame", row.names = c(NA,
-18L))
I want to replace the NAs with values using interpolation between the horizontal values at the same row of each NA.
The expected result:
df2<- structure(list(time = c("12:00:00", "12:05:00", "12:10:00", "12:15:00",
"12:20:00", "12:25:00", "12:30:00", "12:35:00", "12:40:00", "12:45:00",
"12:50:00", "12:55:00", "13:00:00", "13:05:00", "13:10:00", "13:15:00",
"13:20:00", "13:25:00"), speedA = c(60L, 75L, 65L, 45L, 12L,
15L, 20L, 45L, 65L, 60L, 60L, 30L, 35L, 45L, 25L, 15L, 10L, 5L
), speedB = c(50, 30, 63.33333, 40, 21, 15, 18, 52.5, 65, 47.5,
15, 10, 25, 54.5, 17.5, 12, 35, 32.5), speedC = c(60, 25, 61.66667,
40, 30, 15, 50, 60, 65, 35, 34, 35, 15, 64, 10, 7, 60, 60), speedD = c(60L,
10L, 60L, 40L, 50L, 55L, 45L, 35L, 65L, 35L, 45L, 60L, 35L, 34L,
36L, 39L, 48L, 47L)), class = "data.frame", row.names = c(NA,
-18L))
We can use zoo::na.approx to interpolate values. For values which we are not able to interpolate (NA values at the last) we use tidyr::fill to fill it.
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = -time) %>%
group_by(time) %>%
mutate(value = zoo::na.approx(value, na.rm = FALSE)) %>%
fill(value) %>%
pivot_wider()
# time speedA speedB speedC speedD
# <chr> <dbl> <dbl> <dbl> <dbl>
# 1 12:00:00 60 50 60 60
# 2 12:05:00 75 30 25 10
# 3 12:10:00 65 63.333 61.667 60
# 4 12:15:00 45 40 40 40
# 5 12:20:00 12 21 30 50
# 6 12:25:00 15 15 15 55
# 7 12:30:00 20 18 50 45
# 8 12:35:00 45 52.5 60 35
# 9 12:40:00 65 65 65 65
#10 12:45:00 60 47.5 35 35
#11 12:50:00 60 15 34 45
#12 12:55:00 30 10 35 60
#13 13:00:00 35 25 15 35
#14 13:05:00 45 54.5 64 34
#15 13:10:00 25 17.5 10 36
#16 13:15:00 15 12 7 39
#17 13:20:00 10 35 60 48
#18 13:25:00 5 32.5 60 47
You can use zoo::na.approx() row-wise with c_across().
library(dplyr)
library(tidyr)
library(zoo)
df %>%
rowwise() %>%
mutate(speed = list(na.locf(na.approx(c_across(-time), na.rm = FALSE))), .keep = "unused") %>%
unnest_wider(speed, names_sep = "")
# # A tibble: 18 x 5
# time speed1 speed2 speed3 speed4
# <chr> <dbl> <dbl> <dbl> <dbl>
# 1 12:00:00 60 50 60 60
# 2 12:05:00 75 30 25 10
# 3 12:10:00 65 63.3 61.7 60
# 4 12:15:00 45 40 40 40
# 5 12:20:00 12 21 30 50
# 6 12:25:00 15 15 15 55
# 7 12:30:00 20 18 50 45
# 8 12:35:00 45 52.5 60 35
# 9 12:40:00 65 65 65 65
# 10 12:45:00 60 47.5 35 35
# 11 12:50:00 60 15 34 45
# 12 12:55:00 30 10 35 60
# 13 13:00:00 35 25 15 35
# 14 13:05:00 45 54.5 64 34
# 15 13:10:00 25 17.5 10 36
# 16 13:15:00 15 12 7 39
# 17 13:20:00 10 35 60 48
# 18 13:25:00 5 32.5 60 47

Grouping the dataframe based on one variable

I have a dataframe with 10 variables all of them numeric, and one of the variable name is age, I want to group the observation based on age.example. age 17 to 18 one group, 19-22 another group and then each row should be attached to each group. And resulting should be a dataframe for further manipulations.
Model of the dataframe:
A B AGE
25 50 17
30 42 22
50 60 19
65 105 17
355 400 21
68 47 20
115 98 18
25 75 19
And I want result like
17-18
A B AGE
25 50 17
65 105 17
115 98 18
19-22
A B AGE
30 42 22
50 60 19
355 400 21
68 47 20
115 98 18
25 75 19
I did group the dataset according to Age var using the split function, now my concern is how I could manipulate the grouped data. Eg:the answer looked like
$1
A B AGE
25 50 17
65 105 17
115 98 18
$2
A B AGE
30 42 22
50 60 19
355 400 21
68 47 20
115 98 18
25 75 19
My question is how can I access each group for further manipulation?
for eg: if I want to do t-test for each group separately?
The split function will work with dataframes. Use either cut with 'breaks' or findInterval with an appropriate set of cutpoints (named 'vec' if you are using named parameters) as the criterion for grouping, the second argument to split. The default for cut is intervals closed on the right and default for findInterval is closed on the left.
> split(dat, findInterval(dat$AGE, c(17, 19.5, 22.5)))
$`1`
A B AGE
1 25 50 17
3 50 60 19
4 65 105 17
7 115 98 18
8 25 75 19
$`2`
A B AGE
2 30 42 22
5 355 400 21
6 68 47 20
Here is the approach with cut
lst <- split(df1, cut(df1$AGE, breaks=c(16, 18, 22), labels=FALSE))
lst
# $`1`
# A B AGE
#1 25 50 17
#4 65 105 17
#7 115 98 18
#$`2`
# A B AGE
#2 30 42 22
#3 50 60 19
#5 355 400 21
#6 68 47 20
#8 25 75 19
Update
If you need to find the sum, mean of columns for each "list" element
lapply(lst, function(x) rbind(colSums(x[-3]),colMeans(x[-3])))
But, if the objective is to find the summary statistics based on the group, it can be done using any of the aggregating functions
library(dplyr)
df1 %>%
group_by(grp=cut(AGE, breaks=c(16, 18, 22), labels=FALSE)) %>%
summarise_each(funs(sum=sum(., na.rm=TRUE),
mean=mean(., na.rm=TRUE)), A:B)
# grp A_sum B_sum A_mean B_mean
#1 1 205 253 68.33333 84.33333
#2 2 528 624 105.60000 124.80000
Or using aggregate from base R
do.call(data.frame,
aggregate(cbind(A,B)~cbind(grp=cut(AGE, breaks=c(16, 18, 22),
labels=FALSE)), df1, function(x) c(sum=sum(x), mean=mean(x))))
data
df1 <- structure(list(A = c(25L, 30L, 50L, 65L, 355L, 68L, 115L, 25L
), B = c(50L, 42L, 60L, 105L, 400L, 47L, 98L, 75L), AGE = c(17L,
22L, 19L, 17L, 21L, 20L, 18L, 19L)), .Names = c("A", "B", "AGE"
), class = "data.frame", row.names = c(NA, -8L))

Resources