If a data frame has M rows, how can it be interpolated or splined to create a new data frame with N rows? Here is an example:
# Start with some vectors of constant length (M=7) with data at each time point t
df <- tibble(t = c(1, 2, 3, 4, 5, 6, 7),
y1 = c(0.0, 0.5, 1.0, 3.0, 5.0, 2.0, 0.0),
y2 = c(0.0, 0.75, 1.5, 3.5, 6.0, 4.0, 0.0),
y3 = c(0.0, 1.0, 2.0, 4.0, 3.0, 2.0, 0.0))
# How to interpolate or spline these to other numbers of points (rows)?
# By individual column, to spline results to a new vector with length N=15:
spline(x=df$t, y=df$y1, n=15)
spline(x=df$t, y=df$y2, n=15)
spline(x=df$t, y=df$y3, n=15)
So by vector this is trivial. Question is, how can this spline be applied to all columns across the dataset with M rows to create a new dataset with N rows, preferably with tidyverse approach, e.g.:
df15 <- df %>% mutate(...replace(?)...(spline(x=?, y=?, n=15)... ???))
Again, I would like to have this spline be applied across ALL columns without having to specify syntax that includes column names. The intent is to apply this to data frames with something on the order of 100 columns and where names and numbers of columns may vary. It is of course not necessary to include the t (or x) column in the data frame if that simplifies the approach at all. Thanks for any insight.
spline returns a list. So, we may loop across with summarise and then unpack the columns (summarise is flexible in returning any number of rows whereas mutate is fixed i.e. it should return the same number of rows as the input)
library(dplyr)
library(tidyr)
library(stringr)
df %>%
summarise(across(y1:y3, ~spline(t, .x, n = 15) %>%
as_tibble %>%
rename_with(~ str_c(cur_column(), .)))) %>%
unpack(everything())
-output
# A tibble: 15 × 6
y1x y1y y2x y2y y3x y3y
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 0 1 0 1 0
2 1.43 0.319 1.43 0.404 1.43 0.542
3 1.86 0.468 1.86 0.673 1.86 0.905
4 2.29 0.566 2.29 0.907 2.29 1.18
5 2.71 0.752 2.71 1.21 2.71 1.56
6 3.14 1.18 3.14 1.68 3.14 2.30
7 3.57 1.93 3.57 2.43 3.57 3.33
8 4 3 4 3.5 4 4
9 4.43 4.24 4.43 4.84 4.43 3.83
10 4.86 4.99 4.86 5.85 4.86 3.21
11 5.29 4.56 5.29 5.90 5.29 2.67
12 5.71 3.12 5.71 4.96 5.71 2.29
13 6.14 1.47 6.14 3.46 6.14 1.82
14 6.57 0.269 6.57 1.74 6.57 1.09
15 7 0 7 0 7 0
NOTE: Here, we renamed the columns as the output from spline is a list with names x and y and data.frame/tibble wants unique column names
Here is an option with data.table
library(data.table)
setDT(df)[,
lapply(.SD, function(v) list2DF(spline(t, v, n = 15))),
.SDcols = patterns("^y\\d+")
]
which gives
y1.x y1.y y2.x y2.y y3.x y3.y
1: 1.000000 0.0000000 1.000000 0.0000000 1.000000 0.0000000
2: 1.428571 0.3194303 1.428571 0.4039226 1.428571 0.5423159
3: 1.857143 0.4680242 1.857143 0.6731712 1.857143 0.9052687
4: 2.285714 0.5655593 2.285714 0.9065841 2.285714 1.1770242
5: 2.714286 0.7515972 2.714286 1.2081346 2.714286 1.5555866
6: 3.142857 1.1773997 3.142857 1.6848330 3.142857 2.3039184
7: 3.571429 1.9306220 3.571429 2.4271800 3.571429 3.3318454
8: 4.000000 3.0000000 4.000000 3.5000000 4.000000 4.0000000
9: 4.428571 4.2387392 4.428571 4.8368010 4.428571 3.8340703
10: 4.857143 4.9919616 4.857143 5.8546581 4.857143 3.2089361
11: 5.285714 4.5551878 5.285714 5.8976389 5.285714 2.6706702
12: 5.714286 3.1239451 5.714286 4.9619776 5.714286 2.2875045
13: 6.142857 1.4724741 6.142857 3.4632587 6.142857 1.8204137
14: 6.571429 0.2685633 6.571429 1.7399284 6.571429 1.0868916
15: 7.000000 0.0000000 7.000000 0.0000000 7.000000 0.0000000
Related
I want to add an extra column in a dataframe which displays the difference between certain rows, where the distance between the rows also depends on values in the table.
I found out that:
mutate(Col_new = Col_1 - lead(Col_1, n = x))
can find the difference for a fixed n, but only a integer can be used as input. How would you find the difference between rows for a varying distance between the rows?
I am trying to get the output in Col_new, which is the difference between the i and i+n row where n should take the value in column Count. (The data is rounded so there might be 0.01 discrepancies in Col_new).
col_1 count Col_new
1 0.90 1 -0.68
2 1.58 1 -0.31
3 1.89 1 0.05
4 1.84 1 0.27
5 1.57 1 0.27
6 1.30 2 -0.26
7 1.25 2 -0.99
8 1.56 2 -1.58
9 2.24 2 -1.80
10 3.14 2 -1.58
11 4.04 3 -0.95
12 4.72 3 0.01
13 5.04 3 0.60
14 4.99 3 0.60
15 4.71 3 0.01
16 4.44 4 -1.84
17 4.39 4 NA
18 4.70 4 NA
19 5.38 4 NA
20 6.28 4 NA
Data:
df <- data.frame(Col_1 = c(0.90, 1.58, 1.89, 1.84, 1.57, 1.30, 1.35,
1.56, 2.24, 3.14, 4.04, 4.72, 5.04, 4.99,
4.71, 4.44, 4.39, 4.70, 5.38, 6.28),
Count = sort(rep(1:4, 5)))
Some code that generates the intended output, but can undoubtably be made more efficient.
library(dplyr)
df %>%
mutate(col_2 = sapply(1:4, function(s){lead(Col_1, n = s)})) %>%
rowwise() %>%
mutate(Col_new = Col_1 - col_2[Count]) %>%
select(-col_2)
Output:
# A tibble: 20 × 3
# Rowwise:
Col_1 Count Col_new
<dbl> <int> <dbl>
1 0.9 1 -0.68
2 1.58 1 -0.310
3 1.89 1 0.0500
4 1.84 1 0.27
5 1.57 1 0.27
6 1.3 2 -0.26
7 1.35 2 -0.89
8 1.56 2 -1.58
9 2.24 2 -1.8
10 3.14 2 -1.58
11 4.04 3 -0.95
12 4.72 3 0.0100
13 5.04 3 0.600
14 4.99 3 0.600
15 4.71 3 0.0100
16 4.44 4 -1.84
17 4.39 4 NA
18 4.7 4 NA
19 5.38 4 NA
20 6.28 4 NA
df %>% mutate(Col_new = case_when(
df$count == 1 ~ df$col_1 - lead(df$col_1 , n = 1),
df$count == 2 ~ df$col_1 - lead(df$col_1 , n = 2),
df$count == 3 ~ df$col_1 - lead(df$col_1 , n = 3),
df$count == 4 ~ df$col_1 - lead(df$col_1 , n = 4),
df$count == 5 ~ df$col_1 - lead(df$col_1 , n = 5)
))
col_1 count Col_new
1 0.90 1 -0.68
2 1.58 1 -0.31
3 1.89 1 0.05
4 1.84 1 0.27
5 1.57 1 0.27
6 1.30 2 -0.26
7 1.25 2 -0.99
8 1.56 2 -1.58
9 2.24 2 -1.80
10 3.14 2 -1.58
11 4.04 3 -0.95
12 4.72 3 0.01
13 5.04 3 0.60
14 4.99 3 0.60
15 4.71 3 0.01
16 4.44 4 -1.84
17 4.39 4 NA
18 4.70 4 NA
19 5.38 4 NA
20 6.28 4 NA
This would give you your desired results but is not a very good solution for more cases. Imagine your task with 10 or more different counts another solution is required.
date value grp MR MR.avg avg lcl ucl
<date> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2009-07-07 11.6 A NA 1.15 10.2 13.3 7.18
2 2009-07-08 10.1 A 1.51 1.15 10.2 13.3 7.18
3 2009-07-09 11.4 A 1.39 1.15 10.2 13.3 7.18
4 2009-07-10 10.5 A 0.932 1.15 10.2 13.3 7.18
5 2009-07-11 9.27 A 1.24 1.15 10.2 13.3 7.18
6 2009-07-12 9.95 A 0.688 1.15 10.2 13.3 7.18
7 2009-07-13 10.6 A 0.598 1.15 10.2 13.3 7.18
8 2009-07-14 10.1 A 0.415 1.15 10.2 13.3 7.18
9 2009-07-15 8.48 A 1.66 1.15 10.2 13.3 7.18
10 2009-07-16 10.4 A 1.90 1.15 10.2 13.3 7.18
11 2009-07-17 2.72 B NA 0.832 2.83 5.05 0.623
12 2009-07-18 1.44 B 1.27 0.832 2.83 5.05 0.623
13 2009-07-19 2.23 B 0.782 0.832 2.83 5.05 0.623
14 2009-07-20 3.03 B 0.809 0.832 2.83 5.05 0.623
15 2009-07-21 3.21 B 0.176 0.832 2.83 5.05 0.623
Line 8 of my code below groups each of three distributions, into one of three groups - 'A', 'B', or 'C'. I then proceed to graph the three distributions, with control limits as shown far below.
library(tidyverse)
set.seed(55)
df <- tibble(
date = as.Date(40001:40030, origin = "1899-12-30"),
value = c(rnorm(10, 10), rnorm(10, 3), rnorm(10, 7))
) %>%
mutate(grp = c(rep("A", 10), rep("B", 10), rep("C", 10))) %>% # line 8
group_by(grp) %>%
mutate(
MR = abs(lag(value, 1) - value),
MR.avg = sum(MR, na.rm = TRUE) / (n() - 1),
avg = mean(value),
lcl = avg + (2.66 * MR.avg),
ucl = avg - (2.66 * MR.avg),
) %>%
print(n = 15)
p <- ggplot(df, aes(date, value, group = grp)) +
geom_line() +
geom_line(aes(date, lcl)) +
geom_line(aes(date, ucl))
p
I call out line 8 specifically above, with a comment. As you can see line 8 is a manual process, that I want to simplify with some type of split() argument. I'd like to be able to simply specify the split point, in this case the 10th/11th row, and 20th/21st row, with a syntax I imagine resembles split(c(10, 20)) or maybe split(c(11, 21)). How do I properly apply split() or whatever function is appropriate to replace line 8 with something better?
A little background... I am writing this as a function and need to allow the user a simple interface, in essence something along the lines of split(c(point1, point2, etc)) to allow the user to split distributions on a control chart.
I would like to know if it is possible to provide column names in the as_tibble function. I know that I could use the rename function to change column names, but I would like to save the number of lines I write. Lets say I want my column names to be a1, a2, a3.
> library(purrr)
> library(tidyverse)
> 1:3 %>%
+ map(~ rnorm(104, .x)) %>%
+ map_dfc(~as_tibble(.x))
# A tibble: 104 x 3
value value1 value2
<dbl> <dbl> <dbl>
1 2.91139409 1.44646163 1.298360
2 0.87725704 4.05341889 3.892296
3 0.73230088 2.72506579 3.520865
4 1.02862344 2.09576397 4.009980
5 0.49159059 -1.23746772 3.172201
6 0.24665840 1.80876495 2.927716
7 0.75112051 2.22486452 2.896452
8 -0.06036349 3.63503054 3.218324
9 1.84431314 1.88562406 2.398761
10 0.70866474 0.08947359 3.954770
# ... with 94 more rows
We can put as_tibble with map_dfc, and then use setNames(paste0("a", seq_len(ncol(.)))) to change column name based on the number of columns.
library(tidyverse)
set.seed(123)
1:3 %>%
map_dfc(~as_tibble(rnorm(104, .x))) %>%
setNames(paste0("a", seq_len(ncol(.))))
# A tibble: 104 x 3
a1 a2 a3
<dbl> <dbl> <dbl>
1 0.440 1.05 4.65
2 0.770 1.95 2.95
3 2.56 1.22 3.12
4 1.07 0.332 3.24
5 1.13 1.62 4.23
6 2.72 2.92 2.48
7 1.46 1.42 2.01
8 -0.265 2.61 4.68
9 0.313 0.382 2.56
10 0.554 1.94 2.28
# ... with 94 more rows
I have a data frame like:
DATE x y ID
06/10/2003 7.21 0.651 1
12/10/2003 5.99 0.428 1
18/10/2003 4.68 1.04 1
24/10/2003 3.47 0.363 1
30/10/2003 2.42 0.507 1
02/05/2010 2.72 0.47 2
05/05/2010 2.6 0. 646 2
08/05/2010 2.67 0.205 2
11/05/2010 3.57 0.524 2
12/05/2010 0.428 4.68 3
13/05/2010 1.04 3.47 3
14/05/2010 0.363 2.42 3
18/10/2003 0.507 2.52 3
24/10/2003 0.418 4.68 3
30/10/2003 0.47 3.47 3
29/04/2010 0.646 2.42 4
18/10/2003 3.47 2.52 4
i have the count of number of rows per group for column ID as an integer vector like 5 4 6 2
is there a way to replace the group values in column id with these integer vector 5 4 6 2
the output i am expecting is
DATE x y ID
06/10/2003 7.21 0.651 5
12/10/2003 5.99 0.428 5
18/10/2003 4.68 1.04 5
24/10/2003 3.47 0.363 5
30/10/2003 2.42 0.507 5
02/05/2010 2.72 0.47 4
05/05/2010 2.6 646 4
08/05/2010 2.67 0.205 4
11/05/2010 3.57 0.524 4
12/05/2010 0.428 4.68 6
13/05/2010 1.04 3.47 6
14/05/2010 0.363 2.42 6
18/10/2003 0.507 2.52 6
24/10/2003 0.418 4.68 6
30/10/2003 0.47 3.47 6
29/04/2010 0.646 2.42 2
18/10/2003 3.47 2.52 2
i am quite new to R and tried to find if there is any idea replace function. But having a hard time. Any help is much appreciated.
above data is just an example for understanding my requirement.
A compact solution with the data.table-package:
library(data.table)
setDT(mydf)[, ID := .N, by = ID][]
which gives:
> mydf
DATE x y ID
1: 06/10/2003 7.210 0.651 5
2: 12/10/2003 5.990 0.428 5
3: 18/10/2003 4.680 1.040 5
4: 24/10/2003 3.470 0.363 5
5: 30/10/2003 2.420 0.507 5
6: 02/05/2010 2.720 0.470 4
7: 05/05/2010 2.600 0.646 4
8: 08/05/2010 2.670 0.205 4
9: 11/05/2010 3.570 0.524 4
10: 12/05/2010 0.428 4.680 6
11: 13/05/2010 1.040 3.470 6
12: 14/05/2010 0.363 2.420 6
13: 18/10/2003 0.507 2.520 6
14: 24/10/2003 0.418 4.680 6
15: 30/10/2003 0.470 3.470 6
16: 29/04/2010 0.646 2.420 2
17: 18/10/2003 3.470 2.520 2
What this does:
setDT(mydf) converts the dataframe to a data.table
by = ID groups by ID
ID := .N replaces the original value of ID with the count by group
You can use the ave() function to calculate how many rows each ID takes up. In the example below I created a new variable ID2, but you could replace the original ID if you want.
(I included code to create your data in R below, but when you ask questions in the future please include your data in the question by using the dput() function on the data object. That's what I did to make the code below.)
mydata <- structure(list(DATE = c("06/10/2003", "12/10/2003", "18/10/2003",
"24/10/2003", "30/10/2003", "02/05/2010", "05/05/2010", "08/05/2010",
"11/05/2010", "12/05/2010", "13/05/2010", "14/05/2010", "18/10/2003",
"24/10/2003", "30/10/2003", "29/04/2010", "18/10/2003"),
x = c(7.21, 5.99, 4.68, 3.47, 2.42, 2.72, 2.6, 2.67, 3.57, 0.428, 1.04, 0.363,
0.507, 0.418, 0.47, 0.646, 3.47),
y = c(0.651, 0.428, 1.04, 0.363, 0.507, 0.47, 646, 0.205, 0.524, 4.68, 3.47,
2.42, 2.52, 4.68, 3.47, 2.42, 2.52),
ID = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4)),
.Names = c("DATE", "x", "y", "ID"),
class = c("data.frame"),
row.names = c(NA, -17L))
# ave() takes an input object, an object of group IDs of the same length
# as the input object, and a function to apply to the input object split across groups
mydata$ID2 <- ave(mydata$ID, mydata$ID, FUN = length)
mydata
DATE x y ID ID2
1 06/10/2003 7.210 0.651 1 5
2 12/10/2003 5.990 0.428 1 5
3 18/10/2003 4.680 1.040 1 5
4 24/10/2003 3.470 0.363 1 5
5 30/10/2003 2.420 0.507 1 5
6 02/05/2010 2.720 0.470 2 4
7 05/05/2010 2.600 646.000 2 4
8 08/05/2010 2.670 0.205 2 4
9 11/05/2010 3.570 0.524 2 4
10 12/05/2010 0.428 4.680 3 6
11 13/05/2010 1.040 3.470 3 6
12 14/05/2010 0.363 2.420 3 6
13 18/10/2003 0.507 2.520 3 6
14 24/10/2003 0.418 4.680 3 6
15 30/10/2003 0.470 3.470 3 6
16 29/04/2010 0.646 2.420 4 2
17 18/10/2003 3.470 2.520 4 2
# if you want to replace the original ID variable, you can assign to it
# instead of adding a new variable
mydata$ID <- ave(mydata$ID, mydata$ID, FUN = length)
A solution with dplyr:
library(dplyr)
df %>%
group_by(ID) %>%
mutate(ID2 = n()) %>%
ungroup() %>%
mutate(ID = ID2) %>%
select(-ID2)
Edit:
I've just found a solution that's a bit cleaner than the above:
df %>%
group_by(ID2 = ID) %>%
mutate(ID = n()) %>%
select(-ID2)
Result:
# A tibble: 17 x 4
DATE x y ID
<fctr> <dbl> <dbl> <int>
1 06/10/2003 7.210 0.651 5
2 12/10/2003 5.990 0.428 5
3 18/10/2003 4.680 1.040 5
4 24/10/2003 3.470 0.363 5
5 30/10/2003 2.420 0.507 5
6 02/05/2010 2.720 0.470 4
7 05/05/2010 2.600 0.646 4
8 08/05/2010 2.670 0.205 4
9 11/05/2010 3.570 0.524 4
10 12/05/2010 0.428 4.680 6
11 13/05/2010 1.040 3.470 6
12 14/05/2010 0.363 2.420 6
13 18/10/2003 0.507 2.520 6
14 24/10/2003 0.418 4.680 6
15 30/10/2003 0.470 3.470 6
16 29/04/2010 0.646 2.420 2
17 18/10/2003 3.470 2.520 2
Notes:
The reason behind ungroup() %>% mutate(ID = ID2) %>% select(-ID2) is that dplyr doesn't allow mutateing on grouping variables. So this would not work:
df %>%
group_by(ID) %>%
mutate(ID = n())
Error in mutate_impl(.data, dots) : Column ID can't be modified
because it's a grouping variable
If you don't care about replacing the original ID column, you can just do:
df %>%
group_by(ID) %>%
mutate(ID2 = n())
Alternative Result:
# A tibble: 17 x 5
# Groups: ID [4]
DATE x y ID ID2
<fctr> <dbl> <dbl> <int> <int>
1 06/10/2003 7.210 0.651 1 5
2 12/10/2003 5.990 0.428 1 5
3 18/10/2003 4.680 1.040 1 5
4 24/10/2003 3.470 0.363 1 5
5 30/10/2003 2.420 0.507 1 5
6 02/05/2010 2.720 0.470 2 4
7 05/05/2010 2.600 0.646 2 4
8 08/05/2010 2.670 0.205 2 4
9 11/05/2010 3.570 0.524 2 4
10 12/05/2010 0.428 4.680 3 6
11 13/05/2010 1.040 3.470 3 6
12 14/05/2010 0.363 2.420 3 6
13 18/10/2003 0.507 2.520 3 6
14 24/10/2003 0.418 4.680 3 6
15 30/10/2003 0.470 3.470 3 6
16 29/04/2010 0.646 2.420 4 2
17 18/10/2003 3.470 2.520 4 2
My goal is calculate a final data frame, which would contain the means from several different data frames. Given data like this:
A <- c(1,2,3,4,5,6,7,8,9)
B <- c(2,2,2,3,4,5,6,7,8)
C <- c(1,1,1,1,1,1,2,2,1)
D <- c(5,5,5,5,6,6,6,7,7)
E <- c(4,4,3,5,6,7,8,9,7)
DF1 <- data.frame(A,B,C)
DF2 <- data.frame(E,D,C)
DF3 <- data.frame(A,C,E)
DF4 <- data.frame(A,D,E)
I'd like to calculate means for all three columns (per row) in each data frame. To do this I put together a for loop:
All <- data.frame(matrix(ncol = 3, nrow = 9))
for(i in seq(1:ncol(DF1))){
All[,i] <- mean(c(DF1[,i], DF2[,i], DF3[,i], DF4[,i]))
}
X1 X2 X3
1 5.222222 4.277778 3.555556
2 5.222222 4.277778 3.555556
3 5.222222 4.277778 3.555556
4 5.222222 4.277778 3.555556
5 5.222222 4.277778 3.555556
6 5.222222 4.277778 3.555556
7 5.222222 4.277778 3.555556
8 5.222222 4.277778 3.555556
9 5.222222 4.277778 3.555556
But the end result was that I calculated entire column means (as opposed to a mean for each individual row).
For example, the first row and first column for each of the 4 data frames is 1,4,1,1. So I would expect the first col and row of the final data frame to be 1.75 (mean(c(1,4,1,1))
We place the datasets in a list, get the sum (+) of corresponding elements using Reduce and divide it by the number of datasets
Reduce(`+`, mget(paste0("DF", 1:4)))/4
# A B C
#1 1.75 3.25 2.5
#2 2.50 3.25 2.5
#3 3.00 3.25 2.0
#4 4.25 3.50 3.0
#5 5.25 4.25 3.5
#6 6.25 4.50 4.0
#7 7.25 5.00 5.0
#8 8.25 5.75 5.5
#9 8.50 5.75 4.0
NOTE: It should be faster than any apply based solutions and the output is a data.frame as that of the original dataset
If we want the tidyverse, then another option is
library(dplyr)
library(tidyr)
library(purrr)
library(tibble)
mget(paste0("DF", 1:4)) %>%
map(rownames_to_column, "rn") %>%
map(setNames, c("rn", LETTERS[1:3])) %>%
bind_rows() %>%
group_by(rn) %>%
summarise_each(funs(mean))
# A tibble: 9 × 4
# rn A B C
# <chr> <dbl> <dbl> <dbl>
#1 1 1.75 3.25 2.5
#2 2 2.50 3.25 2.5
#3 3 3.00 3.25 2.0
#4 4 4.25 3.50 3.0
#5 5 5.25 4.25 3.5
#6 6 6.25 4.50 4.0
#7 7 7.25 5.00 5.0
#8 8 8.25 5.75 5.5
#9 9 8.50 5.75 4.0
Since what you're describing is effectively an array, you can actually make it one with abind::abind, which makes the operation pretty simple:
apply(abind::abind(DF1, DF2, DF3, DF4, along = 3), 1:2, mean)
## A D E
## [1,] 1.75 3.25 2.5
## [2,] 2.50 3.25 2.5
## [3,] 3.00 3.25 2.0
## [4,] 4.25 3.50 3.0
## [5,] 5.25 4.25 3.5
## [6,] 6.25 4.50 4.0
## [7,] 7.25 5.00 5.0
## [8,] 8.25 5.75 5.5
## [9,] 8.50 5.75 4.0
The column names are meaningless, and the result is a matrix, not a data.frame, but even if you wrap it in data.frame, it's still very fast.
A combination of tidyverse and base:
#install.packages('tidyverse')
library(tidyverse)
transpose(list(DF1, DF2, DF3, DF4)) %>%
map(function(x)
rowMeans(do.call(rbind.data.frame,
transpose(x)))) %>%
bind_cols()
Should yield:
# A B C
# <dbl> <dbl> <dbl>
# 1 1.75 3.25 2.5
# 2 2.50 3.25 2.5
# 3 3.00 3.25 2.0
# 4 4.25 3.50 3.0
# 5 5.25 4.25 3.5
# 6 6.25 4.50 4.0
# 7 7.25 5.00 5.0
# 8 8.25 5.75 5.5
# 9 8.50 5.75 4.0