Fixing Tukey multicomparison - r

Hi I was running a Tukey multicomparison test for the data below (data and code attached) after confirming that the data was normal, equal variances and significant interactions via two-way ANOVA (time and growth condition). The results in R and final barchart (1) were included as well. As you can see the visualization could be improved and need to be tidied up due to the redundant letters. I was advised to redo using the same Tukey test but adding some additional codes to assign the samples at time point 0 hr as the reference/control (sort of like Dunnett test with a single control). I couldn't really find any useful information regarding this online, would appreciate any help/suggestion!
data.frame(Exp1)
id growth_condition time fv fq npq in_situ rd
1 1 Control 0 0.81 0.56 0.72 0.797 1.000
2 2 Control 0 0.81 0.58 0.78 0.788 1.000
3 3 Control 0 0.80 0.59 0.76 0.793 1.000
4 4 High light+Chilled 0 0.82 0.57 0.85 0.799 1.000
5 5 High light+Chilled 0 0.81 0.59 0.75 0.796 1.000
6 6 High light+Chilled 0 0.81 0.56 0.69 0.782 1.000
7 7 Control 0.5 0.81 0.53 1.08 0.759 1.279
8 8 Control 0.5 0.81 0.56 0.72 0.759 0.668
9 9 Control 0.5 0.79 0.50 1.04 0.771 0.877
10 10 High light+Chilled 0.5 0.70 0.46 1.04 0.540 0.487
11 11 High light+Chilled 0.5 0.60 0.43 0.69 0.652 1.341
12 12 High light+Chilled 0.5 0.73 0.46 1.19 0.606 0.904
13 13 Control 8 0.82 0.52 1.20 0.753 0.958
14 14 Control 8 0.81 0.55 1.09 0.759 0.642
15 15 Control 8 0.80 0.55 1.07 0.747 0.612
16 16 High light+Chilled 8 0.44 0.28 0.58 0.230 0.471
17 17 High light+Chilled 8 0.35 0.21 0.45 0.237 0.777
18 18 High light+Chilled 8 0.54 0.35 0.68 0.186 0.342
19 19 Control 24 0.81 0.49 1.17 0.762 0.915
20 20 Control 24 0.82 0.67 1.25 0.749 0.876
21 21 Control 24 0.82 0.48 1.18 0.756 0.836
22 22 High light+Chilled 24 0.40 0.25 0.45 0.089 0.392
23 23 High light+Chilled 24 0.43 0.27 0.51 0.106 0.627
24 24 High light+Chilled 24 0.34 0.21 0.37 0.140 0.258
25 25 Control 48 0.81 0.48 1.05 0.773 0.662
26 26 Control 48 0.80 0.45 1.14 0.785 0.914
27 27 Control 48 0.82 0.47 1.09 0.792 0.912
28 28 High light+Chilled 48 0.73 0.45 0.90 0.750 0.800
29 29 High light+Chilled 48 0.70 0.51 0.79 0.626 1.305
30 30 High light+Chilled 48 0.66 0.43 0.74 0.655 0.579
Code:
res.Exp8 <- aov(npq ~ growth_condition * time, data =Exp1)
summary(res.Exp8)
t8<-TukeyHSD(res.Exp8)
plot(t8)
multcompLetters4(res.Exp8,t8)
Results:
$`growth_condition:time`
Control:24 Control:8 Control:48 High light+Chilled:0.5 Control:0.5 High light+Chilled:48
"a" "ab" "abc" "abc" "abc" "bcd"
High light+Chilled:0 Control:0 High light+Chilled:8 High light+Chilled:24
"cde" "cde" "de" "e"

Related

Error in running t-test via stat_compare_means function, and TukeyHSD

For the "rd" parameter, I got an error message while running t.test using the ggpubr::stat_compare_means() function. Moreover, TukeyHSD analysis of my data categorized all the individual group as "a", implying that there was no significance differences. This seems a bit weird as I'm expecting the opposite by looking at the plot (attached my plot). Moreover, there was no issue for identical t.test and TukeyHSD analysis of other parameters (fv,fq,npq, and in_situ etc. in data frame ). Please find my scripts and datas below, thanks.
This was just an example of similar plot from another parameter ("fv" in data frame) where the results of t.test from ggpubr::stat_compare_means()were shown above the error bars,identical script was being used here. expected plot
Exp1 <- read_csv("Raw data/Exp1.csv")
Exp1 $time <- factor(Exp1$time)
Exp1 $growth_condition <- factor(Exp1$growth_condition)
summary_anti_PsbS_both <-summarySE(data=Exp1, measurevar="rd", groupvars=c("time","growth_condition"))
> data.frame(Exp1)
id growth_condition time fv fq npq in_situ rd
1 1 Control 0 0.81 0.56 0.72 0.797 1.000
2 2 Control 0 0.81 0.58 0.78 0.788 1.000
3 3 Control 0 0.80 0.59 0.76 0.793 1.000
4 4 High light+Chilled 0 0.82 0.57 0.85 0.799 1.000
5 5 High light+Chilled 0 0.81 0.59 0.75 0.796 1.000
6 6 High light+Chilled 0 0.81 0.56 0.69 0.782 1.000
7 7 Control 0.5 0.81 0.53 1.08 0.759 1.279
8 8 Control 0.5 0.81 0.56 0.72 0.759 0.668
9 9 Control 0.5 0.79 0.50 1.04 0.771 0.877
10 10 High light+Chilled 0.5 0.70 0.46 1.04 0.540 0.487
11 11 High light+Chilled 0.5 0.60 0.43 0.69 0.652 1.341
12 12 High light+Chilled 0.5 0.73 0.46 1.19 0.606 0.904
13 13 Control 8 0.82 0.52 1.20 0.753 0.958
14 14 Control 8 0.81 0.55 1.09 0.759 0.642
15 15 Control 8 0.80 0.55 1.07 0.747 0.612
16 16 High light+Chilled 8 0.44 0.28 0.58 0.230 0.471
17 17 High light+Chilled 8 0.35 0.21 0.45 0.237 0.777
18 18 High light+Chilled 8 0.54 0.35 0.68 0.186 0.342
19 19 Control 24 0.81 0.49 1.17 0.762 0.915
20 20 Control 24 0.82 0.67 1.25 0.749 0.876
21 21 Control 24 0.82 0.48 1.18 0.756 0.836
22 22 High light+Chilled 24 0.40 0.25 0.45 0.089 0.392
23 23 High light+Chilled 24 0.43 0.27 0.51 0.106 0.627
24 24 High light+Chilled 24 0.34 0.21 0.37 0.140 0.258
25 25 Control 48 0.81 0.48 1.05 0.773 0.662
26 26 Control 48 0.80 0.45 1.14 0.785 0.914
27 27 Control 48 0.82 0.47 1.09 0.792 0.912
28 28 High light+Chilled 48 0.73 0.45 0.90 0.750 0.800
29 29 High light+Chilled 48 0.70 0.51 0.79 0.626 1.305
30 30 High light+Chilled 48 0.66 0.43 0.74 0.655 0.579
Script for plot
ggplot(data=summary_anti_PsbS_both, mapping = aes(x = factor(time), y = rd, fill= growth_condition))+
geom_bar(stat = "identity", position = "dodge")+
labs(x= "Time (hr)", y="Relative density", fill= "Growth conditions")+
ylim(0,1.5)+
geom_errorbar(aes(ymin=rd-se, ymax=rd+se), width=.2, position=position_dodge(width= 0.9))+
annotate(geom="text", x=1, y=1.45, label="n=3")+
stat_compare_means(data=Exp1, label="p.signif", label.y= 1.35, method="t.test")+
theme_bw()+
theme(text = element_text(size = 15))
Error message
Warning message:
Computation failed in `stat_compare_means()`:
Problem while computing `p = purrr::map(...)`.
Script for TukeyHSD
res.both88 <- aov(rd ~ growth_condition * time, data =Exp1)
summary(res.both88)
t8<-TukeyHSD(res.both88)
multcompLetters4(res.both88,t8)

Problem with imputing data using Package Mice

Here's a small sample of my data :
> sample_n(k,20)
A B C D E
1 1.05 2.02 8.27 0.76 1.02
2 1.2 2.28 19.56 0.62 <NA>
3 1.2 2.31 3.45 0.65 1.22
4 <NA> 2.44 6.76 0.68 1.82
5 <NA> 2.24 6.99 0.59 1.37
6 0.87 1.71 3.32 0.64 1.87
7 <NA> 1.77 3.4 0.6 2.13
8 <NA> 2.17 4.13 0.81 1.19
9 <NA> 1.96 4.39 <NA> 1.66
10 1.15 2.28 14.73 0.73 1.57
11 <NA> 1.76 <NA> 0.79 2.66
12 <NA> 1.97 9 0.81 1.38
13 <NA> 2.18 9.32 0.78 0.9
14 <NA> 1.93 2.3 0.78 1.62
15 1.02 2.05 2.81 0.78 1.24
16 0.94 1.77 1.69 0.73 1.83
17 1.17 2.21 14.79 0.66 1.34
18 1.11 2.18 9.41 <NA> 1.32
19 1.35 2.51 20.44 0.76 0.73
20 <NA> 2.37 <NA> 0.74 1.41
I'm trying to impute the missing data using the package mice :
new_df = mice(df, method="cart")
I get the following error :
Error in edit.setup(data, setup, ...) :
`mice` detected constant and/or collinear variables. No predictors were left after their removal.
How can I fix this?

Applying a custom function repeatedly to same dataframe using purrr

Suppose I have a dataframe as follows:
df <- data.frame(
alpha = 0:20,
beta = 30:50,
gamma = 100:120
)
I have a custom function that makes new columns. (Note, my actual function is a lot more complex and can't be vectorized without a custom function, so please ignore the substance of the transformation here.) For example:
newfun <- function(var = NULL) {
newname <- paste0(var, "NEW")
df[[newname]] <- df[[var]]/100
return(df)
}
I want to apply this over many columns of the dataset repeatedly and have the dataset "build up." This happens just fine when I do the following:
df <- newfun("alpha")
df <- newfun("beta")
df <- newfun("gamma")
Obviously this is redundant and a case for map. But when I do the following I get back a list of dataframes, which is not what I want:
df <- data.frame(
alpha = 0:20,
beta = 30:50,
gamma = 100:120
)
out <- c("alpha", "beta", "gamma") %>%
map(function(x) newfun(x))
How can I iterate over a vector of column names AND see the changes repeatedly applied to the same dataframe?
Writing the function to reach outside of its scope to find some df is both risky and will bite you, especially when you see something like:
df[['a']] <- 2
# Error in df[["a"]] <- 2 : object of type 'closure' is not subsettable
You will get this error when it doesn't find your variable named df, and instead finds the base function named df. Two morals from this discovery:
While I admit to using df myself, it's generally bad practice to name variables the same as R functions (especially from base); and
Scope-breach is sloppy and renders a workflow unreproducible and often difficult to troubleshoot problems or changes.
To remedy this, and since your function relies on knowing what the old/new variable names are or should be, I think pmap or base R Map may work better. Further, I suggest that you name the new variables outside of the function, making it "data-only".
myfunc <- function(x) x/100
setNames(lapply(dat[,cols], myfunc), paste0("new", cols))
# $newalpha
# [1] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17
# [19] 0.18 0.19 0.20
# $newbeta
# [1] 0.30 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 0.40 0.41 0.42 0.43 0.44 0.45 0.46 0.47
# [19] 0.48 0.49 0.50
# $newgamma
# [1] 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17
# [19] 1.18 1.19 1.20
From here, we just need to column-bind (cbind) it:
cbind(dat, setNames(lapply(dat[,cols], myfunc), paste0("new", cols)))
# alpha beta gamma newalpha newbeta newgamma
# 1 0 30 100 0.00 0.30 1.00
# 2 1 31 101 0.01 0.31 1.01
# 3 2 32 102 0.02 0.32 1.02
# 4 3 33 103 0.03 0.33 1.03
# 5 4 34 104 0.04 0.34 1.04
# ...
Special note: if you plan on doing this iteratively (repeatedly), it is generally bad to iteratively add rows to frames; while I know this is a bad idea for adding rows, I suspect (without proof at the moment) that doing the same with columns is also bad. For that reason, if you do this a lot, consider using do.call(cbind, c(list(dat), ...)) where ... is the list of things to add. This results in a single call to cbind and therefore only a single memory-copy of the original dat. (Contrast that with iteratively calling the *bind functions which make a complete copy with each pass, scaling poorly.)
additions <- lapply(1:3, function(i) setNames(lapply(dat[,cols], myfunc), paste0("new", i, cols)))
str(additions)
# List of 3
# $ :List of 3
# ..$ new1alpha: num [1:21] 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 ...
# ..$ new1beta : num [1:21] 0.3 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 ...
# ..$ new1gamma: num [1:21] 1 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 ...
# $ :List of 3
# ..$ new2alpha: num [1:21] 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 ...
# ..$ new2beta : num [1:21] 0.3 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 ...
# ..$ new2gamma: num [1:21] 1 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 ...
# $ :List of 3
# ..$ new3alpha: num [1:21] 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 ...
# ..$ new3beta : num [1:21] 0.3 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 ...
# ..$ new3gamma: num [1:21] 1 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 ...
do.call(cbind, c(list(dat), additions))
# alpha beta gamma new1alpha new1beta new1gamma new2alpha new2beta new2gamma new3alpha new3beta new3gamma
# 1 0 30 100 0.00 0.30 1.00 0.00 0.30 1.00 0.00 0.30 1.00
# 2 1 31 101 0.01 0.31 1.01 0.01 0.31 1.01 0.01 0.31 1.01
# 3 2 32 102 0.02 0.32 1.02 0.02 0.32 1.02 0.02 0.32 1.02
# 4 3 33 103 0.03 0.33 1.03 0.03 0.33 1.03 0.03 0.33 1.03
# 5 4 34 104 0.04 0.34 1.04 0.04 0.34 1.04 0.04 0.34 1.04
# 6 5 35 105 0.05 0.35 1.05 0.05 0.35 1.05 0.05 0.35 1.05
# ...
An alternative approach is to change your function to only return a vector:
newfun2 <- function(var = NULL) {
df[[var]] / 100
}
newfun2('alpha')
# [1] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13
#[15] 0.14 0.15 0.16 0.17 0.18 0.19 0.20
Then, using base, you can use lapply() to loop through your list of functions to do:
cols <- c("alpha", "beta", "gamma")
df[, paste0(cols, 'NEW')] <- lapply(cols, newfun2)
#or
#df[, paste0(cols, 'NEW')] <- purrr::map(cols, newfun2)
df
alpha beta gamma alphaNEW betaNEW gammaNEW
1 0 30 100 0.00 0.30 1.00
2 1 31 101 0.01 0.31 1.01
3 2 32 102 0.02 0.32 1.02
4 3 33 103 0.03 0.33 1.03
5 4 34 104 0.04 0.34 1.04
6 5 35 105 0.05 0.35 1.05
7 6 36 106 0.06 0.36 1.06
8 7 37 107 0.07 0.37 1.07
9 8 38 108 0.08 0.38 1.08
10 9 39 109 0.09 0.39 1.09
11 10 40 110 0.10 0.40 1.10
12 11 41 111 0.11 0.41 1.11
13 12 42 112 0.12 0.42 1.12
14 13 43 113 0.13 0.43 1.13
15 14 44 114 0.14 0.44 1.14
16 15 45 115 0.15 0.45 1.15
17 16 46 116 0.16 0.46 1.16
18 17 47 117 0.17 0.47 1.17
19 18 48 118 0.18 0.48 1.18
20 19 49 119 0.19 0.49 1.19
21 20 50 120 0.20 0.50 1.20
Based on the way you wrote your function, a for loop that assign the result of newfun to df repeatedly works pretty well.
vars <- names(df)
for (i in vars){
df <- newfun(i)
}
df
# alpha beta gamma alphaNEW betaNEW gammaNEW
# 1 0 30 100 0.00 0.30 1.00
# 2 1 31 101 0.01 0.31 1.01
# 3 2 32 102 0.02 0.32 1.02
# 4 3 33 103 0.03 0.33 1.03
# 5 4 34 104 0.04 0.34 1.04
# 6 5 35 105 0.05 0.35 1.05
# 7 6 36 106 0.06 0.36 1.06
# 8 7 37 107 0.07 0.37 1.07
# 9 8 38 108 0.08 0.38 1.08
# 10 9 39 109 0.09 0.39 1.09
# 11 10 40 110 0.10 0.40 1.10
# 12 11 41 111 0.11 0.41 1.11
# 13 12 42 112 0.12 0.42 1.12
# 14 13 43 113 0.13 0.43 1.13
# 15 14 44 114 0.14 0.44 1.14
# 16 15 45 115 0.15 0.45 1.15
# 17 16 46 116 0.16 0.46 1.16
# 18 17 47 117 0.17 0.47 1.17
# 19 18 48 118 0.18 0.48 1.18
# 20 19 49 119 0.19 0.49 1.19
# 21 20 50 120 0.20 0.50 1.20

is there an R function i could use to create a cluster column of an imported dataset in csv format using ggplot2

i want to plot a stacked column using ggplot2 with R1, R2, R3 as the y variables while the varieties names remain in the x variable.
i have tried it on excel it worked but i decided importing the dataset in csv format to R for a more captivating outlook as this is part of my final year project.
varieties R1 R2 R3 Relative.yield SD
1 bd 0.40 2.65 1.45 1.50 1.13
2 bdj1 4.60 NA 2.80 3.70 1.27
3 bdj2 2.40 1.90 0.50 1.60 0.98
4 bdj3 2.40 1.65 5.20 3.08 1.87
5 challenge 2.10 5.15 1.35 2.87 2.01
6 doris 4.20 2.50 2.55 3.08 0.97
7 fel 0.80 2.40 0.75 1.32 0.94
8 fel2 NA 0.70 1.90 1.30 0.85
9 felbv 0.10 2.95 2.05 1.70 1.46
10 felnn 1.50 4.05 1.25 2.27 1.55
11 lad1 0.55 2.20 0.20 0.98 1.07
12 lad2 0.50 NA 0.50 0.50 0.00
13 lad3 1.10 3.90 1.00 2.00 1.65
14 lad4 1.50 1.65 0.50 1.22 0.63
15 molete1 2.60 1.80 2.75 2.38 0.51
16 molete2 1.70 4.70 4.20 3.53 1.61
17 mother's delight 0.10 4.00 1.90 2.00 1.95
18 ojaoba1a 1.90 3.45 2.75 2.70 0.78
19 ojaoba1b 4.20 2.75 4.30 3.75 0.87
20 ojoo 2.80 NA 3.60 3.20 0.57
21 omini 0.20 0.30 0.25 0.25 0.05
22 papa1 2.20 6.40 3.55 4.05 2.14
23 pk5 1.00 2.75 1.10 1.62 0.98
24 pk6 2.30 1.30 3.10 2.23 0.90
25 sango1a 0.40 0.90 1.55 0.95 0.58
26 sango1b 2.60 5.10 3.15 3.62 1.31
27 sango2a 0.50 0.55 0.75 0.60 0.13
28 sango2b 2.95 NA 2.60 2.78 0.25
29 usman 0.60 3.50 1.20 1.77 1.53
30 yau 0.05 0.85 0.20 0.37 0.43
> barplot(yield$R1)
> barplot(yield$Relative.yield)
> barplot(yield$Relative.yield, names.arg = varieties)
Error in barplot.default(yield$Relative.yield, names.arg = varieties) :
object 'varieties' not found
> ggplot(data = yield, mapping = aes(x = varieties, y = yield[,2:4])) + geom_()
Error in geom_() : could not find function "geom_"
> ggplot(data = yield, mapping = aes(x = varieties, y = yield[,2:4])) + geom()
Error in geom() : could not find function "geom"
You should put it in long format first, with tidyr::gather provides this functionality:
library(tidyverse)
gather(df[1:4],R, value, R1:R3) %>%
ggplot(aes(varieties,value, fill = R)) + geom_col()
#> Warning: Removed 5 rows containing missing values (position_stack).
data
df <- read.table(h=T,strin=F,text=
" varieties R1 R2 R3 Relative.yield SD
1 bd 0.40 2.65 1.45 1.50 1.13
2 bdj1 4.60 NA 2.80 3.70 1.27
3 bdj2 2.40 1.90 0.50 1.60 0.98
4 bdj3 2.40 1.65 5.20 3.08 1.87
5 challenge 2.10 5.15 1.35 2.87 2.01
6 doris 4.20 2.50 2.55 3.08 0.97
7 fel 0.80 2.40 0.75 1.32 0.94
8 fel2 NA 0.70 1.90 1.30 0.85
9 felbv 0.10 2.95 2.05 1.70 1.46
10 felnn 1.50 4.05 1.25 2.27 1.55
11 lad1 0.55 2.20 0.20 0.98 1.07
12 lad2 0.50 NA 0.50 0.50 0.00
13 lad3 1.10 3.90 1.00 2.00 1.65
14 lad4 1.50 1.65 0.50 1.22 0.63
15 molete1 2.60 1.80 2.75 2.38 0.51
16 molete2 1.70 4.70 4.20 3.53 1.61
17 'mother\\'s delight' 0.10 4.00 1.90 2.00 1.95
18 ojaoba1a 1.90 3.45 2.75 2.70 0.78
19 ojaoba1b 4.20 2.75 4.30 3.75 0.87
20 ojoo 2.80 NA 3.60 3.20 0.57
21 omini 0.20 0.30 0.25 0.25 0.05
22 papa1 2.20 6.40 3.55 4.05 2.14
23 pk5 1.00 2.75 1.10 1.62 0.98
24 pk6 2.30 1.30 3.10 2.23 0.90
25 sango1a 0.40 0.90 1.55 0.95 0.58
26 sango1b 2.60 5.10 3.15 3.62 1.31
27 sango2a 0.50 0.55 0.75 0.60 0.13
28 sango2b 2.95 NA 2.60 2.78 0.25
29 usman 0.60 3.50 1.20 1.77 1.53
30 yau 0.05 0.85 0.20 0.37 0.43"
)

Pre-defining number format of a data frame in R

Is there any way to pre-define a number format (e.g. rounding off to the specified number of decimal places) of a data frame so that whenever I add a new column it follows the same format?
I tried with format {base}, but it only changes the format of the existing columns not for the ones I add after.
A workable example is given below
mydf <- as.data.frame(matrix(rnorm(50), ncol=5))
mydf
V1 V2 V3 V4 V5
1 -1.3088022 -0.22088032 -1.8739405 1.65276442 1.21762297
2 1.1123253 -0.76042101 -0.1608188 0.39945804 -0.58674209
3 -0.9366654 0.92893610 -0.6905299 -0.37374892 -1.70539909
4 0.4619175 -0.28929198 1.0280021 -0.87998207 -0.34493824
5 -0.3741670 -0.61782368 -1.0435906 0.52166082 -0.29308408
6 -1.2283031 -0.37065379 0.8652538 0.05088202 -1.80997313
7 -1.1137726 -0.97878307 0.5045051 0.85442196 0.02932812
8 0.3373866 -0.46614754 -0.4642278 -0.38438002 -1.47251777
9 0.3245720 -0.06047061 -0.3273080 0.49145133 -0.86507348
10 1.6459180 -1.31076464 1.5627246 0.49841764 0.73895626
the following changes the format of the data frame
mydf <- format(mydf, digits=2)
mydf
V1 V2 V3 V4 V5
1 -1.31 -0.22 -1.87 1.653 1.218
2 1.11 -0.76 -0.16 0.399 -0.587
3 -0.94 0.93 -0.69 -0.374 -1.705
4 0.46 -0.29 1.03 -0.880 -0.345
5 -0.37 -0.62 -1.04 0.522 -0.293
6 -1.23 -0.37 0.87 0.051 -1.810
7 -1.11 -0.98 0.50 0.854 0.029
8 0.34 -0.47 -0.46 -0.384 -1.473
9 0.32 -0.06 -0.33 0.491 -0.865
10 1.65 -1.31 1.56 0.498 0.739
but this formatting is not applied when I add a new column to the data frame, see below
mydf$new <- rnorm(10)
mydf
V1 V2 V3 V4 V5 new
1 -1.31 -0.22 -1.87 1.653 1.218 0.30525117
2 1.11 -0.76 -0.16 0.399 -0.587 -1.83038790
3 -0.94 0.93 -0.69 -0.374 -1.705 0.34830499
4 0.46 -0.29 1.03 -0.880 -0.345 -0.66017888
5 -0.37 -0.62 -1.04 0.522 -0.293 0.03103741
6 -1.23 -0.37 0.87 0.051 -1.810 1.32809006
7 -1.11 -0.98 0.50 0.854 0.029 0.85428977
8 0.34 -0.47 -0.46 -0.384 -1.473 -0.51917266
9 0.32 -0.06 -0.33 0.491 -0.865 -0.37057104
10 1.65 -1.31 1.56 0.498 0.739 -1.32447706
I know I can adjust the digits using print {base}, but that also does not change the underlying format of the data frame. Any suggestion? Thanks in advance.

Resources