Interpolate NA values when column ends on NA - r

I have a column with numeric data with NA and also ending on NA:
df <- data.frame(
Diam_av = c(12.3, 13, 15.5, NA, NA, NA, NA, 13.7, NA, NA, NA, 9.98, 4,0, 8.76, NA, NA, NA)
)
I want to interpolate the missing values. This works fine with zoo's function na.approx as long as there are positive boundary values to interpolate from but it fails if, as in my case, one of the boundary values is NA (at the end of the column Daim_av:
library(zoo)
df %>%
mutate(Diam_intpl = na.approx(Diam_av))
Error: Problem with `mutate()` input `Diam_intpl`.
x Input `Diam_intpl` can't be recycled to size 18.
ℹ Input `Diam_intpl` is `na.approx(Diam_av)`.
ℹ Input `Diam_intpl` must be size 18 or 1, not 15.
Any idea how to exclude/neutralize column-final NA values?

Add na.rm=F to remove the error message. Add rule=2 to get the value from the last non-NA value.
df %>%
mutate(Diam_intpl = na.approx(Diam_av, na.rm=F),
Diam_intpl2 = na.approx(Diam_av, na.rm=F, rule=2))
Diam_av Diam_intpl Diam_intpl2
1 12.30 12.30 12.30
2 13.00 13.00 13.00
3 15.50 15.50 15.50
4 NA 15.14 15.14
5 NA 14.78 14.78
6 NA 14.42 14.42
7 NA 14.06 14.06
8 13.70 13.70 13.70
9 NA 12.77 12.77
10 NA 11.84 11.84
11 NA 10.91 10.91
12 9.98 9.98 9.98
13 4.00 4.00 4.00
14 0.00 0.00 0.00
15 8.76 8.76 8.76
16 NA NA 8.76
17 NA NA 8.76
18 NA NA 8.76

If I understand well, you can replace NAs with imputeTS::na_interpolation(), that has many options:
library(imputeTS)
df$interpolated <- na_interpolation(df,option = 'linear')$Diam_av
Diam_av interpolated
1 12.30 12.30
2 13.00 13.00
3 15.50 15.50
4 NA 15.14
5 NA 14.78
6 NA 14.42
7 NA 14.06
8 13.70 13.70
9 NA 12.77
10 NA 11.84
11 NA 10.91
12 9.98 9.98
13 4.00 4.00
14 0.00 0.00
15 8.76 8.76
16 NA 8.76
17 NA 8.76
18 NA 8.76

Related

Quarterly year-to-year changes

I have a quarterly time series. I am trying to apply a function which is supposed calculate the year-to-year growth and year-to-year difference and multiply a variable by (-1).
I already used a similar function for calculating quarter-to-quarter changes and it worked.
I modified this function for yoy changes and it does not have any effect on my data frame. And any error popped up.
Do you have any suggestion how to modify the function or how to accomplish to apply the yoy change function on a time series?
Here is the code:
Date <- c("2004-01-01","2004-04-01", "2004-07-01","2004-10-01","2005-01-01","2005-04-01","2005-07-01","2005-10-01","2006-01-01","2006-04-01","2006-07-01","2006-10-01","2007-01-01","2007-04-01","2007-07-01","2007-10-01")
B1 <- c(3189.30,3482.05,3792.03,4128.66,4443.62,4876.54,5393.01,5885.01,6360.00,6930.00,7430.00,7901.00,8279.00,8867.00,9439.00,10101.00)
B2 <- c(7939.97,7950.58,7834.06,7746.23,7760.59,8209.00,8583.05,8930.74,9424.00,9992.00,10041.00,10900.00,11149.00,12022.00,12662.00,13470.00)
B3 <- as.numeric(c("","","","",140.20,140.30,147.30,151.20,159.60,165.60,173.20,177.30,185.30,199.30,217.10,234.90))
B4 <- as.numeric(c("","","","",-3.50,-14.60,-11.60,-10.20,-3.10,-16.00,-4.90,-17.60,-5.30,-10.90,-12.80,-8.40))
df <- data.frame(Date,B1,B2,B3,B4)
The code will produce following data frame:
Date B1 B2 B3 B4
1 2004-01-01 3189.30 7939.97 NA NA
2 2004-04-01 3482.05 7950.58 NA NA
3 2004-07-01 3792.03 7834.06 NA NA
4 2004-10-01 4128.66 7746.23 NA NA
5 2005-01-01 4443.62 7760.59 140.2 -3.5
6 2005-04-01 4876.54 8209.00 140.3 -14.6
7 2005-07-01 5393.01 8583.05 147.3 -11.6
8 2005-10-01 5885.01 8930.74 151.2 -10.2
9 2006-01-01 6360.00 9424.00 159.6 -3.1
10 2006-04-01 6930.00 9992.00 165.6 -16.0
11 2006-07-01 7430.00 10041.00 173.2 -4.9
12 2006-10-01 7901.00 10900.00 177.3 -17.6
13 2007-01-01 8279.00 11149.00 185.3 -5.3
14 2007-04-01 8867.00 12022.00 199.3 -10.9
15 2007-07-01 9439.00 12662.00 217.1 -12.8
16 2007-10-01 10101.00 13470.00 234.9 -8.4
And I want to apply following changes on the variables:
# yoy absolute difference change
abs.diff = c("B1","B2")
# yoy percentage change
percent.change = c("B3")
# make the variable negative
negative = c("B4")
This is the fuction that I am trying to use for my data frame.
transformation = function(D,abs.diff,percent.change,negative)
{
TT <- dim(D)[1]
DData <- D[-1,]
nms <- c()
for (i in c(2:dim(D)[2])) {
# yoy absolute difference change
if (names(D)[i] %in% abs.diff)
{ DData[,i] = (D[5:TT,i]-D[1:(TT-4),i])
names(DData)[i] = paste('a',names(D)[i],sep='') }
# yoy percent. change
if (names(D)[i] %in% percent.change)
{ DData[,i] = 100*(D[5:TT,i]-D[1:(TT-4),i])/D[1:(TT-4),i]
names(DData)[i] = paste('p',names(D)[i],sep='') }
#CA.deficit
if (names(D)[i] %in% negative)
{ DData[,i] = (-1)*D[1:TT,i] }
}
return(DData)
}
This is what I would like to get :
Date pB1 pB2 aB3 B4
1 2004-01-01 NA NA NA NA
2 2004-04-01 NA NA NA NA
3 2004-07-01 NA NA NA NA
4 2004-10-01 NA NA NA NA
5 2005-01-01 39.33 -2.26 NA 3.5
6 2005-04-01 40.05 3.25 NA 14.6
7 2005-07-01 42.22 9.56 NA 11.6
8 2005-10-01 42.54 15.29 11.0 10.2
9 2006-01-01 43.13 21.43 19.3 3.1
10 2006-04-01 42.11 21.72 18.3 16.0
11 2006-07-01 37.77 16.99 22.0 4.9
12 2006-10-01 34.26 22.05 17.7 17.6
13 2007-01-01 30.17 18.3 19.7 5.3
14 2007-04-01 27.95 20.32 26.1 10.9
15 2007-07-01 27.04 26.1 39.8 12.8
16 2007-10-01 27.84 23.58 49.6 8.4
Grouping by the months, i.e. 6th and 7th substring using ave and do the necessary calculations. With sapply we may loop over the columns.
f <- function(x) {
g <- substr(Date, 6, 7)
l <- length(unique(g))
o <- ave(x, g, FUN=function(x) 100/x * c(x[-1], NA) - 100)
c(rep(NA, l), head(o, -4))
}
cbind(df[1], sapply(df[-1], f))
# Date B1 B2 B3 B4
# 1 2004-01-01 NA NA NA NA
# 2 2004-04-01 NA NA NA NA
# 3 2004-07-01 NA NA NA NA
# 4 2004-10-01 NA NA NA NA
# 5 2005-01-01 39.32901 -2.259202 NA NA
# 6 2005-04-01 40.04796 3.250329 NA NA
# 7 2005-07-01 42.21960 9.560688 NA NA
# 8 2005-10-01 42.54044 15.291439 NA NA
# 9 2006-01-01 43.12655 21.434066 13.83738 -11.428571
# 10 2006-04-01 42.10895 21.720063 18.03279 9.589041
# 11 2006-07-01 37.77093 16.986386 17.58316 -57.758621
# 12 2006-10-01 34.25636 22.050356 17.26190 72.549020
# 13 2007-01-01 30.17296 18.304329 16.10276 70.967742
# 14 2007-04-01 27.95094 20.316253 20.35024 -31.875000
# 15 2007-07-01 27.03903 26.102978 25.34642 161.224490
# 16 2007-10-01 27.84458 23.577982 32.48731 -52.272727

Replace all duplicated with na

My question is similar to replace duplicate values with NA in time series data using dplyr but while applying to other time series which are like below :
box_num date x y
6-WQ 2018-11-18 20.2 8
6-WQ 2018-11-25 500.75 7.2
6-WQ 2018-12-2 500.75 23
25-LR 2018-11-18 374.95 4.3
25-LR 2018-11-25 0.134 9.3
25-LR 2018-12-2 0.134 4
73-IU 2018-12-2 225.54 0.7562
73-IU 2018-12-9 28 0.7562
73-IU 2018-12-16 225.54 52.8
library(dplyr)
df %>%
group_by(box_num) %>%
mutate_at(vars(x:y), funs(replace(., duplicated(.), NA)))
The above code can identify and replace with NA, but the underlying problem is I'm trying to replace all NA with a linear trend in the coming step. Since it's a time series.But when we see for box_num : 6-WQ after 20.2 we can see directly a large shift which we can say it's a imputed value so I would to replace both the imputed values as NA and the other case is like for box_num 73-IU imputed values got entered after one week so I would like to replace imputed values with NA
Expected output :
box_num date x y
6-WQ 2018-11-18 20.2 8
6-WQ 2018-11-25 NA 7.2
6-WQ 2018-12-2 NA 23
25-LR 2018-11-18 374.95 4.3
25-LR 2018-11-25 NA 9.3
25-LR 2018-12-2 NA 4
73-IU 2018-12-2 NA NA
73-IU 2018-12-9 28 NA
73-IU 2018-12-16 NA 52.8
foo = function(x){
replace(x, ave(x, x, FUN = length) > 1, NA)
}
myCols = c("x", "y")
df1[myCols] = lapply(df1[myCols], foo)
df1
# box_num date x y
#1 6-WQ 2018-11-18 20.20 8.0
#2 6-WQ 2018-11-25 NA 7.2
#3 6-WQ 2018-12-2 NA 23.0
#4 25-LR 2018-11-18 374.95 4.3
#5 25-LR 2018-11-25 NA 9.3
#6 25-LR 2018-12-2 NA 4.0
#7 73-IU 2018-12-2 NA NA
#8 73-IU 2018-12-9 28.00 NA
#9 73-IU 2018-12-16 NA 52.8
#DATA
df1 = structure(list(box_num = c("6-WQ", "6-WQ", "6-WQ", "25-LR", "25-LR",
"25-LR", "73-IU", "73-IU", "73-IU"), date = c("2018-11-18", "2018-11-25",
"2018-12-2", "2018-11-18", "2018-11-25", "2018-12-2", "2018-12-2",
"2018-12-9", "2018-12-16"), x = c(20.2, 500.75, 500.75, 374.95,
0.134, 0.134, 225.54, 28, 225.54), y = c(8, 7.2, 23, 4.3, 9.3,
4, 0.7562, 0.7562, 52.8)), class = "data.frame", row.names = c(NA,
-9L))
With tidyverse you can do:
df %>%
group_by(box_num) %>%
mutate_at(vars(x:y), funs(ifelse(. %in% subset(rle(sort(.))$values, rle(sort(.))$length > 1), NA, .)))
box_num date x y
<fct> <fct> <dbl> <dbl>
1 6-WQ 2018-11-18 20.2 8.00
2 6-WQ 2018-11-25 NA 7.20
3 6-WQ 2018-12-2 NA 23.0
4 25-LR 2018-11-18 375. 4.30
5 25-LR 2018-11-25 NA 9.30
6 25-LR 2018-12-2 NA 4.00
7 73-IU 2018-12-2 NA NA
8 73-IU 2018-12-9 28.0 NA
9 73-IU 2018-12-16 NA 52.8
First, it sorts the values in "x" and "y" and computes the run length of equal values. Second, it creates a subset for those values that have a run length > 1. Finally, it compares whether the values in "x" and "y" are in the subset, and if so, they get NA.

Merging zoo object using do.call in R

A have several csv file like this :
,timestamp,AirTemperature_House
1,2013-09-01 00:00:00,8.22
2,2013-09-01 01:00:00,6.53
3,2013-09-01 02:00:00,6.67
4,2013-09-01 03:00:00,5.58
5,2013-09-01 04:00:00,4.16
6,2013-09-01 05:00:00,4.76
7,2013-09-01 06:00:00,5.06
8,2013-09-01 07:00:00,5.16
9,2013-09-01 08:00:00,6.83
10,2013-09-01 09:00:00,8.59
11,2013-09-01 10:00:00,10.99
12,2013-09-01 11:00:00,11.08
I grouped them to a list of zoo object using the following code :
raw_data<-list.files(path = "./AWS_Data_STU/Air_temp/",pattern="Air",full.names = T)
data_stu<-lapply(raw_data,function(x){
ss<-read.csv(x)
ss<-zoo(ss,order.by = ss$timestamp)
})
I made a list of zoo object which all look like this one :
str(data_stu[[1]])
‘zoo’ series from 2013-09-01 00:00:00 to 2014-04-30 23:00:00
Data: num [1:5808] 8.22 6.53 6.67 5.58 4.16 4.76 5.06 5.16 6.83 8.59 ...
Index: Factor w/ 5808 levels "2013-09-01 00:00:00",..: 1 2 3 4 5 6 7 8 9 10
...
I want to merge all my list to a data frame as :
X1 x2 x3 X4 x5 x6 x7
1 12.95 NA NA NA
2 14.81 14.37 NA NA 12.78 NA
3 15.02 15.11 NA NA 12.61 NA
4 13.91 14.25 NA NA 11.89 NA
5 12.34 13.96 NA NA 10.86 NA
6 14.40 14.47 NA NA 10.40 NA
I used the do call function
do.call(merge.zoo,data_stu )
structure(c(7.66, 7.29, 7.34, 7.15, 6.76, 6.41, 6.25, 6.36, 6.78,
1 NA
2 NA
3 NA
4 NA
5 NA
6 NA
7 NA
8
but it gave me only NA object.
Any ideas ?
The problem is that the index of all your zoo objects are factors. You need to convert them to POSIXct. Also, you should not call methods directly. I.e., you should call merge instead of merge.zoo and let R handle method dispatch.
You can also use read.zoo to help with the conversion.
data_stu <- do.call(merge, lapply(raw_data, read.zoo, sep=",", header=TRUE,
FUN=as.POSIXct, colClasses=c("NULL", "character", "numeric")))

Create a Custom Function that Extracts Certain Rows

head(MYK)
X Analyte Subject Cohort DayNominal HourNominal Concentration uniqueID FS EF VTI deltaFS deltaEF deltaVTI HR
2 MYK-461 005-010 1 1 0.25 31.00 005-0100.25 31.82 64.86 0.00 3 -1 -100 58
3 MYK-461 005-010 1 1 0.50 31.80 005-0100.5 NA NA NA NA NA NA NA
4 MYK-461 005-010 1 1 1.00 9.69 005-0101 26.13 69.11 0.00 -15 6 -100 55
5 MYK-461 005-010 1 1 1.50 8.01 005-0101.5 NA NA NA NA NA NA NA
6 MYK-461 005-010 1 1 2.00 5.25 005-0102 NA NA NA NA NA NA NA
7 MYK-461 005-010 1 1 3.00 3.26 005-0103 29.89 60.99 23.49 -3 -7 9 55
105 MYK-461 005-033 2 1 0.25 3.4 005-0330.25 30.18 68.59 23.22 1 0 16 47
106 MYK-461 005-033 2 1 0.50 12.4 005-0330.5 NA NA NA NA NA NA NA
107 MYK-461 005-033 2 1 0.75 27.1 005-0330.75 NA NA NA NA NA NA NA
108 MYK-461 005-033 2 1 1.00 23.5 005-0331 32.12 69.60 21.06 7 2 5 43
109 MYK-461 005-033 2 1 1.50 16.8 005-0331.5 NA NA NA NA NA NA NA
110 MYK-461 005-033 2 1 2.00 15.8 005-0332 NA NA NA NA NA NA NA
organize = function(x, y) {
g1 = subset(x, Cohort == y)
g1 = aggregate(x[,'Concentration'], by=list(x[,'HourNominal']), FUN=mean)
g1 = setNames(g1, c('HourNominal', 'Concentration'))
g2 = aggregate(x[,'Concentration'], by=list(x[,'HourNominal']), FUN=sd)
g2 = setNames(g2, c('HourNominal', 'SD'))
g1[,'SD'] = g2$SD
g1$top = g1$Concentration + g1$SD
g1$bottom = g1$Concentration - g1$SD
return(g1)
}
I have a dataframe here, along with some code to subset the dataframe based on a certain Cohort, and to aggregate the Concentration based on Hour. However, all of the dataframes look the same.
CA1 = organize(MYK, 1)
CA2 = organize(MYK, 2)
Yet whenever I use these two commands, the two datasets are identical.
I want a dataset that looks like
HourNominal Concentration SD top bottom
1 0.25 27.287500 25.112204 52.399704 2.1752958
2 0.50 41.989722 32.856013 74.845735 9.1337094
3 0.75 49.866667 22.485254 72.351921 27.3814122
4 1.00 107.168889 104.612098 211.780987 2.5567908
5 1.50 191.766389 264.375466 456.141855 -72.6090774
6 1.75 319.233333 290.685423 609.918757 28.5479100
7 2.00 226.785278 272.983234 499.768512 -46.1979560
8 2.25 341.145833 301.555769 642.701602 39.5900645
9 2.50 341.145833 319.099679 660.245512 22.0461542
10 3.00 195.303333 276.530533 471.833866 -81.2271993
11 4.00 107.913889 140.251991 248.165880 -32.3381024
12 6.00 50.174167 64.700785 114.874952 -14.5266184
13 8.00 38.132639 47.099796 85.232435 -8.9671572
14 12.00 31.404444 39.667850 71.072294 -8.2634051
15 24.00 33.488583 41.267392 74.755975 -7.7788087
16 48.00 29.304833 38.233776 67.538609 -8.9289422
17 72.00 7.322792 6.548898 13.871690 0.7738932
18 96.00 7.002833 6.350251 13.353085 0.6525821
19 144.00 6.463875 5.612630 12.076505 0.8512452
20 216.00 5.007792 4.808156 9.815948 0.1996353
21 312.00 3.964727 4.351626 8.316353 -0.3868988
22 480.00 2.452857 3.220947 5.673804 -0.7680897
23 648.00 1.826625 2.569129 4.395754 -0.7425044
The problem is that the even why I try to separate the values by Cohort, the two dataframes have the same content. They should not be identical.

R script to format datatable to exactly 2 decimal places

I have made a datatable "Event_Table" with 46 rows and 6 columns. At some point I export this to text file and would like the output of some fields to be truncated to exactly 2 decimal places.
Event_Table[1:34,3:6]=round(Event_Table[1:34,3:6])
Event_Table[36:39,3:6]=format(round(Event_Table[36:39,3:6],2), nsmall=2)
Event_Table[41:46,3:6]=format(round(Event_Table[41:46,3:6],2), nsmall=2)
Line 1 and 2 produce the desired result, but subsequently running line 3 throws an error:
Error in Math.data.frame(list(CO = c("0", "0", "0.786407766990291", "0", :
non-numeric variable in data frame: CONCONATotal
Why? If remove line 2, then line 3 runs fine. So somethign about setting the formatting in one part of the table is affecting the entire table and prevents a second format command form being possible (even though the formatting is only being applied to discrete parts of the table). Any ideas how to avoid this, or to achieve what is required in a different way?
EDIT:
I should perhaps add that the following code is not quite sufficient:
Event_Table[36:46,3:6]=round(Event_Table[36:46,3:6], digits=2)
Trailing zeros are truncated. i.e. A value of 1 is displayed as "1", not as "1.00". The latter being what is required.
EDIT2:
Here is the table:
ChrSize Chr CO NCO NA Total
1 230218 1 4.00 1.00 0 5.00
2 813184 2 6.00 6.00 0 12.00
3 316620 3 2.00 3.00 0 5.00
4 1531933 4 13.00 20.00 0 33.00
5 576874 5 3.00 8.00 0 11.00
6 270161 6 4.00 2.00 0 6.00
7 1090940 7 11.00 5.00 0 16.00
8 562643 8 5.00 9.00 0 14.00
9 439888 9 6.00 3.00 0 9.00
10 745751 10 10.00 6.00 0 16.00
11 666816 11 3.00 7.00 0 10.00
12 1078177 12 11.00 13.00 1 25.00
13 924431 13 7.00 12.00 0 19.00
14 784333 14 5.00 6.00 1 12.00
15 1091291 15 6.00 17.00 0 23.00
16 948066 16 7.00 6.00 0 13.00
17 12071326 TOTAL 103.00 124.00 2 229.00
18 NA Event Lengths: NA NA NA NA
19 NA Min Len 0.00 22.00 0 0.00
20 NA Max Len 14745.00 12524.00 0 14745.00
21 NA Mean Len 2588.00 1826.00 0 2153.00
22 NA Median Len 1820.00 1029.00 0 1322.00
23 NA Chromatids: NA NA NA NA
24 NA 1_chrom 0.00 98.00 2 100.00
25 NA 2_chrom 81.00 22.00 0 103.00
26 NA 3_chrom 14.00 4.00 0 18.00
27 NA 4_chrom 8.00 0.00 0 8.00
28 NA Classe: NA NA NA NA
29 NA 1_1brin 0.00 55.00 0 55.00
30 NA 1_2brins 0.00 43.00 2 45.00
31 NA 2_nonsis 81.00 15.00 0 96.00
32 NA 2_sis 0.00 7.00 0 7.00
33 NA classe_3 14.00 4.00 0 18.00
34 NA classe_4 8.00 0.00 0 8.00
35 NA Fraction of Chromatids: NA NA NA NA
36 NA 1_chrom 0.00 0.79 1 0.44
37 NA 2_chrom 0.79 0.18 0 0.45
38 NA 3_chrom 0.14 0.03 0 0.08
39 NA 4_chrom 0.08 0.00 0 0.03
40 NA Fraction of each Classe: NA NA NA NA
41 NA 1_1brin 0.00 0.44 0 0.24
42 NA 1_2brins 0.00 0.35 1 0.20
43 NA 2_nonsis 0.79 0.12 0 0.42
44 NA 2_sis 0.00 0.06 0 0.03
45 NA classe_3 0.14 0.03 0 0.08
46 NA classe_4 0.08 0.00 0 0.03
I require rows 1-34 formatted without decimals.
And rows 36-46 formatted with precisely 2 decimal places for all values.
EDIT3: The initial data is read sequentially into tables called "data", then a derivative output table "Event_Table" is generated in which I am inserting summaries of various aspects of each "data" table (i.e. totals, means, medians etc). I then sequentially export the "Event_Tables" since these contain the required summary informations for each "data" table.
Here is the start of the code:
# FIRST SET WORKING DIRECTORY WHERE INPUT FILES ARE!
files = list.files(pattern="Events_") # import files names with "Event_" string into variable "files"
files1 = length(files) # Count number of files
files2 = read.table(text = files, sep = "_", as.is = TRUE) #Split file names by "_" separator and create table "files2"
for (j in 1:files1)
{data <- read.table(files[j], header=TRUE) #Import datatable from files number 1 to j
# Making derivative dataframes:
Event_Table <- data.frame(matrix(NA, nrow = 46, ncol = 6)) # Creates dataframe of arbitrary size full of NAs
names(Event_Table) <- c("ChrSize","Chr","CO","NCO","NA","Total") # Adds column names to dataframe
Event_Table ["Chr"] = c(1:16, "TOTAL","Event Lengths:","Min Len", "Max Len","Mean Len","Median Len","Chromatids:","1_chrom","2_chrom","3_chrom","4_chrom","Classe:","1_1brin","1_2brins","2_nonsis","2_sis","classe_3","classe_4","Fraction of Chromatids:","1_chrom","2_chrom","3_chrom","4_chrom","Fraction of each Classe:","1_1brin","1_2brins","2_nonsis","2_sis","classe_3","classe_4") # Inserts vector 1:16 (numbers 1 to 16) in column 1 of dataframe
Event_Table [1:16,"ChrSize"] = c(230218,813184,316620,1531933,576874,270161,1090940,562643,439888,745751,666816,1078177,924431,784333,1091291,948066)
Event_Table [17,"ChrSize"] =sum(Event_Table [1:16,"ChrSize"])
nE = nrow(data) # Total number of events
Event_Table [17,"Total"] = nrow(data)
Event_Table [19,"Total"] = min(data ["len"])
Event_Table [20,"Total"] = max(data ["len"])
Event_Table [21,"Total"] = mean(data ["len"])
Event_Table [22,"Total"] = median(data [1:nrow(data),"len"])
#More stuff here, etc, then close j loop }
So the Event_Table is set up as a data.frame of type matrix filled with NAs.
I then fill it manually with relevant info in relevant grid positions.
I then simply want to format the visual appearance of these fields.
If I am going about this all wrong, then please can you suggest a better way to do this! Thanks
Here is a proof of concept using 2 rather different data frames:
DF1 <- data.frame(x = rnorm(10), person = rep(LETTERS[1:2], 5))
DF2 <- data.frame(y = 1:10L, result = rep(LETTERS[3:4], 5), alt = rep(letters[3:4], 5))
write.table(DF1, file = "example.csv", sep = ",")
write.table(DF2, file = "example.csv", sep = ",", append = TRUE)
This issues a warning (about column names - no problem) and gives:
x person
1 0.796933543 A
2 1.495800567 B
3 0.359153458 A
4 2.105378598 B
5 0.175455314 A
6 -1.850171347 B
7 -0.87197177 A
8 2.682650638 B
9 1.040676847 A
10 -0.086197042 B
y result alt
1 1 C c
2 2 D d
3 3 C c
4 4 D d
5 5 C c
6 6 D d
7 7 C c
8 8 D d
9 9 C c
10 10 D d
From here you can control the formatting as desired. You may wish to suppress the column names or give more informative ones, and you probably don't want the row numbering either. See ?write.table for all the options.
It could be a similar problem as Error in Math.data.frame.....non-numeric variable in data frame:. Maybe you have commas in your data. If that is not the case, could you show what is in your table?

Resources