R monthly average from monthly time series data - r

Monthly rainfall data is in a time series from 1983 Jan. to 2012 Dec.
One.Month.RainfallSJ.inch <- window(TS.RainfallSJ_inch, start=c(1983, 1), end=c(2012, 12))
One.Month.RainfallSJ.inch
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1983 7.41 4.87 5.92 3.90 0.15 0.00 0.00 0.02 1.08 0.19 5.26 3.82
1984 0.17 1.44 0.90 0.54 0.00 0.01 0.00 0.00 0.02 1.75 3.94 1.73
1985 0.74 0.76 2.98 0.48 0.23 0.00 0.13 0.00 0.35 0.98 2.47 1.40
1986 2.41 6.05 3.99 0.66 0.16 0.00 0.00 0.00 1.02 0.08 0.17 0.85
1987 1.60 2.10 1.87 0.14 0.00 0.00 0.00 0.00 0.00 0.93 1.65 3.31
1988 2.08 0.62 0.06 1.82 0.66 0.00 0.00 0.00 0.00 0.06 1.42 2.14
1989 1.06 1.07 1.91 0.57 0.09 0.00 0.00 0.00 0.83 1.33 0.80 0.04
1990 1.93 1.61 0.89 0.22 2.38 0.00 0.15 0.00 0.24 0.25 0.24 2.03
1991 0.18 2.22 6.17 0.18 0.15 0.06 0.00 0.04 0.12 0.85 0.43 2.43
1992 1.73 6.59 3.37 0.42 0.00 0.25 0.00 0.00 0.00 0.66 0.05 4.51
1993 6.98 4.71 2.81 0.54 0.47 0.54 0.00 0.00 0.00 0.67 2.17 1.99
1994 1.33 3.03 0.44 1.47 1.21 0.01 0.00 0.00 0.07 0.27 2.37 1.76
1995 8.66 0.53 6.85 1.06 1.27 0.84 0.01 0.00 0.00 0.00 0.05 4.71
1996 3.03 4.85 2.62 0.75 1.42 0.00 0.00 0.00 0.01 1.08 1.65 4.78
1997 6.80 0.14 0.17 0.11 0.55 0.21 0.00 0.51 0.00 0.69 5.01 1.85
1998 4.81 10.23 2.40 1.46 1.93 0.00 0.00 0.00 0.05 0.60 1.77 0.72
1999 3.25 2.88 2.69 1.56 0.02 0.14 0.14 0.00 0.00 0.00 0.50 0.55
2000 3.57 4.56 1.69 0.74 0.40 0.30 0.00 0.01 0.12 2.16 0.44 0.31
2001 2.87 4.44 1.71 1.48 0.00 0.13 0.00 0.00 0.13 0.12 2.12 4.47
2002 0.75 0.81 1.80 0.35 0.68 0.00 0.00 0.00 0.00 0.00 1.99 6.60
2003 0.65 1.65 0.77 2.95 0.72 0.00 0.00 0.03 0.03 0.00 1.91 4.91
2004 1.61 4.28 0.49 0.40 0.08 0.00 0.00 0.00 0.15 3.04 0.73 4.32
2005 3.47 5.31 3.55 2.52 0.00 0.00 0.01 0.00 0.00 0.10 0.45 5.47
2006 2.94 2.39 6.55 4.55 0.45 0.00 0.00 0.00 0.00 0.39 1.38 1.77
2007 0.93 3.49 0.46 0.96 0.08 0.00 0.01 0.00 0.26 1.13 0.55 1.18
2008 5.81 1.81 0.15 0.03 0.00 0.00 0.00 0.00 0.00 0.19 1.33 1.53
2009 1.30 5.16 1.89 0.30 0.09 0.01 0.00 0.02 0.19 2.41 0.41 2.16
2010 4.58 2.12 2.05 3.03 0.35 0.00 0.00 0.00 0.00 0.25 1.76 2.53
2011 0.96 3.15 4.32 0.20 0.40 1.51 0.00 0.00 0.00 0.77 0.08 0.08
2012 0.90 0.63 1.98 1.88 0.00 0.15 0.00 0.00 0.01 0.35 2.59 4.24
How can I code Jan. average value from 1983 to 2012 and so on?
Thanks,
Nahm

Try maybe colMeans
colMeans(One.Month.RainfallSJ.inch)
# Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov
# 2.8170000 3.1166667 2.4483333 1.1756667 0.4646667 0.1386667 0.0150000 0.0210000 0.1560000 0.7100000 1.5230000
# Dec
# 2.6063333

Related

R profiling spending a lot of time using .External2

I am learning how to use R profiling, and have run the Rprof command on my code.
The summaryRprof function has shown that a lot of time is spent using .External2. What is this? Additionally, there is a large proportion of the total time spent on <Anonymous>, is there a way to find out what this is?
> summaryRprof("test")
$by.self
self.time self.pct total.time total.pct
".External2" 4.30 27.74 4.30 27.74
"format.POSIXlt" 2.70 17.42 2.90 18.71
"which.min" 2.38 15.35 4.12 26.58
"-" 1.30 8.39 1.30 8.39
"order" 1.16 7.48 1.16 7.48
"match" 0.58 3.74 0.58 3.74
"file" 0.44 2.84 0.44 2.84
"abs" 0.40 2.58 0.40 2.58
"scan" 0.30 1.94 0.30 1.94
"anyDuplicated.default" 0.20 1.29 0.20 1.29
"unique.default" 0.20 1.29 0.20 1.29
"unlist" 0.18 1.16 0.20 1.29
"c" 0.16 1.03 0.16 1.03
"data.frame" 0.14 0.90 0.22 1.42
"structure" 0.12 0.77 1.74 11.23
"as.POSIXct.POSIXlt" 0.12 0.77 0.12 0.77
"strptime" 0.12 0.77 0.12 0.77
"as.character" 0.08 0.52 0.90 5.81
"make.unique" 0.08 0.52 0.16 1.03
"[.data.frame" 0.06 0.39 1.54 9.94
"<Anonymous>" 0.04 0.26 4.34 28.00
"lapply" 0.04 0.26 1.70 10.97
"rbind" 0.04 0.26 0.94 6.06
"as.POSIXlt.POSIXct" 0.04 0.26 0.04 0.26
"ifelse" 0.04 0.26 0.04 0.26
"paste" 0.02 0.13 0.92 5.94
"merge.data.frame" 0.02 0.13 0.56 3.61
"[<-.factor" 0.02 0.13 0.52 3.35
"stopifnot" 0.02 0.13 0.04 0.26
".deparseOpts" 0.02 0.13 0.02 0.13
".External" 0.02 0.13 0.02 0.13
"close.connection" 0.02 0.13 0.02 0.13
"doTryCatch" 0.02 0.13 0.02 0.13
"is.na" 0.02 0.13 0.02 0.13
"is.na<-.default" 0.02 0.13 0.02 0.13
"mean" 0.02 0.13 0.02 0.13
"seq.int" 0.02 0.13 0.02 0.13
"sum" 0.02 0.13 0.02 0.13
"sys.function" 0.02 0.13 0.02 0.13
$by.total
total.time total.pct self.time self.pct
"write.table" 5.10 32.90 0.00 0.00
"<Anonymous>" 4.34 28.00 0.04 0.26
".External2" 4.30 27.74 4.30 27.74
"mapply" 4.22 27.23 0.00 0.00
"head" 4.16 26.84 0.00 0.00
"which.min" 4.12 26.58 2.38 15.35
"eval" 3.16 20.39 0.00 0.00
"eval.parent" 3.14 20.26 0.00 0.00
"write.csv" 3.14 20.26 0.00 0.00
"format" 2.92 18.84 0.00 0.00
"format.POSIXlt" 2.90 18.71 2.70 17.42
"do.call" 1.78 11.48 0.00 0.00
"structure" 1.74 11.23 0.12 0.77
"lapply" 1.70 10.97 0.04 0.26
"FUN" 1.66 10.71 0.00 0.00
"format.POSIXct" 1.62 10.45 0.00 0.00
"[.data.frame" 1.54 9.94 0.06 0.39
"[" 1.54 9.94 0.00 0.00
"-" 1.30 8.39 1.30 8.39
"order" 1.16 7.48 1.16 7.48
"rbind" 0.94 6.06 0.04 0.26
"paste" 0.92 5.94 0.02 0.13
"as.character" 0.90 5.81 0.08 0.52
"read.csv" 0.84 5.42 0.00 0.00
"read.table" 0.84 5.42 0.00 0.00
"as.character.POSIXt" 0.82 5.29 0.00 0.00
"match" 0.58 3.74 0.58 3.74
"merge.data.frame" 0.56 3.61 0.02 0.13
"merge" 0.56 3.61 0.00 0.00
"[<-.factor" 0.52 3.35 0.02 0.13
"[<-" 0.52 3.35 0.00 0.00
"strftime" 0.48 3.10 0.00 0.00
"file" 0.44 2.84 0.44 2.84
"weekdays" 0.42 2.71 0.00 0.00
"weekdays.POSIXt" 0.42 2.71 0.00 0.00
"abs" 0.40 2.58 0.40 2.58
"unique" 0.38 2.45 0.00 0.00
"scan" 0.30 1.94 0.30 1.94
"data.frame" 0.22 1.42 0.14 0.90
"cbind" 0.22 1.42 0.00 0.00
"anyDuplicated.default" 0.20 1.29 0.20 1.29
"unique.default" 0.20 1.29 0.20 1.29
"unlist" 0.20 1.29 0.18 1.16
"anyDuplicated" 0.20 1.29 0.00 0.00
"as.POSIXct" 0.18 1.16 0.00 0.00
"as.POSIXlt" 0.18 1.16 0.00 0.00
"c" 0.16 1.03 0.16 1.03
"make.unique" 0.16 1.03 0.08 0.52
"as.POSIXct.POSIXlt" 0.12 0.77 0.12 0.77
"strptime" 0.12 0.77 0.12 0.77
"as.POSIXlt.character" 0.12 0.77 0.00 0.00
"object.size" 0.12 0.77 0.00 0.00
"as.POSIXct.default" 0.10 0.65 0.00 0.00
"Ops.POSIXt" 0.08 0.52 0.00 0.00
"type.convert" 0.08 0.52 0.00 0.00
"!=" 0.06 0.39 0.00 0.00
"as.POSIXlt.factor" 0.06 0.39 0.00 0.00
"as.POSIXlt.POSIXct" 0.04 0.26 0.04 0.26
"ifelse" 0.04 0.26 0.04 0.26
"stopifnot" 0.04 0.26 0.02 0.13
"$" 0.04 0.26 0.00 0.00
"$.data.frame" 0.04 0.26 0.00 0.00
"[[" 0.04 0.26 0.00 0.00
"[[.data.frame" 0.04 0.26 0.00 0.00
"head.default" 0.04 0.26 0.00 0.00
".deparseOpts" 0.02 0.13 0.02 0.13
".External" 0.02 0.13 0.02 0.13
"close.connection" 0.02 0.13 0.02 0.13
"doTryCatch" 0.02 0.13 0.02 0.13
"is.na" 0.02 0.13 0.02 0.13
"is.na<-.default" 0.02 0.13 0.02 0.13
"mean" 0.02 0.13 0.02 0.13
"seq.int" 0.02 0.13 0.02 0.13
"sum" 0.02 0.13 0.02 0.13
"sys.function" 0.02 0.13 0.02 0.13
"%in%" 0.02 0.13 0.00 0.00
".rs.getSingleClass" 0.02 0.13 0.00 0.00
"[.POSIXlt" 0.02 0.13 0.00 0.00
"==" 0.02 0.13 0.00 0.00
"close" 0.02 0.13 0.00 0.00
"data.row.names" 0.02 0.13 0.00 0.00
"deparse" 0.02 0.13 0.00 0.00
"factor" 0.02 0.13 0.00 0.00
"is.na<-" 0.02 0.13 0.00 0.00
"match.arg" 0.02 0.13 0.00 0.00
"match.call" 0.02 0.13 0.00 0.00
"pushBack" 0.02 0.13 0.00 0.00
"seq" 0.02 0.13 0.00 0.00
"seq.POSIXt" 0.02 0.13 0.00 0.00
"simplify2array" 0.02 0.13 0.00 0.00
"tryCatch" 0.02 0.13 0.00 0.00
"tryCatchList" 0.02 0.13 0.00 0.00
"tryCatchOne" 0.02 0.13 0.00 0.00
"which" 0.02 0.13 0.00 0.00
$sample.interval
[1] 0.02
$sampling.time
[1] 15.5

corr.test arguments imply differing number of rows

I have seen this error multiple times in different projects and I was wondering if there is a way to tell which line caused the error in general?
My specific case:
http://archive.ics.uci.edu/ml/machine-learning-databases/00275/
#using the bike.csv
data<-read.csv("PATH_HERE\\Bike-Sharing-Dataset\\day.csv",header=TRUE)
require(psych)
corr.test(data)
data<-data[,c("atemp","casual","cnt","holiday","hum","mnth","registered",
"season","temp","weathersit","weekday","windspeed","workingday","yr")]
data[data=='']<-NA
#View(data)
require(psych)
cors<-corr.test(data)
returns the error:
Error in data.frame(lower = lower, r = r[lower.tri(r)], upper = upper, :
arguments imply differing number of rows: 0, 91
It works for me
> #using the bike.csv
> data <- read.csv("day.csv",header=TRUE)
> require(psych)
> corr.test(data)
Error in cor(x, use = use, method = method) : 'x' must be numeric
> data <- data[,c("atemp","casual","cnt","holiday","hum","mnth","registered",
+ "season","temp","weathersit","weekday","windspeed","workingday","yr")]
> data[data==''] <- NA
> #View(data)
>
> require(psych)
> cors <- corr.test(data)
> cors
Call:corr.test(x = data)
Correlation matrix
atemp casual cnt holiday hum mnth registered season temp
atemp 1.00 0.54 0.63 -0.03 0.14 0.23 0.54 0.34 0.99
casual 0.54 1.00 0.67 0.05 -0.08 0.12 0.40 0.21 0.54
cnt 0.63 0.67 1.00 -0.07 -0.10 0.28 0.95 0.41 0.63
holiday -0.03 0.05 -0.07 1.00 -0.02 0.02 -0.11 -0.01 -0.03
hum 0.14 -0.08 -0.10 -0.02 1.00 0.22 -0.09 0.21 0.13
mnth 0.23 0.12 0.28 0.02 0.22 1.00 0.29 0.83 0.22
registered 0.54 0.40 0.95 -0.11 -0.09 0.29 1.00 0.41 0.54
season 0.34 0.21 0.41 -0.01 0.21 0.83 0.41 1.00 0.33
temp 0.99 0.54 0.63 -0.03 0.13 0.22 0.54 0.33 1.00
weathersit -0.12 -0.25 -0.30 -0.03 0.59 0.04 -0.26 0.02 -0.12
weekday -0.01 0.06 0.07 -0.10 -0.05 0.01 0.06 0.00 0.00
windspeed -0.18 -0.17 -0.23 0.01 -0.25 -0.21 -0.22 -0.23 -0.16
workingday 0.05 -0.52 0.06 -0.25 0.02 -0.01 0.30 0.01 0.05
yr 0.05 0.25 0.57 0.01 -0.11 0.00 0.59 0.00 0.05
weathersit weekday windspeed workingday yr
atemp -0.12 -0.01 -0.18 0.05 0.05
casual -0.25 0.06 -0.17 -0.52 0.25
cnt -0.30 0.07 -0.23 0.06 0.57
holiday -0.03 -0.10 0.01 -0.25 0.01
hum 0.59 -0.05 -0.25 0.02 -0.11
mnth 0.04 0.01 -0.21 -0.01 0.00
registered -0.26 0.06 -0.22 0.30 0.59
season 0.02 0.00 -0.23 0.01 0.00
temp -0.12 0.00 -0.16 0.05 0.05
weathersit 1.00 0.03 0.04 0.06 -0.05
weekday 0.03 1.00 0.01 0.04 -0.01
windspeed 0.04 0.01 1.00 -0.02 -0.01
workingday 0.06 0.04 -0.02 1.00 0.00
yr -0.05 -0.01 -0.01 0.00 1.00
Sample Size
[1] 731
Probability values (Entries above the diagonal are adjusted for multiple tests.)
atemp casual cnt holiday hum mnth registered season temp
atemp 0.00 0.00 0.00 1.00 0.01 0.00 0.00 0.00 0.00
casual 0.00 0.00 0.00 1.00 1.00 0.04 0.00 0.00 0.00
cnt 0.00 0.00 0.00 1.00 0.28 0.00 0.00 0.00 0.00
holiday 0.38 0.14 0.06 0.00 1.00 1.00 0.15 1.00 1.00
hum 0.00 0.04 0.01 0.67 0.00 0.00 0.58 0.00 0.03
mnth 0.00 0.00 0.00 0.60 0.00 0.00 0.00 0.00 0.00
registered 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00
season 0.00 0.00 0.00 0.78 0.00 0.00 0.00 0.00 0.00
temp 0.00 0.00 0.00 0.44 0.00 0.00 0.00 0.00 0.00
weathersit 0.00 0.00 0.00 0.35 0.00 0.24 0.00 0.60 0.00
weekday 0.84 0.11 0.07 0.01 0.16 0.80 0.12 0.93 1.00
windspeed 0.00 0.00 0.00 0.87 0.00 0.00 0.00 0.00 0.00
workingday 0.16 0.00 0.10 0.00 0.51 0.87 0.00 0.74 0.15
yr 0.21 0.00 0.00 0.83 0.00 0.96 0.00 0.96 0.20
weathersit weekday windspeed workingday yr
atemp 0.05 1.00 0.00 1.00 1.00
casual 0.00 1.00 0.00 0.00 0.00
cnt 0.00 1.00 0.00 1.00 0.00
holiday 1.00 0.25 1.00 0.00 1.00
hum 0.00 1.00 0.00 1.00 0.13
mnth 1.00 1.00 0.00 1.00 1.00
registered 0.00 1.00 0.00 0.00 0.00
season 1.00 1.00 0.00 1.00 1.00
temp 0.05 1.00 0.00 1.00 1.00
weathersit 0.00 1.00 1.00 1.00 1.00
weekday 0.40 0.00 1.00 1.00 1.00
windspeed 0.29 0.70 0.00 1.00 1.00
workingday 0.10 0.33 0.61 0.00 1.00
yr 0.19 0.88 0.75 0.96 0.00
To see confidence intervals of the correlations, print with the short=FALSE option
>
It works for me:::
rm(list=ls())
# http://archive.ics.uci.edu/ml/machine-learning-databases/00275/
#using the bike.csv
day <- read.csv("Bike-Sharing-Dataset//day.csv")
require(psych)
day<-day[,c("atemp","casual","cnt","holiday","hum","mnth","registered",
"season","temp","weathersit","weekday","windspeed","workingday","yr")]
day[day=='']<-NA
require(psych)
corr.test(day)
# corr.test(day)
# Call:corr.test(x = day)
# Correlation matrix
# atemp casual cnt holiday hum mnth registered season temp weathersit weekday windspeed workingday yr
# atemp 1.00 0.54 0.63 -0.03 0.14 0.23 0.54 0.34 0.99 -0.12 -0.01 -0.18 0.05 0.05
# casual 0.54 1.00 0.67 0.05 -0.08 0.12 0.40 0.21 0.54 -0.25 0.06 -0.17 -0.52 0.25
# cnt 0.63 0.67 1.00 -0.07 -0.10 0.28 0.95 0.41 0.63 -0.30 0.07 -0.23 0.06 0.57
# holiday -0.03 0.05 -0.07 1.00 -0.02 0.02 -0.11 -0.01 -0.03 -0.03 -0.10 0.01 -0.25 0.01
# hum 0.14 -0.08 -0.10 -0.02 1.00 0.22 -0.09 0.21 0.13 0.59 -0.05 -0.25 0.02 -0.11
# mnth 0.23 0.12 0.28 0.02 0.22 1.00 0.29 0.83 0.22 0.04 0.01 -0.21 -0.01 0.00
# registered 0.54 0.40 0.95 -0.11 -0.09 0.29 1.00 0.41 0.54 -0.26 0.06 -0.22 0.30 0.59
# season 0.34 0.21 0.41 -0.01 0.21 0.83 0.41 1.00 0.33 0.02 0.00 -0.23 0.01 0.00
# temp 0.99 0.54 0.63 -0.03 0.13 0.22 0.54 0.33 1.00 -0.12 0.00 -0.16 0.05 0.05
# weathersit -0.12 -0.25 -0.30 -0.03 0.59 0.04 -0.26 0.02 -0.12 1.00 0.03 0.04 0.06 -0.05
# weekday -0.01 0.06 0.07 -0.10 -0.05 0.01 0.06 0.00 0.00 0.03 1.00 0.01 0.04 -0.01
# windspeed -0.18 -0.17 -0.23 0.01 -0.25 -0.21 -0.22 -0.23 -0.16 0.04 0.01 1.00 -0.02 -0.01
# workingday 0.05 -0.52 0.06 -0.25 0.02 -0.01 0.30 0.01 0.05 0.06 0.04 -0.02 1.00 0.00
# yr 0.05 0.25 0.57 0.01 -0.11 0.00 0.59 0.00 0.05 -0.05 -0.01 -0.01 0.00 1.00
# Sample Size
# [1] 731
# Probability values (Entries above the diagonal are adjusted for multiple tests.)
# atemp casual cnt holiday hum mnth registered season temp weathersit weekday windspeed workingday yr
# atemp 0.00 0.00 0.00 1.00 0.01 0.00 0.00 0.00 0.00 0.05 1.00 0.00 1.00 1.00
# casual 0.00 0.00 0.00 1.00 1.00 0.04 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00
# cnt 0.00 0.00 0.00 1.00 0.28 0.00 0.00 0.00 0.00 0.00 1.00 0.00 1.00 0.00
# holiday 0.38 0.14 0.06 0.00 1.00 1.00 0.15 1.00 1.00 1.00 0.25 1.00 0.00 1.00
# hum 0.00 0.04 0.01 0.67 0.00 0.00 0.58 0.00 0.03 0.00 1.00 0.00 1.00 0.13
# mnth 0.00 0.00 0.00 0.60 0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00 1.00 1.00
# registered 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00
# season 0.00 0.00 0.00 0.78 0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00 1.00 1.00
# temp 0.00 0.00 0.00 0.44 0.00 0.00 0.00 0.00 0.00 0.05 1.00 0.00 1.00 1.00
# weathersit 0.00 0.00 0.00 0.35 0.00 0.24 0.00 0.60 0.00 0.00 1.00 1.00 1.00 1.00
# weekday 0.84 0.11 0.07 0.01 0.16 0.80 0.12 0.93 1.00 0.40 0.00 1.00 1.00 1.00
# windspeed 0.00 0.00 0.00 0.87 0.00 0.00 0.00 0.00 0.00 0.29 0.70 0.00 1.00 1.00
# workingday 0.16 0.00 0.10 0.00 0.51 0.87 0.00 0.74 0.15 0.10 0.33 0.61 0.00 1.00
# yr 0.21 0.00 0.00 0.83 0.00 0.96 0.00 0.96 0.20 0.19 0.88 0.75 0.96 0.00
#
# To see confidence intervals of the correlations, print with the short=FALSE option
cheers

How to skip NA when applying geometric-mean function

I have the following data frame:
1 8.03 0.37 0.55 1.03 1.58 2.03 15.08 2.69 1.63 3.84 1.26 1.9692516
2 4.76 0.70 NA 0.12 1.62 3.30 3.24 2.92 0.35 0.49 0.42 NA
3 6.18 3.47 3.00 0.02 0.19 16.70 2.32 69.78 3.72 5.51 1.62 2.4812459
4 1.06 45.22 0.81 1.07 8.30 196.23 0.62 118.51 13.79 22.80 9.77 8.4296220
5 0.15 0.10 0.07 1.52 1.02 0.50 0.91 1.75 0.02 0.20 0.48 0.3094169
7 0.27 0.68 0.09 0.15 0.26 1.54 0.01 0.21 0.04 0.28 0.31 0.1819510
I want to calculate the geometric mean for each row. My codes is
dat <- read.csv("MXreport.csv")
if(any(dat$X18S > 25)){ print("Fail!") } else { print("Pass!")}
datpass <- subset(dat, dat$X18S <= 25)
gene <- datpass[, 42:52]
gm_mean <- function(x){ prod(x)^(1/length(x))}
gene$score <- apply(gene, 1, gm_mean)
head(gene)
I got this output after typing this code:
1 8.03 0.37 0.55 1.03 1.58 2.03 15.08 2.69 1.63 3.84 1.26 1.9692516
2 4.76 0.70 NA 0.12 1.62 3.30 3.24 2.92 0.35 0.49 0.42 NA
3 6.18 3.47 3.00 0.02 0.19 16.70 2.32 69.78 3.72 5.51 1.62 2.4812459
4 1.06 45.22 0.81 1.07 8.30 196.23 0.62 118.51 13.79 22.80 9.77 8.4296220
5 0.15 0.10 0.07 1.52 1.02 0.50 0.91 1.75 0.02 0.20 0.48 0.3094169
7 0.27 0.68 0.09 0.15 0.26 1.54 0.01 0.21 0.04 0.28 0.31 0.1819510
The problem is I got NA after applying the geometric mean function to the row that has NA. How do I skip NA and calculate the geometric mean for the row that has NA
When I used gene<- na.exclude(datpass[, 42:52]). It skipped the row that has NA and not calculate the geometric mean at all. That is now what I want. I want to also calculate the geometric mean for the row that has NA also. How do I do this?

Compana function for compositional analysis freezes in R

I'm trying to run compositional analysis of the use of different type of habitats by ground nesting chicks on a set of data using R Studio. It starts processing but gives never stops. I have to manually stop the processing or kill R Studio. (Same result in R.)
I'm using the campana function from the adehabitatHS package. From the adehabitat I'm able to run the sample pheasant and squirrel data without any problems. (I've tried calling campana from both packages with the same result.)
For each chick, the habitat available varies as it's taken as a buffer zone around their nest site.
My data
This is the available habitats for each chick:
grass fallow.plot oil.seed.rape spring.barley winter.wheat maize other.crops other woodland hedgerow
1 23.35 7.53 45.75 0.00 0.00 0.00 0.00 0.00 23.37 0.00
2 86.52 10.35 0.00 0.00 1.24 0.00 0.00 1.89 0.00 0.00
3 5.18 10.33 28.36 38.82 0.00 0.00 17.17 0.14 0.00 0.00
4 4.26 18.32 27.31 32.66 3.82 0.00 0.00 5.02 5.52 3.09
5 4.26 18.32 27.31 32.66 3.82 0.00 0.00 5.02 5.52 3.09
6 12.52 10.35 0.00 0.00 0.00 18.02 43.59 13.15 2.37 0.00
7 21.41 11.56 59.25 0.00 0.00 0.00 0.00 5.82 0.00 1.96
8 21.41 11.56 59.25 0.00 0.00 0.00 0.00 5.82 0.00 1.96
9 36.17 16.93 0.00 30.14 0.00 0.00 0.00 7.08 9.68 0.00
10 0.00 12.17 26.49 0.00 3.99 55.77 0.00 1.58 0.00 0.00
11 0.00 10.27 67.41 1.93 18.30 0.00 0.00 1.18 0.00 0.91
12 2.66 5.38 0.00 14.39 54.06 0.00 8.40 3.83 7.84 3.44
13 2.66 5.38 0.00 14.39 54.06 0.00 8.40 3.83 7.84 3.44
14 84.22 8.00 0.00 0.00 0.00 2.90 0.00 0.22 3.84 0.82
15 84.22 8.00 0.00 0.00 0.00 2.90 0.00 0.22 3.84 0.82
16 86.85 13.04 0.00 0.00 0.00 0.00 0.00 0.11 0.00 0.00
17 86.85 13.04 0.00 0.00 0.00 0.00 0.00 0.11 0.00 0.00
18 86.85 13.04 0.00 0.00 0.00 0.00 0.00 0.11 0.00 0.00
19 86.85 13.04 0.00 0.00 0.00 0.00 0.00 0.11 0.00 0.00
20 21.41 8.11 0.47 8.08 0.00 0.00 56.78 2.26 0.00 2.89
This is the used habitats (mcp):
grass fallow.plot oil.seed.rape spring.barley winter.wheat maize other.crops other woodland hedgerow
1 41.14 58.67 0.19 0.00 0.00 0.00 0.00 0.00 0 0.0
2 35.45 64.55 0.00 0.00 0.00 0.00 0.00 0.00 0 0.0
3 10.10 60.04 7.72 21.37 0.00 0.00 0.00 0.77 0 0.0
4 0.00 44.55 0.00 50.27 0.00 0.00 0.00 5.18 0 0.0
5 2.82 48.48 44.80 0.00 0.00 0.00 0.00 0.00 0 3.9
6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0 0.0
7 0.00 87.41 12.59 0.00 0.00 0.00 0.00 0.00 0 0.0
8 0.00 83.59 16.41 0.00 0.00 0.00 0.00 0.00 0 0.0
9 0.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0 0.0
10 0.00 18.93 0.00 0.00 0.00 81.07 0.00 0.00 0 0.0
11 0.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0 0.0
12 0.00 22.79 0.00 0.00 77.13 0.00 0.00 0.08 0 0.0
13 0.00 0.00 0.00 0.00 100.00 0.00 0.00 0.00 0 0.0
14 54.60 44.97 0.00 0.00 0.00 0.00 0.00 0.43 0 0.0
15 62.86 36.57 0.00 0.00 0.00 0.00 0.00 0.57 0 0.0
16 11.15 88.10 0.00 0.00 0.00 0.00 0.00 0.75 0 0.0
17 20.06 79.62 0.00 0.00 0.00 0.00 0.00 0.32 0 0.0
18 38.64 60.95 0.00 0.00 0.00 0.00 0.00 0.41 0 0.0
19 3.81 95.81 0.00 0.00 0.00 0.00 0.00 0.38 0 0.0
20 0.00 3.56 0.00 0.00 0.00 0.00 96.44 0.00 0 0.0
I've tried both parametric and randomisation tests with the same results. The code I'm running:
habuse <- compana(used, avail, test = "randomisation",rnv = 0.001, nrep = 500, alpha = 0.1)
habuse <- compana(used, avail, test = "parametric")
Any ideas where I'm going wrong?
I've discovered the answer to my own question. For the used data, the function replaces 0 values with the value you specify (0.001 in my case). But it doesn't replace 0 values in the available data, and it doesn't like them either.
I replaced all the 0s with 0.001 in the available table, adjusted the other values and the function worked.

Boxplot of table using ggplot2

I'm trying to plot a boxplot graph with my data, using 'ggplot' in R, but I just can't do it. Can anyone help me out?
The data is like the table below:
Paratio ShapeIdx FracD NNDis Core
-3.00 1.22 0.14 2.71 7.49
-1.80 0.96 0.16 0.00 7.04
-3.00 1.10 0.13 2.71 6.85
-1.80 0.83 0.16 0.00 6.74
-0.18 0.41 0.27 0.00 6.24
-1.66 0.12 0.11 2.37 6.19
-1.07 0.06 0.14 0.00 6.11
-0.32 0.18 0.23 0.00 5.93
-1.16 0.32 0.15 0.00 5.59
-0.94 0.14 0.15 1.96 5.44
-1.13 0.31 0.16 0.00 5.42
-1.35 0.40 0.15 0.00 5.38
-0.53 0.25 0.20 2.08 5.32
-1.96 0.36 0.12 0.00 5.27
-1.09 0.07 0.13 0.00 5.22
-1.35 0.27 0.14 0.00 5.21
-1.25 0.21 0.14 0.00 5.19
-1.02 0.25 0.16 0.00 5.19
-1.28 0.22 0.14 0.00 5.11
-1.44 0.32 0.14 0.00 5.00
And what I exactly want is a boxplot of each column, without any relation "column by column".
ggplot2 requires data in a specific format. Here, you need x= and y= where y will be the values and x will be the corresponding column ids. Use melt from reshape2 package to melt the data to get the data in this format and then plot.
require(reshape2)
ggplot(data = melt(dd), aes(x=variable, y=value)) + geom_boxplot(aes(fill=variable))

Resources