nError in importing signals with createAffyIntensityFile (GWASTools) - r

I am using the GWASTools package and I am facing an error to import my signal file. I tried to mimetize my real data set in the follow example:
library(GWASTools)
snp.anno <- 'snpID chromosome position snpName
AX-100676796 1 501997 AX-100676796
AX-100120875 1 503822 AX-100120875
AX-100067350 1 504790 AX-100067350'
snp.anno <- read.table(text=snp.anno, header=T)
signals <- 'probeset_id sample1.CEL sample1.CEL sample1.CEL
AX-100676796-A 2126.7557 1184.8638 1134.2687
AX-100676796-B 427.1864 2013.8512 1495.0654
AX-100120875-A 1775.5816 2013.8512 651.1691
AX-100120875-B 335.9226 2013.8512 1094.7429
AX-100067350-A 2365.7755 2695.0053 2758.1739
AX-100067350-B 2515.4818 2518.2818 28181.289 '
p1summ <- read.table(text=signals, header=T)
write.table(p1summ, "del.txt", sep="\t", col.names=T, row.names=F, quote=F)
p1summ <- createAffyIntensityFile("del.txt", snp.annotation=snp.anno)
Error: all(snp.annotation$snpID == sort(snp.annotation$snpID)) is not TRUE
In addition: Warning messages:
1: In .checkSnpAnnotation(snp.annotation) : coerced snpID to type integer
2: In .checkSnpAnnotation(snp.annotation) :
coerced chromosome to type integer
I used the probe Names with 'A' and 'B' pattern also, the error was the same:
snp.annoab <- 'snpID chromosome position snpName
AX-100676796-A 1 501997 AX-100676796-A
AX-100676796-B 1 501997 AX-100676796-B
AX-100120875-A 1 503822 AX-100120875-A
AX-100120875-B 1 503822 AX-100120875-B
AX-100067350-A 1 504790 AX-100067350-A
AX-100067350-B 1 504790 AX-100067350-B'
snp.annoab <- read.table(text=snp.annoab, header=T)
p1summ <- createAffyIntensityFile("del.txt", snp.annotation=snp.annoab)
Error: all(snp.annotation$snpID == sort(snp.annotation$snpID)) is not TRUE
In addition: Warning messages:
1: In .checkSnpAnnotation(snp.annotation) : coerced snpID to type integer
2: In .checkSnpAnnotation(snp.annotation) :
coerced chromosome to type integer
In my real dataset the error is slight different, but do not work anyway:
Error: length(snp.annotation$snpID) == length(unique(snp.annotation$snpID)) is not TRUE
In addition: Warning messages:
1: In .checkSnpAnnotation(snp.annotation) : NAs introduced by coercion
2: In .checkSnpAnnotation(snp.annotation) : coerced snpID to type integer
3: In .checkSnpAnnotation(snp.annotation) : NAs introduced by coercion
4: In .checkSnpAnnotation(snp.annotation) :
coerced chromosome to type integer
And the strange thing is that:
> length(snp.annotation$snpID) == length(unique(snp.annotation$snpID))
[1] TRUE
Thus, seems that the error is not in agreement with the command (to check if the length is the same). I am missing some important detail in the format of my inputs? I would be grateful for any help. Thank you!

Related

Warning message:invalid factor level, NA generated

I'm getting this error when I tried to assign new character value to some of the values in one of my columns.
This works fine:
merge_output$extra_dod[merge_output$extra_dod == 'Refugees camps in forestreserve.'] <-'Refugees'
but this doesn't:
merge_output$extra_dod[merge_output$extra_dod=='Air Strip'] <-'strip'
And it returns this error message:
Warning message:
In `[<-.factor`(`*tmp*`, merge_output$extra_dod == "Lime", value = c(5L, :
invalid factor level, NA generated
I'm not sure why I can replace some of the values but not others.
Here's a much-simplified example that fails in the same way:
f <- factor(c("a","b","c","d"))
f[f=="d"] <- "e"
Warning message:
In [<-.factor(*tmp*, f == "d", value = "e") :
invalid factor level, NA generated
If you happen to try replacing with a factor level that already
exists, it works:
f[f=="c"] <- "b"
A few more general options:
Convert the variable back into a character vector
before trying to replace values (or use something like
stringsAsFactors=FALSE in read.csv/read.table)
use car::recode

Error in betadisper function - vegan package

I am trying to run the betadisper function (vegan package) and it returns an error. This is what I do:
hom.cov <- betadisper(morf.dist, sexo)
and it retorns to me this error:
Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?
Then I run to traceback:
traceback()
5: stop("'x' must be atomic for 'sort.list'\nHave you called 'sort' on
a list?")
4: sort.list(y)
3: factor(x)
2: as.factor(group)
1: betadisper(morf.dist, sexo)
When I saw this I tried to convert the vector "sexo" in factor with "as.factor" and then run again, but it returned to me the same error. So I tried to run "betadisper()" with the example use in "Numerical Ecology with R" and give me another error:
env <- read.csv("DoubsEnv.csv", row.names=1)
env.pars2 <- as.matrix(env[, c(1, 9, 10)])
env.pars2.d1 <- dist(env.pars2)
(env.MHV <- betadisper(env.pars2.d1, gr))
Error in x - c : arreglos de dimensón no compatibles
traceback()
2: Resids(vectors[, pos, drop = FALSE], centroids[group, pos, drop =
FALSE])
1: betadisper(env.pars2.d1, gr)
I don't know what could happend. Can anyone help me?
Thanks!
R claims that sexo is not atomic. This is not the most obvious message, but it means that sexo is not a simple vector of values, but it may be, say, a data frame or a list. Issue
str(sexo)
and see what you get. If you see text like data.frame or list in the output and then a dollar sign ($) then you don't have a simple structure. For instance, the following output is not an atomic item:
> str(a)
List of 1
$ a: Factor w/ 4 levels "BF","HF","NM",..: 4 1 NA 4 2 2 2 2 2 1 ...
In this case you should use a$a instead of only a.

Only receive unique warning messages

Warning messages are a good information i want to know. But i just want to know it one time!
So this function throws 2 different warnings and repeats it 20 times.
How can i tell R to only print unique warnings. Im looking for a gerenal solution.
Warning messages:
1: NAs introduced by coercion
2: In sqrt(-1) : NaNs produced
Here is my example:
foobar <- function(n=20) {
for (i in 1:n) {
as.numeric("b")
sqrt(-1)
}
}
foobar()
To return only unique warning strings, use
unique(warnings())
Now, a problem you may have is that your function has more than 50 warnings, in which case warnings() will not catch them all. To workaround this, you can increase nwarnings in options to e.g. 10000 as suggested in the help page of warnings.
options(nwarnings = 10000)
Example:
foobar <- function(n=20) {
warning("First warning")
for (i in 1:n) {
as.numeric("b")
sqrt(-1)
}
warning("Last warning")
}
foobar(60)
unique(warnings())
## Warning messages:
## 1: In foobar(60) : First warning
## 2: NAs introduced by coercion
## 3: In sqrt(-1) : NaNs produced
op <- options(nwarnings = 10000)
foobar(60)
unique(warnings())
## Warning messages:
## 1: In foobar(60) : First warning
## 2: NAs introduced by coercion
## 3: In sqrt(-1) : NaNs produced
## 4: In foobar(60) : Last warning
options(op)

R within group sum of squares kmeans

I have the following code, which is giving me the an error:
# Read input dataset from CSV file
input_dataset <-
read.csv("C:\\Users\\sw029693\\Desktop\\Overtime_work_hrs_analytics\\input_dataset.csv", header = TRUE)
wss <- (nrow(input_dataset)-1)*sum(apply(input_dataset,2,var))
which gives the following error:
Warning messages:
1: In FUN(newX[, i], ...) : NAs introduced by coercion
2: In FUN(newX[, i], ...) : NAs introduced by coercion
3: In FUN(newX[, i], ...) : NAs introduced by coercion
4: In FUN(newX[, i], ...) : NAs introduced by coercion
5: In FUN(newX[, i], ...) : NAs introduced by coercion
> wss
[1] NA
> colnames(input_dataset)
[1] "client" "domain" "user_name"
"cdf_display" "position" "shift_start"
[7] "shift_end" "shift_length_avg" "patients_seen_cnt"
It looks like the wss is NA, I am not sure why. Any ideas?
K-means only supports numerical data.
You columns user_name etc. probably are not numerical.
Bring your data into the appropriate format first.

Error while creating a Timeseries plot in R: Error in plot.window(xlim, ylim, log, ...) : need finite 'ylim' values

Here's a sample of my single column data set:
Lines
141,523
146,785
143,667
65,560
88,524
148,422
I read this file as a .csv file, convert it into a ts object and then plot it:
##Read the actual number of lines CSV file
Aclines <- read.csv(file.choose(), header=T, stringsAsFactors = F)
Aclinests <- ts(Aclines[,1], start = c(2013), end = c(2015), frequency = 52)
plot(Aclinests, ylab = "Actual_Lines", xlab = "Time", col = "red")
I get the following error message:
Error in plot.window(xlim, ylim, log, ...) : need finite 'ylim' values
In addition: Warning messages:
1: In xy.coords(x, NULL, log = log) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf
I thought this might be because of the "," in the columns and tried to use sapply to take care of that as advised here:
need finite 'ylim' values-error
plot(sapply(Aclinests, function(x)gsub(",",".",x)))
But I got the following error:
Error in plot(sapply(Aclinests, function(x) gsub(",", ".", x))) :
error in evaluating the argument 'x' in selecting a method for function 'plot': Error in sapply(Aclinests, function(x) gsub(",", ".", x)) :
'names' attribute [105] must be the same length as the vector [1]
Here is the head of my original and ts data set if it might help:
> head(Aclines)
Lines
1 141,523
2 146,785
3 143,667
4 65,560
5 88,524
6 148,422
> head(Aclinests)
[1] "141,523" "146,785" "143,667" "65,560" "88,524" "148,422"
Also, if I read the .csv file as:
Aclines <- read.csv(file.choose(), header=T, **stringsAsFactors = T**)
Then, I am able to plot the ts object, but head(Aclinests)gives the below output which is not consistent with my original data:
> head(Aclinests)
[1] 14 27 17 84 88 36
Please advice on how I can plot this ts object.
The simplest way to avoid this, in my case, is to remove the commas in the excel file containing the data. This can be done using simple excel commands and it worked for me.

Resources