mean.default argument not numerical on R - r

I'm trying to use R for the first time, I have never taken courses and have some questions. the first is this:
when I try to do the mean of some Temperature values (they are all between 18.15 and 18.40)
I get this answer
"Warning message:
In mean.default(d_Temp_Experiment$value) :
argument is not numeric or logical: returning NA"
I dont' have the same problem with values of PAR 5that are all integer numbers and with values of pH all decimal numbers like 8.831...
Can you tell what I do wrong?

As Arun hints at it could be that the column is character rather than numeric.
If you are sure that all the values are correct you could coerce the values with
d_Temp_Experiment$value <- as.numeric(d_Temp_Experiment$value)
You might have the below sort of business going on.
myvector <- c(0,1,2,3,4,5,"6","7")
mv <- as.numeric(myvector)
mean(myvector)
mean(mv)

Related

PCA in R: Error in svd(x, nu=0, nv=k) : Infinite or missing values in 'x'

My dataframe contains about 26k rows with 129 variables. I've made sure all of the variables are numeric and do not have any NA values (used na.omit). Using the function prcomp() on my dataframe tells me "Infinite or missing values in x". What might I be overlooking then?
Did you also make sure none of them are infinite? As that's the other part of that message?
Easily check all this with:
all( is.finite( your.data.frame ) )

Is there a way in R to ignore a "." in my data when calculating mean/sd/etc

I have a large data set that I need to calculate mean/std dev/min/ and max on for several columns. The data set uses a "." to denote when a value is missing for a subject. When running the mean or sd function this causes R to return NA . Is there a simple way around this?
my code is just this
xCAL<-mean(longdata$CAL)
sdCAL<-sd(longdata$CAL)
minCAL<-min(longdata$CAL)
maxCAL<-max(longdata$CAL)
but R will return NA on all these variables. I get the following Error
Warning message:
In mean.default(longdata$CAL) :
argument is not numeric or logical: returning NA
You need to convert your data to numeric to be able to do any calculations on it. When you run as.numeric, your . will be converted to NA, which is what R uses for missing values. Then, all of the functions you mention take an argument na.rm that can be set to TRUE to remove (rm) missing values (na).
If your data is a factor, you need to convert it to character first to avoid loss of information as explained in this FAQ.
Overall, to be safe, try this:
longdata$CAL <- as.numeric(as.character(longdata$CAL))
xCAL <- mean(longdata$CAL, na.rm = TRUE)
sdCAL <- sd(longdata$CAL, na.rm = TRUE)
# etc
Do note that na.rm is a property of the function - it's not magic that works everywhere. If you look at the help pages for ?mean ?sd, ?min, etc., you'll see the na.rm argument documented. If you want to remove missing values in general, the na.omit() function works well.

new to R, and getting this error message, how do I omit NA in my cohort to analyze my data?

new to R, and getting this error message, how do I omit NA in my cohort to analyze my data? mean(cohort5$"age.at.diagnosis") [1] NA Warning message: In mean.default(cohort5$age.at.diagnosis) : argument is not numeric or logical: returning NA
All you need to do to handle NAs is add na.rm = TRUE:
mean(cohort5$age.at.diagnosis, na.rm = TRUE)
However, the error message you received suggests that the problem is actually in the data format. You should make sure that the variable in your dataframe is, actually, numeric and doesn't contain non-numeric values (for example some unusual character used to indicate missing values). class(cohort5$age.at.diagnosis) will tell you the data type.
cohort5$age.at.diagnosis <- as.numeric(cohort5$age.at.diagnosis) # if currently character
cohort5$age.at.diagnosis <- as.numeric(as.character(cohort5$age.at.diagnosis)) # if currently factor
Both of these lines will coerce non-numeric values into NAs, so be careful because you may be throwing away information by doing that.
There is are ways to omit missing data prior to running any sort of analysis using the na.omit function.
na.omit(Cohort5)

What does "argument to 'which' is not logical" mean in FactoMineR MCA?

I'm trying to run an MCA on a datatable using FactoMineR. It contains only 0/1 numerical columns, and its size is 200.000 * 20.
require(FactoMineR)
result <- MCA(data[, colnames, with=F], ncp = 3)
I get the following error :
Error in which(unlist(lapply(listModa, is.numeric))) :
argument to 'which' is not logical
I didn't really know what to do with this error. Then I tried to turn every column to character, and everything worked. I thought it could be useful to someone else, and that maybe someone would be able to explain the error to me ;)
Cheers
Are the classes of your variables character or factor?I was having this problem. My solution was to change al variables to factor.
#my data.frame was "aux.da"
i=0
while(i < ncol(aux.da)){
i=i+1 aux.da[,i] = as.factor(aux.da[,i])
}
It's difficult to tell without further input, but what you can do is:
Find the function where the error occurred (via traceback()),
Set a breakpoint and debug it:
trace(tab.disjonctif, browser)
I did the following (offline) to find the name of tab.disjonctif:
Found the package on the CRAN mirror on GitHub
Search for that particular expression that gives the error
I just started to learn R yesterday, but the error comes from the fact that the MCA is for categorical data, so that's why your data cannot be numeric. Then to be more precise, before the MCA a "tableau disjonctif" (sorry i don't know the word in english : Complete disjunctive matrix) is created.
So FactomineR is using this function :
https://github.com/cran/FactoMineR/blob/master/R/tab.disjonctif.R
Where i think it's looking for categorical values that can be matched to a numerical value (like Y = 1, N = 0).
For others ; be careful : for R categorical data is related to factor type, so even if you have characters you could get this error.
To build off #marques, #Khaled, and #Pierre Gourseaud:
Yes, changing the format of your variables to factor should address the error message, but you shouldn't change the format of numerical data to factor if it's supposed to be continuous numerical data. Rather, if you have both continuous and categorical variables, try running a Factor Analysis for Mixed Data (FAMD) in the same FactoMineR package.
If you go the FAMD route, you can change the format of just your categorical variable columns to factor with this:
data[,c(3:5,10)] <- lapply(data[,c(3:5,10)] , factor)
(assuming column numbers 3,4,5 and 10 need to be changed).
This will not work for only numeric variables. If you only have numeric use PCA. Otherwise, add a factor variable to your data frame. It seems like for your case you need to change your variables to binary factors.
Same problem as well and changing to factor did not solve my answer either, because I had put every variable as supplementary.
What I did first was transform all my numeric data to factor :
Xfac = factor(X[,1], ordered = TRUE)
for (i in 2:29){
tfac = factor(X[,i], ordered = TRUE)
Xfac = data.frame(Xfac, tfac)
}
colnames(Xfac)=labels(X[1,])
Still, it would not work. But my 2nd problem was that I included EVERY factor as supplementary variable !
So these :
MCA(Xfac, quanti.sup = c(1:29), graph=TRUE)
MCA(Xfac, quali.sup = c(1:29), graph=TRUE)
Would generate the same error, but this one works :
MCA(Xfac, graph=TRUE)
Not transforming the data to factors also generated the problem.
I posted the same answer to a related topic : https://stackoverflow.com/a/40737335/7193352

R - Pie, X values must be positive

I am new to R and drew some testdata about countries in a csv from the web. I am currenty fooling arround with plotting and encountered said error while creating a pie chart of the worlds unemployment.
i issued the following:
>values <- read.csv("D:\\test\\countrydata.csv")
>names(values)
[1] "name" "size" "pop" "unemployed" ...
>typeof(values$unemployed)
"integer"
>pie(values$pop)
Error in pie(values$unemployed) :
'x' values must be positive
>pie(values$pop, na.rm=TRUE)
Error in pie(values$unemployed, na.rm=TRUE) :
'x' values must be positive
The dataset i want to plot is a set of integers, all of them are positive, 0 (thanks kim) or NA.
0 are not a problem when plotting integers, i tried
>pie(as.integer(c(0,1,2,3))
and it worked fine.
What am i missing here?
Thanks and Regards,
BillDoor
I don't have access to your data but in my experience the following might help and is definitely worth a try:
pie(table(values$unemployed))
Would love to learn whether this solved your problem!
According to the R documentation, the first argument must be "non-negative vector" Now when we write
pie(value$pop)
Then R considers it as a numeric data frame which you can check by
str(value$pop).
When you typecast via as.numeric or as.integer, you convert it to a vector, that's why it works on typecasting:
pie(as.numeric(value$pop))
I encountered a similar problem. Here's what worked for me:
vectorVal <- as.numeric(table)
pie(vectorVal)

Resources