I not able to run R code via dplyr library - r

I am trying to run this code:
data("ToothGrowth")
View(ToothGrowth)
filtered_tg <- filter(ToothGrowth, dose == 0,5)
but the following error is causing me problems:
Error in filter():
ℹ In argument: 5.
Caused by error:
! ..2 must be a logical vector, not the number 5.
Run rlang::last_error() to see where the error occurred.
I have already run the following in the RStudio console:
> install.packages("dplyr")
> library(dplyr)

There are a few possibilities here.
This is a simple typo, you meant to type 0.5 instead of 0,5.
You are confused about decimal separator conventions. This has the same solution, but is a different conceptual problem.
R uses the North American convention where ., not , is used as a decimal separator. Specify the dose value as 0.5, not 0,5.
filtered_tg <- filter(ToothGrowth, dose == 0.5)
As R uses a comma for lots of other things, this is a setting you can't change. (You can change it for the purpose of reading and writing data, e.g. see the read.csv2() function, or see here.)
You are trying to specify two different possible values for dose (in which case you should use dose == 0 | dose == 5 or dose %in% c(0,5) as your criterion). (This seems implausible but was mentioned by commenters.)

Related

Numeric data gives error non-numeric argument to binary operator in R

New to R.
I have a dataframe called SortedDF with 125143 observations and 5 variables
Head(SortedDF)
number Retention time (min) Charge m/z Group
28637 98481 16.87978 2 350.1859 Progenesis_peptide
82465 DVQLPK 14.35000 2 350.2022 PEAKS_peptide
81468 DVQLPK 14.32000 2 350.2027 PEAKS_peptide
76662 DVQLPK 14.33000 2 350.2028 PEAKS_peptide
77423 DVQLPK 14.36000 2 350.2029 PEAKS_peptide
73768 DVQLPK 14.27000 2 350.2039 PEAKS_peptide
I want to match peptides based on their m/z similarity using lead command.
MatchedDF <- SortedDF %>% mutate(matches_with_next_row = (abs(("m/z") - lead("m/z")) < 0.01))
While coding another database, this code worked perfectly. However right now, I get the error message saying
Error in mutate(): ! Problem while computing matches_with_next_row = ... & Group != lead(Group). Caused by error in ("m/z") - lead("m/z"): ! non-numeric argument to binary operator Run
rlang::last_error() to see where the error occurred.
When checking
class(SortedDF$"m/z")
(1) "numeric"
I checked following threads without succes:
x non-numeric argument to binary operator while using mutate
--> tried '' around m/z without succes
Other threads mainly talk about other problems than mine. If anyone has an idea on what I am doing wrong, please tell me.
"m/z" is a non-numeric character, whereas m/z with back ticks refers to a data variable of the particular column:
MatchedDF <- SortedDF %>% mutate(
matches_with_next_row = (abs((`m/z`) - lead(`m/z`)) < 0.01)
)
It is recommended to not use special characters for column names to make these expressions much easier to write.

How to solve error factor has bad level in R

I have some difficulties applying the inhomogeneous G-function to my point pattern in R.
In order to use GmultiInhom, I first tried to convert my point pattern bci.tree8pppa to a multitype pattern:
bci.tree8multi = ppp(bci.tree8pppa$x, bci.tree8pppa$y, window=owin(c(0,1000), c(0,500)), marks = factor(bci.tree8pppa$marks[,3]))
Then applied the G-function as follows:
G = GmultiInhom(bci.tree8multi, marks(bci.tree8multi) == species1, marks(bci.tree8multi) == species2, lambdaI = lambda1points, lambdaJ = lambda2points, lambdamin = min(lambda2points), r = c(0,r1,r2,r3))
But this yields the error: "Error in split.default(X, group) : factor has bad level"
How can I solve this?
Thank you in advance!
For the benefit of any R programmers out there: I traced the error message "factor has bad level" to the C source code for .Internal(split.default(x,f)) in the base R system. This error message can occur only when f is a list rather than a factor. The code converts f to a factor using the function interaction which performs character string manipulations. This conversion can go wrong, in the sense that the resulting factor has "bad levels": the integer representation of the factor includes values less than 1 or greater than the number of levels of the factor. Then the error occurs.
The original post has not provided a working example, so it's difficult to figure out exactly how the code in spatstat::GmultiInhom caused the wrong type of data to be fed to split.default. However, it must be related to the misuse of the argument r. The code in spatstat will be tightened to enforce stricter requirements on the format of r.

What does "argument to 'which' is not logical" mean in FactoMineR MCA?

I'm trying to run an MCA on a datatable using FactoMineR. It contains only 0/1 numerical columns, and its size is 200.000 * 20.
require(FactoMineR)
result <- MCA(data[, colnames, with=F], ncp = 3)
I get the following error :
Error in which(unlist(lapply(listModa, is.numeric))) :
argument to 'which' is not logical
I didn't really know what to do with this error. Then I tried to turn every column to character, and everything worked. I thought it could be useful to someone else, and that maybe someone would be able to explain the error to me ;)
Cheers
Are the classes of your variables character or factor?I was having this problem. My solution was to change al variables to factor.
#my data.frame was "aux.da"
i=0
while(i < ncol(aux.da)){
i=i+1 aux.da[,i] = as.factor(aux.da[,i])
}
It's difficult to tell without further input, but what you can do is:
Find the function where the error occurred (via traceback()),
Set a breakpoint and debug it:
trace(tab.disjonctif, browser)
I did the following (offline) to find the name of tab.disjonctif:
Found the package on the CRAN mirror on GitHub
Search for that particular expression that gives the error
I just started to learn R yesterday, but the error comes from the fact that the MCA is for categorical data, so that's why your data cannot be numeric. Then to be more precise, before the MCA a "tableau disjonctif" (sorry i don't know the word in english : Complete disjunctive matrix) is created.
So FactomineR is using this function :
https://github.com/cran/FactoMineR/blob/master/R/tab.disjonctif.R
Where i think it's looking for categorical values that can be matched to a numerical value (like Y = 1, N = 0).
For others ; be careful : for R categorical data is related to factor type, so even if you have characters you could get this error.
To build off #marques, #Khaled, and #Pierre Gourseaud:
Yes, changing the format of your variables to factor should address the error message, but you shouldn't change the format of numerical data to factor if it's supposed to be continuous numerical data. Rather, if you have both continuous and categorical variables, try running a Factor Analysis for Mixed Data (FAMD) in the same FactoMineR package.
If you go the FAMD route, you can change the format of just your categorical variable columns to factor with this:
data[,c(3:5,10)] <- lapply(data[,c(3:5,10)] , factor)
(assuming column numbers 3,4,5 and 10 need to be changed).
This will not work for only numeric variables. If you only have numeric use PCA. Otherwise, add a factor variable to your data frame. It seems like for your case you need to change your variables to binary factors.
Same problem as well and changing to factor did not solve my answer either, because I had put every variable as supplementary.
What I did first was transform all my numeric data to factor :
Xfac = factor(X[,1], ordered = TRUE)
for (i in 2:29){
tfac = factor(X[,i], ordered = TRUE)
Xfac = data.frame(Xfac, tfac)
}
colnames(Xfac)=labels(X[1,])
Still, it would not work. But my 2nd problem was that I included EVERY factor as supplementary variable !
So these :
MCA(Xfac, quanti.sup = c(1:29), graph=TRUE)
MCA(Xfac, quali.sup = c(1:29), graph=TRUE)
Would generate the same error, but this one works :
MCA(Xfac, graph=TRUE)
Not transforming the data to factors also generated the problem.
I posted the same answer to a related topic : https://stackoverflow.com/a/40737335/7193352

Error in huge R package when criterion "stars"

I am trying to do an association network using some expression data I have, the data is really huge: 300 samples and ~30,000 genes. I would like to apply a Gaussian graphical model to my data using the huge R package.
Here is the code I am using
dim(data)
#[1] 317 32291
huge.out <- huge.npn(data)
huge.stars <- huge.select(huge.out, criterion="stars")
However in this last step I got an error:
Error in cor(x) : ling....in progress:10%
Missing values present in input variable 'x'. Consider using use = 'pairwise.complete.obs'
Any help would be very appreciated.
You posted this exact question on Rhelp today. Both SO and Rhelp deprecate cross-posting but if you do choose to switch venues it is at the very least courteous to inform the readership.
You responded to the suggestion here on SO that there were missing data in your data-object named 'data' by claiming there were no missing data. So what does this code return:
lapply(data , function(x) sum(is.na(x)))
That would be a first level check, but there could also be an error caused by a later step that encountered a missing value in the matrix of correlation coefficients in the matrix 'huge.out". That could happen if there were: a) infinities in the calculations or b) if one of the columns were constant:
> cor(c(1:10,Inf), 1:11)
[1] NaN
> cor(rep(2,7), rep(2,7))
[1] NA
Warning message:
In cor(rep(2, 7), rep(2, 7)) : the standard deviation is zero
So the next check is:
sum( is.na(huge.out) )
That will at least give you some basis for defending your claim of no missings and will also give you a plausible theory as to the source of the error. To locate a column that is entirely constant you might do something like this (assuming it were a dataframe):
which(sapply(sapply(data, unique), length) > 1)
If it's a matrix, you need to use apply.

"Error in 1:ncol(x) : argument of length 0" when using Amelia in R

I am working with panel data. I have well over 6,000 country-year observations, and have specified my Amelia imputation as follows:
(CountDependentVariable, m=5, ts="year", cs="cowcode",
sqrts=c("OtherCountVariable2", "OtherCount3", "OtherCount4"),
ords=c("OrdinalVar1", "Ordinal Variable 2"),
lgstc=c("ProportionVariale"),
noms=c("NominalVar1"),p2s = 0, idvars = c("country"))
When I run those lines of code, I continue to receive the following error:
Error in 1:ncol(x) : argument of length 0
I've seen people get a similar error, but in different contexts. Importantly, there are several continuous independent variables I left out of the Amelia code, because I am under the impression that they get imputed WITHOUT having to do so. Does anyone know:
1) What this error means?
2) How to correct this error?
Update #1: Provided more context, in terms of the types of variables in my count panel data, in the above sample code.
Update #2: I did some research, and ran into an R file containing a function that diagnoses possible errors for Amelia code. After running the code, I got the following error message first (and many more thereafter):
AMn<-nrow(x)
Error in nrow(x) : object 'x' not found
AMp<-ncol(x)
Error in ncol(x) : object 'x' not found
subbedout<-c(idvars,cs,ts)
Error: object 'idvars' not found
Error Code: 4
if (any(colSums(!is.na(x)) <= 1)) {
all.miss <- colnames(x)[colSums(!is.na(x)) <= 1]
if (is.null(all.miss)) {
all.miss <- which(colSums(!is.na(x)) <= 1)
}
all.miss <- paste(all.miss, collapse = ", ")
error.code<-4
error.mess<-paste("The data has a column that is completely missing or only has one,observation. Remove these columns:", all.miss)
return(list(code=error.code,mess=error.mess))
}
Error in is.data.frame(x) : object 'x' not found
Error codes: 5-6
Errors in one of the list variables
idout<-listcheck(idvars,"One of the 'idvars'")
Error in identical(vars, NULL) : object 'idvars' not found
Currently, there are no missing values for the country variable I place in the idvars argument. However, the very first "chunk" of errors wants me to believe that this is so.
Am I not properly specifying the Amelia code I have above?
I had forgotten to specify the dataframe in the original Amelia code (slaps hand on forehead). So now, after resolving the whacky issue above, I am getting the following error from Amelia:
Amelia Error Code: 44
One of the variable names in the options list does not match a variable name in the data.
I've checked the variable names, and they match, verbatim, to what I named them in the dataframe.

Resources