I am trying execute this code but getting result NA.
> node1<- paste0("train$", rule, collapse=" & ")
> node1
[1] "train$feat_11< 5.477 & train$feat_60< 4.687"
>x<-ifelse(node1,1,0)
[1] NA
How can I use character vector in if else function?
Logical vectors and character vectors are two very different things in R.
class(node1)
#>[1] "character"
You must first parse and evaluate the string.
lNode1 = eval(parse(text=node1))
class(lNode1)
#>[1] "logical"
x<-ifelse(lNode1,1,0)
#>a list of 1's and 0's
That being said, however, your ifelse statement is redundant. A logical vector will coerce to an integer vector when used in a fashion that requires it. For example, you can sum(lNode1) and get the number of times you pass both rules.
Related
"Alice" is a character vector of length 1. "Bob" is also a character vector of length 1, but it's clearly shorter. At face value, it appears that R's character are made out of something smaller than characters, but if you try to subset them, say "Alice"[1], you'll just get the original vector back. How does R internally make sense of this? What are character vectors actually made of?
You're mistaking vector length for string length.
In R common variables are all vectors containing whatever data you typed, so both are vectors that contain 1 string even if you don't assign a name to them.
If you want to check the size of each string, use nchar function:
nchar("Alice")
[1] 5
nchar("Bob")
[1] 3
What is equivalent SQL server isnumeric function in R studio. I am trying to migrate one of SQL logic to r studio and i have column where it holds both Char values and Int values, now i want take only int values and update them as -1 in R data.table. Please help me to solve the problem.
I have attached results as image, column "A" values are current values and i am expecting have the values like column B.
There are also data type tools in R (as in SQL and other languages) such as is.numeric() and is.integer() in R. Normally these return boolean values, but you could use sub or gsub() to make it -1:
example <- list(123, 321, "not numeric", as.Date("2018/01/01"))
gsub(T, -1, sapply(example, is.numeric))
[1] "-1" "-1" "FALSE" "FALSE"
Also, note that in R numeric is different from integer.
example <- list(as.integer(123), 321, "not numeric", as.Date("2018/01/01"))
example[sapply(example, is.integer)] <- -1
example
[[1]]
[1] -1
[[2]]
[1] 321
[[3]]
[1] "not numeric"
[[4]]
[1] "2018-01-01"
You can convert them back and forth with as.numeric() and as.integer(). Also, note that in R data types in this sense are referred to as the class or classes of the data, whereas the type in R refers to the storage or R internal data type.
I think if you're specifically interested in integers, then the question above is a duplicate of the following:
Check if the number is integer
Your if condition would be something like x == round(x, 0). This will be TRUE if values are integers, but not double or other non-numeric classes.
Finally i have fix this issue by following below steps.
captured all numeric values to separate data table by using below script
CustomDerivedL2AMID <- (subset(DimCombinedEnduser$DRVDEUL2AMID, grepl('^\d+$',DimCombinedEnduser$DRVDEUL2AMID)))
library(data.table)
HandleDerivedL2AMID <-data.table(CustomDerivedL2AMID)
match the HandleDerivedL2AMID table results with original data table and replaced all values to -1.
DCE$DRVDEUL2AMID <- replace(DCE$DRVDEUL2AMID,DCE$DRVDEUL2AMID %in% HandleDerivedL2AMID$CustomDerivedL2AMID,'-1')
now i see only character values. no more numeric values with data set under DRVDEUL2AMID.
I'm using iris dataset.
Ran the following code:
functionq3 <- function(x) {
if(x[['Sepal.Length']] > 5) {
return("greater than 5")
}
else {
return("less than 5")
}
}
outputq3 <- apply(iris,1,functionq3)
print(outputq3)
It returns "greater than 5" even if the value is 5. I'm expecting "less than 5". What's going wrong?
apply coerces all the elements in the iris data frame to character. Then in your function, comparison operator > coerces the numeric 5 on the RHS of x[['Sepal.Length']] > 5 to character "5".
So the real comparison of "5.0" (in iris[['Sepal.Length']]) and 5 is "5.0" > "5". This comparison depend on how the character strings "5.0" and "5" are encoded.
See ?Comparison
Comparison of strings in character vectors is lexicographic within the
strings using the collating sequence of the locale in use ...
... If the two arguments are atomic vectors of different types, one is
coerced to the type of the other, the (decreasing) order of precedence
being character, complex, numeric, integer, logical and raw.
In the console, go ahead and try
> sum(sapply(1:99999, function(x) { x != as.character(x) }))
0
For all of values 1 through 99999, "1" == 1, "2" == 2, ..., 99999 == "99999" are TRUE. However,
> 100000 == "100000"
FALSE
Why does R have this quirky behavior, and is this a bug? What would be a workaround to, e.g., check if every element in an atomic character vector is in fact numeric? Right now I was trying to check whether x == as.numeric(x) for each x, but that fails on certain datasets due to the above problem!
Have a look at as.character(100000). Its value is not equal to "100000" (have a look for yourself), and R is essentially just telling you so.
as.character(100000)
# [1] "1e+05"
Here, from ?Comparison, are R's rules for applying relational operators to values of different types:
If the two arguments are atomic vectors of different types, one is
coerced to the type of the other, the (decreasing) order of
precedence being character, complex, numeric, integer, logical and
raw.
Those rules mean that when you test whether 1=="1", say, R first converts the numeric value on the LHS to a character string, and then tests for equality of the character strings on the LHS and RHS. In some cases those will be equal, but in other cases they will not. Which cases produce inequality will be dependent on the current settings of options("scipen") and options("digits")
So, when you type 100000=="100000", it is as if you were actually performing the following test. (Note that internally, R may well/probably does use something different than as.character() to perform the conversion):
as.character(100000)=="100000"
# [1] FALSE
I have been using the R which function to remove rows from a data frame. I recently discovered that if the search term is NOT in the data.frame, the result is an empty character.
# 1: returns A-Q, S-Z (as expected)
LETTERS[-which(LETTERS == "R")]
# 2: returns "character(0)" (not what I would expect)
LETTERS[-which(LETTERS == "1")]
# 3: returns A-Z (expected)
LETTERS[which(LETTERS != "1")]
# 4: returns A-Q, S-Z (expected)
LETTERS[which(LETTERS != "R")]
Is the second example the expected behavior for -which() when the search term is not found? I have already switched my code to use the syntax in example 4, which seems safer, but I am just curious.
That is a well-known pitfall. When nothing matches the logical test the which-function returns numeric(0) and then "[" returns nothing instead of returning everything which would be expected. You can use:
LETTERS[ ! LETTERS == "1" ]
LETTERS[ ! LETTERS %in% "1" ]
There is another gotcha to be aware of and is the one that makes me choose to use which(). When using logical indexing an NA value used inside "[" will return a row. I generally do not want that so I use DFRM[ which(logical) ] although this seems to bother some people who say is is not needed. I just think they are working with small datasets and infrequently encounter the annoyance of seeing tens of thousands of NA-induced useless lines of output on their console. I never use the negated which version though.
Because of this:
which(LETTERS == '-1')
## integer(0)
and this:
(1:2)[integer(0)]
integer(0)
Instead of #4, use this:
LETTERS[LETTERS != "R"]
In example 2, which returns integer(0) (a zero-length integer vector) because no values are TRUE. A negative zero-length vector (-integer(0)) is still a zero-length vector. So you're essentially asking for the NULL element of LETTERS, which doesn't exist.