How to mark all integer values to "-1" in R Studio? - r

What is equivalent SQL server isnumeric function in R studio. I am trying to migrate one of SQL logic to r studio and i have column where it holds both Char values and Int values, now i want take only int values and update them as -1 in R data.table. Please help me to solve the problem.
I have attached results as image, column "A" values are current values and i am expecting have the values like column B.

There are also data type tools in R (as in SQL and other languages) such as is.numeric() and is.integer() in R. Normally these return boolean values, but you could use sub or gsub() to make it -1:
example <- list(123, 321, "not numeric", as.Date("2018/01/01"))
gsub(T, -1, sapply(example, is.numeric))
[1] "-1" "-1" "FALSE" "FALSE"
Also, note that in R numeric is different from integer.
example <- list(as.integer(123), 321, "not numeric", as.Date("2018/01/01"))
example[sapply(example, is.integer)] <- -1
example
[[1]]
[1] -1
[[2]]
[1] 321
[[3]]
[1] "not numeric"
[[4]]
[1] "2018-01-01"
You can convert them back and forth with as.numeric() and as.integer(). Also, note that in R data types in this sense are referred to as the class or classes of the data, whereas the type in R refers to the storage or R internal data type.

I think if you're specifically interested in integers, then the question above is a duplicate of the following:
Check if the number is integer
Your if condition would be something like x == round(x, 0). This will be TRUE if values are integers, but not double or other non-numeric classes.

Finally i have fix this issue by following below steps.
captured all numeric values to separate data table by using below script
CustomDerivedL2AMID <- (subset(DimCombinedEnduser$DRVDEUL2AMID, grepl('^\d+$',DimCombinedEnduser$DRVDEUL2AMID)))
library(data.table)
HandleDerivedL2AMID <-data.table(CustomDerivedL2AMID)
match the HandleDerivedL2AMID table results with original data table and replaced all values to -1.
DCE$DRVDEUL2AMID <- replace(DCE$DRVDEUL2AMID,DCE$DRVDEUL2AMID %in% HandleDerivedL2AMID$CustomDerivedL2AMID,'-1')
now i see only character values. no more numeric values with data set under DRVDEUL2AMID.

Related

How do I convert data from integer and dbl to numeric in R

As stated above, I'm trying to convert data in my dataframe from integer/dbl to numeric but I end up with dbl for both columns.
Original dataset
Code I'm using to convert to numeric;
data$price <- as.numeric(data$price)
data$lot_size <- as.numeric(data$lot_size)
The dataframe I end up with:
The dataframe I end up with
Dataset I have been working with: https://dasl.datadescription.com/datafile/housing-prices-ge19
"numeric is identical to double"
https://stat.ethz.ch/R-manual/R-devel/library/base/html/numeric.html
> typeof(as.numeric(3L))
[1] "double"
> typeof(as.integer(3L))
[1] "integer"
The stuff with types in R is a bit confusing. I would say that numeric is not really a data type at all in R. You will never get the answer numeric from the typeof function.
Both, integers and doubles are considered to be numeric and the function is.numeric will return TRUE for either.
On the other hand, numeric is more often a synonym for double.
The functions numeric and as.numeric are the same as double and as.double.
Edit:
With a bit more research under my belt let me rephrase it like this:
'numeric' is the virtual superclass of both integer and double.
See for example getClass("numeric") and help(UseMethod) (first paragraph in the Details section).
Hadley says it better: Advanced R

Changing Yes/No CSV Column to Numeric Data in R

I have a .csv file with yes/no answers in one column. I opened it in my R compiler and tried to run pairs() on it; however, I get an error message of "non-numeric argument to pairs." I have attempted to change the yes/no responses to 0/1 values, but as.numeric() and as.factor() don't seem to do anything. I have also tried changing the data type from character to numerical in the data editor window that appears when I use the fix() function. That results in a column full of "NA".
How can I change the yes/no responses into something that will work with pairs() and with plot()?
I am fairly new to R and would much appreciate your help.
logical vectors can be cast fairly directly into numbers using a shortcut +(.). For instance,
x <- c("yes","no","yes")
(x == "yes")
# [1] TRUE FALSE TRUE
+(x == "yes")
# [1] 1 0 1

UseMethod("type") error; no applicable method for 'type" applied to an object of class "c('double', 'numeric')"

In a dataframe, I have a column that has numeric values and some mixed in character data for some rows. I want to remove all rows with the character data and keep those rows with a number value. The df I have is 6 million rows, so I simply made a small object to try to solve my issue and then implement at a larger scale.
Here is what I did:
a <- c("fruit", "love", 53)
b <- str_replace_all("^[:alpha:]", 0)
Reading answers to other UseMethod errors on here (about factors), I tried to change "a" to as.character(a) and attempt "b" again. But, I get the same error. I'm trying to simply make any alphabetic value into the number zero and I'm fairly new at all this.
There are several issues here, even in these two lines of code. First, a is a character vector, because its first element is a character. This means that your numeric 53 is coerced into a character.
> print(a)
[1] "fruit" "love" "53"
You've got the wrong syntax for str_replace_all. See the documentation for how to use it correctly. But that's not what you want here, because you want numerics.
The first thing you need to do is convert a to a numeric. A crude way of doing this is simply
>b <- as.numeric(a)
Warning message:
NAs introduced by coercion b
> b
[1] NA NA 53
And then subset to include only the numeric values in b:
> b <- b[!is.na(b)]
> b
[1] 53
But whether that's what you want to do with a 6 million row dataframe is another matter. Please think about exactly what you would like to do, supply us with better test data, and ask your question again.
There's probably a more efficient way of doing this on a large data frame (e.g. something column-wise, instead of row-wise), but to answer your specific question about each row a:
as.numeric(stringr::str_replace_all(a, "[a-z]+", "0"))
Note that the replacing value must be a character (the last argument in the function call, "0"). (You can look up the documentation from your R-console by: ?stringr::str_replace_all)

Variable cell specification for a csv in R

I wish to use a variable to specify a particular cell in a csv file. I can use:
emp1 <- read.csv("C:/Database/data/emp1.csv",as.is=TRUE)
numberofemployee <- 1
> emp1["1", "X.name"]
[1] "ALEX"
but if I use:
> emp1["numberofemployee", "X.name"]
[1] NA
I assume R is looking for numberofemployee as a column header.
How do I get it to see it as an integer so I can specify my cells?
csv file
#name,mon,tue,wed,thu,fri
ALEX,98,95,73,88,18
BRAD,66,25,72,8,32
JOHN,22,41,78,43,36
The problem is that you pass strings to the []. This works best when referring to row and columnnames. In case of using "1", R probably makes an educated guess and converts the "1" to a 1 (numeric). However, in case of you passing the name of a variable, R cannot do anything else than assume that you are trying to extract something from the numberofemployee column. If you want to use the content of numberofemployee, you need to omit the ". R will then interpret that as an R object, whose content you want to use:
emp1[numberofemployee, "X.name"]

Why does 1..99,999 == "1".."99,999" in R, but 100,000 != "100,000"?

In the console, go ahead and try
> sum(sapply(1:99999, function(x) { x != as.character(x) }))
0
For all of values 1 through 99999, "1" == 1, "2" == 2, ..., 99999 == "99999" are TRUE. However,
> 100000 == "100000"
FALSE
Why does R have this quirky behavior, and is this a bug? What would be a workaround to, e.g., check if every element in an atomic character vector is in fact numeric? Right now I was trying to check whether x == as.numeric(x) for each x, but that fails on certain datasets due to the above problem!
Have a look at as.character(100000). Its value is not equal to "100000" (have a look for yourself), and R is essentially just telling you so.
as.character(100000)
# [1] "1e+05"
Here, from ?Comparison, are R's rules for applying relational operators to values of different types:
If the two arguments are atomic vectors of different types, one is
coerced to the type of the other, the (decreasing) order of
precedence being character, complex, numeric, integer, logical and
raw.
Those rules mean that when you test whether 1=="1", say, R first converts the numeric value on the LHS to a character string, and then tests for equality of the character strings on the LHS and RHS. In some cases those will be equal, but in other cases they will not. Which cases produce inequality will be dependent on the current settings of options("scipen") and options("digits")
So, when you type 100000=="100000", it is as if you were actually performing the following test. (Note that internally, R may well/probably does use something different than as.character() to perform the conversion):
as.character(100000)=="100000"
# [1] FALSE

Resources