I wrote a function in R to attach zeros such that any number between 1 and 100 comes out as 001 (1), 010 (10), and 100 (100) but I can't figure out why the if statements aren't qualifying like I would like them to.
id <- 1:11
Attach_zero <- function(id){
i<-1
for(i in id){
if(id[i] < 10){
id[i] <- paste("00",id[i], sep = "")
}
if((id[i] < 100)&&(id[i]>=10)){
id[i] <- paste("0",id[i], sep = "")
}
print(id[i])
}
}
The output is "001", "2", "3",... "010", "11"
I have no idea why the for loop is skipping middle integers.
The problem here is that you're assigning a character string (e.g. "001") to a numeric vector. When you do this, the entire id vector is converted to character (elements of a vector must be of one type).
So, after comparing 1 to 10 and assigning "001" to id[1], the next element of id is "2" (i.e. character 2). When an inequality includes a character element (e.g. "2" < 10), the numeric part is coerced to character, and alphabetic sorting rules apply. These rules mean that both "100" and "10" comes before "2", and so neither of your if conditions are met. This is the case for all numbers except 10, which according to alphabetic sorting is less than 100, and so your second if condition is met. When you get to 11, neither condition is met once again, since the "word" "11" comes after the word "100".
While there are a couple of ways to fix your function, this functionality exists in R (as mentioned in the comments), both with sprintf and formatC.
sprintf('%03d', 1:11)
formatC(1:11, flag=0, width=3)
# [1] "001" "002" "003" "004" "005" "006" "007" "008" "009" "010" "011"
For another vectorised approach, you could use nested ifelse statements:
ifelse(id < 10, paste0('00', id), ifelse(id < 100, paste0('0', id), id))
Try this:
id <- 1:11
Attach_zero <- function(id){
id1 <- id
i <- 1
for (i in seq_along(id)) {
if(id[i] < 10){
id1[i] <- paste("00", id[i], sep = "")
}
if(id[i] < 100 & id[i] >= 10){
id1[i] <- paste("0", id[i], sep = "")
}
}
print(id1)
}
If you try your function with id = c(1:3, 6:11):
Attach_zero(id)
##[1] "001"
##[1] "2"
##[1] "3"
##[1] "8"
##[1] "9"
##[1] "010"
##[1] "11"
##Error in if (id[i] < 10) { : missing value where TRUE/FALSE needed
What here happens is that the missing values are omitted because your i values says so. The i<-1 does nothing as it is after that written with for (i in id) which in turns gives i for each loop the ith value of id instead of an index. So if your id is id <- c(1:3, 6:11) you will have unexpected results as showed.
Just correcting your function to include all the elements of the id:
Attach_zero <- function(id){
for(i in 1:length(id)){
if(id[i] < 10){
id[i] <- paste("00",id[i], sep = "")
}
if((id[i] < 100)&&(id[i]>=10)){
id[i] <- paste("0",id[i], sep = "")
}
print(id[i])
}
}
Attach_zero(id)
##[1] "001"
##[1] "2"
##[1] "3"
##[1] "6"
##[1] "7"
##[1] "8"
##[1] "9"
##[1] "010"
##[1] "11"
Note the number 7 in this output.
And using sprintf as jbaums says, including it in a function:
Attach_zero <- function(id){
return(sprintf('%03d', id)) #You can change return for print if you want
}
Attach_zero(id)
## [1] "001" "002" "003" "006" "007" "008" "009" "010" "011"
Related
I wonder how for loop can be used at once without non-numeric error. I would like to make multiple character values in a vector Nums, using for loop.
But after the third line, the vector becomes chr so cannot continue the rest. This comes out to be same even when I use if loop or while loop... Can someone give a hint about this?
for(n in 1:30){
Nums<-1:n
Nums[Nums%%2==0 & Nums%%3==0]<-"OK1"
Nums[Nums%%2==0 & Nums%%3!=0]<-"OK2"
Nums[Nums%%2!=0 & Nums%%3==0]<-"OK3"
Nums[Nums%%2!=0 & Nums%%3!=0]<-n
}
Error in Nums%%2 : non-numeric argument to binary operator
I don't think the loop is actually doing what you want it to do. You are replacing Nums at every iteration, so nothing is actually being saved. Maybe you don't actually want a loop.
Nums <- 1:30
x <- 1:30
dplyr::case_when(
Nums%%2==0 & x%%3==0 ~ "OK1",
Nums%%2==0 & x%%3!=0 ~ "OK2",
Nums%%2!=0 & x%%3==0 ~ "OK3",
Nums%%2!=0 & x%%3!=0 ~ as.character(x)
)
#> [1] "1" "OK2" "OK3" "OK2" "5" "OK1" "7" "OK2" "OK3" "OK2" "11" "OK1"
#> [13] "13" "OK2" "OK3" "OK2" "17" "OK1" "19" "OK2" "OK3" "OK2" "23" "OK1"
#> [25] "25" "OK2" "OK3" "OK2" "29" "OK1"
Character and numeric values can't coexist in a vector*. As #Ands. points out, you don't really need a loop for this. If you want to avoid case_when (which is from the dplyr package, part of the "tidyverse"), you can do:
n <- 30
Nums <- 1:n
x <- as.character(Nums)
x[Nums%%2==0 & Nums%%3==0]<-"OK1"
x[Nums%%2==0 & Nums%%3!=0]<-"OK2"
x[Nums%%2!=0 & Nums%%3==0]<-"OK3"
You don't need the final statement because the remaining elements were already set to the corresponding numeric values.
If you want to use a for loop and replace as you go, you could convert the vector to a list:
Nums <- 1:n
Nums <- as.list(Nums)
for (i in 1:n) {
if (i%%2==0 & i%%3==0) Nums[[i]] <- "OK1"
if (i%%2==0 & i%%3!=0) Nums[[i]] <- "OK2"
if (i%%2!=0 & i%%3==0) Nums[[i]] <- "OK3"
}
unlist(Nums)
* Technically they can't coexist in an atomic vector — lists are vectors too ...
I have a set of vectors inside a list wherein I want to append certain values to each vector. When I used append() outside the loop, it worked perfectly fine but inside a loop it doesn't seem to work.
factors <- list(c("K3BG","9"),c("RTCKO","4"))
len <- length(factors)
for (i in 1:length)
{
rejig_score <- factors[[i]][2]
rejig_score <- as.numeric(rejig_score)
if(rejig_score > 5)
{
factors[[i]] <- append(factors[[i]],"Approved")
}
else
{
factors[[i]] <- append(factors[[i]],"Disapproved")
}
}
I changed 1:lenght to 1:len inside for
factors <- list(c("K3BG","9"),c("RTCKO","4"))
len <- length(factors)
for (i in 1:len)
{
rejig_score <- factors[[i]][2]
rejig_score <- as.numeric(rejig_score)
if(rejig_score > 5)
{
factors[[i]] <- append(factors[[i]],"Approved")
}
else
{
factors[[i]] <- append(factors[[i]],"Disapproved")
}
}
factors
[[1]]
[1] "K3BG" "9" "Approved"
[[2]]
[1] "RTCKO" "4" "Disapproved"
Using lapply
lapply(factors, function(x) c(x, if(as.numeric(x[2]) > 5)
"Approved" else "Disapproved"))
-output
[[1]]
[1] "K3BG" "9" "Approved"
[[2]]
[1] "RTCKO" "4" "Disapproved"
Or another option is to extract the second element from the list and do the comparison outside, create the vector values and append
new <- c("Disapproved", "Approved")[1 +
(as.numeric(sapply(factors, `[[`, 2)) > 5)]
Map(c, factors, new)
[[1]]
[1] "K3BG" "9" "Approved"
[[2]]
[1] "RTCKO" "4" "Disapproved"
I am just trying to loop over my columns and print out the count of unique values for further processing - but getting not output. This should be simple but I am not getting any output. Here is a simplified version of my code. Is there something glaringly obviously missing as I suspect
for (i in 1:length(mydata)) {
(table(mydata[,i]))
}
Do you mean using apply?
> x <- data.frame("SN" = 1:4, "Age" = c(21,15,56,15), "Name" =
c("John","Dora","John","Dora"))
> apply(x,2,function(x) unique(x))
$SN
[1] "1" "2" "3" "4"
$Age
[1] "21" "15" "56"
$Name
[1] "John" "Dora"
You can also count the uniques like this:
> apply(x,2,function(x) length(unique(x)))
SN Age Name
4 3 2
I am trying to extract characters only and numbers only from a string. Because the positions of these vary, I can't use syntax which relies on the position of the values.
For example, say I have the following column x where values are repeated, but with different numbers:
x <- c("dummy.DR57", "dummy.hour41", "dummy.MAV43", "dummy.SB1")
I want to create two columns:
1: A column with just the characters after the "." but before the numbers:
name <- c("DR", "hour", "MAV", "SB")
2: A column with just the numbers:
number <- c("57", "41", "43", "1")
I've mostly been trying substr and str_sub - but I'm not getting the results I need.
Any help is much appreciated!
x <- c("dummy.DR57", "dummy.hour41", "dummy.MAV43", "dummy.SB1")
(number <- gsub('[[:alpha:]].', '', x))
# [1] "57" "41" "3" "1"
(name <- gsub("[^.]*[.]|[[:digit:]]", "", x))
# [1] "DR" "hour" "MAV" "SB"
> gsub(x, pattern = '[0-9]|dummy\\.', replacement = '')
[1] "DR" "hour" "MAV" "SB"
> gsub(x, pattern = '[a-zA-Z]|\\.', replacement = '')
[1] "57" "41" "43" "1"
You may try this:
gsub(pattern = "(^.*\\.)([[:alpha:]]+)([[:digit:]]+)",
replacement = "\\2",
x = x)
# [1] "DR" "hour" "MAV" "SB"
gsub(pattern = "(^.*\\.)([[:alpha:]]+)([[:digit:]]+)",
replacement = "\\3",
x = x)
# [1] "57" "41" "43" "1"
I have a problem using a for-loop in R. The following code
a <- seq(-2, 5)
for(i in 1:length(a)){
a[i] <- if(a[i] <= 0) "aa" else a[i]
}
should result in the following vector
> a
[1] "aa" "aa" "aa" "1" "2" "3" "4" "5"
Instead we have the following result:
> a
[1] "aa" "-1" "aa" "1" "2" "3" "4" "5"
Why isn't R able to replace "-1" with "aa"?
We tried another solution which works fine:
a <- seq(-2, 5)
b <- NULL
for(i in 1:length(a)){
b[i] <- if(a[i] <= 0) "aa" else a[i]
}
it produces the expected result:
> b
[1] "aa" "aa" "aa" "1" "2" "3" "4" "5"
Why does the latter example work fine and the first one not?
Thank you very much for your help!
Best regards!!
The collation sequence may not be as you (or Matthew) understand. The character "-" may not be lower in the lexical ordering for your operating system. String comparisons are OS specific. (See ?Comparison) After the first replacement the entire vector was coerced to character and if "-" > 0 returns TRUE on your machine then you have the answer. I will bet that this code will act as you expected:
a <- seq(-2, 5)
for(i in 1:length(a)){
a[i] <- if( as.numeric(a[i]) <= 0) "aa" else a[i]
}
I suspect that Henrik's suggestion should also behave to your expectations because it would create a logical vector from the numeric comparison first, and then select from the choice of "aa" and a.
(In the second instance there was no coercion of the vector to character.)