I have an R file which imports a file, does some data manipulation, and performs a logistic regression model, and then saves those results to a txt file. However, when I run the file from the command line, I get the following error message and don't know what's going on.
anonymous#anonymous-Latitude-E6520:~/Downloads$ R --no-save < Auto_Model.r > out.txt
Warning message:
NAs introduced by coercion
Error in if (x == "\\N") NA else if (x > 1 & x < 6999) "1:6999" else if (x > :
missing value where TRUE/FALSE needed
Calls: bin.value -> do.call -> mapply -> .Call -> <Anonymous>
Execution halted
anonymous#anonymous-Latitude-E6520:~/Downloads$ R --no-save < Auto_Model.r
The R script which results in the error is below =
> ## IMPORT DATA:
> #setwd("~/Desktop")
> library(foreign)
> dat = read.csv("dat.csv", stringsAsFactors=FALSE)
>
> ## zipcode =
> dat$zipcode = as.character(dat$zipcode)
>
> bin.value = Vectorize(function(x) {
+ if (x == "\\N") NA
+ else if (x > 1 & x < 6999) "1:6999"
+ else if (x > 7000 & x < 9999) "7000:9999"
+ else if (x > 10000 & x < 14849) "10000:14849"
+ else if (x > 14850 & x < 19699) "14850:19699"
+ else if (x > 19700 & x < 29999) "19700:29999"
+ else if (x > 30000 & x < 31999) "30000:31999"
+ else if (x > 32000 & x < 34999) "32000:34999"
+ else if (x > 35000 & x < 42999) "35000:42999"
+ else if (x > 43000 & x < 49999) "43000:49999"
+ else if (x > 50000 & x < 59999) "50000:59999"
+ else if (x > 60000 & x < 69999) "60000:69999"
+ else if (x > 70000 & x < 79999) "70000:79999"
+ else if (x > 80000 & x < 89999) "80000:89999"
+ else if (x > 90000 & x < 96999) "90000:96999"
+ else if (x > 97000 & x < 99820) "97000:99820"
+ else NA
+ })
>
> dat$zipcode2 = as.character(bin.value(as.integer(dat$zipcode)))
Error in if (x == "\\N") NA else if (x > 1 & x < 6999) "1:6999" else if (x > :
missing value where TRUE/FALSE needed
Calls: bin.value -> do.call -> mapply -> .Call -> <Anonymous>
Execution halted
I assume some is wrong in how I am trying to manipulate the mode of the zipcode variable but nothing I've tried seems to fix the issue.
> str(dat$zipcode)
int [1:12635] 76148 33825 61832 11368 98290 92078 44104 62052 55106 20861 ...
>
It seems to me that what you're trying to do is already done by function cut:
bin.value <- function(x){
cut(as.integer(x),
breaks= c(1,6999,9999,14849,19699,29999,31999,34999,42999,49999,59999,69999,79999,89999,96999,99820),
labels= c("1:6999", "7000:9999", "10000:14849", "14850:19699", "19700:29999", "30000:31999", "32000:34999", "35000:42999", "43000:49999", "50000:59999", "60000:69999", "70000:79999", "80000:89999", "90000:96999", "97000:99820"))
}
Otherwise your specific problem is caused by as.integer:
a <- c("\\N",sample(seq(0,100000,by=1),10))
a
[1] "\\N" "38987" "50403" "75683" "66706" "27924" "17216" "77539" "80658" "2335" "53010"
as.integer(a)
[1] NA 38987 50403 75683 66706 27924 17216 77539 80658 2335 53010
\\N is therefore traited straight away as NA which your loop only handle at the end, meanwhile all ifstatements try to compare a missing value with some elements.
as.integer(a)[1]=="\\N"
[1] NA # Instead of TRUE or FALSE
Related
I am trying to do an if & statment in R:
I want to do something like this:
if (x > 1) & (y = "Yes) {"replace")
I've also tried
if (x > 1) && (y = "Yes") {"replace")
Which I read on StackOverflow.
How do I convert the excel formula
=IF(AND(cell > 1, cell = "Yes"),100,0)
Try this. Does it work?
if (x > 1 & y == "Yes") {"replace"}
it works with
x[x >= 0.2] = 1
x[x < 0.2] = 0
x is a tensor here.
but when i am trying to use
x[x > 0 and x < 1] = 1
it reports: RuntimeError: bool value of Tensor with more than one value is ambiguous ?
dose anyone know why?
Just a syntax thing.
x = torch.randn((1,3,20,20))
x[(x > 0) & (x < 1)] = 1
Hi I was wondering if someone knows how to realize this sequence in R?
Consider a sequence with following requirement.
a1=1
an=an-1+3 (If n is a even number)
an=2×an-1-5 (If n is a odd number)
e.g. 1,4,3,6,7,10,15,...
a30=?
Try the following.
It will return the entire sequence, not just the last element.
seq_chih_peng <- function(n){
a <- integer(n)
a[1] <- 1
for(i in seq_along(a)[-1]){
if(i %% 2 == 0){
a[i] <- a[i - 1] + 3
}else{
a[i] <- 2*a[i - 1] - 5
}
}
a
}
seq_chih_peng(30)
Note that I do not include code to check for input errors such as passing n = 0 or a negative number.
If you want to do it recursively, you just have the write the equations in your function as follows:
sequence <- function(n) {
if (n == 1) return(1)
else if (n > 1) {
if (n %% 2 == 1) {
return(2 * sequence(n - 1) - 5)
}else{
return(sequence(n - 1) + 3)
}
}else{
stop("n must be stricly positive")
}
}
sequence(30)
# returns 32770
I am having trouble again with my code. Here is the question and what I have right now:
# 2. Draw a random sample of size n=20 from a uniform distribution in 0 and 1 using
# runif(20). Sequentially, print values using the following rules:
# i. Print a value if it is less than 0.3 or more than 0.8, but skip it
# (don’t print the value) if it is in (0.1, 0.2],
# ii. Skip the entire process if you find a value in [0.4,0.5].
# Write three separate R codes using (a) for loop, (b) while loop
# and (c) repeat loop.
# (a) for loop
n = runif(20)
for (val in n){
if (val > 0.1 & val <= 0.2){
next
} else if (val < 0.3 | val > 0.8){
print(val)
} else if (val >= 0.4 & val <= 0.5){
print(val)
break
}
}
# (b) while loop
n = 1
m = runif(20)
while(n < 20){
if (m > 0.1 & m <= 0.2){
next
} else if (m < 0.3 | m > 0.8){
print(m)
} else if (m >= 0.4 & m <= 0.5){
print(m)
break
}
n = n + 1
}
# (c) repeat loop
n = 1
m = runif(20)
repeat{
if (m > 0.1 & m <= 0.2){
next
} else if (m < 0.3 | m > 0.8){
print(val)
} else if (m >= 0.4 & m <= 0.5){
print(m)
break
}
}
Part (a) for loop is working perfectly.
My only issue is (b) while loop and (c) repeat loop. He didn't do a good job in class or notes going over a while loop and repeat loop. Please help.
The object m that you created has a length of 20, so when you go to test it with something like if (m > 0.1 & m <= 0.2), R only tests the first item in your object. To solve this, you'll need to index m with n, your loop counter. In other words, don't use m in your tests, but use m[n] instead. In all it should look like this:
n <- 1
m <- runif(20)
while(n < 20){
if (m[n] > 0.1 & m[n] <= 0.2){
next
} else if (m[n] < 0.3 | m[n] > 0.8){
print(m[n])
} else if (m[n] >= 0.4 & m[n] <= 0.5){
print(m[n])
break
}
n <- n + 1
}
You should be able to use a similar approach for part c. (Also note that in part c you have print(val) at one point.)
Hope that helps!
Apparently the exercise if for you to sort it out, but OK, I'll post a solution.
# (b) while loop
n = 1
m = runif(20)
while(n <= 20){
if (m[n] > 0.1 & m[n] <= 0.2){
n = n + 1
next
} else if (m[n] < 0.3 | m[n] > 0.8){
print(m[n])
} else if (m[n] >= 0.4 & m[n] <= 0.5){
print(m[n])
break
}
n = n + 1
}
# (c) repeat loop
n = 0
m = runif(20)
repeat{
if(n < 20)
n <- n + 1
else
break
if (m[n] > 0.1 & m[n] <= 0.2){
next
} else if (m[n] < 0.3 | m[n] > 0.8){
print(m[n])
} else if (m[n] >= 0.4 & m[n] <= 0.5){
print(m[n])
break
}
}
As a final note, whenever pseudo-random number generators are used you should set the initial value in order for the results to be reproducible. This is done like this:
set.seed(6019) # or any other value, 6019 is the seed
This is put before the first call to runif.
I have a dataframe named flow with over 17,000 entries which contains daily water quality days for about 50 years. I have a column that has the jday (day of the year) of each entry but now I want to assign each entry a season from 1 to 4 (winter, spring, fall, summer). This is what I have so far:
> for(i in flow){
+ if (flow$jdays[i] <= 80 | flow$jdays[i]>355){
+ flow$season [i] <- 1
+ } else if (flow$jdays [i] > 80 & flow$jdays [i]<= 172){
flow$season [i] <- 2
+ }
+ else if(flow$jdays [i] > 172 & flow$jdays [i]<= 264){
+ flow$season [i] <- 3
+ }
+ else{
+ flow$season [i] <- 4
+ }
+ }
I keep getting the following message:
Error in if (flow$jdays[i] <= 80 | flow$jdays[i] > 355) { :
argument is of length zero
this may be better approach,
flow$season<-ifelse(flow$jdays<=80 | flow$jdays>355 ,1,
ifelse(flow$jdays<=172,2,
ifelse(flow$jdays<=264,3,4)))
This is in error:
for(i in flow){
Change to:
for(in in seq(nrow(flow))){
A vectorized solution using ifelse:
transform(flow, season=
ifelse (jdays <= 80 | jdays>355, 1,
ifelse(jdays <= 172,2,
ifelse(jdays <= 264, 3, 4))))