Creating a function called clean_data in R - r

I'm a new guy in the R world and had to create a vector.
data <- rnorm(10, 0, 1)
Next question asked for a loop so I did:
for(i in 1:length(data)){
if(data[i] > 0)
print("postive")
else
print("negative")
But now it's asking for:
"Write a function called “clean data” that takes in a vector of numbers and returns a vector called “ret” of same length such that ret[i] = 1 if the input vector ith element was positive, and ret[i] = 0 otherwise. To get started, make a separate R Block for your function and use the following shell:
clean data <- function(input){ # your code here [...]
# ...
# your code here [...] return(ret) }
Professor also recommends reusing the loop from earlier.

I believe this is what you are looking for. Please note how ifelse is being used in the function instead of if and else functions separately. As you can see, this function would work with numeric string as input, but I've added an instance of it running on the previous data you've created. But I would like to add that the purpose of these exercices is really to make we scratch our heads and try to work the problem out for ourselves, so I recommend you persist on trying next time :)
data <- rnorm(10, 0, 1)
for(i in 1:length(data)){
if(data[i] > 0) print("positive") else print("negative")
}
clean_data <- function(input){
ret <- NULL
for(i in 1:length(input)){
ifelse(input[i] > 0, ret[i] <- 1 , ret[i] <- 0) # note ifelse structure
}
return(ret)
}
clean_data(data)

Related

How to write a function containing a 'for' loop that uses a different function within? Applying this function to vectors?

I think I am misunderstanding some fundamental part of how 'for' loops and functions work. This function:
even.odd <- function(x) {
if (x != round(x)) {
y <- NA
} else if (x %% 2 == 0) {
y <- "even"
} else {
y <- "odd"
}
return(y)
}
works perfectly fine, returning "even", "odd", or "NA" given a number. However I am given two vectors :
test1 <- c(-1, 0, 2, 5)
test2 <- c(0, 2.7, 9.1)
and need to create a 'for' loop containing the even.odd function and test it using these vectors. I have read all the recommended reading/lecture notes, and tried for hours unsuccessfully to produce the desired result, using empty vectors, indexing new objects, just putting even.odd(num.vec). I'm not sure where I've gone so wrong.
We were given a starting point for the new function:
even.odd.vec <- function(num.vec) {
#write your code here
}
So far this is what I have come up with:
#creating intermediate function
intermediate <- function(num.vec) {
if (even.odd(num.vec) == "odd") {
return("odd")
} else if(even.odd(num.vec) == "even") {
return("even")
} else ifelse(is.na(num.vec), NA, "False")
}
#creating new desired function
even.odd.vec <- function(num.vec) {
for (i in seq_along(num.vec)) {
intermediate(num.vec)
print(intermediate(num.vec))
}
}
The intermediate function was the result of me running into various errors when trying to create a simpler body for even.odd.vec. But now when I try to use even.odd.vec with one of the test vectors I get this error:
the condition has length > 1 and only the first element will be usedthe condition has length > 1 and only the first element will be usedthe condition has length > 1 and only the first element will be usedthe condition has length > 1 and only the first element will be used
I am very stuck at this point and dying to know how to make something like this work. I've had a lot of fun working on it but I think I'm digging myself into a hole and/or making things much more complicated than necessary. Any help is unbelievably appreciated as my professor is out of town and the TA seems overwhelmed.
There are definitely ways to do this without for loop but since you want to use a for loop for this exercise explicitly we can use even.odd function that works for a single number and create a new function which uses for loop and calls even.odd function for every element individually.
even.odd.vec <- function(x) {
result <- character(length = length(x))
for (i in seq_along(x)) {
result[i] <- even.odd(x[i])
}
return(result)
}
You can then pass vectors test1, test2 to this function.
even.odd.vec(test1)
#[1] "odd" "even" "even" "odd"
even.odd.vec(test2)
#[1] "even" NA NA

R for loop returns NULL instead of expected values

i<-c(1:44)
diff_arbeitnehmer <- for(x in i){if(x == 44) {diff_arbeitnehmer[x] <- 0} else{diff_arbeitnehmer[x] <- 100/erwerbstaetige[x,2]*erwerbstaetige[x,4]-100/erwerbstaetige[x+1,2]*erwerbstaetige[x+1,4]}}
My data frame has 44 entriess
I am using R script could someone tell me what could be the reason?
I am lost with this
I can't run your code because I don't have your data frame, but maybe the reason is because you are trying to assing a for loop into the variable diff_arbeitnehmer. I did this change and hope that know it works:
i<-c(1:44)
diff_arbeitnehmer <- c()
for(x in i){
if(x == 44){
diff_arbeitnehmer[x] <- 0
} else{
diff_arbeitnehmer[x] <- 100/erwerbstaetige[x,2]*erwerbstaetige[x,4]-100/erwerbstaetige[x+1,2]*erwerbstaetige[x+1,4]
}
}
An advice is to take a look if the assignment in the last condition is right, maybe you need to put some parenthesis.

R - Saving the values from a For loop in a vector or list

I'm trying to save each iteration of this for loop in a vector.
for (i in 1:177) {
a <- geomean(er1$CW[1:i])
}
Basically, I have a list of 177 values and I'd like the script to find the cumulative geometric mean of the list going one by one. Right now it will only give me the final value, it won't save each loop iteration as a separate value in a list or vector.
The reason your code does not work is that the object ais overwritten in each iteration. The following code for instance does what precisely what you desire:
a <- c()
for(i in 1:177){
a[i] <- geomean(er1$CW[1:i])
}
Alternatively, this would work as well:
for(i in 1:177){
if(i != 1){
a <- rbind(a, geomean(er1$CW[1:i]))
}
if(i == 1){
a <- geomean(er1$CW[1:i])
}
}
I started down a similar path with rbind as #nate_edwinton did, but couldn't figure it out. I did however come up with something effective. Hmmmm, geo_mean. Cool. Coerce back to a list.
MyNums <- data.frame(x=(1:177))
a <- data.frame(x=integer())
for(i in 1:177){
a[i,1] <- geomean(MyNums$x[1:i])
}
a<-as.list(a)
you can try to define the variable that can save the result first
b <- c()
for (i in 1:177) {
a <- geomean(er1$CW[1:i])
b <- c(b,a)
}

Trying to vectorize a for loop in R

UPDATE
Thanks to the help and suggestions of #CarlWitthoft my code was simplified to this:
model <- unlist(sapply(1:length(model.list),
function(i) ifelse(length(model.list[[i]][model.lookup[[i]]] == "") == 0,
NA, model.list[[i]][model.lookup[[i]]])))
ORIGINAL POST
Recently I read an article on how vectorizing operations in R instead of using for loops are a good practice, I have a piece of code where I used a big for loop and I'm trying to make it a vector operation but I cannot find the answer, could someone help me? Is it possible or do I need to change my approach? My code works fine with the for loop but I want to try the other way.
model <- c(0)
price <- c(0)
size <- c(0)
reviews <- c(0)
for(i in 1:length(model.list)) {
if(length(model.list[[i]][model.lookup[[i]]] == "") == 0) {
model[i] <- NA
} else {
model[i] <- model.list[[i]][model.lookup[[i]]]
}
if(length(model.list[[i]][price.lookup[[i]]] == "") == 0) {
price[i] <- NA
} else {
price[i] <- model.list[[i]][price.lookup[[i]]]
}
if(length(model.list[[i]][reviews.lookup[[i]]] == "") == 0) {
reviews[i] <- NA
} else {
reviews[i] <- model.list[[i]][reviews.lookup[[i]]]
}
size[i] <- product.link[[i]][size.lookup[[i]]]
}
Basically the model.list variable is a list from which I want to extract a particular vector, the location from that vector is given by the variables model.lookup, price.lookup and reviews.lookup which contain logical vectors with just one TRUE value which is used to return the desired vector from model.list. Then every cycle of the for loop the extracted vectors are stored on variables model, price, size and reviews.
Could this be changed to a vector operation?
In general, try to avoid if when not needed. I think your desired output can be built as follows.
model <- unlist(sapply(1:length(model.list), function(i) model.list[[i]][model.lookup[[i]]]))
model[model=='']<-NA
And the same for your other variables. This assumes that all model.lookup[[i]] are of length one. If they aren't, you won't be able to write the output to a single element of model in the first place.
I would also note that you are grossly overcoding, e.g. x<-0 is better than x<-c(0), and don't bother with length evaluation on a single item.

Putting Results of for loop into a data frame in R

I have a function that will return a numeric class object, here is a simplified version:
count.matches <- function(x) {
ifelse(!is.na(a[,x] & a[,2]),1,0)
}
it just produces an object of 0s and 1s.
For example
count.matches(4)
[1] 0 0 0 0 1 1 0
I just want to do a simple for loop of this function and store the results in a data frame, i.e. each time the function loops through create a column, however I am having trouble.
p <- data.frame()
my.matches <- for(i in 2:100) {
p[i,] <- count.matches(i)
}
This produces nothing. Sorry if this is a really stupid question, but I have tried a bunch of things and nothing seems to work. If you need any more information, please let me know and I will provide it.
for does not return the last value in the loop and combine the results. It returns NULL.
Using for you will have to create the data.frame first, and fill it with the results:
my.matches <- as.data.frame(matrix(0, nrow(a), ncol(a) - 1))
for (i in 2:100) {
my.matches[,i] <- count.matches(i)
}
Alternative you could use the foreach package, which provides the functionality you're expecting from for.
library(foreach)
my.matches <- foreach(i = 2:100, .combine=data.frame) %do% {
count.matches(i)
}

Resources