So I've created the following in R that prints the row number and column name of values missing but is there a way to turn what i've coded into a function - this is likely redic easy but I am very new to this... if I were to create a function based off the code below where would I input the "function_name <-"
for (i in 1:nrow(airbnb)){
rownum <- i
#print(rownum)
for (j in 1:ncol(airbnb)){
colname <- names(airbnb[,j])
#airbnb[i,j]
if(is.na(airbnb[i,j])){
print(paste("Row Number:",i))
print(paste("Column Name:",colname))
}
}
}
I think this is what you're looking for:
You can name your function whatever you want, here it is called missing_func
You can replace the x to be more descriptive, so you can change all of the values for x to be df or dataframe or xyz:
missing_func <- function(x){
for (i in 1:nrow(x)){
rownum <- i
#print(rownum)
for (j in 1:ncol(x)){
colname <- names(x[,j])
#airbnb[i,j]
if(is.na(x[i,j])){
print(paste("Row Number:",i))
print(paste("Column Name:",colname))
}
}
}
}
Now to call the function above, you just need to supply a value for x (or whatever you choose)
missing_func(airbnb)
Related
I'm doing if condition and I want to match values from 2 different columns and of they match it has to assign value to another column. when I write the statement
for (l in 1:k) {
for(i in 1:n) {
if(y_related[i,2]==con_f[l]) {
y_out[l]=y_related[i,1]
}
}
}
then it doesn't work! but if I replaced the con_f with it's numerical value say 0.004 then it works. but I wanted to run it automatically as I don't want to write the numerical value every time!!
detailed example:
y_related=matrix(NA,1000,2)
y_related[,1]=rnorm(1000,5,10)
y_related[,2]=rank(y_related[,1])/1000
con_f=matrix(NA,250,1)
for(x in 1:250) {
con_f[x]=(1-((x-1)/250))
}
y_out=matrix(NA,250,1)
for (l in 1:250) {
for(i in 1:1000) {
if(y_related[i,2]==con_f[l]) {
y_out[l]=y_related[i,1]
}
}
}
This sounds like a job for the function merge
# Turn them in to data frames, and rename to sensible names
y_related_df <- as.data.frame(y_related)
names(y_related_df) <- c("value", "variable")
con_f_df <- as.data.frame(con_f)
names(con_f_df) <- c("variable")
# merge allows you to join on a vector of variables, and then move the data from
# y in to x as we've done a left join (all.x=TRUE)
output <- merge(x=con_f_df, y=y_related_df, by="variable", all.x=TRUE)
output
If you're not familiar with different types of joins, then have a look at this stackoverflow post
merge is very very useful, I use it all the time
I'm trying to save each iteration of this for loop in a vector.
for (i in 1:177) {
a <- geomean(er1$CW[1:i])
}
Basically, I have a list of 177 values and I'd like the script to find the cumulative geometric mean of the list going one by one. Right now it will only give me the final value, it won't save each loop iteration as a separate value in a list or vector.
The reason your code does not work is that the object ais overwritten in each iteration. The following code for instance does what precisely what you desire:
a <- c()
for(i in 1:177){
a[i] <- geomean(er1$CW[1:i])
}
Alternatively, this would work as well:
for(i in 1:177){
if(i != 1){
a <- rbind(a, geomean(er1$CW[1:i]))
}
if(i == 1){
a <- geomean(er1$CW[1:i])
}
}
I started down a similar path with rbind as #nate_edwinton did, but couldn't figure it out. I did however come up with something effective. Hmmmm, geo_mean. Cool. Coerce back to a list.
MyNums <- data.frame(x=(1:177))
a <- data.frame(x=integer())
for(i in 1:177){
a[i,1] <- geomean(MyNums$x[1:i])
}
a<-as.list(a)
you can try to define the variable that can save the result first
b <- c()
for (i in 1:177) {
a <- geomean(er1$CW[1:i])
b <- c(b,a)
}
I have the following code:
df<- iris
library(svDialogs)
columnFunction <- function (x) {
column.D <- dlgList(names(x), multiple = T, title = "Spalten auswaehlen")$res
if (!length((column.D))) {
cat("No column selected\n")
} else {
cat("The following columns are choosen:\n")
print(column.D)
for (z in column.D) {
x[[z]] <- NULL #with this part I wanted to delete the above selected columns
}
}
}
columnFunction(df)
So how is it possible to address data.frame columns "dynamically" so: x[[z]] <- NULL should translate to:
df$Species <- NULL
df[["Species"]] <- NULL
df[,"Species"] <- NULL
and that for every selected column in every data.frame chosen for the function.
Well does anyone know how to archive something like that? I tried several things like with the paste command or sprintf, deparse but i didnt get it working. I also tied to address the data.frame as a global variable by using <<- but didn`t help, too. (Well its the first time i even heard about that). It looks like i miss the right method transferring x and z to the variable assignment.
If you want to create a function columnFunction that removes columns from a passed data frame df, all you need to do is pass the data frame to the function, return the modified version of df, and replace df with the result:
library(svDialogs)
columnFunction <- function (x) {
column.D <- dlgList(names(x), multiple = T, title = "Spalten auswaehlen")$res
if (!length((column.D))) {
cat("No column selected\n")
} else {
cat("The following columns are choosen:\n")
print(column.D)
x <- x[,!names(x) %in% column.D]
}
return(x)
}
df <- columnFunction(df)
I'm a novice R user and have created a small script that is doing some trigonometry with movement data. I need to add a final column that deletes repeated values from the column before it.
I've tried adding an if else statement that seems to work when isolated, but keep having errors when it is put into the for loop. I'd appreciate any advice.
# trig loop
list.df <- vector("list", max(Sp_test$ID))
names1 <- c(1:max(Sp_test$ID))
for(i in 1:max(Sp_test$ID)) {
if(i %in% unique(Sp_test$ID)) {
idata <- subset(Sp_test, ID == i)
idata$originx <- idata[1,3]
idata$originy <- idata[1,4]
idata$deltax <- idata[,"UTME"]-idata[,"originx"]
idata$deltay <- idata[,"UTMN"]-idata[,"originy"]
idata$length <- sqrt((idata[,"deltax"])^2+(idata[,"deltay"]^2))
idata$arad <- atan2(idata[,"deltay"],idata[,"deltax"])
idata$xnorm <- idata[,"deltax"]/idata[,"length"]
idata$ynorm <- idata[,"deltay"]/idata[,"length"]
sumy <- sum(idata$ynorm, na.rm=TRUE)
sumx <- sum(idata$xnorm, na.rm=TRUE)
idata$vecsum <- atan2(sumy,sumx)
idata$width <- idata$length*sin(idata$arad-idata$vecsum)
# need if else statement excluding a repeat from the position just before it
list.df[[i]] <- idata
names1[i] <- i
} }
# this works alone, I think the problem is when it gets to the first of the dataset and there is not one before it
if (idata$width[j]==idata$width[j-1]) {
print("NA")
} else {
print(idata$width[j])
}
I think you want to use the function diff for this. diff(idata$width) will give the differences between successive values of idata$width. Then
idata$width[c(FALSE, diff(idata$width) == 0)] <- NA
I think does what you want. The initial FALSE is since there is no value corresponding to the first element (since as you rightly noted, the first element doesn't have an element before it).
I have a function in which I want to return a different object per column of a matrix. However, I don't know how to make the return such that it identifies how many variables to create within a list given that it will be conditioned to the number of columns of the input matrix. In other words, how do I change the last command in the following function:
f <- function(Treat) {
for (i in c(1:ncol(Treat))) {
assign(paste0("Treat",i), as.matrix(Treat[,i]))
}
return(dat = list(Treat1=Treat1 , Treat2=Treat2, .....Treatn=Treatn))
}
lapply is what you are looking for
f <- function(Treat){
lapply(1:ncol(Treat), function(i) as.matrix(Treat[,i]))
}
lappy works nicely here. But you could also do it with an explicit loop (as per your original example) that assigns to a (pre-initialised) list as it goes:
empty_list <- list() # inititalise an empty list
f <- function(Treat, empty_list) { # set up the function
for (i in c(1:ncol(Treat))) { # set up the loop
empty_list[[i]] <- as.matrix(Treat[,i])) # write each column to a new element
}
return(empty_list) # return the list
}
You could then use this with:
full_list <- f(Treat, empty_list)