How to write if/else statements if dataframe is empty in R - r

I am trying to do the following:
If there is nothing in the dataframe, print "no_match".
If there is something, bind it to the ID of dataframe df2:
if(df == []){
print("nomatch")
}else{
cbind(df, df2$id2)
}

You could get the information about the dimensions of your data frame via dim. For example running the code:
data(mtcars)
dim(mtcars)
will show you the dimensions:
[1] 32 11
For a NULL object you would get:
mtcars <- NULL
dim(mtcars)
NULL
dim is quite flexible as in case of a data.frame with no rows:
mtcars <- mtcars[-c(1:dim(mtcars)[1]),]
you will get
> dim(mtcars)
[1] 0 11
IF statements
Constructing if statements is very simple, depening on what you want to check you can do
Object is NULL
*The object is NULL, no rows and no columns.
if (dim(df) == NULL) {
}
No rows
This data frame has columns but no observations.
if (dim(df)[1] == 0) {
}
No columns
*The object is still of class data.frame but has no data.
if (dim(df)[2] == 0) {
}
You would construct the object like that (if of interest):
data(mtcars)
mtcars <- mtcars[,-c(1:dim(mtcars)[2])]
Naturally, you can combine conditions to check for both or one event of data frame being empty.

It depends, is your data.frame actually empty or are all the elements something you consider empty.
If the data.frame is empty you can use nrow as a simple check.
tmp <- data.frame(A = numeric())
nrow(tmp)
[1] 0
if(nrow(tmp) == 0){
print("data.frame is empty")
}else{
print("data.frame contains data")
}
EDIT - OP asks about object existence
You can check if an object has been defined with exists
exists("tmp2")
[1] FALSE
exists("tmp")
[1] TRUE

Is max(dim(df)) == 0 doing the trick?
if (max(dim(df)) == 0) {
print("nomatch")
} else {
cbind(df, df2$id2)
}

Related

Check if a data frame contains at least one zero value inside an if statement in R

I have a dataframe in R as follows
df <-
as.data.frame(cbind(c(1,2,3,4,5), c(0,1,2,3,4,5),c(1,2,4,5,6)))
and I have a function in which I want the procedure to stop and display a message if the input df contains at least one 0 value. I tried the following but can't make it work properly. What is the correct if() statement I should use?
my_function <- function(df){
if (all(df == 0) == 'TRUE')
stop(paste("invalid input df"))
}
We could use %in%
my_function <- function(df) {
if(0 %in% unlist(df)) {
stop("invalid input df")
}
}

R Assign values from a data.frame to a list

I have a list containing some basic characteristics of factories (like capacity, turnover). All values set initially to NULL:
#My List:
list.var <- list(Capacity = NULL, Production = NULL)
list <- list(Factory1 = list.var, Factory2 = list.var)
> list
$Factory1
$Factory1$Capacity
NULL
$Factory1$Production
NULL
$Factory2
$Factory2$Capacity
NULL
$Factory2$Production
NULL
Also I have data frames that contains the "missing" values separately for each characteristics for all factories, like that:
> #My Data Frame:
> df.capacity <- data.frame(Factory = c("Factory1", "Factory2"), Capacity = c(100,200))
> df.capacity
Factory Capacity
1 Factory1 100
2 Factory2 200
I want to assign the capacity values in df.capacity to the corresponding factory in my list. The result should look like this:
$Factory1
$Factory1$Capacity
[1] 100
$Factory1$Production
NULL
$Factory2
$Factory2$Capacity
[1] 200
$Factory2$Production
NULL
How can I do this? (note that I have multiple factories and even more characteristics, thus I should do it automatically each time like left join in case of data frames). I tried to convert the data frame to a list and then combine with the original one, but it didn't work for me.
From base R, you could also do:
modifyList(list, split(df.capacity[-1], df.capacity[1]))
$Factory1
$Factory1$Capacity
[1] 100
$Factory1$Production
NULL
$Factory2
$Factory2$Capacity
[1] 200
$Factory2$Production
NULL
We could match to get the corresponding values and then do the assignment
library(purrr)
imap(list, ~ {
.x$Capacity <- df.capacity$Capacity[match(.y, df.capacity$Factory)]
.x})
Or with Map from base R
Map(function(x, y) {
x$Capacity <- df.capacity$Capacity[match(y, df.capacity$Factory)]
x
},
list, names(list))
-output
$Factory1
$Factory1$Capacity
[1] 100
$Factory1$Production
NULL
$Factory2
$Factory2$Capacity
[1] 200
$Factory2$Production
NULL
Or using a for loop
for(i in seq_along(df.capacity$Factory)) list[[df.capacity$Factory[i]]]$Capacity <- df.capacity$Capacity[i]

For Loop to rename objects in a data frame (+ ignore NA) in R

I have a data frame that contains a column with binary variables (pointed or broad). To do my calculations I need to replace them with 0 or 1. I want to write a for loop which is doing this for me.
My code:
binary_To_Number<-function(df)
{
for(i in df)
{
if(i=="pointed")
{
i<-1
}
else if(i=="broad")
{
i<-0
}
else if(is.na(i))
{
print("NA")
}
else
{
}
}
}
binary_To_Number(town$shape)
I tried to use this piece of code. My first problem with it is that I don't know how to save the results. So my code is changing the i temporarily but won't save it in the df. I know that you can create an empty storage vector to store results in it, but can I replace the variable in my df immediately?
The second problem is that my code stops and gives me an error message if it comes to an i which contains NA.
Error in if (i == "pointed") { : missing value where TRUE/FALSE needed
Is there something I can do about it or do I need to replace the NA with a placeholder first?
You can also use dplyr (ensures 0 for not pointed):
library(dplyr)
df <- df %>%
mutate(
isPointed = as.integer(tolower(shape) == 'pointed')
)
Output:
shape isPointed
1 Pointed 1
2 broad 0
3 pointed 1
The dataframe I used:
df <- data.frame(
shape = c('Pointed', 'broad', 'pointed'),
stringsAsFactors = FALSE
)

Delete data frame column within function

I have the following code:
df<- iris
library(svDialogs)
columnFunction <- function (x) {
column.D <- dlgList(names(x), multiple = T, title = "Spalten auswaehlen")$res
if (!length((column.D))) {
cat("No column selected\n")
} else {
cat("The following columns are choosen:\n")
print(column.D)
for (z in column.D) {
x[[z]] <- NULL #with this part I wanted to delete the above selected columns
}
}
}
columnFunction(df)
So how is it possible to address data.frame columns "dynamically" so: x[[z]] <- NULL should translate to:
df$Species <- NULL
df[["Species"]] <- NULL
df[,"Species"] <- NULL
and that for every selected column in every data.frame chosen for the function.
Well does anyone know how to archive something like that? I tried several things like with the paste command or sprintf, deparse but i didnt get it working. I also tied to address the data.frame as a global variable by using <<- but didn`t help, too. (Well its the first time i even heard about that). It looks like i miss the right method transferring x and z to the variable assignment.
If you want to create a function columnFunction that removes columns from a passed data frame df, all you need to do is pass the data frame to the function, return the modified version of df, and replace df with the result:
library(svDialogs)
columnFunction <- function (x) {
column.D <- dlgList(names(x), multiple = T, title = "Spalten auswaehlen")$res
if (!length((column.D))) {
cat("No column selected\n")
} else {
cat("The following columns are choosen:\n")
print(column.D)
x <- x[,!names(x) %in% column.D]
}
return(x)
}
df <- columnFunction(df)

How to create a sorted vector in r

I have a list of elements in a random order. I want to read each element of this data one at a time and insert into other list in a sorted order. I wonder how to do this in R. I tried the below code.
lst=list()
x=c(2,3,1,4,5)
for(i in 1:length(x)) ## for reading the elements from x
{
if(lst==NULL)
{
lst=x[i]
}
else
{
lst=x[i]
print(lst)
for(k in 2: length(lst)) ## For sorting the elements in a list
{
value = lst[k]
j=k-1
while(j>=1 && lst[j]>value)
{
lst[j+1] = lst[j]
j= j-1
}
lst[j+1] = value
}
}
print(lst)
}
But i get the the Error :
error in if (lst == NULL) { : argument is of length zero.
For big datasets with lots of columns, you can use do.call
df1 <- df[do.call(order, df),]
Checking the order by specifying the column names,
df2 <- df[with(df, order(V1, V2, V3, V4)),]
identical(df1,df2)
#[1] TRUE
If you need to order in the reverse direction
df[do.call(order, c(df,decreasing=TRUE)),]
data
set.seed(24)
df <- as.data.frame(matrix(sample(letters,10*4,replace=TRUE),ncol=4))
First off, as commenters as pointed, you could use sort or order. But I believe you are trying to solve an assignment.
Your problem is a typo. Try executing in a console:
lst <- list()
lst == NULL
The last line evaluates to a null-length vector (logical(0)) for which R has no interpretation. Instead you are interested in
is.null(lst)
which will return TRUE or FALSE.

Resources