Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I have a database which has "null" values for some elements. I wanted to replace these "null" values or even delete rows including "null" values from database but I could not find a way to do so. I used is.na to return these rows but it seems that NA and NULL are two different concepts in r. Does anybody know how I can do this?
Thanks
"null" might be coming in the class character, as in "null" is not the same thing as NULL in R, so is.null() will not work if it is treating it as a character sequence.
You should be able to find these values pretty easily in your dataset using a which statement:
todelete <- which(dataset == "null", arr.ind=TRUE)[1]
newdataset <- dataset[-todelete,]
Or you can try to give a snapshot of what your dataset looks like, using str(dataset) and we can help diagnose better.
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I am trying to use the air dataset from the dummies package.
I tried:
library(dummies)
dumair<-air[c(5:18)]
but that throws the following error:
Error in `[.data.frame`(air, c(5:18)) : undefined columns selected
How can I overcome this?
You need to specify which columns you want to select. When selecting rows and columns from a dataframe, you need to specify them like this:
df[rows, columns]
If you need all columns, leave the "columns" field empty. Same if you want all rows but only some columns!
I believe you want to select all columns, but only the rows from 5 to 18, right?
So doing:
dumair <- air[c(5:18), ]
Should work!
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
Okay so I'm making logs of some distance variables
- example;
loghospital=log(hospital_2015_distance, base=exp(1))
Works, i get values that i can run in a regression.
However for my LASSO regression it's better i specify a dataset.
So i want a dataframe of these logs (values).
Or better I want these logs (values) added to my existing dataframe called (data).
Any idea how this can be achieved? And if not, what else i should do to achieve the same?
To add it to your data.frame you can use $:
data$loghospital = log(hospital_2015_distance, base=exp(1))
Also you could use [[ or [ and probably should <- instead of = for assignment:
# Examples:
data[["loghospital"]] <- log(hospital_2015_distance, base=exp(1))
data["loghospital"] <- log(hospital_2015_distance, base=exp(1))
data[, "loghospital"] <- log(hospital_2015_distance, base=exp(1))
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I am trying to convert factors from a data-frame to numeric using the commands
data[] <- lapply (data, function(x) as.numeric(as.character(x))
But it keeps asking me for more coding. What am I doing wrong?
The data-frame is named data and it consists of 50 rows and 2 columns. Will this command change every variable in numeric right? Or shall I do something else?
screenshot after using 'dput' at http://imgur.com/Sde9QSk.png
Shouldn't you add ) at the end of your code?
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
This is the code I used to look at a subset of data:
active<-clinic[
(clinic$Days.since.injury.physio > 20 & clinic$Days.since.injury.physio < 35)
&(clinic$Days.since.injury.F.U.1 > 27 & clinic$Days.since.injury.F.U.1 < 63)
, ]
I'd like to select a group of subjects based on two criteria and then analyze their results. To start I was looking at the descriptive data when I noticed na's that exceeded the entire data set.
Subsetting seemed to result in NA's. I've looked at several posts including these two below that seem relevant but I don't understand how to apply the answers.
Why does subset cause na's that don't exist in the full data set? (I think the answer from other posts is that there is an na in another variable?)
How do I work around this?
I'd like to be able to get values from the variables that are present rather than ignoring the whole row if there is a missing value.
Thank you.
Subsetting R data frame results in mysterious NA rows
NA when trying to summarize a subset of data (R)
This is a workaround, a response to your #2
Looking at your code, there is a much easier way of subsetting data. Try this.
Check if this solves your issue.
library(dplyr)
active<- clinic %>%
filter(Days.since.injury.physio>20,
Days.since.injury.physio<35,
Days.since.injury.F.U.1>27,
Days.since.injury.F.U.1<63
)
dplyr does wonders when it comes to subsetting and manipulation of data.
The %>% symbol chains statements together so you don't ever have to use the $ symbol.
If, for some bizarre reason, you don't like this, you should look at the subset function in r.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I have a data frame that has some empty entries. I set the
options(stringsAsFactors = FALSE)
so that I can change the empty cells. I then wrote the following code:
apply(my_data[,6:65],2, function(x) x[which(x=='')]<-0)
, hoping that it replaces all the empty cells with zeros. But it isn't working!
Note that my_data has 65 columns and columns 1:5 contain string.
Thanks in advance
No need to use apply, just use [<- with logical indexing
my_data[my_data==""] <- 0