It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
Have a dataframe in R, I want to delete all rows in that dataframe where column X has values >100%. Whats the best way to do this?
Appreciate the help.
If your column X contains numbers (which I'm pretty sure it does although your use of % symbols gives bit different impression), then you can select the rows i where X[i]<100 like this:
datasetnew <- dataset[dataset$X<=100,]
But if you really have percentages in the column, i.e. values in X are something like "10%","23%","103%", then you need to remove the % first, for example using the gsub function:
datasetnew <- dataset[as.numeric(gsub(dataset$X,"%",""))<=100,]
Related
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
I have been using R for a number of years now and I keep expecting to stumble upon the answer to this basic question somewhere but somehow I can't seem to find it anywhere.
How do you use a variable to target the elements of a dataframe or list?
The best I have come up with is this awkward formation.
For instance:
a=list(A=1:100, B=letters)
c=eval(parse(text="a$A"))
I could then sub out the a with "a" with a "b" if I wanted to check what the A element of the b string is. Likewise, I would like an easy method to apply changes to an element of a.
If your indexer is a string, then you can do this:
index <- "A"
a_list <- list(A=1:100, B=letters)
a_list[[index]]
The case for data frames is similar; you have a choice to two syntaxes.
a_data_frame <- data.frame(A = 1:26, B = letters)
a_data_frame[, index]
a_data_frame[[index]]
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have boxscore data from the NFL and some of the data is obviously incorrect. For example for some games the number of sacks is negative, which is impossible. This column is named SackNumOff. How do I change any negative values in this column to zero?
Something like this:
dat$columnname[dat$columnname < 0] = 0
Replaces all negative numbers by 0. The idea is that you can use a subset [] both to extract a subset and assign values to a subset.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have a list of tables in R. The number of rows differs from table to table but the number of columns is uniform (22). I need to calculate the min,max,and median of each column in each table and put that into a vector. There needs to be a separate vector per table.
lapply(your_list,function(x){apply(x,2,mean)})
Replace mean by min, max or median according to which one you want to calculate.
apply(x,2,...) apply the function on each column (i. e. 2nd dimension) of matrix, array or dataframe.
lapply(...) apply the function to each element of a list.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have a huge csv file of sports (EPL) data which encompasses player performance in every game for their respective teams. I would like to run a loop to compare the amount of times a team has scored first in a match (the data is called First.Goal).
I know how to calculate them individually, e.g for Liverpool from a csv called Prem1112:
Prem<-read.csv("Prem1112.csv")
sum(subset(Prem,Team=='Liverpool',First.Goal))
Ideally I'd like to run the loop so I wouldn't have to calculate all 20 teams individually. Any ideas?
What about this:
aggregate(First.Goal ~ Team, Prem, sum)
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I need to replace the contents of multiple cells in a dataframe. Using mtcars as an example, how would I replace any cells which contain 1 with one in the vs column?
mtcars$vs[mtcars$vs == 1] <- "one"
or
mtcars[mtcars$vs == 1, "vs"] <- "one"
Something like this:
mtcars$vs[mtcars$vs == 1] <- "one"