How can I subset based on multiple criteria? [closed] - r

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 7 years ago.
Improve this question
I have a dataframe of this nature:
id year levels
A 1967 cat
B 1965 dog
C 1980 cat
A 1989 dog
B 1990 mouse
C 2010 pig
And I want to subset once using these criteria at the same time:
1. id = A
2. year > 1980
3. levels = dog
I know how to do subset(df, year>1980) but don't know how to combine these criteria.
When I do this,
sub<-subset(all,year>1980 & id == 'A' & levels == 'dog')
I get an empty dataframe

you can try:
df[df$id == "A" & df$year > 1980 & df$levels == "dog",]

Related

R: how to sample rows with custom frequencies [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I have a data frame in R that has two columns, one with last names, the other with the frequency of each last name. I would like to randomly select last names based on the frequency values (0 -> 1).
So far I have tried using the sample function, but it doesn't allow for specific frequencies for each value. Not sure if this is possible :/
df1 <- data.frame(names = c("John","Mary"),freq=c(0.2,0.8))
df1
# names freq
# 1 John 0.2
# 2 Mary 0.8
set.seed(1)
sample100 <- sample(
x = df1$names,
size = 100,
replace=TRUE,
prob=df1$freq)
table(sample100)
# sample100
# John Mary
# 17 83

First row not detected with R [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I have my data in xls file.I try to read like this
> df = read.xls ("natgas.xls")
Output
df
Dec.2007 X2399154
1 Jan-2008 2733970
2 Feb-2008 2503421
3 Mar-2008 2278151
4 Apr-2008 1823867
5 May-2008 1576387
6 Jun-2008 1604249
7 Jul-2008 1708641
8 Aug-2008 1682924
9 Sep-2008 1460924
10 Oct-2008 1635827
Everything is OK,except the first line.
When I index second column
> df[,2]
[1] 2733970 2503421 2278151 1823867 1576387 1604249 1708641 1682924 1460924
the first value is missing.
How to solve this?
Looks like you need to add header = FALSE to your read.xls call (which seems to come from the gdata package):
df1 <- read.xls("natgas.xls", header = FALSE)

How to add a column with constant observation and another variable with consecutive numbers with a character in R [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I want to add a first column with consecutive numbers with characters in a existing data frame.
I use the following code. It does not work.
df$VARNAME_ <- paste0('COL', 1:5)(df)
I want to it look like this.
VARNAME_ old_var1 old_var2
COL1 1 2
COL2 1 2
COL3 1 2
COL4 1 2
COL5 1 2
Thanks in advance.
I am Sorry that I asked a stupid question. And now I figure out.
The solution is as following.
actual_df<-data.frame(df)#transfer matrix a to data frame
actual_df<-cbind(VARNAME_=paste0('COL', 1:5),actual_df) #add COL1~COL5 in the first column
actual_df<-cbind(ROWTYPE_ = 'PROX', actual_df) #Add a variable with constant observations in first column. Now the previous column become second one.
df$VARNAME_ = paste0('COL', 1:5)
will work

Remove a row from a data table in R [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I have a data table with 5778 rows and 28 columns. How do I delete ALL of the 1st row. E.g. let's say the data table had 3 rows and 4 columns and looked like this:
Row number tracking_id 3D71 3D72 3D73
1 xxx 1 1 1
2 yyy 2 2 2
3 zzz 3 3 3
I want to create a data table that looks like this:
Row number tracking_id 3D71 3D72 3D73
1 yyy 2 2 2
2 zzz 3 3 3
i.e. I want to delete all of row number 1 and then shift the other rows up.
I have tried datatablename[-c(1)] but this deletes the first column not the first row!
Many thanks for any help!
You can do this via
dataframename = dataframename[-1,]
It can be easily done with indexing the data.table/data frame as mentioned by #joni. You can also do with
datatablename <- datatablename[2:nrow(datatablename), ]
You can find more interesting stuff about data.table here.

How to subset a dataframe based on one column level? [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I would like to subset a df based on one level in a column, i.e. keep all rows that only contain this unique level within a column.
For this example I want a df with all columns that meet the criteria "blue" in column "D" without losing information. Whether that is subset, filter, etc.
A B C D E
1 2 3 "blue" 8
7 4 6 "red" 5
5 9 1 "green" 2
I have tried the variations of the following script:
newdf = subset(df, D == "blue")
newdf = subset(df, levels(D) == "blue")
This should work:
newdf = df[df$D == "blue", ]

Resources