Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 7 years ago.
Improve this question
I have a dataframe of this nature:
id year levels
A 1967 cat
B 1965 dog
C 1980 cat
A 1989 dog
B 1990 mouse
C 2010 pig
And I want to subset once using these criteria at the same time:
1. id = A
2. year > 1980
3. levels = dog
I know how to do subset(df, year>1980) but don't know how to combine these criteria.
When I do this,
sub<-subset(all,year>1980 & id == 'A' & levels == 'dog')
I get an empty dataframe
you can try:
df[df$id == "A" & df$year > 1980 & df$levels == "dog",]
Related
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I have a data frame in R that has two columns, one with last names, the other with the frequency of each last name. I would like to randomly select last names based on the frequency values (0 -> 1).
So far I have tried using the sample function, but it doesn't allow for specific frequencies for each value. Not sure if this is possible :/
df1 <- data.frame(names = c("John","Mary"),freq=c(0.2,0.8))
df1
# names freq
# 1 John 0.2
# 2 Mary 0.8
set.seed(1)
sample100 <- sample(
x = df1$names,
size = 100,
replace=TRUE,
prob=df1$freq)
table(sample100)
# sample100
# John Mary
# 17 83
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I have my data in xls file.I try to read like this
> df = read.xls ("natgas.xls")
Output
df
Dec.2007 X2399154
1 Jan-2008 2733970
2 Feb-2008 2503421
3 Mar-2008 2278151
4 Apr-2008 1823867
5 May-2008 1576387
6 Jun-2008 1604249
7 Jul-2008 1708641
8 Aug-2008 1682924
9 Sep-2008 1460924
10 Oct-2008 1635827
Everything is OK,except the first line.
When I index second column
> df[,2]
[1] 2733970 2503421 2278151 1823867 1576387 1604249 1708641 1682924 1460924
the first value is missing.
How to solve this?
Looks like you need to add header = FALSE to your read.xls call (which seems to come from the gdata package):
df1 <- read.xls("natgas.xls", header = FALSE)
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I want to add a first column with consecutive numbers with characters in a existing data frame.
I use the following code. It does not work.
df$VARNAME_ <- paste0('COL', 1:5)(df)
I want to it look like this.
VARNAME_ old_var1 old_var2
COL1 1 2
COL2 1 2
COL3 1 2
COL4 1 2
COL5 1 2
Thanks in advance.
I am Sorry that I asked a stupid question. And now I figure out.
The solution is as following.
actual_df<-data.frame(df)#transfer matrix a to data frame
actual_df<-cbind(VARNAME_=paste0('COL', 1:5),actual_df) #add COL1~COL5 in the first column
actual_df<-cbind(ROWTYPE_ = 'PROX', actual_df) #Add a variable with constant observations in first column. Now the previous column become second one.
df$VARNAME_ = paste0('COL', 1:5)
will work
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I have a data table with 5778 rows and 28 columns. How do I delete ALL of the 1st row. E.g. let's say the data table had 3 rows and 4 columns and looked like this:
Row number tracking_id 3D71 3D72 3D73
1 xxx 1 1 1
2 yyy 2 2 2
3 zzz 3 3 3
I want to create a data table that looks like this:
Row number tracking_id 3D71 3D72 3D73
1 yyy 2 2 2
2 zzz 3 3 3
i.e. I want to delete all of row number 1 and then shift the other rows up.
I have tried datatablename[-c(1)] but this deletes the first column not the first row!
Many thanks for any help!
You can do this via
dataframename = dataframename[-1,]
It can be easily done with indexing the data.table/data frame as mentioned by #joni. You can also do with
datatablename <- datatablename[2:nrow(datatablename), ]
You can find more interesting stuff about data.table here.
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I would like to subset a df based on one level in a column, i.e. keep all rows that only contain this unique level within a column.
For this example I want a df with all columns that meet the criteria "blue" in column "D" without losing information. Whether that is subset, filter, etc.
A B C D E
1 2 3 "blue" 8
7 4 6 "red" 5
5 9 1 "green" 2
I have tried the variations of the following script:
newdf = subset(df, D == "blue")
newdf = subset(df, levels(D) == "blue")
This should work:
newdf = df[df$D == "blue", ]