I came accross this question: Can you use multiple conditions in match() function - R, and I was wondering, what exacly is the tribble function used for? (It's part of the answer provided)
According to Rdocumentation (https://www.rdocumentation.org/packages/tibble/versions/3.0.4/topics/tribble) it is used to constuct dataframes, but what is the difference between it and e.g. data.frame() ?
Tibble Vs data.frame
The below information will give you a better understanding of how tibble differs from data.frame
Tibble
data.frame
Row names
doesn't add row names to the data frame
Can add row names
partial matching of variable name
Doesn't allow partial matching of variable name
permitted
Subset
subsetting/extracting part of a tibble always gives a tibble
subsetting data.frame can return vector
non-valid R variable
Allows non-valid R variable names
A non-valid R variable name has to be surrounded by backticks
variables
variables used to create tibble must have same length
can have different length
recycling values of vector
Tibbles don't recycle values of vector when creating dataframe
data.frame recycles values of vector when creating a dataframe
Printing capabilities
has better printing capabilities
no special printing capabilities
Printing rows
can specify no of rows to print for a tibble
prints all rows
Printing columns
Only prints number of columns that can fit horizontally in console
prints all columns
Related
I have a data table that I am trying to subset by creating a list of variable names by pasting together some string vectors in the j argument of the data table, but I'm running into difficulty.
I have a character vector called foos (for this example foos <- c('FOO0','FOO1','FOO2')) and a vector I created with c() . I wanted to subset my data table by doing dt[,paste0(foos, c('VAR0','VAR1','VAR2'))] but that didn’t work as expected. I output what paste0(foos, c('VAR0','VAR1','VAR2')) returns and it becomes
[1] "FOO0VAR0" "FOO1VAR1" "FOO2VAR2"
so it seems this approach does a vector index by vector index concatenation instead of a concatenation of the vectors themselves (and that’s a bit surprising to me, I’d expect to have to lapply to get a paste happening on elements of a vector). Changing the permutation of the c() and paste0 didn’t work. I also tried to do
dt[,c(foos,c('VAR0','VAR1','VAR2'))] but that also doesn't work.
Is there a way to subset by a created concatenation of two string vectors in the jth column of a data table in R?
I have a data frame with 100+ variables listed in columns, and each subject in rows. I'd like to loop through each column to perform an ANOVA, and while the loop function works fine the step I am stuck on is listing which columns to loop through. Currently I can set these by manually typing/pasting each variable name but this is obviously not practical.
Currently the loop runs through my list of vars, to get this I currently just type the name of these columns manually...
variables <- vars(height, width, strength)
Which only loops for those selected 3 out of 100+ variables that I have had to manually type in.
I had thought I could list the range of column names for dataframe df between columns 3 to 100 within the vars expression as below...
variables <- vars(colnames(df[3:100]))
This just provides one variable of the name colnames(df[3:100]).
Any ideas to avoid typing or manually inserting commas/removing quotation marks from 100+ different variable names? Thanks in advance.
Consider do.call which is shorthand for expanded list of arguments to a function. Specifically, below:
variables <- do.call(vars, colnames(df)[3:100])
is equivalent to expanded version:
variables <- vars(colnames(df)[3], colnames(df)[4], ..., colnames(df)[100])
I have a 2-D list with one column "names:chr" and the second column "n:int". This was created with the count() function in R. The "n:int" represent the occurrences of the names. Is there a way in R to sort or order the list base on the second column "n"? The list is a tibble.
I'm trying to iterate through columns in an R data.frame.
To do so, I'm hoping to write a for loop which loops over the column names and then filters the data.table accordingly with values.
My issue is that given the syntax:
df[which(df$XX == y), ]
XX needs to actually be a column name versus a variable that is a string equivalent to the column name.
Is there a way to loop over the columns via inputting a variable?
Many thanks!
I'm relatively new to R, and I can't figure out how to split the list that I'm working with. I have
B<-tapply(newdata$lf.d1, newdata$year, mean)
But I want to concatenate the mean values onto another matrix without the year values. How would I go about doing this?
The result of tapply with a single grouping factor will be an R contingency table with rownames. There is only a single column (actually not even that because it is a table object and only has a single dimension unless you coerce it with as.matrix). If you want to remove the names, then use the unname function.
unname(B)
unname(as.matrix(B))