Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am beginner in R. I am looking to gathering some columns for plotting. I got this error "Error: Must supply a symbol or a string as argument". I would be happy if you help me in how to fix this error.
library(readxl)
df =read_excel("a1-cereals.xls")
# Select columns
df%>% select (c(-1,-2,-11,-14,15))->df1
# Filtering rows
df_Type_C <- filter (df1, Type == "C")
head (df_Type_C)
df_Type_H <- filter (df1, Type == "H")
head (df_Type_C)
#Gathering columns to make a long table
df_long <- gather (df1,Mes_Type,2:11)
This is a sample of the dataset I am working on:
This happens because gather requires a name and value argument (supplied as strings, hence the error message. I imagine that supplying this would solve your problem. However, I strong encourage you to change from gather to the pivot arguments as gather has been retired. I think your code would work fine with pivot. Please see links below and good luck!
Link1
Link2
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
It is about the dataset MplsStops from R.
I have selected two column of the dataset I need: race and citationIssued.
After that, I omitted all NAs.
Then I want to filter all citationIssued values by the values Black and White from the column race.
How can I get done that?
Thanks for you help
Welcome to Stack Overflow!
The dplyr package is your friend if you are trying to learn data processing with R.
How about this:
library(carData) # get the sample data
library(dplyr) # load dplyr for data processing functions
MplsStops %>% # start with the data and pipe it to the next line
select(race, citationIssued) %>% # keep two variables and pipe to next line
filter(race %in% c("Black", "White") & !is.na(citationIssued))
If that answers it click the green checkmark and if not add a comment for more help.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm working with a very large dataset with lots of columns (400+) and every time I create a new variable or add a new one I have to reorder it. I want it ordered so that all the related variables remain together so I've been using dplyr::select() to reorder things. Yet there are times when I have to go back into my script very early on and add a new variable. When I run the whole code after that, there tends to be one or two variables I forgot to put into preceding select() functions so it goes missing.
I use select() because selecting all the columns between two variables and referencing them by name is super easy (eg, Vfour:Vthreefifty). Do you have any tips for reordering datasets with lots of columns?
Given no reproducible example but using your 2 column names:
df %>%
select(., starts_with('V'))
You can then chain starts_with as needed.
Other options include:
ends_with, contains, matches
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 1 year ago.
Improve this question
Im using Rstudio and I can't seem to solve this problem:
I have a df which I want to subset by taking only some columns and so I do the following:
dfo <- read.csv("cwurData.csv")
df<- subset(dfo, c=("world_rank", "country", "quality_of_education",
"alumni_employment", "publications", "patents", "year"))
To which I get the following error: (and I can't see why!)
Error: unexpected ',' in "df<- subset(dfo, c=("world_rank","
Thanks for your help:)
I am assuming that all the quoted names are names of columns you want to select, if so the problem is that you are not using the select argument in the subset function (see ?subset for details). An example of how to use this function on the diamonds data set from ggplot2 can be seen below:
install.packages('ggplot2')
library(ggplot2)
diamonds
subset_d= subset(diamonds,select=c('cut','color'))
Also just some other things to note, you look like your attempting to assign a vector of character values to c by doing c=('x','y','z',...), just a reminder that you need to instead do c=c('x','y','z',...), the c before the parentheses being a combine function call. Good practice would also be to assign vectors to variable names other than 'c', as this causes confusion with the function name. Let me know of any other questions.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I would like to understand how really works this script :
y <- y[keep, , keep.lib.sizes=FALSE]
in :
keep <- rowSums(cpm(y)>1) >= 3
y <- y[keep, , keep.lib.sizes=FALSE]
I do know d.f[a,b] but I can not find R-doc for d.f[a, ,b].
I tried "brackets", "hooks", "commas"... :-(
(Sometimes I would prefer that one does not simplifie his R script !)
Thanks in advance.
Subscripting data.Frames takes two values: df[rows, columns]. Any third value are optional arguments that you can use to subscript.
The most common of those is drop=FALSE as in df[1:18, 3, drop = FALSE]. This is done because when you subset just one column of a data.frame, it will lose the data.frame class. In your specific case, it seems like you are using another object that looks like a data.frame but with added functionalities from the bioconductor package. A look at the methods for those will tell you how these work.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have over 200 data.frames in my global environment. I would like to remove the first row from each data.frame, but I am not sure how.
Any help will be appreciated please let me know if further information is needed.
This will list all the data frames in your environment, remove the first row from each, and organize them into a list of data frames. Generally, better practice to have them in a list so you can more easily apply functions across them and access them.
df <- lapply(ls(), function(x) get(x)[-1,])
Update: good idea to check if objects are in fact data frames and only work with those. First we create a logical vector listing dataframes, then combine them into a list and remove the first row of each.
dfs = sapply(ls(), is.data.frame)
lapply(mget(names(dfs)[dfs]), "[", -1, , drop = FALSE)
thanks to comments for finding my error and providing more efficient solutions