Unexpected token error R subset [closed] - r

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 1 year ago.
Improve this question
Im using Rstudio and I can't seem to solve this problem:
I have a df which I want to subset by taking only some columns and so I do the following:
dfo <- read.csv("cwurData.csv")
df<- subset(dfo, c=("world_rank", "country", "quality_of_education",
"alumni_employment", "publications", "patents", "year"))
To which I get the following error: (and I can't see why!)
Error: unexpected ',' in "df<- subset(dfo, c=("world_rank","
Thanks for your help:)

I am assuming that all the quoted names are names of columns you want to select, if so the problem is that you are not using the select argument in the subset function (see ?subset for details). An example of how to use this function on the diamonds data set from ggplot2 can be seen below:
install.packages('ggplot2')
library(ggplot2)
diamonds
subset_d= subset(diamonds,select=c('cut','color'))
Also just some other things to note, you look like your attempting to assign a vector of character values to c by doing c=('x','y','z',...), just a reminder that you need to instead do c=c('x','y','z',...), the c before the parentheses being a combine function call. Good practice would also be to assign vectors to variable names other than 'c', as this causes confusion with the function name. Let me know of any other questions.

Related

left_join in R returns and error "Error: Join columns must be present in data" colnames() says that the col names are present [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 1 year ago.
Improve this question
I am trying to join two data frames using a left_join function. Here is my code:
combined <- left_join(APRN_mailing, DOPL_List, by = "ID")
I keep receiving the error:
"Error: Join columns must be present in data."
When I run colnames() on both data frames I get:
colnames(DOPL_List)
[1]"ID.LAST_NAME.FIRST_NAME.gender.ADDR_LINE_1.ADDR_LINE_2.CITY.STATE.zipcode.EMAIL.LicenseID.ProfessionGroup.Birth_Year"
colnames(APRN_mailing)
[1]"ID.LAST_NAME.FIRST_NAME.gender.ADDR_LINE_1.ADDR_LINE_2.CITY.STATE.zipcode.EMAIL.ProfessionGroup.Birth_Year"
It looks to me like I have a column named "ID" in both data frames. I have tried rewriting the code:
combined <- left_join(APRN_mailing, DOPL_List, by = c("ID" = "ID")
but I get the same result.
Any ideas what the problem might be?
It seems to me that you misread the data, your columns are not separated. If you look at the results of colnames () it only returns the name of a variable, which is very long.

Argument "nm" is missing, with no default [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 1 year ago.
Improve this question
I have this function to create a dataset which includes all my prediction I have made before:
tst=setNames(
data.frame(
expand.grid(unique(df_sum[,"id"]),unique(df_sum[,"training"]),seq(25,100,25))
)
)
Unfortunately, this message comes:
Error in setNames(data.frame(expand.grid(unique(df_sum[, "id"]),
unique(df_sum[, : argument "nm" is missing, with no default
It is a big dataset, so, it is hard to share. I hope you have enough details to help me.
Thanks
If you run help("setNames") you will find under the heading "Usage" that setNames takes two arguments, called object and nm. The latter is missing in your call.
If your dataset is to large to share, please use a tiny part of it or some simulated data to publish small reproducible examples of what you try to do. We all have the iris dataset and a reproducible example for your problem might be as simple as
> setNames(iris)
Error in setNames(iris) : argument "nm" is missing, with no default
The answer is then
head( setNames(iris, nm = c("name1", "name2", "name3")) )

How to avoid rbind changing column from numeric to character? [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 1 year ago.
Improve this question
I'm a beginner using R to analyze a bunch of data. My program currently opens a list of csv's from a folder and binds them together into one data frame using rbind. Here's what it looks like:
LDT.list <- list.files(path="./Desktop/Data_analysis/LDT/", pattern=".csv", full.names = TRUE)
LDTfiles <- lapply(LDT.list, read.csv)
LDT_table <- do.call("rbind.data.frame" , LDTfiles)
The problem is that after using rbind, one of the columns in my dataframe is no longer numeric and I can't calculate its mean. I imported a single csv and the column I've described is considered numeric.The problem seems to occur after using rbind. I've already tried to convert the column to numeric but got a warning about missing data. So my question is how I could use rbind while keeping the classes as numerics. Thanks

Undefined Columns Selected (when using order function) [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I have searched for answers to my question and can't seem to find any answer. I am trying to sort my data so that I can first sort by year of birth and then by last name. Here is my code:
ResidentsBD_99_2015_clean < ResidentsBD_99_2015_clean[order(ResidentsBD_99_2015_clean[, birthdate_year],
ResidentsBD_99_2015_clean[, "surname"],
decreasing = FALSE), ]
When I run this code, this is the error message that I recieve:
Error in `[.data.frame`(ResidentsBD_99_2015_clean, , birthdate_year) :
undefined columns selected
You might just be stuck with typos in your code. birthdate_year should be quoted. It also looks like you have a typo in the assign-operator (<-).
In a more general sense, I prefer ordering with dplyr.
library(dplyr)
ResidentsBD_99_2015_clean <- arrange(ResidentsBD_99_2015_clean, birthdate_year, surname)
From what I can see from your code, it might just be the missing - in the assignment and some small syntax problem. Try this:
ResidentsBD_99_2015_clean<- ResidentsBD_99_2015_clean[order(ResidentsBD_99_2015_clean$birthdate_year, ResidentsBD_99_2015_clean$surname),]

Code a new variable for a dataset using multiple logical operators in R [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I am trying to convert SAS script to R to learn R. Below is the script in SAS.
if continent=1 and (country=5 or country=10) then rate = 8
Here are my attempts for the dataframe called data:
data$rate[(continent==1) & (country==5 | country==10)] <- 8
Or:
data$rate[(continent==1) & country %in% c(5,10)] <-8
Unfortunately, the attempts do not generate the result correctly. The result shows the rate 8 when either continent=1 or country=5 or country=10. I guess I am wrong on combining logical operators in R.
Could anyone help me fix the issue? Many thanks!
Note: I used attach(data) above since I am lazy to rewrite data again.
It looks like your issue is that you did not call the variable from the data from the dataframe data. This may be the cause of your error. To fix this, specify the variable within the dataframe by doing:
data$rate[(data$continent==1 & (data$country==5 | data$country==10))] <- 8

Resources