When running this simple code :
freq_table <- ques %>%
select(across(starts_with("Q36_"))) %>%
table()
I get the following error message :
across()` must only be used inside dplyr verbs
I have restarted R and reloaded dplyr but the problem remains. Any clues why?
Thanks!
I created my first R package which has three functions, to query the data in database and return the data frames based on user input. Since the data frames are large instead of printing them in console, I added View() within my function to show user the data extracted based on their input.
Code goes like this:
queryData <- function(p, q, r, s, t){
d <- DBI::dbGetQuery(conn = con, statement = "SELECT * FROM dataset" )
d <- d%>%
dplyr::filter(org == p) %>%
dplyr::filter(exp == q) %>%
dplyr::filter(dis %like% r) %>%
dplyr::filter(tis %like% s) %>%
dplyr::filter(Rx %like% t)
print(paste("No. of datasets that matched your criteria:",nrow(d)))
View(d)
}
R check was fine, I was able to install the package and run the functions. But it gave me error when I created vignette for the package.
Here is the error message:
Error: processing vignette 'package_vignette.Rmd' failed with diagnostics:
View() should not be used in examples etc
--- failed re-building 'package_vignette.Rmd'
SUMMARY: processing the following file failed:
'package_vignette.Rmd'
Error: Vignette re-building failed.
Any advice on how to fix this issue?
As the error message mentioned, View() is not made for RMarkdown, which is what the package vignettes are written in. The R Markdown cookbook suggests you can display the data just by calling the object using the built-in knitr::kable(). If it's too long you can show just the first bit by subsetting it. E.g.
knitr::kable(my_table[5,5])
will print only the first 5 rows and columns of the table. There are other packages you can use too (a brief list here), which work differently depending on the desired output format.
Alternatively, you can use paged tables to avoid scrolling:
rmarkdown::paged_table(my_table)
I tried to run my code like I usually do, and I got an "unused argument" error message. I have previously run the code multiple times and everything worked perfectly fine, this is the first time I have gotten an error message (I haven't changed the code). The only thing I've done different is I cleared the workspace at the end of my previous session (though I have no idea if this would actually affect it?).
Below is the code:
pacman::p_load(
rio, # importing data
here, # relative file pathways
janitor, # data cleaning and tables
lubridate, # working with dates
epikit, # age_categories() function
tidyverse, # data management and visualization
skimr,
psych,
reshape2, #for reshaping dataset
dplyr,
miscFuncs,
foreign, #read data formats
rcompanion, # group means
eeptools,
plyr)
mesh_dat <- import(here("R", "BTmeshdata.xlsx"))
The error message:
Error in here("R", "BTmeshdata.xlsx") :
unused argument ("BTmeshdata.xlsx")
The issue seems to be in how the dataset is imported because I have the same issue with importing a dataset from a different project.
.here and my folder "R" are located in my Documents folder.
Thanks!
I have a persistent multiple warning of "unknown column" for all types of commands (e.g., str(x) to installing updates on packages), and not sure how to debug this or fix it.
The warning "unknown column" is clearly related to a variable in a tbl_df that I renamed, but the warning comes up in all kinds of commands seemingly unrelated to the tbl_df (e.g., installing updates on a package, str(x) where x is simply a character vector).
This is an issue with the Diagnostics tool in RStudio (the tool that shows warnings and possible mistakes in your code). It was partially fixed at this commit in RStudio v1.1.103 or later by #kevin-ushey. That fix was partial, because the warnings still appeared (albeit with less frequency). This issue was reported with a reproducible example at https://github.com/rstudio/rstudio/issues/7372 and it was fixed on RStudio v1.4 pull request.
Update to the latest RStudio release to fix this issue. Alternatively, there are several workarounds available, choose the solution you prefer:
Disable the code diagnostics for all files in Preferences/Code/Diagnostics
Disable all diagnostics for a specific file:
Add at the beginning of the opened file(s):
# !diagnostics off
Then save the files and the warnings should stop appearing.
Disable the diagnostics for the variables that cause the warning
Add at the beginning of the opened file(s):
# !diagnostics suppress=<comma-separated list of variables>
Then save the files and the warnings should stop appearing.
The warnings appear because the diagnostics tool in RStudio parses the source code to detect errors and when it performs the diagnostic checks it accesses columns in your tibble that are not initialized, giving the Warning we see. The warnings do not appear because you run unrelated things, they appear when the RStudio diagnostics are executed (when a file is saved, then modified, when you run something...).
I have been encountering the same problem, and although I don't know why it occurs, I have been able to pin down when it occurs, and thus prevent it from happening.
The issue seems to be with adding in a new column, derived from indexing, in a base R data frame vs. in a tibble data frame. Take this example, where you add a new column (age) to a base R data frame:
base_df <- data.frame(id = c(1:3), name = c("mary", "jill","steve"))
base_df$age[base_df$name == "mary"] <- 47
That works without returning a warning. But when the same is done with a tibble, it throws a warning (and consequently, I think causing the weird, seemingly unprovoked, multiple warning issue):
library(tibble)
tibble_df <- tibble(id = c(1:3), name = c("mary", "jill","steve"))
tibble_df$age[tibble_df$name == "mary"] <- 47
Warning message:
Unknown column 'age'
There are surely better ways of avoiding this, but I have found that first creating a vector of NAs does the job:
tibble_df$age <- NA
tibble_df$age[tibble_df$name == "mary"] <- 47
I have faced this issue when using the "dplyr" package.
For those facing this problem after using the "group_by" function in the "dplyr" library:
I have found that ungrouping the variables solves the unknown column warning problem. Sometimes I have had to iterate through the ungrouping several times until the problem is resolved.
Converting the class into data.frame solved the problem for me:
library(dplyr)
df <- data.frame(id = c(1,1:3), name = c("mary", "jo", "jill","steve"))
dfTbl <- df %>%
group_by(id) %>%
summarize (n = n())
class(dfTbl) # [1] "tbl_df" "tbl" "data.frame"
dfTbl = as.data.frame(dfTbl)
class(dfTbl) # [1] "data.frame"
Borrowed the partial script from #adts
I had this problem when dealing with tibble and lapply functions together. The tibble seemed to save things as a list inside the dataframe.
I solved it by using unlist before adding the results of an lapply function to the tibble.
I ran into this problem too except through a tibble created using a dyplyr block. Here's slight modification of sabre's code to show how I came to the same error.
library(dplyr)
df <- data.frame(id = c(1,1:3), name = c("mary", "jo", "jill","steve"))
t <- df %>%
group_by(id) %>%
summarize (n = n())
t
str(t)
t$newvar[t$id==1] <- 0
I know this is an old thread, but I just encountered the same problem when loading a spatial vector in geopackage format with the package sf. Using as_tibble=FALSE worked for me. The file was loaded as an sp object but everything still worked fine. As mentioned by #sabre, trying to force an object into a tibble seems to be making the problems while trying to index a column that was not anymore there.
Let's say I wanted to select the following column(s)
best.columns = 'id'
For me the following gave the warning:
df%>% select_(one_of(best.columns))
While this worked as expected, although, as far as I know dplyr, this should be identical.
df%>% select_(.dots = best.columns)
I get these warnings when I rename a column using dplyr::rename after reading it using the readr package.
The old name of the column is not renamed in the spec attribute. So removing the the spec attribute makes the warnings go away. Also removing the "spec_tbl_df" class seems like a good idea.
attr(dat, "spec") <- NULL
class(dat) <- setdiff(class(dat), "spec_tbl_df")
Building on the answer by #stok ( https://stackoverflow.com/a/47848259/7733418 ), who found this problem when using group_by (which also converts your data.frame to a tibble), and solved it in the same way.
For me the problem was ultimately due to the use of "slice()".
Slice() converted my data.frame to a tibble, causing this error.
Checking the class of your data.frame and re-converting it to a data.frame whenever a function converts it to a tibble could solve this issue.
My question is some packages share the same function name. How can I tell R which package that I want to use this function from?
I tried to load the package that I wanted to use again in the code but it still did not work. My case is the select in MASS and dplyr. I want to use dplyr but the error is always unused argument...
You can use the :: operator:
iris %>%
head(n = 3) %>%
dplyr::select(Sepal.Length)
See here for details.
Or detach MASS ala this post.