Good afternoon, I received an R script from my professor to act on files on my computer. More specifically it is supposed to go into the file and read the data and perform two very specific calculations and add two values to the spreadsheet to my understanding. It seems like the pathing issue is clear but i have a new error when i attempt to run the code
source("C:/Users/JNunl/OneDrive/Desktop/DASCRIPT.R")
Error in file(file, "rt") : invalid 'description' argument
I have launched R studio
Created a new project with the working directory being the same folder the programs are in
Made sure i had downloaded the right libraries needed to run the script
Double checked the path was correct
Launched the script
Imported the dataset(idk if that nessesary but have done it anyway)
Run the line of code in question and it returns this error
I have tried to remove the last section with the file name and it does run that but i get errors for the rest of the program
Below is the code in question:
library(seacarb)
library(readxl)
library(lubridate)
setwd("~/")
myf=list.files(pattern="csv")
para=c("YEAR", "MONTH", "STATION_ID", "STN_DEPTH_M", "LATITUDE", "LONGITUDE","SAMPLING_DATE","SAMPLE_DEPTH_M","DEPTH_CODE", "MEDIUM", "SAMPLE_ID", "VALUE_1", "VALUE_2", "VALUE_3")
result=NULL
for(i in 1: length(myf)){
out=read.csv(myf[i], header=T)[para]
result=rbind(result, out)
}
colnames(result)=c("Yr","Mn", "Stn","Std","Lat","Lon","Sdt","Spd","Dc", "Med", "SID", "Alk", "pH", "T")
a=subset(result, Med =="surface water")
a$Alk=as.numeric(a$Alk)
a$pH=as.numeric(a$pH)
a$T=as.numeric(a$T)
a$Mn=month(as.Date(a$Sdt))
a=subset(a, Alk>50 & Alk<125)
a=subset(a, !is.na(Alk) & !is.na(pH) & !is.na(T))
b=carb(flag=8, var1=a$pH, var2=a$Alk/50000, T=a$T, S=0.2, P=1, Pt=0.3e-9, Sit =10e-6, k1k2="r", pHscale ="F")
a$DIC=b$DIC*12000
a$pCO2=b$pCO2
stns=read_excel("stn.xlsx")
bsn<-function(x) if(nrow(subset(stns, STATION==x))>0) subset(stns, STATION==x)$BASIN else NA
lon<-function(x) if(nrow(subset(stns, STATION==x))>0) mean(subset(stns, STATION==x)$LONGITUDE) else NA
lat<-function(x) if(nrow(subset(stns, STATION==x))>0) mean(subset(stns, STATION==x)$LATITUDE) else NA
a$Basin=unlist(mapply(bsn, a$Stn), use.names=F)
a$Lat=unlist(mapply(lat, a$Stn), use.names=F)
a$Lon=unlist(mapply(lon, a$Stn), use.names=F)
Additionally here is a screenshot of the file that i would like r to act on
sample data
I will be back around to anwser any questions
Thank you for your time
Related
This is an example code snippet:
Problematic Code:
```{r}
ratings <- df %>%
dplyr::group_by(rating) %>%
dplyr::summarise(
compound = mean(compound),
neg = mean(neg),
neu = mean(neu),
pos = mean(pos),
)
View(ratings)
```
This is the error I get:
Error message
r$> ratings <- df %>%
dplyr::group_by(rating) %>%
dplyr::summarise(
compound = mean(compound),
neg = mean(neg),
neu = mean(neu),
Error: unexpected '}' in:
" neu = mean(neu),
}"
r$> pos = mean(pos)
)
Error: unexpected ')' in:
" pos = mean(pos)
)"
Error: unexpected '}' in "}"
I'm not exactly sure what's causing it. I'm on a Windows 11 system, using VSCode as the editor and Radian to run the code. For VSCode extensions I am using: R (REditorSuppor), Markdown All in One (Yu Zhang).
When I run the code on RStudio I do not get an error and the code runs as expected. I'd rather use VSCode and figure out the issue. When searching for an answer online, I've seen issues with unicode characters and actual extra punctuations causing errors. However, as far as I'm aware those don't seem to be the issue here, and how to fix it. Any help would be greatly appreciated. Thank you.
Edit: There is one more detail I forgot to mention above, the summarized function works if it is written as one line:
ratings <- df %>%
dplyr::group_by(rating) %>%
dplyr::summarise(compound = mean(compound), neg = mean(neg), neu = mean(neu), pos = mean(pos))
If you are using radian as your R terminal in VS Code you need to add the following to your user settings.json:
"r.bracketedPaste": true
There are some other useful options in this medium post.
Instructions about how to access your settings.json are set out in the VS Code docs, as well as further information about user and workspace settings.
This is a weird one and I am hoping someone can figure it out. I have written a function that uses googlesheets4 and googledrive. One thing I'm trying to do is move a googledrive document (spreadsheet) from the base folder to a specified folder. I had this working perfectly yesterday so I don't know what happened as it just didn't when I came in this morning.
The weird thing is that if I step through the function, it works fine. It's just when I run the function all at once that I get the error.
I am using a folder ID instead of a name and using drive_find to get the correct folder ID. I am also using a sheet ID instead of a name. The folder already exists and like I said, it was working yesterday.
outFolder <- 'exact_outFolder_name_without_slashes'
createGoogleSheets <- function(
outFolder
){
folder_id <- googledrive::drive_find(n_max = 10, pattern = outFolder)$id
data <- data.frame(Name = c("Sally", "Sue"), Data = c("data1", "data2"))
sheet_id <- NA
nameDate <- NA
tempData <- data.frame()
for (i in 1:nrow(data)){
nameDate <- data[i, "Name"]
tempData <- data[i, ]
googlesheets4::gs4_create(name = nameDate, sheets = list(sheet1 = tempData)
sheet_id <- googledrive::drive_find(type = "spreadsheet", n_max = 10, pattern = nameDate)$id
googledrive::drive_mv(file = as_id(sheet_id), path = as_id(folder_id))
} end 'for'
} end 'function'
I don't think this will be a reproducible example. The offending code is within the for loop that is within the function and it works fine when I run through it step by step. folder_id is defined within the function but outside of the for loop. sheet_id is within the for loop. When I move folder_id into the for loop, it still doesn't work although I don't know why it would change anything. These are just the things I have tried. I do have the proper authorization for google drive and googlesheets4 by using:
googledrive::drive_auth()
googlesheets4::gs4_auth(token = drive_token())
<error/rlang_error>
Error in as_parent():
! Parent specified via path is invalid:
x Does not exist.
Backtrace:
global createGoogleSheets(inputFile, outPath, addNames)
googledrive::drive_mv(file = as_id(sheet_id), path = as_id(folder_id))
googledrive:::as_parent(path)
Run rlang::last_trace() to see the full context.
Backtrace:
x
-global createGoogleSheets(inputFile, outPath, addNames)
-googledrive::drive_mv(file = as_id(sheet_id), path = as_id(folder_id))
\-googledrive:::as_parent(path)
\-googledrive:::drive_abort(c(invalid_parent, x = "Does not exist."))
\-cli::cli_abort(message = message, ..., .envir = .envir)
\-rlang::abort(message, ..., call = call, use_cli_format = TRUE)
I have tried changing the folder_id to the exact path of my google drive W:/My Drive... and got the same error. I should mention I have also tried deleting the folder and re-creating it fresh.
Anybody have any ideas?
Thank you in advance for your help!
I can't comment because I don't have the reputation yet, but I believe you're missing a parenthesis in your for-loop.
You need that SECOND parenthesis below:
for (i in 1:nrow(tempData) ) {
...
}
(I'm not sure if this is an r or shell issue, forgive adding both tags, if you think I should remove one please comment and I'll do so)
I have a amazon hosted version of r at rstudio.example.com. I have written two scripts and they both run fine when I source them from within Rstudio interface.
When I ssh in to my scripts directory and run from there, the scripts generate some errors.
The purpose of the first script is to qdap::check_spelling of a column of text in a data frame, then get the frequency of that spelling error along with an example of the misspelt word:
library(tidyverse)
library(qdap)
# example data
exampledata <- data.frame(
id = 1:5,
text = c("cats dogs dgs cts oranges",
"orngs orngs cats dgs",
"bannanas, dogs",
"cats cts dgs bnnanas",
"ornges fruit")
)
# check for unique misspelt words using qdap
all.misspelts <- check_spelling(exampledata$text) %>% data.frame %>% select(row:not.found)
unique.misspelts <- unique(all.misspelts$not.found)
# for each misspelt word, get the first instance of it appearing for context/example of word in a sentence
contexts.misspellts.index <- lapply(unique.misspelts, function(x) {
filter(all.misspelts, grepl(paste0("\\b",x,"\\b"), not.found))[1, "row"]
}) %>% unlist
# join it all together in a data farem to write to a csv
contexts.misspelts.vector <- exampledata[contexts.misspellts.index, "text"]
freq.misspelts <- table(all.misspelts$not.found) %>% data.frame() %>% mutate(Var1 = as.character(Var1))
misspelts.done <- data.frame(unique.misspelts, contexts.misspelts.vector, stringsAsFactors = F) %>%
left_join(freq.misspelts, by = c("unique.misspelts" = "Var1")) %>% arrange(desc(Freq))
write.csv(x = misspelts.done, file="~/csvs/misspelts.example_data_done.csv", row.names=F, quote=F)
The final data frame looks like:
> print(misspelts.done)
unique.misspelts contexts.misspelts.vector Freq
1 dgs cats dogs dgs cts oranges 3
2 cts cats dogs dgs cts oranges 2
3 orngs orngs orngs cats dgs 2
4 bannanas bannanas, dogs 1
5 bnnanas cats cts dgs bnnanas 1
6 ornges ornges fruit 1
When I run this on my cloud instance of RStudio it runs with no issues and a csv file is generated in the directory specified on the last line of code.
When I run this in linux I get:
myname#ip-10-0-0-38:~$ r myscript.R
ident, sql
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
Error in grepl(paste0("\\b", x, "\\b"), not.found) :
object 'not.found' not found
In addition: Warning message:
In data.matrix(data) : NAs introduced by coercion
myname#ip-11-0-0-28:~/rscripts$
Looks like a problem with my grepl() function. But it works fine when running within Rstudio, just not when calling the script from shell.
But I'm also getting other errors in a separate script based on a dplyry verb (filter).
If anyone recognizes this issue please help! If any more information is required please let me know and I'll add.
P.S. I tried running the script in my shell locally and it worked. Could this be an issue with my Amazon server?
file in Shell:
shell$ r < input.R > output.CSV
I am not sure if work on R.
You can try!
Through trial and error I found that prepending the library name of each function solved this problem e.g. dplyr::select(). I don't know why but I wish I understood. This only had to be done when calling the script from ssh r myscript.R. On every other environment I tested this was not the case, including local terminal, local RStudio instance, hosted RStudio instance - all 3 of those did not need me to prepend the library, only when calling via ssh
I am going to download the 2005 10-Ks for several corporations in R using the EDGAR package. I have a mini loop to test which is working:
for (CIK in c(789019, 777676, 849399)){
getFilings(2005,CIK,'10-K')
}
However each time this runs I get a yes/no prompt and I have to type 'yes':
Total number of filings to be downloaded=1. Do you want to download (yes/no)? yes
Total number of filings to be downloaded=1. Do you want to download (yes/no)? yes
Total number of filings to be downloaded=1. Do you want to download (yes/no)? yes
How can I prompt R to answer 'yes' for each run? Thank you
Please remember to include a minimal reproducible example in your question, including library(...) and all other necessary commands:
library(edgar)
report <- getMasterIndex(2005)
We can bypass the prompt by doing some code surgery. Here, we retrieve the code for getFilings, and replace the line that asks for the prompt with just a message. We then write the new function (my_getFilings) to a temporary file, and source that file:
x <- capture.output(dput(edgar::getFilings))
x <- gsub("choice <- .*", "cat(paste(msg3, '\n')); choice <- 'yes'", x)
x <- gsub("^function", "my_getFilings <- function", x)
writeLines(x, con = tmp <- tempfile())
source(tmp)
Everything downloads fine:
for (CIK in c(789019, 777676, 849399)){
my_getFilings(2005, CIK, '10-K')
}
list.files(file.path(getwd(), "Edgar filings"))
# [1] "777676_10-K_2005" "789019_10-K_2005" "849399_10-K_2005"
I often see R posts where there is a paste of the output of someone's data, not using a dput(). Sometimes I see people use
data_in <- read.table("clipboard")
which on my OS X machine results in
data_in <- read.table("clipboard")
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") : clipboard cannot be opened or contains no text
I found some previous answers here and here but for one the copy doesn't work on the way in, and for the other the readLines is leading to runaway R sessions for me, both documented below. I have worked through how to get text to and from the clipboard in OS X, which might be useful to some others, but I am hopeful that there are better methods or there is more finesse possible:
# some test data
data <- rbind(c(1,1,2,3), c(1,1, 3, 4), c(1,4,6,7))
str <- "Here is a special string \n\r with \t many üñåé tokens"
# a test input set of numbers to copy to your clipboard if you have nothing to hand
# [10:17:55, 10:37:40, 10:40:26, 10:48:18, 11:00:17, 11:01:12, 11:06:58, 11:09:20, 11:43:41, 11:48:24, 11:49:14, 12:07:31, 12:10:52, 12:10:52, 12:19:00, 12:19:00, 12:19:43, 12:20:55, 12:38:27, 12:38:27, 12:55:09, 12:55:10, 12:57:31, 12:57:31, 13:04:16, 13:04:16, 13:06:51 13:06:51, 14:55:06, 14:56:10, 15:01:30, 15:28:42, 3:29:17, 15:35:33, 15:58:32, 16:05:07, 16:09:16, 16:10:36, 16:32:57, 16:32:57, 16:34:32, 16:38:16, 17:43:27, 17:53:01, 17:56:14, 18:08:21, 18:17:23, 18:37:23, 18:37:23, 18:43:13, 18:43:13 18:51:43, 18:51:43, 19:05:39, 19:05:39]
# Input works reasonably well for tables and text
cb_handle <- pipe("pbcopy", "w")
write.table(data, file=cb_handle)
close(cb_handle)
cb_handle <- pipe("pbcopy", "w")
write(str, file = cb_handle)
close(cb_handle)
# DO NOT USE THIS ONE as it leads to a runaway R process
cb_handle <- pipe("pbcopy", "r")
read.table(cb_handle)
# This reads in the contents but leaves cleanup to do if not really a table
cb_handle <- pipe("pbpaste")
data_in <- read.table(cb_handle)