Getting value rather than formula in RGoogleDocs - r

Is there an easy way to read the value of the cells rather than the formula?
By the way I only get this problem in a spreadsheet that I have published but not in spreadsheets that are private.
So for instance in a cell whose value was created by simply using the value from the cell immediately to the left in the Google spreadsheet I would prefer to get the value rather than =RC[-1]
When one exports with Google Spreadsheets as a csv then that does not happen.
I am using the following line of code in R
y2009<-sheetAsMatrix(ts2$y2009,header=TRUE, as.data.frame=TRUE, trim=TRUE)

If you haven't stumbled upon this webpage, it looks useful (I haven't tried it myself...)
http://blog.revolutionanalytics.com/2009/09/how-to-use-a-google-spreadsheet-as-data-in-r.html
It looks like the package that used to be used (RGoogleData) is currently being maintained.
Good Luck!

Not a clever solution, but evaluating the formula does work. For example with the following function:
getValues <- function(x) {
m <- apply(x, 2, function(x) as.character(x))
for (i in 1:nrow(m)) {
formulas <- which(substr(m[i,], 1, 4) == "=RC[")
t <- sub('=RC[', '', m[i, formulas], fixed=TRUE)
t <- sub(']', '', t, fixed=TRUE)
t <- as.numeric(t)
m[i, formulas] <- m[i, (formulas + t)]
}
return(m)
}
getValues(y2009) should return a data frame with all required values. I know this is a quite dumb "solution" with lot of compromises but I hope you could code a lot cleaner function for the task! :)

The original question was written 2009-09. At the time I was using RGoogleDocs to read google spreadsheets. In the last few months I discovered an actively created and maintained package called "googlesheets: Manage Google Spreadsheets from R" by Jennifer Bryan and Joanna Zhao in British Columbia, Canada. At first it was only on GitHub and now it is in the Cran repository. It works well. It is good for reading and writing. It does not expose one's google password. All problems that I previously had in RGoogleDocs have been made irrelevant by googlesheets.

Related

R packages for MS-SSIM?

I wanted to do some simple image comparisons with R (the reason I am not using python is that the workflow is in R). I tried to search for ms-ssim implementations in packages in R, but did find any except for spatialcompare::msssim. However, as I mentioned in my post yesterday, I figured out that the result of this function might not be correct for my input (could be related to matrix to raster transformation?). Are there any more suggestions for appropriate ms-ssim codes? I actually self-implemented one based on SpatialPack::SSIM because it seems easy to just downsample the image again and again, but am not sure if I am writing it correctly. I'll put it as an answer.
My demo code for a very simple ms-ssim:
calcMSSSIM<-function(Mat1,Mat2){
weight = c(0.0448, 0.2856, 0.3001, 0.2363, 0.1333)
level<-5
array<-c()
ssim<-SpatialPack::SSIM(Mat1, Mat2, alpha=1, beta=1, gamma=1,
eps=c(0.01,0.03), L=max(tmp1,tmp2))$SSIM
array[1]<-ssim$comps["contrast"]*ssim$comps["structure"]
result<-abs(array[1])^weight[1]
for (i in 2:level) {
tmp1 <- EBImage::resize(tmp1, w=dim(tmp1)[1]/2, filter="none")
tmp2 <- EBImage::resize(tmp2, w=dim(tmp2)[1]/2, filter="none")
ssim <- SpatialPack::SSIM(tmp1,tmp2,alpha=1, beta=1, gamma=1,
eps=c(0.01,0.03), L=max(tmp1,tmp2))$SSIM
array[i]<-ssim$comps["contrast"]*ssim$comps["structure"]
result <- result*abs(array[i])^weight[i]
}
lum<-ssim$comps["luminance"]
result<-result*lum^weight[i]
return(list(array,result))
}
Is it correct?

For loop data frame error when working with character vectors

I am currently doing data science with R and I generally write loops to access multiple files or objects at once. Normally this goes without any problems but recently a problem occurred when trying to run the following code:
setwd(PROJECT_FOLDER)
climate_forcing <- c("cf-1", "cf-2", "cf-3", "cf-4")
#load all mean stacks from IM and create rasterstack
for (i in 1:NROW(climate_forcing)){
setwd(PROJECT_FOLDER)
setwd(paste0("time frames mcor/X variable/IM/", climate_forcing[i], "/ncstack/"))
file.names <- list.files(pattern = ".nc", recursive=T, full.names=F) #list all files with ".nc"
stopwords <- c(".nc", "stack", "/dLAI") #stopwords
names.short <- gsub(paste(stopwords, collapse="|"), "", file.names)
assign("names.short", paste0(names.short, climate_forcing[i]))
for (j in 1:NROW(file.names)){
assign(paste0(names.short[j], "_stack"), stack(file.names[j]))
}
}
Error message returned:
Error in data.frame(values = unlist(unname(x)), ind, stringsAsFactors = FALSE) :
arguments imply differing number of rows: 1, 0
I wrote this a while ago and I ran it before and I think it used to work since the files being created by a similar script are there.
Anyways I did some testing and it seems that the error occurs in the for loop within the for loop (with the variable j). I am unsure what may cause this bug but has to do something with "file.names" and "names.short" right? When I compare them, their properties appear to be identical though, which I figured would be, since I create the latter out of the former. The reason I am creating them like this is because I want to create objects reading out the corresponding files of file.names.
The error I get refers to data.frame which confuses me because I'm working with character vectors here..
Maybe somebody with more experience can figure this issue out.
Thanks for any help and if there are any questions I will try to answer them.
Alright it turns out something was wrong with the R packages, I reinstalled and reloaded them (raster) and now it works. Thanks to anyone for your contributions!

For R: How to exclude some data files based on file language

I'm rather new to R (and in law school so this is all very new to me), so apologies if this is poorly worded.
I have a series of about 1500 documents that I am importing into R to categorize and analyze later. The first thing that I need to do is exclude all documents that are written in French, which are labelled with an "FR" in the title/doc.info. I was curious what kind of code I could use to exclude that before importing the files to have a clean data set before analyzing anything (since it will obvious make a mess of processes like sentiment analysis).
Any help is appreciated (even if that help is explaining how to better talk about coding).
Kind regards!
edit 1
The code that I am using is readtext(folder), which you can see below:
folder<-"C:/[pathway]"
submissions<-readtext(folder)
submissions_text<-submissions$text
submission_number<- numeric()
submission_person<- factor()
submission_code<- factor()
submission_language<-factor()
submission_location<-factor()
for (submission_name in submissions$doc_id) {
submission_name<-gsub(".txt","",submission_name)
number<-as.numeric(strsplit(submission_name, "_|-")[[1]][1])
submission_number<-c(submission_number,number)
person<-strsplit(submission_name, "_")[[1]][2]
submission_person<-c(submission_person, person)
code<-strsplit(submission_name, "_")[[1]][3]
submission_code<-c(submission_code, code)
lang<-strsplit(submission_name, "_")[[1]][4]
submission_language<-c(submission_language, lang)
location<-strsplit(submission_name, "_")[[1]][5]
submission_location<-c(submission_location, location)
}
submissions<-cbind(submissions,submission_number)
submissions<-cbind(submissions,submission_person)
submissions<-cbind(submissions,submission_code)
submissions<-cbind(submissions,submission_language)
submissions<-cbind(submissions,submission_location)
submissions<-submissions[order(submissions$submission_number, decreasing = FALSE),]
This is just the organizational aspect of my code. I am looking to hopefully exclude all of the French data before this point (but if it comes afterward, I would also be more than happy with that).
The functionality you are after can be found in the list.files() function. Documentation can be found here.
In short, your code will likely end up looking something like this:
setwd("c:/path/to/your/data/here")
files <- list.files()
non_french_files <- files[!grepl("FR", files)]
lapply(non_french_files, function(x) {
f <- read.csv(x)
#do stuff with f
}]
Note - you could directly leverage the pattern parameter found in `list.files(), but I chose to do that in two steps in case you wanted to do something else with the French files. This also simplifies what each line of code is doing...
...good luck and welcome to R!
Here's an alternative similar to #Chase 's:
#set wd
files<-list.files()[!grepl("FR",list.files())]
lapply(files,function(x) read.csv(x)) #reads all at once, might want to save each

Can R transform emoji characters to their text equivalents?

In my question yesterday, "Can R read html-encoded emoji characters?", user rensa noted that:
As far as I'm aware, there's no solution to printing emoji in the R console: they always come out as "U0001f600" (or what have you). However, the packages I described above can help you plot emoji in some circumstances (I'm hoping to expand ggflags to display arbitrary full-colour emoji at some point). They can also help you search for emoji to get their codes, but they can't get names given the codes AFAIK. But maybe you could try importing the emoji list from emojilib into R and doing a join with your data frame, if you've extracted the emoji codes into a column, to get the English names.
How would this look in R?
(Note: I'm posting this question with the intention of answering it immediately, rather than posting this in the question linked above, since it's tangential to that question, but still possibly of use to others.)
The approach below works for transforming an emoji character or unicode representation into a name.
I am happy to release the code snippet below under a CC0 dedication (i.e., putting this implementation into the public domain for free reuse).
# Get (MIT-licensed) emojilib data:
emoji_json_file <- "https://raw.githubusercontent.com/muan/emojilib/master/emojis.json"
json_data <- rjson::fromJSON(paste(readLines(emoji_json_file), collapse = ""))
get_name_from_emoji <- function(emoji_unicode, emoji_data = json_data){
emoji_evaluated <- stringi::stri_unescape_unicode(emoji_unicode)
# names(json_data)
vector_of_emoji_names_and_characters <- unlist(
lapply(json_data, function(x){
x$char
})
)
name_of_emoji <- attr(
which(vector_of_emoji_names_and_characters == emoji_evaluated)[1],
"names"
)
name_of_emoji
}
get_name_from_emoji("\\U0001f917")
# [1] "hugs"
get_name_from_emoji("🤗") # An attempt actually pasting the hugs emoji in also works.
# [1] "hugs"

Is there a built-in function for sampling a large delimited data set?

I have a few large data files I'd like to sample when loading into R. I can load the entire data set, but it's really too large to work with. sample does roughly the right thing, but I'd like to have to take random samples of the input while reading it.
I can imagine how to build that with a loop and readline and what-not but surely this has been done hundreds of times.
Is there something in CRAN or even base that can do this?
You can do that in one line of code using sqldf. See part 6e of example 6 on the sqldf home page.
No pre-built facilities. Best approach would be to use a database management program. (Seems as though this was addressed in either SO or Rhelp in the last week.)
Take a look at: Read csv from specific row , and especially note Grothendieck's comments. I consider him a "class A wizaRd". He's got first hand experience with sqldf. (The author IIRC.)
And another "huge files" problem with a Grothendieck solution that succeeded:
R: how to rbind two huge data-frames without running out of memory
I wrote the following function that does close to what I want:
readBigBz2 <- function(fn, sample_size=1000) {
f <- bzfile(fn, "r")
rv <- c()
repeat {
lines <- readLines(f, sample_size)
if (length(lines) == 0) break
rv <- append(rv, sample(lines, 1))
}
close(f)
rv
}
I may want to go with sqldf in the long-term, but this is a pretty efficient way of sampling the file itself. I just don't quite know how to wrap that around a connection for read.csv or similar.

Resources