Connect R to POP Email Server (Gmail) - r

Is is possible to have R connect to gmail's POP server and read/download the messages in a specific folder of mine? I have been storing emails and would like to go back and start to analyze subject lines, etc.
Basically, I need a way to export a folder in my gmail account and I would like to do this pro grammatically if it all possible.
Thanks in advance!

I am not sure that this can be done via a single command. Maybe there is a package out there, which I am not aware of that can accomplish that, but as long as you do not run into that maybe the following process would be a solution ...
Consider got-your-back (http://code.google.com/p/got-your-back/wiki/GettingStarted#Step_4%3a_Performing_A_Backup) which "is a command line tool that backs up and restores your Gmail account".
You can invoke it like this (given that python is available on your machine):
python gyb.py --email foo#bar.com --search "from:pip#pop.com" --folder "mail_from_pip"
After completion you'll find all the emails matching the --search in the specified --folder, along with a sqlite database. (posted by dukedave, Dec 4 '11)
So depending on your OS you should be able to invoke the above command from within R and then access the downloaded mails in the respective folder.

GotYourBack is a good backup utility, but for downloading metadata for analysis, you might want something that doesn't first require you to fetch the entire content of all your email.
I've recently used the gmailr package to do a similar analysis.

Related

How do I make sure my R session attached is using the compute node instead of the admin (i.e. login) node?

I am using VS Code for R in a remote Unix environment. My goal is to perform regular interactive job while editing the script on the remote server as what people usually do in RStudio locally.
For the HPC server I use, there're a admin node (i.e. login node) and a compute node (mostly for interactive job).
Usually what I did, is to login in via admin node first (via ssh), and then request certain resourses (e.g. memories, cpu, etc) from the compute node, and then do
ssh $SLURM_JOB_NODELIST
which transfer me from 'admin' to 'compute' node in the terminal.
And lastly, I do "R: Create R terminal". However, I wouldn't be able to check if this R terminal is operated on the compute node or the admin node.
There's a way to go around, by using 'radian' package and set "r.alwaysUseActiveTerminal" as "true". However, via this way, my data viewer wouldn't be attached and I couldn't view my data in the 'workspace'. As this,
enter image description here
The trickiest part is I need to use 'ssh' to switch between 'admin' and 'compute node'. While at the same time the whole left panel of VS Code, including the File Viewer, is still based on the 'admin' node.
Any suggestions and advice are welcome! Thanks a lot!
In the R syntax, use Sys.info() to find the hostname of the computer R is running on:
> Sys.info()["nodename"]
nodename
"node002.cluster"

Send file via SCP in R

I would like to copy a file from my computer to a remote server via SCP using R.
I have found 2 functions that appear to satisfy this partially.
1.
Using ssh.utils
ssh.utils::cp.remote(path.src="~/myfile.txt",
remote.dest="username#remote",
remote.src="", path.dest="~/temp", verbose=TRUE)
I've noticed that with this method, if I need to enter a password (when remote doesn't have my public key), the function produces an error.
2.
Using RCurl:
RCurl appears to have more robust functionality in the scp() function, but, from what I can tell, it is only for copying a file from a remote to my local machine. I would like to do the opposite.
Is there another way to use these functions or is there another function that would be able to copy a file from my local machine to a remote machine via SCP?
One approach to address the need to enter a password interactively is to use sshpass (see https://stackoverflow.com/a/13955428/6455166) in a call to system, e.g.
system('sshpass -p "password" scp ~/myfile.txt username#remote.com:/some/remote/path')
See the linked answer above for more details, including options to avoid embedding the password in the command.

HTTPS credentials: obfuscate console or pop-up window input

Inspired by this awesome post on a Git branching model and this one on what a version bumping script actually does, I went about creating my own Git version bumping routine which resulted in a little package called bumpr.
However, I don't like the current way of handling (GitHub) HTTPS credentials. I'm using the solution stated in this post and it works great, but I don't like the fact that I need to store my credentials in plain text in this _netrc file.
So I wondered:
if one could also obfuscate console input when prompting via readline(), scan() or the like in much the same way as when using the Git shell. See code of /R/bump.r at line 454:
input <- readline(paste0("Password for 'https://",
git_user_email, "#github.com': "))
idx <- ifelse(grepl("\\D", input), input, NA)
if (is.na(idx)){
message("Empty password")
message("Exiting")
return(character())
}
git_https_password <- input
how RStudio realizes that a "Insert credentials" box pops up when pushing to a remote Git repository and how they obfuscate the password entry.
if file _netrc is something closely related to the GitHub API or if this works for HTTPS requests in general
Git has a mechanism to store, cache or prompt for credentials. Please read http://git-scm.com/docs/gitcredentials.
Within a script, you can use the git credential command to access it: http://git-scm.com/docs/git-credential

Authenticating Service Accounts on Google Compute Engine with BigQuery, via R Studio Server

I'm looking to call BigQuery from R Studio, installed on a Google Compute Engine.
I have the bq python tool installed on the instance, and I was hoping to use its service accounts and system() to get R to call bq command line tool and so get the data.
However, I run into authentication problems, where it asks for a browser key. I'm pretty sure there is no need to get the key due to the service account, but I don't know how to construct the authetication from with R (it runs on RStudio, so will have multiple users)
I can get an authetication token like this:
library(RCurl)
library(RJSONIO)
metadata <- getURL('http://metadata/computeMetadata/v1beta1/instance/service-accounts/default/token')
tokendata <- fromJSON(metadata)
tokendata$$access_token
But how do I then use this to generate a .bigqueryrc token? Its the lack of this that triggers the authetication attempt.
This works ok:
system('/usr/local/bin/bq')
showing me bq is installed ok.
But when I try something like:
system('/usr/local/bin/bq ls')
I get this:
Welcome to BigQuery! This script will walk you through the process of initializing your .bigqueryrc configuration file.
First, we need to set up your credentials if they do not already exist.
******************************************************************
** No OAuth2 credentials found, beginning authorization process **
******************************************************************
Go to the following link in your browser:
https://accounts.google.com/o/oauth2/auth?scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fbigquery&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&response_type=code&client_id=XXXXXXXX.apps.googleusercontent.com&access_type=offline
Enter verification code: You have encountered a bug in the BigQuery CLI. Google engineers monitor and answer questions on Stack Overflow, with the tag google-bigquery: http://stackoverflow.com/questions/ask?tags=google-bigquery
etc.
Edit:
I have managed to get bq functioning from RStudio system() commands, by skipping the authetication by logging in to the terminal as the user using RStudio, autheticating there by signing in via the browser, then logging back to RStudio and calling system("bq ls") etc..so this is enough to get me going :)
However, I would still prefer it if BQ can be autheticated within RStudio itself, as many users may log in and I'll need to autheticate via terminal for all of them. And from the service account documentation, and the fact I can get an authetication token, hints at this being easier.
For the time being, you need to run 'bq init' from the command line to set up your credentials prior to invoking bq from a script in GCE. However, the next release of bq will include support for GCE service accounts via a new --use_gce_service_account flag.

Programmatically open an email from a POP3 and extract an attachment

We have a vendor that sends CSV files as email attachments. These CSV files contain statuses that are imported into our application. I'm trying to automate the process end-to-end, but it currently depends on someone opening an email, saving the attachment to a server share, so the application can use the file.
Since I cannot convince the vendor to change their process, such as offering an FTP location or a Web Service, I'm stuck with trying to automate the existing process.
Does anyone know of a way to programmatically open an email from a POP3 account and extract an attachment? The preferred solution would reside on a Windows 2003 server, be written VB.NET and secure. The application can reside on the same server as the POP3 server, for example, we could setup the free POP3 server that comes with Windows Server and pull against the mail file stored on the file system.
BTW, we are willing to pay for an off-the-shelf solution, if one exists.
Note: I did look at this question but the answer points to a CodeProject solution that doesn't deal with attachments.
Try Mail.dll email component, it's very affordable, supports attachments national characters and is easy to use, it also supports SSL:
Using pop3 As New Pop3()
pop3.Connect("mail.server.com")
pop3.Login("user", "password")
Dim builder As New MailBuilder()
For Each uid As String In pop3.GetAll()
' Receive email message'
Dim mail As IMail = builder.CreateFromEml(pop3.GetMessageByUID(uid))
'Write out received message'
Console.WriteLine(mail.Subject)
'Here you can use mail.Attachmets collection'
For Each attachment As MimeData In mail.Attachments
Console.WriteLine(attachment.FileName)
attachment.Save("c:\" + attachment.FileName)
' you can also use attachment.Data here'
Next attachment
Next
pop3.Close(true)
End Using
You can download it here: http://www.lesnikowski.com/mail.
possible duplication of Reading Email using Pop3 in C#
Atleast, there's a shed load of suggestions there that you may find useful
I'll throw in a late suggestion for a more generalized "download POP3 messages and extract attachments" solution using existing software and minimal programming. I needed to do this for a client who switched to receiving faxes via email and was not pleased with manually saving the attachments to a location where they could be imported into an application.
For downloading messages on *nix systems fetchmail seems to be the standard and is very capable, but I chose mpop for both simplicity and Windows compatibility (but it is cross-platform). If mpop hadn't done the trick for me, I probably would have ended up doing something with the Python-based getmail, which was created when fetchmail's development stalled for a time (it's since resumed).
Mpop is controlled either via command line or configuration file, so I simply created multiple configuration files and specify via command line which file to load. I'm using it in "Exchange pickup directory" mode, which means it simply downloads the messages and drops them as text (.eml) files in a specified directory.
For extraction of the message attachments, UUDeview appears to be the standard (I'm using the Windows port of UUDeview) across just about any system you could want with just about any features you could want. My main alternative to this was a much-less-capable Python script that I'd developed for a different client back in 2007, but I'm happy to go with a precompiled executable over either installing Python or packaging with any of the Python-to-exe options.
Finally there's the configuration - along with the two mpop configuration files mentioned above (which I could do away with by using command-line options), I also have two 2-line .cmd files launched every 10 minutes by scheduled task - the first line to launch mpop to download into a working directory and the second line to launch UUDeview and extract attachments of specified types (.pdf or .tif) then delete each file from which it extracted attachments. Output is sent to another directory from which staff can directly attach files as needed.
This is overall not the most elegant way to reach these ends, but it was quick, simple, functional and reasonably robust - at each stage if something goes wrong it fails such that no data is lost. The only places where data could be lost are any non-attachment messages being sent to the dedicated fax email addresses, and even those will sit in the processing directory and be caught eventually.

Resources