How to use write.table in R with up/down-arrows? - r

I've got a dataframe f in R with one column called utterance which contains lines with character strings like:
~↑I don't think I can↑~ and
↓carrying↓
Whenever I'm using
write.table(f, "C:/Users/...txt", sep="\t", quote=F, row.names=F, fileEncoding = "UTF-8")
to create a table in a .txt, Up and Down arrows are given like so in the created .txt file:
<U+2191> instead of the actual ↑
<U+2193> instead of the actual ↓
~<U+2191>I don't think I can<U+2191>~
<U+2193>carrying<U+2193>
How can I fix this problem to get the actual ↑ and ↓ in the txt files by using the correct settings for write.table in R? I'm using the standard text editor of Windows10 and Notepad++.

There are some advices in the Escaping from character encoding hell in R on Windows (and all other known articles on this topic) however those do not seem to be useful for this particular case as the ↑ and ↓ characters do not come under any natural language.
Good news
Write file as UTF-8 encoding in R for Windows
… when the R writes a UTF-8 text into a file on Windows, characters of
unsupported language are modified. In contrast, all characters are
written correctly in Mac OS.
Using binary
There is a solution for this problem. Writing a binary file instead of
a text file solves this. All applications handling a UTF-8 file in
Windows are using the same trick.
BOM
The BOM should not be used in UTF-8 files. This is what the Linux and
the Mac OS are doing. But the Windows Notepad and some applications
use the BOM. So, handling the BOM is needed, in spite of grammatically
wrong.
…
Solution
arrows.html (a sample UTF-8 file, used later in 70166451.r)
<!DOCTYPE html>
<html>
<head> <meta charset="utf-8"> </head>
<body>up=↑ ↑↓ down=↓</body>
</html>
70166451.r (partially commented script):
### my circumstances
setwd("D:\\BAT\\R")
filepath = '70166451.txt'
### ↓↓↓ adapted from https://tomizonor.wordpress.com/2013/04/17/file-utf8-windows/
BOM <- charToRaw('\xEF\xBB\xBF')
writeUtf8 <- function(xstr, filepath, forappend=F, bom=F) {
openmode <- ifelse(forappend, 'ab', 'wb')
con <- file( filepath, open=openmode)
if( !forappend && bom ) writeBin(BOM, con, endian="little")
# If the connection is open it is written from its current position:
writeBin(charToRaw(xstr), con, endian="little")
close(con)
}
### ↑↑↑ adapted from https://tomizonor.wordpress.com/2013/04/17/file-utf8-windows/
### hard-coded characters ↑ and ↓
aa <- "up ↑ (↑↓) ↓ down" # unworkable? (not solved)
aa <- "up \u2191 (↑↓) \u2193 down" # unworkable! (unsolvable?)
aa <- "up \u2191 (\u2191\u2193) \u2193 down" # workable! (solved here)
# print( c( 'aa ', Encoding(aa), aa ))
# "aa " "UTF-8" "up <U+2191> (<U+2191><U+2193>) <U+2193> down"
xx <- data.frame( myword = c(aa,toupper(aa)), word = c(toupper(aa),aa))
yy <- readr::format_tsv( xx, append = F, quote_escape = "none", eol = "\r\n")
writeUtf8( yy, filepath)
### characters read from a file
library(xml2)
rawHTML <- paste(readLines("arrows.html", encoding='utf-8'), collapse=" ")
aaa <- xml_text(read_html(charToRaw(rawHTML)))
# print( c( 'aaa', Encoding(aaa), aaa ))
# "aaa" "UTF-8" "up=<U+2191> <U+2191><U+2193> down=<U+2193>"
xxx <- data.frame( myword = c(aaa,toupper(aaa)), word = c(toupper(aaa),aaa))
yyy <- readr::format_tsv( xxx, append = T, quote_escape = "none", eol = "\r\n")
writeUtf8( yyy, filepath, forappend=T)
Result (one can Copy&Paste above code snippet to an open R Console window, or save and run using Rscript.exe as shown below):
pushd D:\bat\R & del 70166451*.txt & rscript 70166451.r & type 70166451*.txt & popd
70166451.txt
myword word
up ↑ (↑↓) ↓ down UP ↑ (↑↓) ↓ DOWN
UP ↑ (↑↓) ↓ DOWN up ↑ (↑↓) ↓ down
up=↑ ↑↓ down=↓ UP=↑ ↑↓ DOWN=↓
UP=↑ ↑↓ DOWN=↓ up=↑ ↑↓ down=↓

Related

Convert Firefox bookmarks JSON file to markdown

Background
I want to show part of my bookmarks on my Hugo website. The bookmarks from Firefox can be saved in JSON format, this is the source. The result should represent the nested structure somehow, in a format of a nested list, treeview or accordion. The source files of contents on the website are written in markdown. I want to generate a markdown file from the JSON input.
As I searched for possible solutions:
treeview or accordion: HTML, CSS and Javascript needed. I could not nest accordions with the <details> tag. Also, seems like overkill at the moment.
unordered list: can be done with bare markdown.
I chose to generate an unordered nested list from JSON. I would like to do this with R.
Input/output
Input sample: https://gist.github.com/hermanp/c01365b8f4931ea7ff9d1aee1cbbc391
Preferred output (indentation with two spaces):
- Info
- Python
- [The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)
- [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)
- [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
- [Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)
- Frontend
- [CodePen](https://codepen.io/)
- [JavaScript](https://www.javascript.com/)
- [CSS-Tricks](https://css-tricks.com/)
- [Butterick’s Practical Typography](https://practicaltypography.com/)
- [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)
- [Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)
- [Client-Side Web Development](https://info340.github.io/)
- [Stack Overflow](https://stackoverflow.com/)
- [HUP](https://hup.hu/)
- [Hope in Source](https://hopeinsource.com/)
Bonus preferred output: show favicons before links, like below (other suggestion welcomed, like loading them from the website's server instead of linking):
- ![https://cdn.sstatic.net/Sites/stackoverflow/Img/apple-touch-icon.png?v=c78bd457575a][Stack Overflow](https://stackoverflow.com/)
Attempt
generate_md <- function (file) {
# Encoding problem with tidyjson::read_json
bmarks_json_lite <- jsonlite::fromJSON(
txt = paste0("https://gist.githubusercontent.com/hermanp/",
"c01365b8f4931ea7ff9d1aee1cbbc391/raw/",
"33c21c88dad35145e2792b6258ede9c882c580ec/",
"bookmarks-example.json"))
# This is the start point, a data frame
level1 <- bmarks_json_lite$children$children[[2]]
# Get the name of the variable to modify it.
# Just felt that some abstraction needed.
varname <- deparse(substitute(level1))
varlevel <- as.integer(substr(varname, nchar(varname), nchar(varname)))
# Get through the data frame by its rows.
for (i in seq_len(nrow(get(varname)))) {
# If the type of the element in the row is "text/x-moz-place",
# then get its title and create a markdown list element from it.
if (get(varname)["type"][i] == "text/x-moz-place"){
# The two space indentation shall be multiplied as many times
# as deeply nested in the lists (minus one).
md_title <- paste0(strrep(" ", varlevel - 1),
"- ",
get(varname)["title"][i],
"\n")
# Otherwise do this and also get inside the next level.
} else if (get(varname)["type"][i] == "text/x-moz-place-container") {
md_title <- paste0(strrep(" ", varlevel - 1),
"- ",
get(varname)["title"][i],
"\n")
# I know this is not good, just want to express my thought.
# Create the next, deeper level's variable, whoose name shall
# represent the depth in the nest.
# Otherwise how can I multiply the indentation for the markdown
# list elements? It depends on the name of this variable.
varname <- paste0(regmatches(varname, regexpr("[[:alpha:]]+", varname)),
varlevel + 1L)
varlevel <- varlevel + 1L
assign(varname, get(varname)["children"][[i]])
# The same goes on as seen at the higher level.
for (j in seq_len(nrow(get(varname)))){
if (get(varname)["type"][i] == "text/x-moz-place"){
md_title <- paste0(strrep(" ", varlevel - 1),
"- ",
get(varname)["title"][i],
"\n")
} else if (get(varname)["type"][i] == "text/x-moz-place-container") {
md_title <- paste0(strrep(" ", varlevel - 1),
"- ",
get(varname)["title"][i],
"\n")
varname <- paste0(regmatches(varname, regexpr("[[:alpha:]]+", varname)),
varlevel + 1L)
varlevel <- varlevel + 1L
assign(varname, get(varname)["children"][[i]])
for (k in seq_len(nrow(get(varname)))){
# I don't know where this goes...
# Also I need to paste somewhere the md_title strings to get the
# final markdown output...
}
}
}
}
}
}
Question
How can I recursively grab and paste strings from this JSON file? I tried to search for tips in recursion, but it's quite a hard topic. Any suggestion, package, function, link will be welcomed!
I know you asked for a solution in R.
Just as a suggestion, here is a solution using jq, as it is very suitable for json transformations.
#!/bin/bash
BOOKMARKS='FirefoxBookmarks.json'
jq -r '
def bookmark($iconuri; $title; $uri):
if $iconuri != null then "![\($iconuri)]" else "" end +
"[\($title)](\($uri))";
def bookmarks:
(objects | to_entries[]
| if .value | type == "array" then (.value | bookmarks)
else .value end ) //
(arrays[] | [bookmarks] | " - \(.[0])", " \(.[1:][])" );
(.. | .children? | arrays)
|= map(if .uri != null then {bookmark: bookmark(.iconuri; .title; .uri)}
else {title} end +
{children})
| del(..| select(length == 0)) # remove empty children and empty titles
| del(..| select(length == 0)) # remove objects that got empty because of previous deletion
| del(..| objects | select(has("title") and (.children | length == 0))) # remove objects with title but no children
| .children # remove root level
| bookmarks
' < "$BOOKMARKS"
output:
- Könyvjelzők eszköztár
- Info
- Python
- ![fake-favicon-uri:https://www.freecodecamp.org/news/the-python-guide-for-beginners/][The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)
- [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)
- [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
- ![https://github.githubassets.com/favicons/favicon.svg][Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)
- Frontend
- ![https://static.codepen.io/assets/favicon/favicon-touch-de50acbf5d634ec6791894eba4ba9cf490f709b3d742597c6fc4b734e6492a5a.png][CodePen](https://codepen.io/)
- ![https://www.javascript.com/etc/clientlibs/pluralsight/main/images/favicons/android-chrome-192x192.png][JavaScript](https://www.javascript.com/)
- ![https://css-tricks.com/apple-touch-icon.png][CSS-Tricks](https://css-tricks.com/)
- [Butterick’s Practical Typography](https://practicaltypography.com/)
- [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)
- ![https://www.smashingmagazine.com/images/favicon/app-icon-512x512.png][Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)
- ![https://info340.github.io/img/busy-spider-icon.png][Client-Side Web Development](https://info340.github.io/)
- ![https://cdn.sstatic.net/Sites/stackoverflow/Img/apple-touch-icon.png?v=c78bd457575a][Stack Overflow](https://stackoverflow.com/)
- ![https://hup.hu/profiles/hupper/themes/hup_theme/favicon.ico][HUP](https://hup.hu/)
- [Hope in Source](https://hopeinsource.com/)
After I watched a few videos on recursion and saw a few code examples, I tried, manually stepped through the code and somehow managed to do it with recursion. This solution is independent on the nestedness of the bookmarks, therefore a generalized solution for everyone.
Note: all the bookmarks were in the Bookmarks Toolbar in Firefox. This is highlighted in the generate_md function. You can tackle with it there. If I improve the answer later, I will make it more general.
library(jsonlite)
# This function recursively converts the bookmark titles to unordered
# list items.
recursive_func <- function (level) {
md_result <- character()
# Iterate through the current data frame, which may have a children
# column nested with other data frames.
for (i in seq_len(nrow(level))) {
# If this element is a bookmark and not a folder, then grab
# the title and construct a list item from it.
if (level[i, "type"] == "text/x-moz-place"){
md_title <- level[i, "title"]
md_uri <- level[i, "uri"]
md_iconuri <- level[i, "iconuri"]
# Condition: the URLs all have schema (http or https) part.
# If not, filname will be a zero length character vector.
host_url <- regmatches(x = md_uri,
m = regexpr(pattern = "(?<=://)[[:alnum:].-]+",
text = md_uri,
perl = T))
md_link <- paste0("[", md_title, "]", "(", md_uri, ")")
md_listitem <- paste0("- ", md_link, "\n")
# If this element is a folder, then get into it, call this
# function over it. Insert two space (for indentation) in
# the generated sting before every list item. Paste this
# list of items to the folder list item.
} else if (level[i, "type"] == "text/x-moz-place-container") {
md_title <- level[i, "title"]
md_listitem <- paste0("- ", md_title, "\n")
md_recurs <- recursive_func(level = level[i, "children"][[1]])
md_recurs <- gsub("(?<!(\\w ))-(?= )", " -", md_recurs, perl = T)
md_listitem <- paste0(md_listitem, md_recurs)
}
# Collect and paste the list items of the current data frame.
md_result <- paste0(md_result, md_listitem)
}
# Return the (sub)list of the data frame.
return(md_result)
}
generate_md <- function (jsonfile) {
# Encoding problem with tidyjson::read_json
bmarks_json_lite <- fromJSON(txt = jsonfile)
# This is the start point, a data frame. It represents the
# elements inside the Bookmarks Toolbar in Firefox.
level1 <- bmarks_json_lite$children$children[[2]]
# Do not know how to make it prettier, but it works.
markdown_result <- recursive_func(level = level1)
return(markdown_result)
}
You can run the generate_md function with the example.
generate_md(paste0("https://gist.githubusercontent.com/hermanp/",
"c01365b8f4931ea7ff9d1aee1cbbc391/raw/",
"33c21c88dad35145e2792b6258ede9c882c580ec/",
"bookmarks-example.json"))
# Output
[1] "- Info\n - Python\n - [The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)\n - [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)\n - [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)\n - [Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)\n - Frontend\n - [CodePen](https://codepen.io/)\n - [JavaScript](https://www.javascript.com/)\n - [CSS-Tricks](https://css-tricks.com/)\n - [Butterick’s Practical Typography](https://practicaltypography.com/)\n - [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)\n - [Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)\n - [Client-Side Web Development](https://info340.github.io/)\n - [Stack Overflow](https://stackoverflow.com/)\n - [HUP](https://hup.hu/)\n - [Hope in Source](https://hopeinsource.com/)\n"
You can cat it and write it to a file also with writeLines. But bevare! In Windows environments, you probably need to turn useBytes = TRUE to get the correct characters in the file. Reference: UTF-8 file output in R
cat(generate_md(paste0("https://gist.githubusercontent.com/hermanp/",
"c01365b8f4931ea7ff9d1aee1cbbc391/raw/",
"33c21c88dad35145e2792b6258ede9c882c580ec/",
"bookmarks-example.json")))
# Output
- Info
- Python
- [The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)
- [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)
- [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
- [Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)
- Frontend
- [CodePen](https://codepen.io/)
- [JavaScript](https://www.javascript.com/)
- [CSS-Tricks](https://css-tricks.com/)
- [Butterick’s Practical Typography](https://practicaltypography.com/)
- [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)
- [Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)
- [Client-Side Web Development](https://info340.github.io/)
- [Stack Overflow](https://stackoverflow.com/)
- [HUP](https://hup.hu/)
- [Hope in Source](https://hopeinsource.com/)
There was a problem with the regex part. If there are bookmarks with some - title (space, hyphen, space) characters in their titles, these hyphens will also be "indented" as the list items.
# Input JSON
https://gist.github.com/hermanp/381eaf9f2bf5f2b9cdf22f5295e73eb5
cat(generate_md(paste0("https://gist.githubusercontent.com/hermanp/",
"381eaf9f2bf5f2b9cdf22f5295e73eb5/raw/",
"76b74b2c3b5e34c2410e99a3f1b6ef06977b2ec7/",
"bookmarks-example-hyphen.json")))
# Output (two space indentation) markdown:
- Info
- Python
- [The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)
- [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)
- [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
- [Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)
- Frontend
- [CodePen](https://codepen.io/)
- [JavaScript - Wikipedia](https://en.wikipedia.org/wiki/JavaScript) # correct
- [CSS-Tricks](https://css-tricks.com/)
- [Butterick’s Practical Typography](https://practicaltypography.com/)
- [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)
- [Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)
- [Client-Side Web Development](https://info340.github.io/)
- [Stack Overflow](https://stackoverflow.com/)
- [HUP](https://hup.hu/)
- [Hope in Source](https://hopeinsource.com/)
I posted another question about this problem. After some hint and try I answered my own question.

How to improve formatting of slack messages using slackr?

I'm using slackr to send alert messages to a Slack channel. It works great except the message format is not great and I want to improve it.
install_github("hrbrmstr/slackr")
library(slackr)
slackr_setup(channel="#alerts", username="Mark Davis",
incoming_webhook_url = "https://hooks.slack.com/services/T31P8UDAB/BCH4HKQSC/*********",
api_token = "*********", echo = F)
alert="On Monday, 2018-09-03 # 2pm Pacific..."
slackr(alert)
Here is an example of how a message from slackr looks in Slack:
Here is an example of how I'd like it to look:
slackr doesn't seem to have many options in the way of formatting. I was thinking of building an image and inserting that, but I'm having trouble building an image out of a text file using R.
Perhaps there is another api I could call that could take my text and format it for slack?
I'm open to any suggestions.
Addendum:
Slackr has an option to upload files, so my latest attempt is to create an image from the text message and upload that object.
I am able to create a png file from the text message using the magick library. I created an image with a colored background, and I simply add the message text to the image:
library(magick)
alert_picture <- image_read('alert_480x150_dark_red.png')
alert_picture=image_annotate(alert_picture, DreamCloud_Alert, size = 20, gravity = "southwest",
color = "white", location = "+10+10")
image_write(alert_picture, path = "alert_picture.png", format = "png")
The image looks pretty good (although there doesn't seem to be an easy way to bold or underline specific words in the message), but the obstacle now is that I can't get the upload command to work.
slackr_upload(filename = "alert_picture.png")
I don't get any error messages but nothing is uploaded to slack.
I got around this issue by using the httr package to execute the post image function to slack.
Thanks to Adil B. for providing the solution:
Post Image to Slack Using HTTR package in R
I am not sure this is what you meant, but I solved allowing formatting like in a regular slack message by altering the slackr_bot() function and just removing the 2 sets of 3 back-ticks at the end of the code where it says text. Then just call it slackr_bot1() or something, and then you can post formatted messages. This is the function after the back-ticks removal:
slackr_bot1 <- function(...,
channel=Sys.getenv("SLACK_CHANNEL"),
username=Sys.getenv("SLACK_USERNAME"),
icon_emoji=Sys.getenv("SLACK_ICON_EMOJI"),
incoming_webhook_url=Sys.getenv("SLACK_INCOMING_URL_PREFIX")) {
if (incoming_webhook_url == "") {
stop("No incoming webhook URL specified. Did you forget to call slackr_setup()?", call. = FALSE)
}
if (icon_emoji != "") { icon_emoji <- sprintf(', "icon_emoji": "%s"', icon_emoji) }
resp_ret <- ""
if (!missing(...)) {
# mimics capture.output
# get the arglist
args <- substitute(list(...))[-1L]
# setup in-memory sink
rval <- NULL
fil <- textConnection("rval", "w", local = TRUE)
sink(fil)
on.exit({
sink()
close(fil)
})
# where we'll need to eval expressions
pf <- parent.frame()
# how we'll eval expressions
evalVis <- function(expr) withVisible(eval(expr, pf))
# for each expression
for (i in seq_along(args)) {
expr <- args[[i]]
# do something, note all the newlines...Slack ``` needs them
tmp <- switch(mode(expr),
# if it's actually an expresison, iterate over it
expression = {
cat(sprintf("> %s\n", deparse(expr)))
lapply(expr, evalVis)
},
# if it's a call or a name, eval, printing run output as if in console
call = ,
name = {
cat(sprintf("> %s\n", deparse(expr)))
list(evalVis(expr))
},
# if pretty much anything else (i.e. a bare value) just output it
integer = ,
double = ,
complex = ,
raw = ,
logical = ,
numeric = cat(sprintf("%s\n\n", as.character(expr))),
character = cat(sprintf("%s\n\n", expr)),
stop("mode of argument not handled at present by slackr"))
for (item in tmp) if (item$visible) { print(item$value, quote = FALSE); cat("\n") }
}
on.exit()
sink()
close(fil)
# combined all of them (rval is a character vector)
output <- paste0(rval, collapse="\n")
loc <- Sys.getlocale('LC_CTYPE')
Sys.setlocale('LC_CTYPE','C')
on.exit(Sys.setlocale("LC_CTYPE", loc))
resp <- POST(url = incoming_webhook_url, encode = "form",
add_headers(`Content-Type` = "application/x-www-form-urlencoded",
Accept = "*/*"), body = URLencode(sprintf("payload={\"channel\": \"%s\", \"username\": \"%s\", \"text\": \"%s\"%s}",
channel, username, output, icon_emoji)))
warn_for_status(resp)
}
return(invisible())
}
slackr_bot1("*test* on time")

Build string with OS-specific newline characters (CRLF, LF, CR) to write it into a database table column

I want to write write a string that contains the usual new line character of R (\n) into a column of a database table.
How can I convert the new line into operating system specific representation (Windows = CR/LF, Linux = LF, Mac = CR...)?
I have learned that R does not provide the operating system specific representation so I have to find a work-around:
R equivalent for .NET's Environment.NewLine
Printing newlines with print() in R
Any trial to print/cat the string did fail:
msg <- "I want to have \n a new line"
cat(msg)
# I want to have
# a new line
out <- capture.output(cat(msg))
out
# a vector with two elements (one for each row but no new line characters anymore)
# [1] "I want to have " " a new line"
paste(out, collapse = "\n") # how could I inject the correct new line characters here?
# [1] "I want to have \n a new line"
# welcome endless-loop :-)
Is there any way to let R create the correct new line characters from \n in a string?
PS: I am playing around with the built-in tcltk package and puts but I always end with R "reconverting" the newline into \n...
Another "cheat" could be to enclose the \n with quotation marks to read it as if it were one line. I have no idea so far how this could work...
One way to correctly set the new line code in R is to query the operating system. Since both OS X and Linux behave the same way, it's a question of determining whether the operating system is Windows. One way to do this is to interrogate the OS environment variable as follows.
if(substr(Sys.getenv("OS"),1,7) == "Windows") {
# set Windows newline
newLine <- "\r\n"
}
else {
# set non-Windows newline
newLine <- "\n"
}
Next use paste() with the newLine object to generate the right characters for new line by operating system.
paste("my text string on a line",newline,sep="")
regards,
Len
Here you find my final implementation as a possible alternative to the accepted answer:
# Returns the operating system specific new line character(s):
# CR LF on Windows, else only LF...
# Simlar to Microsofts .Net "Environment.NewLine"
platform.NewLine <- function() {
is.windows <- grepl(tolower(.Platform$OS.type), "windows", fixed = TRUE)
if (is.windows) {
newline <- "\r\n"
} else {
newline <- "\n"
}
sys.name <- Sys.info()["sysname"]
is.windows.2nd.opinion <- grepl(tolower(sys.name), "windows", fixed = TRUE)
if (is.windows != is.windows.2nd.opinion)
warning("R seems to run on Windows OS but this could not be recognized for sure")
return(newline)
}
# Usage (examples) ------------------------------------------------------------------------------------------------
newline <- platform.NewLine()
# "print" shows the "symbolic" names (escape codes)
print(paste("Line1", "Line2", sep = newline))
# [1] "Line1\r\nLine2"
# uses "\n" or "\r\n" depending on your OS
# "cat" applies the newline escape codes to the output
cat(paste("Line1", "Line2", sep = newline))
# Line1
# Line2

RCurl: Display progress meter in Rgui

Using R.exe or Rterm.exe, this gives an excellent progress meter.
page=getURL(url="ftp.wcc.nrcs.usda.gov", noprogress=FALSE)
In Rgui I am limited to:
page=getURL(url="ftp.wcc.nrcs.usda.gov",
noprogress=FALSE, progressfunction=function(down,up) print(down))
which gives a very limited set of download information.
Is there a way to improve this?
I start doubting that with standard R commands it is possible to reprint overwriting the current line, which is what RCurl does in non-GUI mode.
I am glad to tell that I was wrong. At least for a single line, \r can do the trick. In fact:
conc=function(){
cat(" abcd")
cat(" ABCD", '\n')
}
conc()
# abcd ABCD
But:
over=function(){
cat(" abcd")
cat("\r ABCD", "\n")
}
over()
# ABCD
That given, I wrote this progressDown function, which can monitor download status rewriting always on the same same line:
library(RCurl) # Don't forget
### Callback function for curlPerform
progressDown=function(down, up, pcur, width){
total=as.numeric(down[1]) # Total size as passed from curlPerform
cur=as.numeric(down[2]) # Current size as passed from curlPerform
x=cur/total
px= round(100 * x)
## if(!is.nan(x) && px>60) return(pcur) # Just to debug at 60%
if(!is.nan(x) && px!=pcur){
x= round(width * x)
sc=rev(which(total> c(1024^0, 1024^1, 1024^2, 1024^3)))[1]-1
lb=c('B', 'KB', 'MB', 'GB')[sc+1]
cat(paste(c(
"\r |", rep.int(".", x), rep.int(" ", width - x),
sprintf("| %g%s of %g%s %3d%%",round(cur/1024^sc, 2), lb, round(total/1024^sc, 2), lb, px)),
collapse = ""))
flush.console() # if the outptut is buffered, it will go immediately to console
return(px)
}
return(pcur)
}
Now we can use the callback with curlPerform
curlProgress=function(url, fname){
f = CFILE(fname, mode="wb")
width= getOption("width") - 25 # you can make here your line shorter/longer
pcur=0
ret=curlPerform(url=url, writedata=f#ref, noprogress=FALSE,
progressfunction=function(down,up) pcur<<-progressDown(down, up, pcur, width),
followlocation=T)
close(f)
cat('\n Download', names(ret), '- Ret', ret, '\n') # is success?
}
Running it with a small sample binary:
curlProgress("http://www.nirsoft.net/utils/websitesniffer-x64.zip", "test.zip")
the intermediate output at 60% is (no # protection):
|................................. | 133.74KB of 222.75KB 60%
where KB, will be adjusted to B, KB, MB, GB, based on total size.
Final output with success status, is:
|.......................................................| 222.61KB of 222.75KB 100%
Download OK - Ret 0
Note, the output line width is relative to R width option (which controls the maximum number of columns on a line) and can be customised changing the curlProgress line:
width= getOption("width") - 25
This is enough for my needs and solves my own question.
Here's a simple example using txtProgressBar. Basically, just do a HEAD request first to get the file size of the file you want to retrieve, then setup a txtProgressBar with that as its max size. Then you use the progressfunction argument to curlPerform to call setTxtProgressBar. It all works very nicely (unless there is no "content-length" header, in which case this code works by just not printing a progress bar).
url <- 'http://stackoverflow.com/questions/21731548/rcurl-display-progress-meter-in-rgui'
h <- basicTextGatherer()
curlPerform(url=url, customrequest='HEAD',
header=1L, nobody=1L, headerfunction=h$update)
if(grepl('Transfer-Encoding: chunked', h$value())) {
size <- 1
} else {
size <- as.numeric(strsplit(strsplit(h$value(),'\r\nContent-Type')[[1]][1],
'Content-Length: ')[[1]][2])
}
bar <- txtProgressBar(0, size)
h2 <- basicTextGatherer()
get <- curlPerform(url=url, noprogress=0L,
writefunction=h2$update,
progressfunction=function(down,up)
setTxtProgressBar(bar, down[2]))
h2$value() # return contents of page
The output is just a bunch of ====== across the console.
What about:
curlProgress=function(url, fname){
f = CFILE(fname, mode="wb")
prev=0
ret=curlPerform(url=url, writedata=f#ref, noprogress=FALSE,
progressfunction=function(a,b){
x=round(100*as.numeric(a[2])/as.numeric(a[1]))
if(!is.nan(x) && x!=prev &&round(x/10)==x/10) prev<<-x else x='.'
cat(x)
}, followlocation=T)
close(f)
cat(' Download', names(ret), '- Ret', ret, '\n')
}
?
It prints dots or percent download divisible by 10 and breaks line on 50%.
And with a small 223 KB file:
curlProgress("http://www.nirsoft.net/utils/websitesniffer-x64.zip", "test.zip")
it sounds like this:
................10...............20................30...............40...............50
..............................70...............80...............90...............100... Download OK - Ret 0
I start doubting that with standard R commands it is possible to reprint overwriting the current line, which is what RCurl does in non-GUI mode.

why causes invalid format '%d in R?

The code given below is to convert binary files from float32 to 16b with scale factor of 10. I am getting error of invalidation of %d.
setwd("C:\\2001")
for (b in paste("data", 1:365, ".flt", sep="")) {
conne <- file(b, "rb")
file1<- readBin(conne, double(), size=4, n=360*720, signed=TRUE)
file1[file1 != -9999] <- file1[file1 != -9999]*10
close(conne)
fileName <- sprintf("C:\\New folder (11)\\NewFile%d.bin", b)
writeBin(as.integer(file1), fileName, size = 2)
}
Result:
Error in sprintf("C:\\New folder (11)\\NewFile%d.bin", :
invalid format '%d'; use format %s for character objects
I used %s as suggested by R.But the files from 1:365 were totally empty
The %d is a placeholder for a integer variable inside a string. Therefore, when you use sprintf(%d, var), var must be an integer.
In your case, the variable b is a string (or a character object). So, you use the placeholder for string variables, which is %s.
Now, if your files are empty, there must be something wrong elsewhere in your code. You should ask another question more specific to it.

Resources