Namespace without prefix in XML in R - r

In the XML package in R, it is possible to create a new xmlTree object with a namespace, e.g. using:
library(XML)
d = xmlTree("foo", namespaces = list(prefix = "url"))
d$doc()
# <?xml version="1.0"?>
# <foo xmlns:prefix="url"/>
How do I create a default namespace, without the prefix bar, such that it looks like the following?
# <?xml version="1.0"?>
# <foo xmlns="url"/>
The following does not produce what I expected.
library(XML)
d = xmlTree("foo", namespaces = list("url"))
d$doc()
# <?xml version="1.0"?>
# <url:foo xmlns:url="<dummy>"/>

There seems to be a difference between nameless lists and lists with an empty name in R.
1 - A nameless list:
list("url")
# [[1]]
# [1] "url"
names(list("url"))
# NULL
2 - A named list:
list(prefix = "url")
# $prefix
# [1] "url"
names(list(prefix = "url"))
# [1] "prefix"
3 - An incorrectly initialised empty-name list:
list("" = "url")
# Error: attempt to use zero-length variable name
4 - An hacky way to initialise an empty-name list:
setNames(list(prefix = "url"), "")
# [[1]]
# [1] "url"
names(setNames(list(prefix = "url"), ""))
# [1] ""
It would seem 1. and 4. are identical, however, in the package XML they produce different results. The first gives the incorrect XML as mentioned in the OP, whereas option 4. produces:
library(XML)
d = d = xmlTree("foo", namespaces = setNames(list(prefix = "url"), ""))
d$doc()
# <?xml version="1.0"?>
# <foo xmlns="url"/>

Related

How to save an object whose name is in a variable?

This is calling for some "tricky R", but this time it's beyond my fantasy :-) I need to save() an object whose name is in the variable var. I tried:
save(get(var), file = ofn)
# Error in save(get(var), file = ofn) : object ‘get(var)’ not found
save(eval(parse(text = var)), file = ofn)
# Error in save(eval(parse(text = var)), file = ofn) :
# object ‘eval(parse(text = var))’ not found
both of which fail, unfortunatelly. How would you solve this?
Use the list argument. This saves x in the file x.RData. (The list argument can specify a vector of names if you need to save more than one at a time.)
x <- 3
name.of.x <- "x"
save(list = name.of.x, file = "x.RData")
# loading x.RData to check that it worked
rm(x)
load("x.RData")
x
## [1] 3
Note
Regarding the first attempt in the question which attempts to use get we need to specify the name rather than its value so that attempt could use do.call converting the character name to a name class object.
do.call("save", list(as.name(name.of.x), file = "x.RData"))
Regarding the second attempt in the question which uses eval, to do that write out the save, substitute in its name as a name class object and then evaluate it.
eval(substitute(save(Name, file = "x.RData"), list(Name = as.name(name.of.x))))
If it's just one object, you can use saveRDS:
a<-1:4
var<-"a"
saveRDS(get(var),file="test.R")
readRDS(file="test.R")
[1] 1 2 3 4

Change default argument(s) of S3 Methods in R

Is it possible to change default argument(s) of S3 Methods in R?
It's easy enough to change arguments using formals ...
# return default arguments of table
> args(table)
function (..., exclude = if (useNA == "no") c(NA, NaN), useNA = c("no",
"ifany", "always"), dnn = list.names(...), deparse.level = 1)
# Update an argument
> formals(table)$useNA <- "always"
# Check change
> args(table)
function (..., exclude = if (useNA == "no") c(NA, NaN), useNA = "always",
dnn = list.names(...), deparse.level = 1)
But not S3 methods ...
# View default argument of S3 method
> formals(utils:::str.default)$list.len
[1] 99
# Attempt to change
> formals(utils:::str.default)$list.len <- 99
Error in formals(utils:::str.default)$list.len <- 99 :
object 'utils' not found
At #nicola's generous prompting here is an answer-version of the comments:
You can edit S3 methods and other non-exported functions using assignInNamespace(). This lets you replace a function in a given namespace with a new user-defined function (fixInNamespace() will open the target function in an editor to let you make a change).
# Take a look at what we are going to change
formals(utils:::str.default)$list.len
#> [1] 99
# extract the whole function from utils namespace
f_to_edit <- utils:::str.default
# make the necessary alterations
formals(f_to_edit)$list.len<-900
# Now we substitute our new improved version of str.default inside
# the utils namespace
assignInNamespace("str.default", f_to_edit, ns = "utils")
# and check the result
formals(utils:::str.default)$list.len
#> [1] 900
If you restart your R session you'll recover the defaults (or you can put them back manually in the current session).

error while inserting data into mongodb via R if try to set _id

I can insert data into mongodb hosted at mongolabs from R but the moment I try to set the _id field I get this error:
> data<-list("_id"="1fgthhy2334",text="abc",nums=c(1,2,3))
> db$insert(data)
Error: can't use an array for _id
data<-list("_id"=c("12334"),text="abc",nums=c(1,2,3))
> db$insert(data)
Error: can't use an array for _id
Any idea why it thinks I'm trying to set the id to an array? None of my variations seem to work.
How can I set a particular _id field to my selected (unique) identifier?
if you do
jsonlite::toJSON(data)
# {"_id":["1fgthhy2334"],"text":["abc"],"nums":[1,2,3]}
you'll see it's converted internally to an array (as mongolite uses jsonlite to do the conversion)
To insert it as an object itself, you need the input data as a data.frame, something like
data <- data.frame("_id" = "1fgthhy2334", text = "abc", nums = c(1,2,3))
data <- aggregate(nums ~ X_id + text, data, list)
names(data)[1] <- "_id"
Now it gets converted to an object
jsonlite::toJSON(data)
# [{"_id":"1fgthhy2334","text":"abc","nums":[1,2,3]}]
so the insert should work
m <- mongo(collection = "test", db = "test")
m$insert(data)
# Complete! Processed total of 1 rows.
# $nInserted
# [1] 1
#
# $nMatched
# [1] 0
#
# $nRemoved
# [1] 0
#
# $nUpserted
# [1] 0
#
# $writeErrors
# list()
And as a sanity check, try and insert it again and it will fail because that _id already exists
m$insert(data)
Error: insertDocument :: caused by :: 11000 E11000 duplicate key error index: test.test_id.$_id_ dup key: { : "1fgthhy2334" }

Which selector to write in rvest package in R?

I am trying to extract informations from source code of a specific website
In the source code there are lines:
# [[4]]
# <script type="text/javascript">
# <![CDATA[
# <!-- // <![CDATA[
# var wp_dot_addparams = {
# "cid": "148938",
# "ctype": "article",
# "ctags": "dziejesiewkulturze,piraci z karaibów,Charlie Hebdo,Scorpions",
# "cauthor": "",
# "csource": "film.wp.pl",
# "cpageno": 1,
# "cpagemax": 1,
# "cdate": "2015-02-18"
# };
# // ]]]]><![CDATA[> -->
# ]]>
# </script>
From which I'd like to extract:
"ctags": "dziejesiewkulturze,piraci z karaibów,Charlie Hebdo,Scorpions",
Does anyone know how I should specify the selector in html_nodes function in rvest package in R?
html("http://film.wp.pl/id,148938,title,dziejesiewkulturze-Codzienna-dawka-informacji-kulturalnych-180215-WIDEO,wiadomosc.html") %>%
html_nodes("script")
Extract the JSON object from the element's text (tidy the selector up while you're at it)
Parse it as a list using jsonlite's fromJSON() function.
You can access it directly using "$ctags"
library(jsonlite)
json <- html("http://film.wp.pl/id,148938,title,dziejesiewkulturze-Codzienna-dawka-informacji-kulturalnych-180215-WIDEO,wiadomosc.html") %>%
html_nodes("script:contains('var wp_dot_addparams')") %>%
gsub(x=., pattern=".*var wp_dot_addparams = (\\{.*\\});.*",replacement="\\1") %>%
fromJSON()
json$ctags
[1] "dziejesiewkulturze,piraci z karaibów,Charlie Hebdo,Scorpions"

How to access data saved in an assign construct?

I made a list, read the list into a for loop, do some calculations with it and export a modified dataframe to [1] "IAEA_C2_NoStdConditionResiduals1" [2] "IAEA_C2_EAstdResiduals2" ect. When I do View(IAEA_C2_NoStdConditionResiduals1) after the for loop then I get the following error message in the console: Error in print(IAEA_C2_NoStdConditionResiduals1) : object 'IAEA_C2_NoStdConditionResiduals1' not found, but I know it is there because RStudio tells me in its Environment view. So the question is: How can I access the saved data (in this assign construct) for further usage?
ResidualList = list(IAEA_C2_NoStdCondition = IAEA_C2_NoStdCondition,
IAEA_C2_EAstd = IAEA_C2_EAstd,
IAEA_C2_STstd = IAEA_C2_STstd,
IAEA_C2_Bothstd = IAEA_C2_Bothstd,
TIRI_I_NoStdCondition = TIRI_I_NoStdCondition,
TIRI_I_EAstd = TIRI_I_EAstd,
TIRI_I_STstd = TIRI_I_STstd,
TIRI_I_Bothstd = TIRI_I_Bothstd
)
C = 8
for(j in 1:C) {
#convert list Variable to string for later usage as Variable Name as unique identifier!!
SubNameString = names(ResidualList)[j]
SubNameString = paste0(SubNameString, "Residuals")
#print(SubNameString)
LoopVar = ResidualList[[j]]
LoopVar[ ,"F_corrected_normed"] = round(LoopVar[ ,"F_corrected_normed"] / mean(LoopVar[ ,"F_corrected_normed"]),
digit = 5
)
LoopVar[ ,"F_corrected_normed_error"] = round(LoopVar[ ,"F_corrected_normed_error"] / mean(LoopVar[ ,"F_corrected_normed_error"]),
digit = 5
)
assign(paste(SubNameString, j), LoopVar)
}
View(IAEA_C2_NoStdConditionResiduals1)
Not really a problem with assign and more with behavior of the paste function. This will build a variable name with a space in it:
assign(paste(SubNameString, j), LoopVar)
#simple example
> assign(paste("v", 1), "test")
> `v 1`
[1] "test"
,,,, so you need to get its value by putting backticks around its name so the space is not misinterpreted as a parse-able delimiter. See what happens when you type:
`IAEA_C2_NoStdCondition 1`
... and from here forward, use paste0 to avoid this problem.

Resources