Why are some strings in quotes but others aren't when creating a .YAML file from R? - r

I'm trying to create the following .YAML file:
summary:
title: "Table tabs"
link: ~
blocks: []
nested: nav-pills
nested_names: yes
(note there are no quotes around the tilde, square brackets or yes).
I write the code to create it in R:
tabs <- list(
summary =
list(
title = "Table tabs",
link = "~",
blocks = "[]",
nested = "nav-pills",
nested_names = "yes"
)
)
write(yaml::as.yaml(tabs), file = "myfile.yaml"
But when I write it out to .YAML, it looks like this:
summary:
title: Table tabs
link: '~'
blocks: '[]'
nested: nav-pills
nested_names: 'yes'
i.e. There are quotations around the tilde, square brackets and yes.
Why does this happen, and what can I do to prevent it?

The information is already provided in stackoverflow:
I try to point you through the given answers:
More general considerations using quotes in yaml are discussed sufficiently in the question "YAML: Do I need quotes for strings in YAML?"
Here the difference of ' and "in yaml is discussed:
"What is the difference between a single quote and double quote in Yaml header for r Markdown?"
Specifically the tilde sign is discussed here:
"What is the purpose of tilde character ~ in YAML?"
To summarise,
The tilde is one of the ways the null value can be written. Most
parsers also accept an empty value for null, and of course null, Null
and NULL

Based on the answer from TarJae, the solution is as follows:
tabs <- list(
summary =
list(
title = "Table tabs",
link = NULL,
blocks = list(),
nested = "nav-pills",
nested_names = TRUE
)
)

Related

Judge whitespace or number using pyparsing

I am working on parsing structured text files by pyparsing and I have a problem judging whitespace or numerical number. My file looks like this:
RECORD 0001
TITLE (Main Reference Title)
AUTHOR (M.Brown)
Some files have more than one author then
RECORD 0002
TITLE (Main Reference Title 1)
AUTHOR 1(S.Red)
2(B.White)
I would like to parse files and convert them into dictionary format.
{"RECORD": "001",
"TITLE": "Main Reference Title 1",
"AUTHOR": {"1": "M.Brown"}
}
{"RECORD": "002",
"TITLE": "Main Reference Title 2",
"AUTHOR": {"1": "S.Red", "2": "B.White"}
}
I tried to parse the AUTHOR field by pyparsing (tried both 2.4.7 and 3.0.0b3). Following is the simplified version of my code.
from pyparsing import *
flag = White(" ",exact=1).set_parse_action(replace_with("1")) | Word(nums,exact=1)
flaged_field = Group(flag + restOfLine)
next_line = White(" ",exact=8).suppress() + flaged_field
authors_columns = Keyword("AUTHOR").suppress() +\
White(" ",exact=2).suppress() +\.
flaged_field +\ # parse first row
ZeroOrMore(next_line) # parse next row
authors = authors_columns.search_string(f)
, where 'f' contains all lines read from the file. With this code, I only could parse the author's names with numbering flags.
[]
[[['1', '(S.Red)'],['2','(B.White)']]]
However, if I only parse with whitespace
flag = White(" ",exact=1).set_parse_action(replace_with("1"))
it worked correctly for the files without numbering flags.
['1', '(M.Brown)']
[]
The number (or whitespace) in [9:10] has a meaning in my format and want to judge if it is a whitespace or a numerical number (limited up to 9). I also replaced "|" to "^", and replaced the order, and tried
flag = Word(nums+" ")
, too, but neither of the cases works for me. Why judge White(" ") or Word(nums) doesn't work with my code? Could someone help me or give me an idea to solve this?
This was solved by adding leave_whitespace().
flag = (White(" ",exact=1).set_parse_action(replace_with("0")) | Word(nums,exact=1)).leave_whitespace()

R/exams d2l multiple choice question doesn't select correct answer

I use the following to create a D2L exam from the "capital.Rmd" example (I converted the question to schoice)
exams2blackboard("capitals.Rmd", n =3, name = "testquiz" )
After I upload the testquiz.zip file, I notice that the correct answer must be manually chosen on the D2L platform.
I was wondering if there is a workaround.
Many Thanks,
Umut
If you want the correct solution to be selected, do not use the Import option from the Question Library or from the Quiz itself. Use the Import/Export/Copy Components under the Course Admin tab.
If you import the questions through the following steps, BrightSpace correctly picks the right solution. It’s a bit longer but seems to correctly choose the solution.
Under the Course Admin tab of your course, go to
'Import/Export/Copy Components' -> ‘Import Components’ -> Start -> (drag and drop the ZIP file)
Click ‘Advanced Options…’
This step will take a few minutes for large files; if you do not click
Advanced Options, then the import will automatically import the
questions into the 'Question Library' and will generate a Quiz with the
imported questions; you do not want this.
-> Continue -> Continue -> at this point choose 'Question Library' from the section 'Select Components to Import'
I would not choose ‘Quizzes’ because it automatically creates a quiz
and makes it available to students. It has the unfortunate side-effect
of making ALL the questions available, which means all the versions of
various dynamic questions; this is not something we want.
-> Continue -> Continue. This stage takes a few minutes for large
imports.
Now the Questions are available in the Question Library and can be used to generate new quizzes. Each question has the correct answer selected already. This works for ‘schoice’ and ‘mchoice’ versions of questions. Currently, plots are not imported, though, still trying to figure out why.
This problem is new to me. In earlier versions of Brightspace/D2L the import of single-choice and multiple-choice exercises via exams2blackboard() worked well. Possibly, D2L changed in the meantime given that neither the current release version from CRAN nor the development version from R-Forge work for you.
D2L also supports other import formats and we did play around with some of these. See the following discussions in the R/exams forum on R-Forge:
https://R-Forge.R-project.org/forum/forum.php?thread_id=33404&forum_id=4377&group_id=1337
https://R-Forge.R-project.org/forum/forum.php?thread_id=33657&forum_id=4377&group_id=1337
Notably we tried to use the XML-based QTI 2.1 format that seems to be employed by D2L internally. However, D2L apparently uses a particular custom flavor of QTI 2.1. It should be possible to reverse engineer that and improve exams2qti21() correspondingly but so far (to the best of my knowledge) no one put the time and effort into this that would be needed.
For simple single/multiple choice questions a CSV-based exchange format can also be used. I have put together a very basic exams2d2l() function that was posted in the threads above and that I'm also including below. It can set up the CSV file for a single exercise like the capitals.Rmd exercise that you use above. For plain text exercises like that it seems to work well but not for more complex elements (graphics, code, math, etc.).
exams2d2l <- function(file, dir = ".", ## n = 1L, nsamp = NULL disabled for now
name = NULL, quiet = TRUE, edir = NULL, tdir = NULL, sdir = NULL, verbose = FALSE,
resolution = 100, width = 4, height = 4, svg = FALSE,
encoding = "", converter = NULL, ...)
{
## for Rnw exercises use "ttm" converter otherwise "pandoc" converter
if(any(tolower(tools::file_ext(unlist(file))) == "rmd")) {
if(is.null(converter)) converter <- "pandoc"
} else {
if(is.null(converter)) converter <- "ttm"
}
## output directory or display on the fly
## output name processing
if(is.null(name)) name <- tools::file_path_sans_ext(basename(file))
## set up .html transformer and writer function
htmltransform <- make_exercise_transform_html(converter = converter, ...)
## create exam with HTML text
rval <- xexams(file,
driver = list(sweave = list(quiet = quiet, pdf = FALSE, png = !svg, svg = svg,
resolution = resolution, width = width, height = height, encoding = encoding),
read = NULL, transform = htmltransform, write = NULL),
dir = dir, edir = edir, tdir = tdir, sdir = sdir, verbose = verbose)
## currently: only a single exercise
rval <- rval[[1L]][[1L]]
## put together CSV
cleanup <- function(x) gsub('"', '""', paste(x, collapse = "\n"), fixed = TRUE)
rval <- c(
'NewQuestion,MC,,,',
sprintf('ID,"%s",,,', cleanup(rval$metainfo$file)),
sprintf('Title,"%s",,,', cleanup(rval$metainfo$name)),
sprintf('QuestionText,"%s",,,', cleanup(rval$question)),
sprintf('Points,%s,,,', if(is.null(rval$metainfo$points)) 1 else rval$metainfo$points),
'Difficulty,1,,,',
'Image,,,,',
paste0('Option,', ifelse(rval$metainfo$solution, 100, 0), ',"', cleanup(rval$questionlist), '",,"', cleanup(rval$solutionlist), '"'),
'Hint,,,,',
sprintf('Feedback,"%s",,,', cleanup(rval$solution))
)
writeLines(rval, file.path(dir, paste0(name, ".csv")))
invisible(rval)
}

Avoiding JSON error displaying Japanese strings within Plotly (R) / Running a function on one variable at a time

I'm very new to R and beginner level at programming in general, and trying to figure out how to get hovertext in plotly to display a Japanese string from my dataframe. After venturing through character encoding hell, I've got things mostly worked out but am getting stuck on a single point: Getting the Japanese string to display in the final plot.
plot_ly(df, x = ~cost, y = ~grossSales, type = "scatter", mode = "markers",
hoverinfo = "text",
text = ~paste0("Product name: ", productName,
"<br>Gross: ", grossSales, "<br> Cost: ", cost,
)
)
The problem I encounter is that using 'productName' returns the Japanese string from the dataframe, which causes the plot to fail to render. DOM Inspector's console shows JSON encountering issues with the string (even though it's just encoded in UTF-8).
Using toJSON(productName), I am able to render the table, however this renders the hover textbox with the full information of the productName column (e.g., ["","Product1","Product2","Product3"...]). I only want the name of that specific product; just as 'grossSales' and 'cost' only return one the data specific to that product at each point on the plot.
Is there a way I can execute toJSON() only on each specific instance of 'productName'? (i.e., output should be "Product1" with JSON friendly string format) Alternatively, is there a way I can have plotly read the list output and select only the correct productName?
Stepping away from the problem to continue studying other things, I found a partial solution in using a for-loop:
productNames <- NULL
for (i in 1:nrow(df))
{
productNames <- c(productNames, toJSON(df[i, "productName"]))
}
df$jsonProductNames <- productNames
Using the jsonProductNames variable within plotly, the graph renders and displays only the name for each product! The sole issue remaining is that it is displayed with the JSON [""] formatting around each product's name.
Update:
I've finally got this working fully how I want it. I imagine there are more elegant solutions, and I'd still be interested to learn how to achieve what I originally was looking at if possible (run a function on a variable within R for each time it is encountered in a loop), but here is how I have it working:
colToJSON <- function(df, colStr)
{
JSONCol <- NULL
for (i in 1:nrow(df))
{
JSONCol <- c(JSONCol, toJSON(df[i, colStr]))
}
JSONCol <- gsub("\\[\"", "", JSONCol)
JSONCol <- gsub("\"\\]", "", JSONCol)
return(JSONCol)
}
df$jsonProductNames <- colToJSON(df, "productName")

R: XPath expression returns links outside of selected element

I am using R to scrape the links from the main table on that page, using XPath syntax. The main table is the third on the page, and I want only the links containing magazine article.
My code follows:
require(XML)
(x = htmlParse("http://www.numerama.com/magazine/recherche/125/hadopi/date"))
(y = xpathApply(x, "//table")[[3]])
(z = xpathApply(y, "//table//a[contains(#href,'/magazine/') and not(contains(#href, '/recherche/'))]/#href"))
(links = unique(z))
If you look at the output, the final links do not come from the main table but from the sidebar, even though I selected the main table in my third line by asking object y to include only the third table.
What am I doing wrong? What is the correct/more efficient way to code this with XPath?
Note: XPath novice writing.
Answered (really quickly), thanks very much! My solution is below.
extract <- function(x) {
message(x)
html = htmlParse(paste0("http://www.numerama.com/magazine/recherche/", x, "/hadopi/date"))
html = xpathApply(html, "//table")[[3]]
html = xpathApply(html, ".//a[contains(#href,'/magazine/') and not(contains(#href, '/recherche/'))]/#href")
html = gsub("#ac_newscomment", "", html)
html = unique(html)
}
d = lapply(1:125, extract)
d = unlist(d)
write.table(d, "numerama.hadopi.news.txt", row.names = FALSE)
This saves all links to news items with keyword 'Hadopi' on this website.
You need to start the pattern with . if you want to restrict the search to the current node.
/ goes back to the start of the document (even if the root node is not in y).
xpathSApply(y, ".//a/#href" )
Alternatively, you can extract the third table directly with XPath:
xpathApply(x, "//table[3]//a[contains(#href,'/magazine/') and not(contains(#href, '/recherche/'))]/#href")

Find word (not containing substrings) in comma separated string

I'm using a linq query where i do something liike this:
viewModel.REGISTRATIONGRPS = (From a In db.TABLEA
Select New SubViewModel With {
.SOMEVALUE1 = a.SOMEVALUE1,
...
...
.SOMEVALUE2 = If(commaseparatedstring.Contains(a.SOMEVALUE1), True, False)
}).ToList()
Now my Problem is that this does'n search for words but for substrings so for example:
commaseparatedstring = "EWM,KI,KP"
SOMEVALUE1 = "EW"
It returns true because it's contained in EWM?
What i would need is to find words (not containing substrings) in the comma separated string!
Option 1: Regular Expressions
Regex.IsMatch(commaseparatedstring, #"\b" + Regex.Escape(a.SOMEVALUE1) + #"\b")
The \b parts are called "word boundaries" and tell the regex engine that you are looking for a "full word". The Regex.Escape(...) ensures that the regex engine will not try to interpret "special characters" in the text you are trying to match. For example, if you are trying to match "one+two", the Regex.Escape method will return "one\+two".
Also, be sure to include the System.Text.RegularExpressions at the top of your code file.
See Regex.IsMatch Method (String, String) on MSDN for more information.
Option 2: Split the String
You could also try splitting the string which would be a bit simpler, though probably less efficient.
commaseparatedstring.Split(new Char[] { ',' }).Contains( a.SOMEVALUE1 )
what about:
- separating the commaseparatedstring by comma
- calling equals() on each substring instead of contains() on whole thing?
.SOMEVALUE2 = If(commaseparatedstring.Split(',').Contains(a.SOMEVALUE1), True, False)

Resources