Modify xml_document in officer in R

Modify xml_document in officer in R - r

I have an rdocx and I want to manipulate something in the xml code. That's my document:
library(officer)
doc <- read_docx() %>%
body_add_par("centered text", style = "centered") %>%
slip_in_seqfield("STYLEREF 1 \\s") %>%
slip_in_text("\u2011") %>%
slip_in_seqfield(sprintf("SEQ %s \\* ARABIC \\s 1", "Table")) %>%
slip_in_text(str_c(": ", "My Caption")) %>%
body_bookmark("my_bookmark")
With doc$doc_obj$get() I can get the xml code with classes xml_document and xml_node. Now I want to replace some code, in detail I want the part with w:bookmarkEnd to appear later so the bookmarked part gets bigger. How can I achieve this? If I could achieve this with str_replace it would be awesome.

You can use run_bookmark() as in the following example (the manual does not state that lists are supported, I'll add that info soon):
library(officer)
bkm <- run_bookmark(
bkm = "test",
list(
run_word_field(field = "SEQ tab \\* ARABIC \\s"),
ftext(" Table", prop = fp_text_lite(color = "red"))
)
)
doc <- read_docx()
doc <- body_add_fpar(
x = doc,
value = fpar(bkm)
)
# how to make a reference to the bkm
doc <- body_add_fpar(
x = doc,
value = fpar(run_reference("test"))
)
print(doc, "zz.docx")

Related

Alternate portrait and landscape sections using officer package in R

I would like to make a .pdf report that alternates between portrait (for text) and landscape (for large figures) sections. I use the officer package in R which generates a .docx, that I can convert into .pdf using Word or LibreOffice.
I made some attempts, but I have the following problems: I have a blank portrait page at the end, which I would like to remove, and if I convert into pdf it adds blank pages between the portrait and landscapes pages. You can also detect these blank pages in word by numbering the pages (they switch for 1 to 3 will skipping the 2), or looking at the impression viewer. This problem is explained in http://wordfaqs.ssbarnhill.com/BlankPage.htm for how to deal with them in Word, but I would like a solution to remove those blank pages using officer because I will have hundreds of sections alternating between portrait and landscape to deal with.
Here is my attempt:
library(officer)
doc_1 <- read_docx()
doc_1 <- body_add_par(doc_1, value = "Portrait")
doc_1 <- body_end_block_section(doc_1, block_section(prop_section()))
doc_1 <- body_add_par(doc_1, value = "Landscape")
doc_1 <- body_end_section_landscape(doc_1)
temp <- tempfile(fileext = ".docx")
temp
print(doc_1, target = temp)
# system(paste0('open "', temp, '"'))
The answer of David (underneath) improves my problem, but it does remove some of the portrait orientations when I try to iterate it using body_add_docx (which I use for efficiency reasons, see https://github.com/davidgohel/officer/issues/184):
library(officer)
portrait_section_prop <- prop_section(page_size = page_size(orient = "portrait"))
landscape_section_prop <- prop_section(page_size = page_size(orient = "landscape"))
core <- function(i){
doc_1 <- read_docx() |>
body_add_par(value = paste("Portrait", i)) |>
body_end_block_section(value = block_section(portrait_section_prop)) |>
body_add_par(value = paste("Landscape", i)) |>
body_end_block_section(value = block_section(landscape_section_prop)) |>
body_set_default_section(landscape_section_prop)
return(doc_1)
}
accu <- core(1)
for(i in 2:10){
doc_1 <- core(i)
temp <- tempfile(fileext = ".docx")
print(doc_1, temp)
accu <- body_add_docx(accu, temp)
}
print(accu, target = tempfile(fileext = ".docx")) |> browseURL()

Here is the code you need, you need to define the same default section than the one you want to end the document so that Word agree to not add a page:
library(officer)
portrait_section_prop <- prop_section(page_size = page_size(orient = "portrait"))
landscape_section_prop <- prop_section(page_size = page_size(orient = "landscape"))
doc_1 <- read_docx() |>
body_add_par(value = "Portrait") |>
body_end_block_section(value = block_section(portrait_section_prop)) |>
body_add_par(value = "Landscape") |>
body_end_block_section(value = block_section(landscape_section_prop)) |>
body_set_default_section(landscape_section_prop)
temp <- tempfile(fileext = ".docx")
print(doc_1, target = temp)

How to put a subscript in a caption of a Word document using officer in R

I am trying to add a subscript to a caption generated in officer.
I am creating the caption like this:
library(officer)
doc <- read_docx('empty_file.docx')
autonum <- run_autonum(seq_id = 'fig', pre_label = 'Figure ')
caption <- block_caption(label='My caption includes SO2.', style = "caption", autonum = autonum)
doc <- body_add_caption(doc, caption)
print(doc, target = 'output.docx'))
However, now I'd like to put the '2' in 'SO2' in subscript. I know how to generate subscripts:
fp_text_prop <- fp_text(color='orange')
prop_mod <- update(fp_text_prop, vertical.align = 'subscript')
paragraph <- fpar(ftext('SO', prop = fp_text_prop), ftext('2', prop = prop_mod)))
But I can't use the resulting fpar inside a caption, as body_add_caption expects the output from block_caption and block_caption expects a normal string as argument for the label=.
How can I put an fpar, or alternatively, a subscript inside a caption?

I have found a solution that is a bit involved but seems to work.
library(officer)
doc <- read_docx('empty_file.docx')
autonum <- run_autonum(seq_id = 'fig', pre_label = 'Figure ')
fp_text_prop <- fp_text(color='orange')
prop_mod <- update(fp_text_prop, vertical.align = 'subscript')
caption <- fpar(autonum, ftext('SO', prop = fp_text_prop), ftext('2', prop = prop_mod)))
doc <- body_add_fpar(x=doc, value=caption, style = 'caption')
print(doc, target = 'output.docx'))
There are a couple of caveats: fp_text_prop should match the normal style of the captions and style = 'caption' should be changed to pick the correct style for captions in your document.

Using officer in R to hyperlink to another slide within a flextable cell

Using officer in R, I've used ph_slidelink() to hyperlink a text box to another slide in the presentation, and I've used compose() and hyperlink_text() to hyperlink a cell within a flextable. My question is: is there a way to combine these, and to hyperlink to another slide in the presentation within the cell of a flextable?
Here's a very simple example of code I'd like to transform:
library(officer)
library(flextable)
library(magrittr)
ft <- data.frame(slide_number = seq(3)) %>%
flextable() %>%
width(width = 3)
doc <- read_pptx() %>%
add_slide() %>%
ph_with("Table of Contents", location = ph_location_label("Title 1")) %>%
ph_with(ft, location = ph_location_label("Content Placeholder 2"))
for (i in seq(3)) {
doc <- doc %>%
add_slide() %>%
ph_with(paste("Slide", i), location = ph_location_label("Title 1"))
}
print(doc, target = "~/Desktop/officer_example.pptx" )
...and in this case I'd like the 1/2/3 in the table of contents (here) to link to slides 1/2/3.
Is this possible?

Table and Figure cross-reference officer R

I would like to be able to cross-reference a table or figure in a word document using the officer R package.
I have come across these materials so far but they do not seem to have a solution:
https://davidgohel.github.io/officer/articles/word.html#table-and-image-captions
and a similar question
add caption to flextable in docx
In both of these I can only insert a caption as a level 2 header and not a true table caption.
What I want to be able to do in Word is Insert -> Cross-reference and go to Reference type: Table and see my caption there. Right now I can only see the caption under Numbered item.
Does this functionality exist in officer or anywhere else?

In word, the table numbers use the { SEQ \\# arabic } pattern, but references to them use { REF bookmark \h }. We can use this to make new code which can reference a SEQ field.
code:
ft <- regulartable(head(iris)) # create flextable
str <- paste0(' REF ft \\h ') # create string to be used as reference to future bookmark
doc <- read_docx() %>%
body_add_par('This is my caption' , style = 'Normal') %>% # add caption
slip_in_seqfield(str = "SEQ Table \\# arabic",
style = 'Default Paragraph Font',
pos = "before") %>% # add number for table
body_bookmark('ft') %>% # add bookmark on the number
slip_in_text("Table ",
style = 'Default Paragraph Font',
pos = "before") %>% # add the word 'table'
body_add_flextable(value = ft, align = 'left') %>% # add flextable
body_add_break() %>% # insert a break (optional)
slip_in_text('As you can see in Table',
style = 'Default Paragraph Font',
pos = 'after') %>% # add the text you want before the table reference
slip_in_seqfield(str = str,
style = 'Default Paragraph Font',
pos = 'after') %>% # add the reference to the table you just added
slip_in_text(', there are a lot of iris flowers.',
style = 'Default Paragraph Font',
pos = 'after') %>% # add the rest of the text
print('Iris_test.docx') # print
Hope this helps :)

Just for the record, you can do this a bit easier now by using some helper functions from the {crosstable} package.
Disclaimer: I am the developer of that package and these functions were highly inspired by #morgan121's answer. Thanks Morgan!
Here is an example:
library(officer)
library(crosstable)
library(ggplot2)
options(crosstable_units="cm")
ft = regulartable(head(iris))
my_plot = ggplot(data = iris ) +
geom_point(mapping = aes(Sepal.Length, Petal.Length))
doc = read_docx() %>%
body_add_title("Dataset iris", 1) %>%
body_add_normal("Table \\#ref(table_iris) displays the 6 first rows of the iris dataset.") %>%
body_add_flextable(ft) %>%
body_add_table_legend("Iris head", bookmark="table_iris") %>%
body_add_normal("Let's add a figure as well. You can see in Figure \\#ref(fig_iris) that sepal length is somehow correlated with petal length.") %>%
body_add_figure_legend("Relation between Petal length and Sepal length", bookmark="fig_iris") %>%
body_add_gg2(my_plot, w=14, h=10, scale=1.5)
print(doc , 'Iris_test.docx')
More info on https://danchaltiel.github.io/crosstable/articles/crosstable-report.html.
As with morgan121's code, you have to select all the text in MS Word and press F9 twice for the numbers to update properly.

insert an external docx using "officer" pakage

Why doesn't work body_add_docx method in package "officer"? Where did I make mistake?
library(officer)
library(magrittr)
read_docx(path = "/home/user/page1.docx") %>% # load page1.docx as base document
body_add_break() %>% # add page break
body_add_docx(src="/home/user/page2.docx") %>% #FIXME: This method doesn't work
print(target = "/home/user/out.docx") # out.docx conteins only page1.docx !?

Code below works only for Windows, MS Word and only without page break.
For Linux, LibreOffice, google document it doesn't work.
library(officer)
library(magrittr)
read_docx(path = "/home/user/page1.docx") %>%
# body_add_break() %>% # with page break it doesn't work
body_add_docx(src="/home/user/page2.docx") %>% # only for Widows and MS Word
print(target = "/home/user/out.docx")

The function body_add_docx is using a MS Word feature. When the document is edited, the content of the file is copied in the main document, but that only happens when the document is edited by Word. LibreOffice and gdoc probably don't have this feature implemented (at least I am not aware of them).
The script below is producing the expected document only when edited with Word:
library(officer)
library(magrittr)
read_docx() %>%
body_add_par("hello world 1", style = "Normal") %>%
print(target = "doc1.docx")
read_docx() %>%
body_add_par("hello world 2", style = "Normal") %>%
print(target = "doc2.docx")
read_docx(path = "doc1.docx") %>%
body_add_break() %>%
body_add_docx(src="doc2.docx") %>%
print(target = "out.docx")

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Modify xml_document in officer in R - r

Related

Alternate portrait and landscape sections using officer package in R

How to put a subscript in a caption of a Word document using officer in R

Using officer in R to hyperlink to another slide within a flextable cell

Table and Figure cross-reference officer R

insert an external docx using "officer" pakage

Categories

Resources