So here's my string:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam elit lacus, dignissim quis laoreet non, cursus id eros. Etiam lacinia tortor vel purus eleifend accumsan. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Quisque bibendum vestibulum nisl vitae volutpat.
I need to split it every 100 characters (full words only) until all the characters are used up.
So we'd end up with:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam elit lacus, dignissim quis laoreet non,
and
cursus id eros. Etiam lacinia tortor vel purus eleifend accumsan. Pellentesque habitant morbi tristique
and
senectus et netus et malesuada fames ac turpis egestas. Quisque bibendum vestibulum nisl vitae volutpat.
Any ideas on the best way to do that?
Since Daniel replied with actual code similar to my description, I'm gonna go with a different suggestion. I might be one character off with count. This code prints the start/end offsets and the substrings. What YOU need to do is modify this to save the strings in an array instead:
<%
Dim LoremIpsum
LoremIpsum = "Lorem ipsum dolor sit amet....."
Response.Write LoremIpsum & "<br>"
SplitWords LoremIpsum, 100
Function SplitWords(text, maxlen)
Dim c, i, j, l
l = Len(text)
i = 1
j = maxlen
Do While (j < l And Response.IsClientConnected)
c = Mid(text, j, 1)
Do While (c <> " " And j > i)
j = j - 1
c = Mid(text, j, 1)
Loop
Response.Write(i & "<br>")
Response.Write(j & "<br>")
s = Mid(text, i, j-i)
Response.Write(s & "<br>")
i = j
j = j + maxlen
Loop
End Function
%>
First you may want to split your string with the space character as a delimiter. Then start with an empty string, iterate over each word in the array, concatenate each word to the new string until the number of words exceeds 100:
str = "Lorem ipsum ...."
words = Split(str)
stringSection = ""
wordCounter = 0
FOR EACH word IN words
stringSection = stringSection & word
wordCounter = wordCounter + 1
IF wordCounter >= 100 THEN
Response.Write(stringSection & "<BR /><BR />")
wordCounter = 0
stringSection = ""
ELSE
stringSection = stringSection & " "
END IF
NEXT
Response.Write(stringSection & "<BR /><BR />")
Note that the last Response.Write is necessary to handle the last stringSection, even though it might have not exceeded the 100 words.
I needed to count the spaces as well as have it as a function...here is what I came up with...
Function wordSubstring(txtString,maxLen)
words = Split(txtString," ")
charCounter = 0
stringSection = ""
For Each word IN words
stringSection = stringSection & word
charCounter = len(stringSection)
if charCounter >= maxLen Then
wordSubstring=stringSection
exit For
else
stringSection = stringSection & " "
end If
Next
wordSubstring = stringSection
End Function
Related
I have a data frame in R that is basically this,a body of text with a line brake string (\r\n) sprinkled through out :
df <- data.frame (text = c("non consectetur a erat nam at lectus urna duis convallis convallis tellus id interdum velit laoreet id donec ultrices tincidunt arcu non sodales neque sodales ut etiam sit amet nisl purus in mollis nunc sed id semper risus in hendrerit gravida rutrum quisque non tellus orci ac auctor augue mauris augue neque gravida in fermentum et sollicitudin ac orci phasellus egestas \r\n tellus rutrum tellus pellentesque eu tincidunt tortor aliquam nulla facilisi cras fermentum odio eu feugiat pretium nibh ipsum consequat nisl vel pretium lectus quam id leo in vitae turpis massa sed elementum tempus egestas sed sed risus pretium quam vulputate dignissim suspendisse in est ante in nibh mauris cursus mattis molestie a iaculis \r\n at erat pellentesque adipiscing commodo elit at imperdiet dui accumsan sit amet nulla facilisi morbi tempus iaculis urna id volutpat lacus laoreet non curabitur gravida arcu ac \r\n tortor dignissim convallis aenean et tortor at risus viverra adipiscing at in tellus integer feugiat scelerisque varius morbi enim nunc faucibus a pellentesque sit amet porttitor eget dolor morbi non arcu "))
How can I split this string into several dataframe entires each time the page break character (\r\n) appears?
The R base strsplit() function can be used to split text data into separate strings based on the page break character you mentioned ("\r\n"). The resulting object is a list that you can bind the list into a data frame:
df_list <- strsplit(df$text, "\r\n")
df_split <- data.frame(text = unlist(df_list))
We could use separate_longer_delim
library(tidyr)
separate_longer_delim(df, text, delim = "\r\n")
From the string
s <- "|tree| Lorem ipsum dolor sit amet, |house| consectetur adipiscing elit,
|street| sed do eiusmod tempor incididunt ut labore et |car| dolore magna aliqua."
I want to extract the text after the letters within the |-symbols.
My approach:
words <- list("tree","house","street","car")
for(word in words){
expression <- paste0("^.*\\|",word,"\\|\\s*(.+?)\\s*\\|.*$")
print(sub(expression, "\\1", s))
}
This works fine for all but the last wortd car. It instead returns the entire string s.
How can I modify the regex such that for the last element of words-list in prints out dolore magna aliqua..
\Edit: Previously the list with expressions was a,b,c,d. Solutions to this specific problem cannot be generalized very well.
Try this:
library(stringi)
s <- '|a| Lorem ipsum dolor sit amet, |b| consectetur adipiscing elit,
|c| sed do eiusmod tempor incididunt ut labore et |d| dolore magna aliqua.'
stri_split_regex(s, '\\|[:alpha:]\\|')
[[1]]
[1] "" " Lorem ipsum dolor sit amet, "
[3] " consectetur adipiscing elit, \n" " sed do eiusmod tempor incididunt ut labore et "
[5] " dolore magna aliqua."
You can try this pattern
library(stringr)
s <- "|tree| Lorem ipsum dolor sit amet, |house| consectetur adipiscing elit,
|street| sed do eiusmod tempor incididunt ut labore et |car| dolore magna aliqua."
str_extract_all(s, regex("(?<=\\|)\\w+(?=\\|)"))
#[1] "tree" "house" "street" "car"
(?<=\\|): Look behind, position following by |; \\|: is an escape for |
\\w: word characters
(?=\\|): Lookahead, position followed by |
I suggest extracting all the words with corresponding values using stringr::str_match_all:
s <- "|tree| Lorem ipsum dolor sit amet, |house| consectetur adipiscing elit,
|street| sed do eiusmod tempor incididunt ut labore et |car| dolore magna aliqua."
words1 <- list("tree","house","street","car")
library(stringr)
expression <- paste0("\\|(", paste(words1, collapse="|"),")\\|\\s*([^|]*)")
result <- str_match_all(s, expression)
lapply(result, function(x) x[,-1])
See the R demo
Output:
[[1]]
[,1] [,2]
[1,] "tree" "Lorem ipsum dolor sit amet, "
[2,] "house" "consectetur adipiscing elit, \n"
[3,] "street" "sed do eiusmod tempor incididunt ut labore et "
[4,] "car" "dolore magna aliqua."
The regex is
\|(tree|house|street|car)\|\s*([^|]*)
See the regex demo, details:
\| - a | char
(tree|house|street|car) - Group 1: one of the words
\| - a | char
\s* - 0 or more whitespace chars
([^|]*) - Group 2: any 0 or more chars other than |.
I'm currently trying to divide up a dataset of text documents (coded in UTF-8) by paragraph in R, but I'm having trouble getting them into the format I want for tidytext, which is a single column of the different paragraphs.
My data so far looks something like this:
list <- c("Lorem ipsum dolor sit amet, movet omittantur ut vel, vim an offendit prodesset. Sumo summo intellegam vel ei, dicunt persecuti vim ne. Lorem noluisse at est. Per ex postulant philosophia, ut vel amet affert tantas, pro ne consetetur scriptorem. Id mel aeque deleniti.
Nam ut erat eligendi, pro eu minim molestie persequeris. Civibus interesset te nec, cu aeque fabellas luptatum has. Ad usu nominati tractatos. Eu voluptatum disputationi vis, alienum delicatissimi pri eu. Et molestie copiosae nam, ex vix ignota dignissim. Dico suas illum at mea, no case modus antiopam sea.
Ius te copiosae lobortis contentiones. Est ceteros dissentiet ne, qui malis iuvaret tacimates an. Vivendo erroribus nec no. No quo corpora indoctum iracundia, mel ad mollis accusam praesent. Sit at admodum sensibus mediocrem, no pri decore nemore.",
"Lorem ipsum dolor sit amet, movet omittantur ut vel, vim an offendit prodesset. Sumo summo intellegam vel ei, dicunt persecuti vim ne. Lorem noluisse at est. Per ex postulant philosophia, ut vel amet affert tantas, pro ne consetetur scriptorem. Id mel aeque deleniti.
Nam ut erat eligendi, pro eu minim molestie persequeris. Civibus interesset te nec, cu aeque fabellas luptatum has. Ad usu nominati tractatos. Eu voluptatum disputationi vis, alienum delicatissimi pri eu. Et molestie copiosae nam, ex vix ignota dignissim. Dico suas illum at mea, no case modus antiopam sea.
Ius te copiosae lobortis contentiones. Est ceteros dissentiet ne, qui malis iuvaret tacimates an. Vivendo erroribus nec no. No quo corpora indoctum iracundia, mel ad mollis accusam praesent. Sit at admodum sensibus mediocrem, no pri decore nemore.",
"Lorem ipsum dolor sit amet, movet omittantur ut vel, vim an offendit prodesset. Sumo summo intellegam vel ei, dicunt persecuti vim ne. Lorem noluisse at est. Per ex postulant philosophia, ut vel amet affert tantas, pro ne consetetur scriptorem. Id mel aeque deleniti.
Nam ut erat eligendi, pro eu minim molestie persequeris. Civibus interesset te nec, cu aeque fabellas luptatum has. Ad usu nominati tractatos. Eu voluptatum disputationi vis, alienum delicatissimi pri eu. Et molestie copiosae nam, ex vix ignota dignissim. Dico suas illum at mea, no case modus antiopam sea.
Ius te copiosae lobortis contentiones. Est ceteros dissentiet ne, qui malis iuvaret tacimates an. Vivendo erroribus nec no. No quo corpora indoctum iracundia, mel ad mollis accusam praesent. Sit at admodum sensibus mediocrem, no pri decore nemore.")
df <- as.data.frame(list)
df_spl <- str_split(df$list, "\n", n = Inf)
df_spl
Basically it's a large list of different vectors that have different paragraphs in them from each original row.
What I ultimately want is a single column vector with all the list items, like this:
vector <- c("Lorem ipsum dolor sit amet, movet omittantur ut vel, vim an offendit prodesset. Sumo summo intellegam vel ei, dicunt persecuti vim ne. Lorem noluisse at est. Per ex postulant philosophia, ut vel amet affert tantas, pro ne consetetur scriptorem. Id mel aeque deleniti.", "Nam ut erat eligendi, pro eu minim molestie persequeris. Civibus interesset te nec, cu aeque fabellas luptatum has. Ad usu nominati tractatos. Eu voluptatum disputationi vis, alienum delicatissimi pri eu. Et molestie copiosae nam, ex vix ignota dignissim. Dico suas illum at mea, no case modus antiopam sea.", "Ius te copiosae lobortis contentiones. Est ceteros dissentiet ne, qui malis iuvaret tacimates an. Vivendo erroribus nec no. No quo corpora indoctum iracundia, mel ad mollis accusam praesent. Sit at admodum sensibus mediocrem, no pri decore nemore.", "Lorem ipsum dolor sit amet, movet omittantur ut vel, vim an offendit prodesset. Sumo summo intellegam vel ei, dicunt persecuti vim ne. Lorem noluisse at est. Per ex postulant philosophia, ut vel amet affert tantas, pro ne consetetur scriptorem. Id mel aeque deleniti." "Nam ut erat eligendi, pro eu minim molestie persequeris. Civibus interesset te nec, cu aeque fabellas luptatum has. Ad usu nominati tractatos. Eu voluptatum disputationi vis, alienum delicatissimi pri eu. Et molestie copiosae nam, ex vix ignota dignissim. Dico suas illum at mea, no case modus antiopam sea.", "Ius te copiosae lobortis contentiones. Est ceteros dissentiet ne, qui malis iuvaret tacimates an. Vivendo erroribus nec no. No quo corpora indoctum iracundia, mel ad mollis accusam praesent. Sit at admodum sensibus mediocrem, no pri decore nemore.", "Lorem ipsum dolor sit amet, movet omittantur ut vel, vim an offendit prodesset. Sumo summo intellegam vel ei, dicunt persecuti vim ne. Lorem noluisse at est. Per ex postulant philosophia, ut vel amet affert tantas, pro ne consetetur scriptorem. Id mel aeque deleniti.", "Nam ut erat eligendi, pro eu minim molestie persequeris. Civibus interesset te nec, cu aeque fabellas luptatum has. Ad usu nominati tractatos. Eu voluptatum disputationi vis, alienum delicatissimi pri eu. Et molestie copiosae nam, ex vix ignota dignissim. Dico suas illum at mea, no case modus antiopam sea.", "Ius te copiosae lobortis contentiones. Est ceteros dissentiet ne, qui malis iuvaret tacimates an. Vivendo erroribus nec no. No quo corpora indoctum iracundia, mel ad mollis accusam praesent. Sit at admodum sensibus mediocrem, no pri decore nemore.")
I've already tried commands like cbind(), stack(), and unnest(), but none of them have gotten me that single column :(
Any help would be greatly, greatly appreciated! Thanks!!
We can unlist the list element into avectorandpaste` if we need a single string
out <- paste(unlist(df_spl), collapse=" ")
To turn a list into a vector you can use:
unlist(df_spl)
I am trying to search for movies with style as "live action". I have a huge movies file with a number of nodes. I have included only one relevant node to explain my problem.
I have movies.xml as follows:
<movies><movie id="movie89" lang="hebrew">
<label>Metro-Goldwyn-Myer</label>
<title>Purus massa pede gravida erat ad etiam eu auctor blandit laoreet.</title>
<directors>
<director id="person166">Suzanne Gipson</director>
<director id="person912" award="win">Kurt Meadows</director>
<director id="person26" award="win">Harry Jacobs</director>
</directors>
<actors>
<actor role="deuteragonist" id="person337">Luis Munoz</actor>
<actor role="supporting" id="person207" award="win">Hazel Boucher</actor>
<actor role="deuteragonist" id="person595">Lori Neace</actor>
<actor role="supporting" id="person293">Lauren Bradburn</actor>
<actor role="deuteragonist" id="person591">Leona Rosenzweig</actor>
<actor role="supporting" id="person285">Susan Peterson</actor>
<actor role="supporting" id="person242" award="nom">Larry Gaudin</actor>
<actor role="protagonist" id="person76">Thelma Getter</actor>
<actor role="deuteragonist" id="person2">Zachery Cooley</actor>
<actor role="protagonist" id="person834">Harold James</actor>
<actor role="deuteragonist" id="person956">William Hayes</actor>
<actor role="supporting" id="person931">Christi Abbott</actor>
<actor role="protagonist" id="person666">Larry James</actor>
<actor role="protagonist" id="person253">Gary Scanlon</actor>
<actor role="deuteragonist" id="person744" award="win">Rachel Deloach</actor>
<actor role="supporting" id="person183">Patricia Wilmoth</actor>
<actor role="supporting" id="person831">Linda Rayner</actor>
<actor role="protagonist" id="person115">Sadie King</actor>
<actor role="protagonist" id="person949">Fred Breden</actor>
<actor role="deuteragonist" id="person45">Fern Seibold</actor>
<actor role="protagonist" id="person337">Luis Munoz</actor>
</actors>
<extras>
<sponsor id="person130">David Hellman</sponsor>
<stunt_person id="person335">Robert Harris</stunt_person>
<screenwriter id="person120">Toni Carlin</screenwriter>
<artist id="person433" award="nom">James Munroe</artist>
<screenwriter id="person113">Latoya Martin</screenwriter>
<executive_producer id="person840">Mirian Ritchie</executive_producer>
<sponsor id="person43">William Wilder</sponsor>
<choreographer id="person143">Lola Myklebust</choreographer>
<stunt_person id="person781" award="nom">Melvin Garza</stunt_person>
<dialogue_writer id="person50">Maude Ward</dialogue_writer>
<casting id="person422">William Bullock</casting>
<costume_designer id="person169">Luke Robinson</costume_designer>
<cinematographer id="person549">Murray Mosser</cinematographer>
<cinematographer id="person518" award="nom">James Davis</cinematographer>
<casting id="person90" award="nom">Barbara Sheppard</casting>
<business_partner id="person269">Kevin Rhodes</business_partner>
<stunt_person id="person986" award="nom">Jeremy Earp</stunt_person>
<screenwriter id="person848">Mary Hall</screenwriter>
<sponsor id="person925">Jennifer Hager</sponsor>
<sponsor id="person758" award="win">William Austin</sponsor>
<sponsor id="person440">Pauline Carter</sponsor>
<costume_designer id="person259">Wallace Gravatt</costume_designer>
<business_partner id="person716">David Crowder</business_partner>
<casting id="person916">Derek Thompson</casting>
<artist id="person89" award="nom">Patricia Sloan</artist>
<stunt_person id="person610">James Garrett</stunt_person>
<sponsor id="person861">Richard Moeller</sponsor>
<locations id="person982">Helen Fountain</locations>
<choreographer id="person618">Roy Penick</choreographer>
<dialogue_writer id="person630">Mary Hernandez</dialogue_writer>
<executive_producer id="person885">Linda Welborn</executive_producer>
<cinematographer id="person326">Wava Huntsinger</cinematographer>
<executive_producer id="person60" award="win">Jennifer Gibson</executive_producer>
<executive_producer id="person45" award="nom">Fern Seibold</executive_producer>
<casting id="person233">Clifford Hall</casting>
<cinematographer id="person120">Toni Carlin</cinematographer>
<business_partner id="person764">Emily Hicks</business_partner>
<business_partner id="person229">Robert Campbell</business_partner>
<choreographer id="person940">Jeffrey Vaughan</choreographer>
<dialogue_writer id="person328">Benjamin Riekena</dialogue_writer>
<dialogue_writer id="person634">Michael Smartt</dialogue_writer>
<casting id="person927">Tommie Young</casting>
<choreographer id="person908">Dorothy Varner</choreographer>
<costume_designer id="person489">Naomi Kempinski</costume_designer>
<cinematographer id="person714">Sarah Wilgus</cinematographer>
<producer id="person170">Janet Zaiser</producer>
<casting id="person572">Brianna Price</casting>
<executive_producer id="person35">Suzanne Wright</executive_producer>
<sponsor id="person645">Paula Montoya</sponsor>
<cinematographer id="person282">Stacy Espinoza</cinematographer>
<choreographer id="person965" award="nom">Charles Salamone</choreographer>
<business_partner id="person547">Annie Rafferty</business_partner>
</extras>
<genres>
<genre>musical</genre>
</genres>
<styles>
<style>live action</style>
</styles>
<released>
<release region="NA">1930-2-27</release>
<release region="CAUC">1982-10-19</release>
</released>
<formats>
<format type="CD">
<description region="AF">LD</description>
</format>
<format type="DVD">
<description region="ME">3D</description>
</format>
<format type="BD">
<description region="CAUC">3D</description>
</format>
<format type="CD">
<description region="EU">3D</description>
</format>
<format type="DVD">
<description region="SEA">LD</description>
</format>
<format type="Stream">
<description region="SEA">3D</description>
</format>
<format type="CD">
<description region="EE">3D</description>
</format>
<format type="DVD">
<description region="ME">4K</description>
</format>
</formats>
<budget>19831177</budget>
<earnings>
<ticket_sales>38287157</ticket_sales>
<digital_sales>
<ppv_sales>18822458</ppv_sales>
<stream_sales>34280453</stream_sales>
<disc_sales>3977022</disc_sales>
</digital_sales>
</earnings>
<length unit="mins">310</length>
<country>Falkland Islands (Malvinas)</country>
<reviews>
<review proid="critic1">
<text/>
</review>
<review proid="critic2">
<text>Vitae massa.<emph>Curae fames elit gravida libero nibh suspendisse cubilia dis.</emph>
<br>Massa porta gravida.</br>
<bold>Proin neque pede commodo et leo ve pellentesque iaculis quam nunc diam id fames massa suspendisse eleifend.</bold>
<strong>Lacus proin netus inceptos class justo nunc pharetra sollicitudin.</strong>
</text>
</review>
<review proid="critic3">
<text>Neque class morbi rhoncus iaculis duis habitant.Nulla class primis aenean proin blandit vulputate.Curae morbi.<keyword>Class porta tellus litora nascetur odio sed litora.</keyword>
<strong>Proin curae.</strong>
<bold>Magna nulla lobortis urna sagittis fames nisl fermentum.</bold>Justo risus fermentum pharetra diam posuere ac enim congue mus egestas.<keyword>Class risus.</keyword>
</text>
</review>
<review proid="critic4">
<text>
<bold>Justo morbi libero aptent cum pretium ve fermentum velit duis sem penatibus velit fermentum vel blandit donec.</bold>
<strong>Lorem fames platea vel.</strong>
<emph>Dolor morbi lacus ullamcorper cum at congue.</emph>Porta donec sociosqu est tempor a adipiscing.Magna felis.<keyword>Nulla risus etiam metus nostra.</keyword>
<strong>Donec dolor volutpat massa felis mattis sollicitudin penatibus class ve pellentesque nullam.</strong>
</text>
</review>
<review proid="critic5">
<text>
<bold>Morbi magna pretium orci a ante justo tristique est lacus.</bold>
<bold>Morbi purus ullamcorper.</bold>Felis augue.Donec etiam nibh hendrerit luctus a nascetur amet dapibus venenatis amet natoque.</text>
</review>
<review proid="critic6">
<text>
<bold>Lorem risus platea mi.</bold>Curae purus scelerisque facilisi ante.Porta nulla elit cras sed montes.<keyword>Ipsum metus sit ipsum enim cubilia etiam feugiat eu pede massa tortor nisl urna.</keyword>Ipsum justo.<emph>Neque ipsum neque dis integer placerat at.</emph>Risus augue aliquam nisi etiam.</text>
</review>
<review proid="critic7">
<text>Fames justo.Augue nulla.<bold>Velit morbi neque lobortis nec donec.</bold>
<keyword>Fusce morbi bibendum dis justo dui aliquam placerat.</keyword>Class fusce.Netus proin aptent et elit lobortis iaculis eget ad.</text>
</review>
</reviews>
<synopsis>
<text>
<emph>Augue metus vel lacus tristique viverra tellus mi mus hymenaeos sodales pellentesque malesuada etiam tortor felis at montes commodo rhoncus.</emph>Proin donec litora eu ligula nonummy ut cras suspendisse id cursus egestas venenatis eu elementum.Class donec ornare nulla tellus eget natoque.<strong>Porta fames massa purus donec augue magna.</strong>
</text>
</synopsis>
<ratings max="5">
<rating proid="critic1">4</rating>
<rating proid="critic2">2</rating>
<rating proid="critic3">1</rating>
<rating proid="critic4">3</rating>
<rating proid="critic5">4</rating>
<rating proid="critic6">2</rating>
</ratings>
<tracklist>
<track position="1" duration="1265">
<title>Metus felis interdum nibh ad.</title>
</track>
<track position="2" duration="387">
<title>Morbi porta hendrerit commodo consectetuer.</title>
</track>
<track position="3" duration="403">
<title>Velit metus porta cursus ornare interdum sed.</title>
</track>
<track position="4" duration="1519">
<title>Etiam fames risus a orci augue consequat odio condimentum.</title>
</track>
<track position="5" duration="468">
<title>Donec dolor libero platea integer natoque.</title>
</track>
<track position="6" duration="781">
<title>Proin lorem rutrum magna neque fames purus.</title>
</track>
</tracklist>
<awards>
<award type="nomination">BAFTA Best Film</award>
<award type="nomination">Cannes Best Film</award>
<award type="win">Cannes Best Film</award>
</awards>
</movie></movies>
I am trying to use the following query:
for $doc in db:open("movies","movies.xml")/movies/movie/styles
where style="live action"
return $doc/..
However, nothing gets selected even though the above movie node contains style with "live action" in it.
What am I doing wrong?
You access the context item, which is not bound by flwor-expressions. Use the variable instead:
for $doc in db:open("movies","movies.xml")/movies/movie/styles
where $doc/style="live action"
return $doc/..
Additionally, if you want the matching movies as a result, I'd go for looping over the movies instead, which results in more readable code and results tighter coupled together on what you're looping over. Furthermore, I changed the variable name to a more "speaking" one:
for $movie in db:open("movies","movies.xml")/movies/movie
where $movie/styles/style="live action"
return $movie
Finally, a simple predicate would also do without an explicit loop (now, the context item is bound, and you don't need to use a variable any more):
db:open("movies","movies.xml")/movies/movie[styles/style="live action"]
Suppose I have data like the following:
lab <- "A really really long string!"
dat <- data.frame(grp = paste(1:6,lab),x=1:6,y=runif(6))
When plotting a legend with strings this long, sometimes it can be a challenge to get the legend to fit nicely. If I have to I can always abbreviate the strings to shorten them, but I was wondering if it's possible (most likely using some grid magic) to 'wrap' a legend across multiple rows or columns. For instance, say I position the legend on the bottom, horizontally:
ggplot(dat,aes(x=x,y=y,colour=grp)) + geom_point() +
opts(legend.position="bottom",legend.direction="horizontal")
Is it possible to get this legend to display as two rows of three, rather than one row of six?
To wrap long strings, use strwrap.
lipsum <- "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur ullamcorper tellus vitae libero placerat aliquet egestas tortor semper. Maecenas pellentesque euismod tristique. Donec semper interdum magna, commodo vehicula ante hendrerit vitae. Maecenas at diam sollicitudin magna mollis lobortis. In nibh elit, tincidunt eu lobortis ac, molestie a felis. Proin turpis leo, iaculis non commodo quis, venenatis at justo. Duis in magna vel erat fringilla gravida quis non nisl. Nunc lacus magna, varius eu luctus vel, luctus tristique sapien. Suspendisse mi dolor, vestibulum at facilisis elementum, lacinia vitae metus. Etiam ut nisl urna, vel tempus mi. In hac habitasse platea dictumst. Quisque pretium volutpat felis, nec tempor diam faucibus at. Praesent volutpat posuere sapien, eu vulputate risus molestie vitae. Proin iaculis quam non leo porttitor hendrerit."
strwrap(lipsum)
cat(strwrap(lipsum), sep = "\n")
# Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur ullamcorper tellus
# vitae libero placerat aliquet egestas tortor semper. Maecenas pellentesque euismod
# tristique. Donec semper interdum magna, commodo vehicula ante hendrerit vitae. Maecenas
# at diam sollicitudin magna mollis lobortis. In nibh elit, tincidunt eu lobortis ac,
# molestie a felis. Proin turpis leo, iaculis non commodo quis, venenatis at justo. Duis
# in magna vel erat fringilla gravida quis non nisl. Nunc lacus magna, varius eu luctus
# vel, luctus tristique sapien. Suspendisse mi dolor, vestibulum at facilisis elementum,
# lacinia vitae metus. Etiam ut nisl urna, vel tempus mi. In hac habitasse platea
# dictumst. Quisque pretium volutpat felis, nec tempor diam faucibus at. Praesent
# volutpat posuere sapien, eu vulputate risus molestie vitae. Proin iaculis quam non leo
# porttitor hendrerit.
Try this. I wrote this for very long titles but it works for any long string.
You still have to figure out the linelength for your instance.
# splits title of plot if to long
splittitle=function(title,linelength=40)
{
spltitle<-strsplit(title,' ')
splt<-as.data.frame(spltitle)
title2<-NULL
title3<-NULL
titlelength<-round(nchar(title)/round(nchar(title)/linelength))
dimsplt<-dim(splt)
n=1
doonce2=0
for(m in 1:round(nchar(title)/linelength)){
doonce=0
doonce2=0
for(l in n:dimsplt[1]){
if(doonce==0){title2<-title3}
title2=paste(title2,splt[l,],sep=' ')
if(doonce2==0){if(nchar(title2)>=(titlelength*m)){title3=paste(title2,'\n',sep='')
n<-(l+1)
doonce2=1}
}
doonce=1
}
}
title2
}
lab <- "A really really long string!A really really long string!A really really long string!A really really long string!A really really long string!A really really long string!A really really long string!A really really long string!"
lab2<-splittitle(lab)
cat(lab)
cat(lab2)
library('ggplot2')
1 original
dat <- data.frame(grp = paste(1:6,lab2),x=1:6,y=runif(6))
ggplot(dat,aes(x=x,y=y,colour=grp)) + geom_point() +
opts(legend.position="bottom",legend.direction="horizontal")
2 using splittitle
dat <- data.frame(grp = paste(1:6,lab2),x=1:6,y=runif(6))
ggplot(dat,aes(x=x,y=y,colour=grp)) + geom_point() +
opts(legend.position="bottom",legend.direction="horizontal")
The earlier mentioned splittitle almost works, but for example
> splittitle("abc defg hi jkl m", 6)
[1] " abc defg\n hi\n jkl m"
does not really give you what you want...
One trick is to use RGraphics::splitString which
"Splits a single string into multiple lines (by inserting line breaks)
so that the output will fit within the current viewport."
Then you just change the viewport temporarily. The function below did the trick for me, but is still only a quick & dirty -solution. I used it to wrap a legend title.
library(RGraphics)
multiLines <- function(text, maxWidth=11) {
textLen = nchar(text)
maxHeight = ceiling(textLen/maxWidth)*1.5
vp=viewport(width=maxWidth,height=maxHeight, default.units="char")
pushViewport(vp) #activate the viewport
text2 = splitString(text) #given vp, split the text
popViewport() #get rid of it
return(text2)
}