I have refactored a man page's paragraph so that each sentence is it's own line. When rendering with man ./somefile.3 The output is slightly different.
Let me show an example:
This is line 1. This is line 2.
vs.
This is line 1.
This is line 2.
Are rendering like so:
First:
This is line 1. This is line 2.
Second:
This is line 1. This is line 2.
There is an extra space between the sentences. Note that I have made sure that there is no extra white space. I have more experience with Latex, asciidoc, and markdown and I can control that there, is it possible with troff/groff? I'd like to avoid that if possible. I don't think it should be there.
The troff input standard is to have a newline at the end of each sentence, and to let the typesetter do its job with filling. (Althought I doubt it was the intent, it does make it play nicer with source control.) Therefore, it considers sentence ends to be at the end of a line that ends with a period (or ? or !, and optionally followed by ',",*,],),or †). It also believes that sentences should have two spaces between them. This almost certainly derives from the typography standards at Bell Labs at the time; It's rather curious that this behavior is not settable through any fill modes.
groff does provide a way to set the "inter-sentence" spacing, with the extended .ss request:
.ss word_space_size [sentence_space_size]
Change the size of a space between words. It takes its units as one
twelfth of the space width parameter for the current font. Initially
both the word_space_size and sentence_space_size are 12. In fill mode,
the values specify the minimum distance.
If two arguments are given to the ss request, the second argument sets
the sentence space size. If the second argument is not given, sentence
space size is set to word_space_size. The sentence space size is used
in two circumstances: If the end of a sentence occurs at the end of a
line in fill mode, then both an inter-word space and a sentence space
are added; if two spaces follow the end of a sentence in the middle of
a line, then the second space is a sentence space. If a second
argument is never given to the ss request, the behaviour of UNIX troff
is the same as that exhibited by GNU troff. In GNU troff, as in UNIX
troff, a sentence should always be followed by either a newline or two
spaces.
So you can specify that the "sentence space" should be zero-width by making the request
.ss 12 0
As far as I know, this is a groff extension; heirloom troff supports it, but older dwb derived versions may not.
Example:
This is line 1. This is line 2.
This is line 1. This is line 2.
This is line 1.
This is line 2.
SET SENTENCE SPACING
.ss 12 0
This is line 1. This is line 2.
This is line 1. This is line 2.
This is line 1.
This is line 2.
Results:
$ groff -T ascii spaces.tr |sed -n -e/./p
This is line 1. This is line 2.
This is line 1. This is line 2.
This is line 1. This is line 2.
SET SENTENCE SPACING
This is line 1. This is line 2.
This is line 1. This is line 2.
This is line 1. This is line 2.
So the following will work, but I hope there is a better option.
This is line 1. \
This is line 2.
renders as
This is line 1. This is line 2.
Related
I am trying to achieve the below using XQuery
Input
<DemoXML>
This is a sample line one
this is line number two
this line contains multiple spaces
paragraph ends
</DemoXML
Required Output(Two Records)
<Record1>
This is a sample line one
this line contains multiple spaces
paragraph ends
</Record1>
<Record2>
This is a sample line one
this line contains multiple spaces
paragraph ends
</Record2>
I tried using Tokenize but the problem is tokenize function removes all the 'Spaces' in the secondline.
this is line number two
fn:tokenize($input,'\n')
Tokenize Output
This is a sample line one
this is line number two
this line contains multiple spaces
paragraph ends
Can someone let me know a workaround plz
Your attached query is working fine. Also attached generated output for your reference. May be issue in processor which you are using. I test this query in Marklogic console and Oxygen Editor with XQuery 9.6.0.7
let $val1:=
This is a sample line one
this is line number two
this line contains multiple spaces
paragraph ends
return tokenize($val1,'\n')
Generate Output:
This is a sample line one this is line number two this line contains multiple spaces paragraph ends
I have got an expression – ]006IRBTS1[ g600 niT erauqS ehcoirB g004 g001 /p 57.01$ hcnuB /p 51.2$
I want to extract the portion in bold. The logic is:
Start with “]“.
Take everything until you get to “[“ including “[“.
Include the next 10 characters/digits whatever it is.
After those 10 characters/digits, include all letters and white spaces
until you hit a digit. Capture the digit and everything that follows until you hit a whitespace.
I am using the following regular expression in R. It doesn’t work of course. Any thoughts?
"^].+\\[.{10}[A-Za-z\\s]+[0-9\\.]+\\s"
1) Start with “]“.
\]
2) Take everything until you get to “[“ including “[“.
[^\[]+\[
3) Include the next 10 characters/digits whatever it is.
.{,10}
4) After those 10 characters/digits, include all letters and white spaces until you hit a digit.
[a-zA-Z\s]+\d
5) Capture the digit and everything that follows until you hit a whitespace.
[^\s]+
Combined:
\][^\[]+\[.{,10}[a-zA-Z\s]+\d[^\s]+
Regex101: https://regex101.com/r/TpoV52/1
UPDATE
I changed the very last quantifier from + to * so it can match some or none more characters.
That is because given "Capture the digit and everything that follows until you hit a whitespace" it is possible that after that digit immediately a whitespace follows. This is the case in the 2nd subject string you gave in your comment:
]006IRBTS1[ g600 niT erauqS ehcoirB g4 g001 /p 57.01$ hcnuB /p 51.2$
The updated pattern, below, will stop at the "capture that digit" (g4) because "and everything that follows until you hit a whitespace" is actually nothing. (Whitespace is next char after digit.)
\][^\[]+\[.{,10}[a-zA-Z\s]+\d[^\s]*
Regex101: https://regex101.com/r/TpoV52/2
I have a table with a cell which contains a textframe which contains a table. In some of the cells inside the inside table, I have added a paragraph. Inside the paragraph I have placed text, via the addtext method, like "WordA WordB". The cell's size will cause a line break between "WordA" and "WordB".
The problem is that I am expecting:
WordA
WordB
What I am getting is:
WordA
WordB
Is there a setting somewhere to get what I expect or is this a bug in the renderer?
I think it's a bug - a bug that typically shows when words are longer than the column width allows.
For typical scenarios (short words in wide columns) this problem will not show up. With long words in narrow columns you sometimes get this bug. Hyphens or soft hyphens in long words will allow MigraDoc to break the words correctly.
It was a bug in the paragraph rederer (ParagraphRenderer.cs). There were actually 2 bugs I found. The first is if the current line doesn't fit and the next "Text" is a blank (" "). The second is if the current line is a blank (" ") and the next line doesn't fit.
The first bug was easy to fix, I changed the HandleNonFittingLine subroutine to keep advancing until it this.currentLeaf is not a blank (" ").
The second bug was harder to figure out and fix. I had to get the Format function to find the next leaf and pass the Current property, of the Next Leaf, to the FormatElement (if the next leaf exists). I then had to modify the FormatElement function to optionally take a second parameter. Then I modified the FormatElement function by returning FormatResult.Ignore if the current leaf is a blank (" ") and the next leaf doesn't fit on the current line with the blank (" ").
Is there some trick or workaround that allows me to write a code block that starts with a space in reStructuredText? Something like:
This line is indented
This line is not
The naïve attempt:
::
This line is indented
This line is not
obvoiusly doesn't work (the second line is not interpreted as part of the same block. Visually I could obtain the same thing by using a non-breaking space, but then it would affect copy-paste.
I am using ed a unix line editor and the book i'm reading says to type 1,$p
(also works in vim)
after trial and error I figured the first value means the line number but whats the purpose to $p? from what i can tell is the 1 goes to the beginning of the line and $p goes to the EOF and displays to me everything it picked up. is this true or am i way off?
The 1,$ part is a range. The comma separates the beginning and end of the range. In this case, 1 (line 1) is the beginning, and $ (EOF) is the end. The p means print, which is the command the range is being given to, and yes.. it displays to you what is in that range.
In vim you can look at :help :range and :help :print to find out more about how this works. These types of ranges are also used by sed and other editors.
They probably used the 1,$ terminology in the tutorial to be explicit, but note that you can also use % as its equivalent. Thus, %p will also print all the lines in the file.