\underline appearing too low in MathJax - vector

I prefer to write vectors with the underline symbol:
$\underline{x} \; \underline{x}_k \; \underline{y} \; \underline{z}$
But when I do this in MathJax, the appearence is much different from "normal" LaTeX to PDF workflow; the spacing is much higher.
Normal LaTeX:
MathJax:
The underlines in MathJax are placed lower than they should.
How can I adjust the distance between a symbol and its underline in MathJax?
Here an other example:
$\underline{x} \, \underline{\smash{x}} \, \underline{y} \, \underline{\smash{y}}
\quad \underline{x}_k \, \underline{\smash{x}}_k \, \underline{y}_k \, \underline{\smash{y}}_k
\quad \underline{x}{}_k \, \underline{\smash{x}}{}_k \, \underline{y}{}_k \, \underline{\smash{y}}{}_k $
LaTeX:
MathJax:

Related

Remove latin1 chars from a file

My file is utf8 but contains several latin1 chars namely other foreign languages. My aim is to get rid of these chars using a Unix command. Earlier when i tried to achieve this by removing all the non-ASCII chars, the below command went ahead and removed all the accented chars as well. I wanted to retain the accented chars on the same hand i wanted to remove only the non-english(mandarain, japanese, korean, thai, arabic) terms from the file.
grep --color='auto' -P -n "[\x80-\xFF]" file.txt -> this command helped me remove non-ASCII chars but it also removes the accented chars(í, æ, Ö etc)...is it possible to get
888|Jobin|Matt|NORMALSQ|YUOZ|IOP|OPO|洁|ID12|doorbell|geo#xyx.comd
1011|ICE|LAND|邵|DUY|DUY|123|EOP|dataset1|geo#xyx.com
53101|炜|GUTTI|RR|Hi|London|UK|WLU|GB|dataset1|陈
สัอ |JOH|LIU|ABC|DUY|DUY|57T2P|EOP|unknown|geo#xyx.com
เมื่รกเริ่ม|JOH|LIU|ABC|DUYសា|DUY|57T2P|EOP|unknown|geo#xyx.com
👼|👼🏻| RAVI|OLE|Hi|London|UK|NA|GB|unknown| WELSH#WELSH.COM
Rogério|Davies|Hi|USA|US|WLU|US|unknown| USA#WELSH.COM
Balázs| Roque| Hi|USA|US|WLU|US|unknown| USA#WELSH.COM
Johny|Peniç| Hi|USA|US|WLU|US|unknown| USA#WELSH.COM
Mike|Mane| Hi | USA |US|WLU|US|unknown| USA#WELSH.COM
Output:
888|Jobin|Matt|NORMALSQ|YUOZ|IOP|OPO||ID12|doorbell|geo#xyx.comd
1011|ICE|LAND||DUY|DUY|57T2P|EOP|dataset1|geo#xyx.com
53101||GUTTI|RR|Hi|London|UK|WLU|GB|dataset1|
|JOH|LIU|ABC|DUY|DUY|57T2P|EOP|unknown|geo#xyx.com
|JOH|LIU|ABC|DUY|DUY|57T2P|EOP|unknown|geo#xyx.com
|| RAVI|OLE|Hi|London|UK|NA|GB|unknown| WELSH#WELSH.COM
Rogério|Davies|Hi|USA|US|WLU|US|unknown| USA#WELSH.COM
Balázs| Roque| Hi|USA|US|WLU|US|unknown| USA#WELSH.COM
Johny|Peniç| Hi|USA|US|WLU|US|unknown| USA#WELSH.COM
Mike|Mane| Hi | USA |US|WLU|US|unknown| USA#WELSH.COM
You can use the Unicode Properties to detect characters that belong to the Latin and Basic Latin, which are the ones you seem to want preserved. Perl supports them in regular expressions:
perl -CSD -pe 's/[^\p{Basic Latin}\p{Latin}]//g' file.txt
(but it doesn't change 123 to 57T2P)
-CSD turns on UTF-8 decoding/encoding of input and output
-p reads the input line by line and prints each line after processing
s/PATTERN/REPLACEMENT/g is a global replacement, it replaces all occurrences of PATTERN by the replacement, in this case the replacement is empty
[...] introduces a character class, ^ at the beginning negates it, i.e. we want to match anything that's not Latin or Basic Latin.
If you really have UTF-8 and want to keep only the extended ascii characters (aka, usually, latin1), iconv may work for you.
iconv -c -f UTF8 -t LATIN1 input_file > output_file
-c Silently discard characters that cannot be converted instead
of terminating when encountering such characters.
Here is the most non-elegant solution to your problem:
$ sed -e 's/[^,./#|[:space:]0-9[=a=][=b=][=c=][=d=][=e=][=f=][=g=][=h=][=i=][=j=][=k=][=l=][=m=][=n=][=o=][=p=][=q=][=r=][=s=][=t=][=u=][=v=][=w=][=x=][=y=][=z=][=A=][=B=][=C=][=D=][=E=][=F=][=G=][=H=][=I=][=J=][=K=][=L=][=M=][=N=][=O=][=P=][=Q=][=R=][=S=][=T=][=U=][=V=][=W=][=X=][=Y=][=Z=]]//g' file.txt
To my big surprise I could not use [:punct:] because some of the symbols are actually defined as punctuation.

r-markdown: German quotation marks break bold text in HTML document

When German quotation marks („ and “ or HTML code „ and “, see https://unicode-table.com/de/201E/ and https://unicode-table.com/de/201C/) are in between bold text markers **...**, then pandoc does not render the text bold when I knit in RStudio. Even worse, the **s are printed verbatim in the HTML document.
Example:
---
output: html_document
lang: de
---
This is a **„Test“**.
Another **„Test“**.
This **"just works"**.
Result:
Are there any pandoc options or workarounds for solving this problem?
Note that a similar question was answered for PDF output in r-markdown: German quotation marks. But I need HTML output.
The issue tracking input of localized quotes is https://github.com/jgm/pandoc/issues/661.
Meanwhile, I recommend using non-typographic quotes (") and for HTML-output use the --html-q-tags option and some CSS, like:
q {
quotes: '„' '“';
}
My workaround: I made use of the command line tool sed and regular expressions:
First, modify the .Rmd (or .md) file and replace all the German typographic quotation marks with standard quotation marks: (WARNING: commands change the file inplace!)
sed -i 's/„/"/g' mydocument.Rmd
sed -i 's/“/"/g' mydocument.Rmd
Knit the document (or convert it to HTML with pandoc).
Then, replace all the English typographic quotation marks with German ones:
sed -i "s/“/„/g" mydocument.html
sed -i "s/”/“/g" mydocument.html

Count number of slashes in string with zsh

I try to get the number of slashes in a string with zsh. I thought it should work like this (replace all non / chars and then count them)
foo="sdfds/sf/sdf/sdf/sd/f//sdf/"
print ${#foo//[^/]/}
But i get preexec:26: bad pattern: [^. It seems to work for all characters except /. I tried to escape it with backslashes but it did not work until I added 3 backslashes:
print ${#foo//[^\\\/]/}
Why do I need to find an escaped slash in the string?
edit: Yes, it seems to work with one backslash using zsh -f.
My setopt:
autocd
autopushd
nobeep
completeinword
correct
extendedglob
extendedhistory
nohistbeep
histfindnodups
histignorealldups
histignoredups
histignorespace
histnofunctions
histnostore
histreduceblanks
histsavenodups
histverify
nohup
incappendhistory
interactive
interactivecomments
longlistjobs
monitor
nonomatch
promptsubst
pushdignoredups
sharehistory
shinstdin
transientrprompt
zle

Using sed to extract morse code from a text file

I have an assignment to use 'sed' to extract morse code (dashes and periods) from a text file containing the following
A test to see if the morse code can be removed from a file. .--- -. ..
This is a test --. -.- .-- .. -.. --- .- .. of sorts and so on. Let's see if the code snippets can be found.
Also can they be .- . -.- removed and yet leave the periods at the end
of sentences alone. ---- -. There are also hyphenated words like the
following: Edgar-Jones. -.
Now I could use sed to remove all of the characters [a-z] and [A-Z] but the problem would be the periods at the end of sentences would get picked up as well as the hyphen in Edgar-Jones. I just can't find a way to take those out as well...
Any help would be appreciated, thanks
Thanks for all the answers, every one was helpful. This is the one I went with
sed "s/[a-zA-Z][-.]//g;s/[a-zA-Z: ']*//g" file
It finds an instance of a dash or a period that follows a character and removes that first which is what I was having trouble with. Then it goes and cleans up the rest of the characters and whitespace and colons and apostrophes.
Thanks again!
sed 's/\(^\|[[:blank:]]\)[^[:blank:]]*[^-.[:blank:]][^[:blank:]]*/ /g' file
.--- -. ..
--. -.- .-- .. -.. --- .- ..
.- . -.-
---- -.
-.
That regular expression is:
the beginning of the line, or a space
some non-whitespace chars
followed by a character that is not whitespace or a morse character
followed by some non-whitespace characters
This identifies words that have at least one non-morse character in them, and then replaces them with a single space.
Simpler with GNU grep, too bad you can't use it:
grep -oP '(?<=^|\s)[.-]+(?=\s|$)' file
Here is an awk to can fix this.
awk '{for (i=1;i<=NF;i++) if ($i!~/[a-zA-Z0-9]/) printf "%s ",$i;print ""}' file
.--- -. ..
--. -.- .-- .. -.. --- .- ..
.- . -.-
---- -.
-.
This test every field, and if it contains a-z do not print it.
Or as Glenn commented:
awk '{for (i=1;i<=NF;i++) if ($i~/^[.-]+$/) printf "%s ",$i;print ""}' file
this sed one-liner should do the job :
extract morse code (dashes and periods)
on your example file:
sed "s/[a-zA-Z][-.]//g;s/[a-zA-Z: ']*//g" file
test with your file:
kent$ cat f1
A test to see if the morse code can be removed from a file. .--- -. ..
This is a test --. -.- .-- .. -.. --- .- .. of sorts and so on. Let's see if the code snippets can be found.
Also can they be .- . -.- removed and yet leave the periods at the end
of sentences alone. ---- -. There are also hyphenated words like the
following: Edgar-Jones. -.
kent$ sed "s/[a-zA-Z][-.]//g;s/[a-zA-Z: ']*//g" f1
.----...
--.-.-.--..-..---.-..
.-.-.-
-----.
-.
sed 's/\.$//
s/\([^-[:space:].]\{1,\}[-.]\{0,1\}\)*//g
s/\([[:space:]]\)\{2,\}/\1/g
' YourFile
replace multispace by 1
posix version

SED character after the substitute command ("s")

I know about s// type command in sed, however never saw using s#. Could someone explain what exactly this is doing?
% sed -e "s#SRC_DIR=.*#SRC_DIR=$PROJECT_SRC_DIR#g" -i proj.cfg
I understand that -e defines a script to execute, and the script is withing "", but what exactly s# does?
Checked http://www.grymoire.com/Unix/Sed.html and gnu website, but no luck.
# is a sed delimiter like /. We could use ~, #, /, ;, etc as sed delimiters. They uses a different delimiter # because they don't want to escape / slashes. If you use # as delimiter, you don't need to escape / forward slash. But if you use / as delimiter, you must need to escape / as \/ or otherwise sed would consider / as delimiter.
From sed's manual:
The syntax of the s (as in substitute) command is ‘s/regexp/replacement/flags’. The / characters may be uniformly replaced by any other single character within any given s command. The / character (or whatever other character is used in its stead) can appear in the regexp or replacement only if it is preceded by a \ character.

Resources