changing specific letters in a Turkish text with R

changing specific letters in a Turkish text with R - r

I am analyzing a Turkish text. I need to change some letters in the text. Since in Turkish there are ş ç ı ğ ü ö letters. I want to change them with s c i g u o. How can I handle it?
I have used following for the one letter, but it did not work, nothing changed in the text.
gsub("s","ş" , text)
Any help would be appreciated.

We can use chartr. Example
> string <- "ş ç ı ğ ü ö f s x q"
> chartr("ş ç ı ğ ü ö", "s c i g u o", string)
[1] "s c i g u o f s x q"
Another alternative is stri_trans_general from stringi package
> library(stringi)
> stri_trans_general(string, "latin-ascii")
[1] "s c i g u o f s x q"

Related

How to decrypt the monoalphabetic substitution cipher message through substitution cipher using linux commands

I have been trying to decrypt a message which is seed labs task. I have to use linux commands. They have provided the guidelines but as I am new to this I couldn't find proper help.
What commands do I need to run in order to decrypt this message?The instructions are attached below. The ciphertext.txt file is attached as well which I need to decrypt in the plain text.
ciphertext.txt
ytn xqavhq yzhu xu qzupvd ltmat qnncq vgxzy hmrty vbynh ytmq ixur qyhvurn
vlvhpq yhme ytn gvrrnh bnniq imsn v uxuvrnuvhmvu yxx
ytn vlvhpq hvan lvq gxxsnupnp gd ytn pncmqn xb tvhfnd lnmuqynmu vy myq xzyqny
vup ytn veevhnuy mceixqmxu xb tmq bmic axcevud vy ytn nup vup my lvq qtvenp gd
ytn ncnhrnuan xb cnyxx ymcnq ze givasrxlu eximymaq vhcavupd vaymfmqc vup
v uvymxuvi axufnhqvymxu vq ghmnb vup cvp vq v bnfnh phnvc vgxzy ltnytnh ytnhn
xzrty yx gn v ehnqmpnuy lmubhnd ytn qnvqxu pmpuy ozqy qnnc nkyhv ixur my lvq
nkyhv ixur gnavzqn ytn xqavhq lnhn cxfnp yx ytn bmhqy lnnsnup mu cvhat yx
vfxmp axubimaymur lmyt ytn aixqmur anhncxud xb ytn lmuynh xidcemaq ytvusq
ednxuratvur

First of all, you need to perform a frequency analysis on your cipher text. There are many online tools available to do that, but the most powerful I found was this one:
http://www.brianveitch.com/maze-runner/frequency-analysis/index.html
Based on your cipher text, you need to make assumptions and replace each letter one by one and then analyze the final result to be sure that your answer makes sense. The more correct guesses you'll make, the more closer you will get and eventually, you'll be able to crack the final mono-alphabetic code.
Based on the cipher text you provided in your ciphertext.txt file, the following results are true (Replace the lowercase letters with uppercase letters).
n - E
y - T
v - A
t - H
x - O
u - N
h - R
b - F
q - S
i - L
m - I
r - G
p - D
c - M
s - K
z - U
a - C
d - Y
k - X
l - W
e - P
g - B
f - V
j - Q
o - Z
A quick way to do this is by using tr.
tr 'nyvtxuhbqimrpcszadklegfjo' 'ETAHONRFSLIGDMKUCYXWPBVQZ' < test.txt > out.txt

Comparing strings to numeric

When I write the following code
"hello" > 9
It is assessed as TRUE. Why is that the case? What is the coercion that happens in the background for R to assess this as TRUE?
I was thinking 9 would be coerced to '9' but didn't know how R establishes the order of strings.
"Hello" > 9
[1] TRUE

You're right that > coerces the number to a string before comparing.
?">" says:
Comparison of strings in character vectors is lexicographic within
the strings using the collating sequence of the locale in use: see
‘locales’. The collating sequence of locales such as ‘en_US’ is
normally different from ‘C’ (which should use ASCII) and can be
surprising.
Lexicographic order means letter-by-letter comparison as in a dictionary; one often-surprising result of this is that "10"<"2".
Interpreting this, it means that whether "9" is greater or less than "H" in your example will depend on where "9" and "H" fall in the collating sequence (the internal order of symbols/letters/numbers etc.)
The end of example(">") generates a table of the collating sequence: on my machine, you can see that the numbers come before all of the letters ...
writeLines(strwrap(paste(sort(x), collapse=" "), width = 60))
   _ - , ; : ! ¡ ? ¿ . · ' " « » ( ) [ ] { } § ¶ # * / \ &
# % ` ´ ^ ¯ ¨ ¸ ° © ® + ± ÷ × < = > ¬ | ¦ ~ ¤ ¢ $ £ ¥ 0 1 ¹
½ ¼ 2 ² 3 ³ ¾ 4 5 6 7 8 9 a A ª á Á à À â Â å Å ä Ä ã Ã æ Æ
b B c C ç Ç d D ð Ð e E é É è È ê Ê ë Ë f F g G h H i I í Í
ì Ì î Î ï Ï j J k K l L m M n N ñ Ñ o O º ó Ó ò Ò ô Ô ö Ö õ
Õ ø Ø p P q Q r R s S ß t T u U ú Ú ù Ù û Û ü Ü v V w W x X
y Y ý Ý ÿ z Z þ Þ µ

User specified function with operators in R

I want to use a user-specified function and apply the function to a list of values. I envision that the user will give a 'formula' as a character string containing the names of variable and operators, e.g. "a * b %% c - d / e ^ f + g %/% h".
The following toy example works
prmlist <- list(a=1:10, b=21:30, c=31:40, d=4, e=5, f=6, g=7, h=8)
with(prmlist, a * b %% c - d / e ^ f + g %/% h)
The problem starts when I want to use this approach within a function. To do that I must get the 'formula' specified by the user inside the function. A character string seems the obvious route. The question is how to evaluate it inside the function. do.call() doesn't seem to be suited because the operators are each really a function. I hoped something simple like
my.formula <- "a * b %% c - d / e ^ f + g %/% h"
with(prmlist, eval(my.formula))
would work but it doesn't.

You can envoke your operators using substitute() instead:
my.formula <- substitute(a * b %% c - d / e ^ f + g %/% h)
with(prmlist, eval(my.formula))
[1] 20.99974 43.99974 68.99974 95.99974 124.99974 155.99974 188.99974
[8] 223.99974 260.99974 299.99974
Update: If the command is a string you can use parse:
myCmd <- "a * b %% c - d / e ^ f + g %/% h"
with(prmlist, eval( parse(text=myCmd) ))
[1] 20.99974 43.99974 68.99974 95.99974 124.99974 155.99974 188.99974
[8] 223.99974 260.99974 299.99974

How to remove/add spaces in all textfiles?

I have several files that look like these, e.g. test.in:
apple foo bar
hello world
I need to achieve this desired output, a space after every character:
a p p l e f o o b a r
h e l l o w o r l d
I though possibly i'll first remove all spaces and then add spaces to each character, as such:
sed 's/\s//g' test.in | sed -e 's/\(.\)/\1 /g'
but is there other ways?

This awk may do:
awk -v FS="" '{gsub(/ /,"");$1=$1}1' file
a p p l e f o o b a r
h e l l o w o r l d
This first remove all space, then since FS (Field Separator) is set to nothing, the $1=$1 reconstruct all fields with one space.
This does not add space at the end as most of the other sed and perl command here.
Or based on sed posted here.
awk '{gsub(/ /,"");gsub(/./,"& ")}1' file
a p p l e f o o b a r
h e l l o w o r l d

You can combine your two sed commands into a single command instead:
$ sed 's/\s//g;s/./& /g' test.in
a p p l e f o o b a r
h e l l o w o r l d
Note the use of . and & instead of \(.\) and \1.
On systems that do not support \s to designate matching whitespace, you can use [[::blank::]] instead:
$ sed 's/[[:blank:]]//g;s/./& /g' test.in
a p p l e f o o b a r
h e l l o w o r l d

Through perl,
$ perl -ple 's/([^ ]|^)(?! )/\1 /g' file
a p p l e f o o b a r
h e l l o w o r l d
Add an inline edit option -i to save the changes made,
perl -i -ple 's/([^ ]|^)(?! )/\1 /g' file

sed 's/ //g;s/./& /g' filename
&: refers to that portion of the pattern space which matched

Or maybe something like this with sed :
$ sed 's/./& /g;s/ //g' file
a p p l e f o o b a r
h e l l o w o r l d

This might work for you (GNU sed):
sed 's/\B/ /g' file

How to justify this symbol in MathType

I have a formula in MathType attached below. But I could not justify the position of $+\infty$ symbol. I want it appear just after the "${$" and aligns to the left of the second term.
Thank you for your help.
The LaTex Code:
${{R}{1}}\left( {{x}{pi}},{{G}{q}},{{x}{qj}} \right)=\,\left{ \begin{matrix}
+\infty & p=q \
\underset{l=1}{\overset{d}{\mathop \sum }}\,({{x}{pi}}\left[ l \right]-{{x}{qj}}\left[ l \right])\left( 2\left( {{x}{qj}}\left[ l \right]-{{{\bar{x}}}{q}}\left[ l \right] \right)+({{x}{pi}}\left[ l \right]-{{x}{qj}}\left[ l \right])(\left| {{G}{q}} \right|-1)/|{{G}{q}}| \right) & p\ne q \
\end{matrix} \right.$

I must use array statement instead of matrix.
LaTex Code:
[
{{R}{1}}\left( {{x}{pi}},{{G}{q}},{{x}{qj}} \right)=\,\left{ \begin{array}{#{}lc}
+\infty & p=q \
\underset{l=1}{\overset{d}{\mathop \sum }}\,({{x}{pi}}\left[ l \right]-{{x}{qj}}\left[ l \right])\left( 2\left( {{x}{qj}}\left[ l \right]-{{{\bar{x}}}{q}}\left[ l \right] \right)+({{x}{pi}}\left[ l \right]-{{x}{qj}}\left[ l \right])(\left| {{G}{q}} \right|-1)/|{{G}{q}}| \right) & p\ne q \
\end{array} \right.
]

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

changing specific letters in a Turkish text with R - r

We can use chartr. Example > string <- "ş ç ı ğ ü ö f s x q" > chartr("ş ç ı ğ ü ö", "s c i g u o", string) [1] "s c i g u o f s x q" Another alternative is stri_trans_general from stringi package > library(stringi) > stri_trans_general(string, "latin-ascii") [1] "s c i g u o f s x q"

Related

How to decrypt the monoalphabetic substitution cipher message through substitution cipher using linux commands

Comparing strings to numeric

User specified function with operators in R

How to remove/add spaces in all textfiles?

How to justify this symbol in MathType

Categories

Resources