Comparing strings to numeric - r

When I write the following code
"hello" > 9
It is assessed as TRUE. Why is that the case? What is the coercion that happens in the background for R to assess this as TRUE?
I was thinking 9 would be coerced to '9' but didn't know how R establishes the order of strings.
"Hello" > 9
[1] TRUE

You're right that > coerces the number to a string before comparing.
?">" says:
Comparison of strings in character vectors is lexicographic within
the strings using the collating sequence of the locale in use: see
‘locales’. The collating sequence of locales such as ‘en_US’ is
normally different from ‘C’ (which should use ASCII) and can be
surprising.
Lexicographic order means letter-by-letter comparison as in a dictionary; one often-surprising result of this is that "10"<"2".
Interpreting this, it means that whether "9" is greater or less than "H" in your example will depend on where "9" and "H" fall in the collating sequence (the internal order of symbols/letters/numbers etc.)
The end of example(">") generates a table of the collating sequence: on my machine, you can see that the numbers come before all of the letters ...
writeLines(strwrap(paste(sort(x), collapse=" "), width = 60))
­   _ - , ; : ! ¡ ? ¿ . · ' " « » ( ) [ ] { } § ¶ # * / \ &
# % ` ´ ^ ¯ ¨ ¸ ° © ® + ± ÷ × < = > ¬ | ¦ ~ ¤ ¢ $ £ ¥ 0 1 ¹
½ ¼ 2 ² 3 ³ ¾ 4 5 6 7 8 9 a A ª á Á à À â Â å Å ä Ä ã Ã æ Æ
b B c C ç Ç d D ð Ð e E é É è È ê Ê ë Ë f F g G h H i I í Í
ì Ì î Î ï Ï j J k K l L m M n N ñ Ñ o O º ó Ó ò Ò ô Ô ö Ö õ
Õ ø Ø p P q Q r R s S ß t T u U ú Ú ù Ù û Û ü Ü v V w W x X
y Y ý Ý ÿ z Z þ Þ µ

Related

C++ reverse XOR operator?

I have a hexadecimal number which i XOR with another hexadecimal number.
I only know one of those hexadecimal numbers but i know the result of the XOR operation.
Example
0x35 ^ x = 0x39
Is there a way to get x?
You can get x with
x = 0x35 ^ 0x39
For XOR :
a = b ^ c <=> b = a ^ c <=> c = a ^ b

How to decrypt the monoalphabetic substitution cipher message through substitution cipher using linux commands

I have been trying to decrypt a message which is seed labs task. I have to use linux commands. They have provided the guidelines but as I am new to this I couldn't find proper help.
What commands do I need to run in order to decrypt this message?The instructions are attached below. The ciphertext.txt file is attached as well which I need to decrypt in the plain text.
ciphertext.txt
ytn xqavhq yzhu xu qzupvd ltmat qnncq vgxzy hmrty vbynh ytmq ixur qyhvurn
vlvhpq yhme ytn gvrrnh bnniq imsn v uxuvrnuvhmvu yxx
ytn vlvhpq hvan lvq gxxsnupnp gd ytn pncmqn xb tvhfnd lnmuqynmu vy myq xzyqny
vup ytn veevhnuy mceixqmxu xb tmq bmic axcevud vy ytn nup vup my lvq qtvenp gd
ytn ncnhrnuan xb cnyxx ymcnq ze givasrxlu eximymaq vhcavupd vaymfmqc vup
v uvymxuvi axufnhqvymxu vq ghmnb vup cvp vq v bnfnh phnvc vgxzy ltnytnh ytnhn
xzrty yx gn v ehnqmpnuy lmubhnd ytn qnvqxu pmpuy ozqy qnnc nkyhv ixur my lvq
nkyhv ixur gnavzqn ytn xqavhq lnhn cxfnp yx ytn bmhqy lnnsnup mu cvhat yx
vfxmp axubimaymur lmyt ytn aixqmur anhncxud xb ytn lmuynh xidcemaq ytvusq
ednxuratvur
First of all, you need to perform a frequency analysis on your cipher text. There are many online tools available to do that, but the most powerful I found was this one:
http://www.brianveitch.com/maze-runner/frequency-analysis/index.html
Based on your cipher text, you need to make assumptions and replace each letter one by one and then analyze the final result to be sure that your answer makes sense. The more correct guesses you'll make, the more closer you will get and eventually, you'll be able to crack the final mono-alphabetic code.
Based on the cipher text you provided in your ciphertext.txt file, the following results are true (Replace the lowercase letters with uppercase letters).
n - E
y - T
v - A
t - H
x - O
u - N
h - R
b - F
q - S
i - L
m - I
r - G
p - D
c - M
s - K
z - U
a - C
d - Y
k - X
l - W
e - P
g - B
f - V
j - Q
o - Z
A quick way to do this is by using tr.
tr 'nyvtxuhbqimrpcszadklegfjo' 'ETAHONRFSLIGDMKUCYXWPBVQZ' < test.txt > out.txt

changing specific letters in a Turkish text with R

I am analyzing a Turkish text. I need to change some letters in the text. Since in Turkish there are ş ç ı ğ ü ö letters. I want to change them with s c i g u o. How can I handle it?
I have used following for the one letter, but it did not work, nothing changed in the text.
gsub("s","ş" , text)
Any help would be appreciated.
We can use chartr. Example
> string <- "ş ç ı ğ ü ö f s x q"
> chartr("ş ç ı ğ ü ö", "s c i g u o", string)
[1] "s c i g u o f s x q"
Another alternative is stri_trans_general from stringi package
> library(stringi)
> stri_trans_general(string, "latin-ascii")
[1] "s c i g u o f s x q"

Double precision (64-bit) representation of numeric value in R (sign, exponent, significand)

R FAQ states that:
The only numbers that can be represented exactly in R’s numeric type are integers and fractions whose denominator is a power of 2. All other numbers are internally rounded to (typically) 53 binary digits accuracy.
R uses IEEE 754 double-precision floating-point numbers which is
1 bit for sign
11 bits for exponent
52 bits for mantissa (or significand)
which sums up to 64-bits.
For the numeric number 0.1, R represents
sprintf("%.60f", 0.1)
[1] "0.100000000000000005551115123125782702118158340454101562500000"
Double (IEEE754 Double precision 64-bit) gives us this binary representation for 0.1 :
00111111 10111001 10011001 10011001 10011001 10011001 10011001 10011010
How we can get this representation in R and how does it relate to the output given by sprintf in our example?
The answer to the question raised by #chux in the comments is "yes"; R supports the %a format:
sprintf("%a", 0.1)
#> [1] "0x1.999999999999ap-4"
If you want to access the underlying bit pattern, you will have to reinterpret the double as a 64bit integer. For this task one can use C++ via Rcpp:
Rcpp::cppFunction('void print_hex(double x) {
uint64_t y;
static_assert(sizeof x == sizeof y, "Size does not match!");
std::memcpy(&y, &x, sizeof y);
Rcpp::Rcout << std::hex << y << std::endl;
}', plugins = "cpp11", includes = "#include <cstdint>")
print_hex(0.1)
#> 3fb999999999999a
This hexadecimal representation is identical to your binary representation. How does one get to the decimal representation?
The first bit is zero, hence the sign is positive
The exponent is 0x3fb, i.e. 1019 in decimal. Given the exponent bias this corresponds to an actual exponent of -4.
The mantissa is 0x1999999999999a × 2^-52 including the implicit 1, i.e. 2^−52 × 7,205,759,403,792,794.
In total this gives 2^−56 × 7,205,759,403,792,794:
sprintf("%.60f", 2^-56 * 7205759403792794)
#> [1] "0.100000000000000005551115123125782702118158340454101562500000"
Take for example 0.3 into account. Run in R console
> sprintf("%a", 0.3)
[1] "0x1.3333333333333p-2"
Mantissa or Significand
The hex representation 3333333333333 to binary would give us the mantissa (or significand) part. That is
0011001100110011001100110011001100110011001100110011
Exponent
The exponent part (11 bits) should be the offset from 2^(11-1) - 1 = 1023 so as the trailing 3 is p-2 (in the output given by sprintf) we have
-2 + 1023 = 1021
and its binary representation fixed in 11 bits is
01111111101
Sign
As for the sign bit, its 0 for positive and 1 otherwise
Double Precision Representation
So the complete representation is
0 | 01111111101 | 0011001100110011001100110011001100110011001100110011
Another example:
> sprintf("%a", -2.94)
[1] "-0x1.7851eb851eb85p+1"
# Mantissa or Significand
(7851eb851eb85) # base 16
(0111100001010001111010111000010100011110101110000101) # base 2
# Exponent
1 + 1023 = 1024 # base 10
10000000000 # base 2
# So the complete representation is
1 | 10000000000 | 0111100001010001111010111000010100011110101110000101
From decimal to normalized double precion:
library(BMS)
from10toNdp <- function(my10baseNumber) {
out <- list()
# Handle special cases (0, Inf, -Inf)
if (my10baseNumber %in% c(0,Inf,-Inf)) {
if (my10baseNumber==0) { out <- "0000000000000000000000000000000000000000000000000000000000000000" }
if (my10baseNumber==Inf) { out <- "0111111111110000000000000000000000000000000000000000000000000000" }
if (my10baseNumber==-Inf) { out <- "1111111111110000000000000000000000000000000000000000000000000000" }
} else {
signBit <- 0 # assign initial value
from10to2 <- function(deciNumber) {
binaryVector <- rep(0, 1 + floor(log(deciNumber, 2)))
while (deciNumber >= 2) {
theExpo <- floor(log(deciNumber, 2))
binaryVector[1 + theExpo] <- 1
deciNumber <- deciNumber - 2^theExpo }
binaryVector[1] <- deciNumber %% 2
paste(rev(binaryVector), collapse = "")}
#Sign bit
if (my10baseNumber<0) { signBit <- 1
} else { signBit <- 0 }
# Biased Exponent
BiasedExponent <- strsplit(from10to2(as.numeric(substr(sprintf("%a", my10baseNumber), which(strsplit( sprintf("%a", my10baseNumber), "")[[1]]=="p")+1, length( strsplit( sprintf("%a", my10baseNumber), "")[[1]]))) + 1023), "")[[1]]
BiasedExponent <- paste(BiasedExponent, collapse='')
if (nchar(BiasedExponent)<11) {BiasedExponent <- paste(c( rep(0,11-nchar(BiasedExponent)), BiasedExponent),collapse='') }
# Significand
significand <- BMS::hex2bin(substr( sprintf("%a", my10baseNumber) , which(strsplit( sprintf("%a", my10baseNumber), "")[[1]]=="x")+3, which(strsplit( sprintf("%a", my10baseNumber), "")[[1]]=="p")-1))
significand <- paste(significand, collapse='')
if (nchar(significand)<52) {significand <- paste(c( significand,rep(0,52-nchar(significand))),collapse='') }
out <- paste(c(signBit, BiasedExponent, significand), collapse='')
}
out
}
Hence,
from10toNdp(0.1)
# "0011111110111001100110011001100110011001100110011001100110011010"

How to justify this symbol in MathType

I have a formula in MathType attached below. But I could not justify the position of $+\infty$ symbol. I want it appear just after the "${$" and aligns to the left of the second term.
Thank you for your help.
The LaTex Code:
${{R}{1}}\left( {{x}{pi}},{{G}{q}},{{x}{qj}} \right)=\,\left{ \begin{matrix}
+\infty & p=q \
\underset{l=1}{\overset{d}{\mathop \sum }}\,({{x}{pi}}\left[ l \right]-{{x}{qj}}\left[ l \right])\left( 2\left( {{x}{qj}}\left[ l \right]-{{{\bar{x}}}{q}}\left[ l \right] \right)+({{x}{pi}}\left[ l \right]-{{x}{qj}}\left[ l \right])(\left| {{G}{q}} \right|-1)/|{{G}{q}}| \right) & p\ne q \
\end{matrix} \right.$
I must use array statement instead of matrix.
LaTex Code:
[
{{R}{1}}\left( {{x}{pi}},{{G}{q}},{{x}{qj}} \right)=\,\left{ \begin{array}{#{}lc}
+\infty & p=q \
\underset{l=1}{\overset{d}{\mathop \sum }}\,({{x}{pi}}\left[ l \right]-{{x}{qj}}\left[ l \right])\left( 2\left( {{x}{qj}}\left[ l \right]-{{{\bar{x}}}{q}}\left[ l \right] \right)+({{x}{pi}}\left[ l \right]-{{x}{qj}}\left[ l \right])(\left| {{G}{q}} \right|-1)/|{{G}{q}}| \right) & p\ne q \
\end{array} \right.
]

Resources