Character to Integer Ada - ada

I'm trying to turn characters into integers in Ada, nothing seems to work, I've so far been able for it to return DEC from ASCII but I would like to return 0 (Integer).
Character'Pos('0');
returns 48 --I want it to return 0?

You can't convert characters directly to the integer value they represent, but it is possible to do with strings:
Some_Integer_Variable := Some_Integer_Type'Value ("0");
Or if you have a character variable:
Some_Integer_Variable := Some_Integer_Type'Value ((1 => Character_Variable));

Related

How to extract 4 bit unsigned integer from 32 bit R integer?

I am reading binary files in R and need to read 10 bytes, which must be interpreted as 4 bit unsigned integers (2 per byte, so 20 values in range 0..15 I guess).
From my understanding of the docs, this cannot done with readBin directly because the minimal length to read, 1, means 1 byte.
So I think I need to read the data as 1 byte integers and use bit-wise operations to get the 4 bit integers. I found out that the values are stored as 32 bit integers internally by R, and I found this explanation on SO that seems to describe what I want to do. So here is my attempt at an R function that follows the advice:
#' #title Interprete bits start_index to stop_index of input int8 as unsigned integer.
uint8bits <- function(int8, start_index, stop_index) {
num_bits = stop_index - start_index + 1L;
bitmask = bitwShiftL((bitwShiftL(1L, num_bits) -1L), stop_index);
return(bitwShiftR(bitwAnd(int8, bitmask), start_index));
}
However, it does not work as intended, e.g, to get the two numbers out of the read value (255 in this example), I would call the function once to extract bits 1 to 4, and once more for bits 5 to 8:
value1 = uint8bits(255L, 1, 4); # I would expect 15, but the output is 120.
value2 = uint8bits(255L, 5, 8); # I would expect 15, but the output is 0.
What am I doing wrong?
We can use the packBits function to achieve your expected behaviour:
uint8.to.uint4 <- function(int8,start_index,stop_index)
{
bits <- intToBits(int8)
out <- packBits(c(bits[start_index:stop_index],
rep(as.raw(0),32-(stop_index-start_index+1))),type="integer")
return(out)
}
uint8.to.uint4(255L,1,4)
[1] 15
We first convert the integer to a bit vector, then extract the bits you like and pad the number with 0 to achieve the 32bit internal storage length for integers (32 bits). Then we can just convert with the packBits function back to an integer

ord() Function or ASCII Character Code of String with Z3 Solver

How can I convert a z3.String to a sequence of ASCII values?
For example, here is some code that I thought would check whether the ASCII values of all the characters in the string add up to 100:
import z3
def add_ascii_values(password):
return sum(ord(character) for character in password)
password = z3.String("password")
solver = z3.Solver()
ascii_sum = add_ascii_values(password)
solver.add(ascii_sum == 100)
print(solver.check())
print(solver.model())
Unfortunately, I get this error:
TypeError: ord() expected string of length 1, but SeqRef found
It's apparent that ord doesn't work with z3.String. Is there something in Z3 that does?
The accepted answer dates back to 2018, and things have changed in the mean time which makes the proposed solution no longer work with z3. In particular:
Strings are now formalized by SMTLib. (See https://smtlib.cs.uiowa.edu/theories-UnicodeStrings.shtml)
Unlike the previous version (where strings were simply sequences of bit vectors), strings are now sequences unicode characters. So, the coding used in the previous answer no longer applies.
Based on this, the following would be how this problem would be coded, assuming a password of length 3:
from z3 import *
s = Solver()
# Ord of character at position i
def OrdAt(inp, i):
return StrToCode(SubString(inp, i, 1))
# Adding ascii values for a string of a given length
def add_ascii_values(password, len):
return Sum([OrdAt(password, i) for i in range(len)])
# We'll have to force a constant length
length = 3
password = String("password")
s.add(Length(password) == length)
ascii_sum = add_ascii_values(password, length)
s.add(ascii_sum == 100)
# Also require characters to be printable so we can view them:
for i in range(length):
v = OrdAt(password, i)
s.add(v >= 0x20)
s.add(v <= 0x7E)
print(s.check())
print(s.model()[password])
Note Due to https://github.com/Z3Prover/z3/issues/5773, to be able to run the above, you need a version of z3 that you downloaded on Jan 12, 2022 or afterwards! As of this date, none of the released versions of z3 contain the functions used in this answer.
When run, the above prints:
sat
" #!"
You can check that it satisfies the given constraint, i.e., the ord of characters add up to 100:
>>> sum(ord(c) for c in " #!")
100
Note that we no longer have to worry about modular arithmetic, since OrdAt returns an actual integer, not a bit-vector.
2022 Update
Below answer, written back in 2018, no longer applies; as strings in SMTLib received a major update and thus the code given is outdated. Keeping it here for archival purposes, and in case you happen to have a really old z3 that you cannot upgrade for some reason. See the other answer for a variant that works with the new unicode strings in SMTLib: https://stackoverflow.com/a/70689580/936310
Old Answer from 2018
You're conflating Python strings and Z3 Strings; and unfortunately the two are quite different types.
In Z3py, a String is simply a sequence of 8-bit values. And what you can do with a Z3 is actually quite limited; for instance you cannot iterate over the characters like you did in your add_ascii_values function. See this page for what the allowed functions are: https://rise4fun.com/z3/tutorialcontent/sequences (This page lists the functions in SMTLib parlance; but the equivalent ones are available from the z3py interface.)
There are a few important restrictions/things that you need to keep in mind when working with Z3 sequences and strings:
You have to be very explicit about the lengths; In particular, you cannot sum over strings of arbitrary symbolic length. There are a few things you can do without specifying the length explicitly, but these are limited. (Like regex matches, substring extraction etc.)
You cannot extract a character out of a string. This is an oversight in my opinion, but SMTLib just has no way of doing so for the time being. Instead, you get a list of length 1. This causes a lot of headaches in programming, but there are workarounds. See below.
Anytime you loop over a string/sequence, you have to go up to a fixed bound. There are ways to program so you can cover "all strings upto length N" for some constant "N", but they do get hairy.
Keeping all this in mind, I'd go about coding your example like the following; restricting password to be precisely 10 characters long:
from z3 import *
s = Solver()
# Work around the fact that z3 has no way of giving us an element at an index. Sigh.
ordHelperCounter = 0
def OrdAt(inp, i):
global ordHelperCounter
v = BitVec("OrdAtHelper_%d_%d" % (i, ordHelperCounter), 8)
ordHelperCounter += 1
s.add(Unit(v) == SubString(inp, i, 1))
return v
# Your original function, but note the addition of len parameter and use of Sum
def add_ascii_values(password, len):
return Sum([OrdAt(password, i) for i in range(len)])
# We'll have to force a constant length
length = 10
password = String("password")
s.add(Length(password) == 10)
ascii_sum = add_ascii_values(password, length)
s.add(ascii_sum == 100)
# Also require characters to be printable so we can view them:
for i in range(length):
v = OrdAt(password, i)
s.add(v >= 0x20)
s.add(v <= 0x7E)
print(s.check())
print(s.model()[password])
The OrdAt function works around the problem of not being able to extract characters. Also note how we use Sum instead of sum, and how all "loops" are of fixed iteration count. I also added constraints to make all the ascii codes printable for convenience.
When you run this, you get:
sat
":X|#`y}###"
Let's check it's indeed good:
>>> len(":X|#`y}###")
10
>>> sum(ord(character) for character in ":X|#`y}###")
868
So, we did get a length 10 string; but how come the ord's don't sum up to 100? Now, you have to remember sequences are composed of 8-bit values, and thus the arithmetic is done modulo 256. So, the sum actually is:
>>> sum(ord(character) for character in ":X|#`y}###") % 256
100
To avoid the overflows, you can either use larger bit-vectors, or more simply use Z3's unbounded Integer type Int. To do so, use the BV2Int function, by simply changing add_ascii_values to:
def add_ascii_values(password, len):
return Sum([BV2Int(OrdAt(password, i)) for i in range(len)])
Now we'd get:
unsat
That's because each of our characters has at least value 0x20 and we wanted 10 characters; so there's no way to make them all sum up to 100. And z3 is precisely telling us that. If you increase your sum goal to something more reasonable, you'd start getting proper values.
Programming with z3py is different than regular programming with Python, and z3 String objects are quite different than those of Python itself. Note that the sequence/string logic isn't even standardized yet by the SMTLib folks, so things can change. (In particular, I'm hoping they'll add functionality for extracting elements at an index!).
Having said all this, going over the https://rise4fun.com/z3/tutorialcontent/sequences would be a good start to get familiar with them, and feel free to ask further questions.

Integers in R programming [duplicate]

In R we all know it is convenient for those times we want to ensure we are dealing with an integer to specify it using the "L" suffix like this:
1L
# [1] 1
If we don't explicitly tell R we want an integer it will assume we meant to use a numeric data type...
str( 1 * 1 )
# num 1
str( 1L * 1L )
# int 1
Why is "L" the preferred suffix, why not "I" for instance? Is there a historical reason?
In addition, why does R allow me to do (with warnings):
str(1.0L)
# int 1
# Warning message:
# integer literal 1.0L contains unnecessary decimal point
But not..
str(1.1L)
# num 1.1
#Warning message:
#integer literal 1.1L contains decimal; using numeric value
I'd expect both to either return an error.
Why is "L" used as a suffix?
I've never seen it written down, but I theorise in short for two reasons:
Because R handles complex numbers which may be specified using the
suffix "i" and this would be too simillar to "I"
Because R's integers are 32-bit long integers and "L" therefore appears to be sensible shorthand for referring to this data type.
The value a long integer can take depends on the word size. R does not natively support integers with a word length of 64-bits. Integers in R have a word length of 32 bits and are signed and therefore have a range of −2,147,483,648 to 2,147,483,647. Larger values are stored as double.
This wiki page has more information on common data types, their conventional names and ranges.
And also from ?integer
Note that current implementations of R use 32-bit integers for integer vectors, so the range of representable integers is restricted to about +/-2*10^9: doubles can hold much larger integers exactly.
Why do 1.0L and 1.1L return different types?
The reason that 1.0L and 1.1L will return different data types is because returning an integer for 1.1 will result in loss of information, whilst for 1.0 it will not (but you might want to know you no longer have a floating point numeric). Buried deep with the lexical analyser (/src/main/gram.c:4463-4485) is this code (part of the function NumericValue()) which actually creates a int data type from a double input that is suffixed by an ascii "L":
/* Make certain that things are okay. */
if(c == 'L') {
double a = R_atof(yytext);
int b = (int) a;
/* We are asked to create an integer via the L, so we check that the
double and int values are the same. If not, this is a problem and we
will not lose information and so use the numeric value.
*/
if(a != (double) b) {
if(GenerateCode) {
if(seendot == 1 && seenexp == 0)
warning(_("integer literal %s contains decimal; using numeric value"), yytext);
else {
/* hide the L for the warning message */
*(yyp-2) = '\0';
warning(_("non-integer value %s qualified with L; using numeric value"), yytext);
*(yyp-2) = (char)c;
}
}
asNumeric = 1;
seenexp = 1;
}
}
Probably because R is written in C, and L is used for a (long) integer in C

Why would R use the "L" suffix to denote an integer?

In R we all know it is convenient for those times we want to ensure we are dealing with an integer to specify it using the "L" suffix like this:
1L
# [1] 1
If we don't explicitly tell R we want an integer it will assume we meant to use a numeric data type...
str( 1 * 1 )
# num 1
str( 1L * 1L )
# int 1
Why is "L" the preferred suffix, why not "I" for instance? Is there a historical reason?
In addition, why does R allow me to do (with warnings):
str(1.0L)
# int 1
# Warning message:
# integer literal 1.0L contains unnecessary decimal point
But not..
str(1.1L)
# num 1.1
#Warning message:
#integer literal 1.1L contains decimal; using numeric value
I'd expect both to either return an error.
Why is "L" used as a suffix?
I've never seen it written down, but I theorise in short for two reasons:
Because R handles complex numbers which may be specified using the
suffix "i" and this would be too simillar to "I"
Because R's integers are 32-bit long integers and "L" therefore appears to be sensible shorthand for referring to this data type.
The value a long integer can take depends on the word size. R does not natively support integers with a word length of 64-bits. Integers in R have a word length of 32 bits and are signed and therefore have a range of −2,147,483,648 to 2,147,483,647. Larger values are stored as double.
This wiki page has more information on common data types, their conventional names and ranges.
And also from ?integer
Note that current implementations of R use 32-bit integers for integer vectors, so the range of representable integers is restricted to about +/-2*10^9: doubles can hold much larger integers exactly.
Why do 1.0L and 1.1L return different types?
The reason that 1.0L and 1.1L will return different data types is because returning an integer for 1.1 will result in loss of information, whilst for 1.0 it will not (but you might want to know you no longer have a floating point numeric). Buried deep with the lexical analyser (/src/main/gram.c:4463-4485) is this code (part of the function NumericValue()) which actually creates a int data type from a double input that is suffixed by an ascii "L":
/* Make certain that things are okay. */
if(c == 'L') {
double a = R_atof(yytext);
int b = (int) a;
/* We are asked to create an integer via the L, so we check that the
double and int values are the same. If not, this is a problem and we
will not lose information and so use the numeric value.
*/
if(a != (double) b) {
if(GenerateCode) {
if(seendot == 1 && seenexp == 0)
warning(_("integer literal %s contains decimal; using numeric value"), yytext);
else {
/* hide the L for the warning message */
*(yyp-2) = '\0';
warning(_("non-integer value %s qualified with L; using numeric value"), yytext);
*(yyp-2) = (char)c;
}
}
asNumeric = 1;
seenexp = 1;
}
}
Probably because R is written in C, and L is used for a (long) integer in C

Easier way to convert a character to an integer?

Still getting a feel for what's in the Julia standard library. I can convert strings to their integer representation via the Int() constructor, but when I call Int() with a Char I don't integer value of a digit:
julia> Int('3')
51
Currently I'm calling string() first:
intval = Int(string(c)) # doesn't work anymore, see note below
Is this the accepted way of doing this? Or is there a more standard method? It's coming up quite a bit in my Project Euler exercise.
Note: This question was originally asked before Julia 1.0. Since it was asked the int function was renamed to Int and became a method of the Int type object. The method Int(::String) for parsing a string to an integer was removed because of the potentially confusing difference in behavior between that and Int(::Char) discussed in the accepted answer.
The short answer is you can do parse(Int, c) to do this correctly and efficiently. Read on for more discussion and details.
The code in the question as originally asked doesn't work anymore because Int(::String) was removed from the languge because of the confusing difference in behavior between it and Int(::Char). Prior to Julia 1.0, the former was parsing a string as an integer whereas the latter was giving the unicode code point of the character which meant that Int("3") would return 3 whereaas Int('3') would return 51. The modern working equivalent of what the questioner was using would be parse(Int, string(c)). However, you can skip converting the character to a string (which is quite inefficient) and just write parse(Int, c).
What does Int(::Char) do and why does Int('3') return 51? That is the code point value assigned to the character 3 by the Unicode Consortium, which was also the ASCII code point for it before that. Obviously, this is not the same as the digit value of the letter. It would be nice if these matched, but they don't. The code points 0-9 are a bunch of non-printing "control characters" starting with the NUL byte that terminates C strings. The code points for decimal digits are at least contiguous, however:
julia> [Int(c) for c in "0123456789"]
10-element Vector{Int64}:
48
49
50
51
52
53
54
55
56
57
Because of this you can compute the value of a digit by subtracting the code point of 0 from it:
julia> [Int(c) - Int('0') for c in "0123456789"]
10-element Vector{Int64}:
0
1
2
3
4
5
6
7
8
9
Since subtraction of Char values works and subtracts their code points, this can be simplified to [c-'0' for c in "0123456789"]. Why not do it this way? You can! That is exactly what you'd do in C code. If you know your code will only ever encounter c values that are decimal digits, then this works well. It doesn't, however, do any error checking whereas parse does:
julia> c = 'f'
'f': ASCII/Unicode U+0066 (category Ll: Letter, lowercase)
julia> parse(Int, c)
ERROR: ArgumentError: invalid base 10 digit 'f'
Stacktrace:
[1] parse(::Type{Int64}, c::Char; base::Int64)
# Base ./parse.jl:46
[2] parse(::Type{Int64}, c::Char)
# Base ./parse.jl:41
[3] top-level scope
# REPL[38]:1
julia> c - '0'
54
Moreover, parse is a bit more flexible. Suppose you want to accept f as a hex "digit" encoding the value 15. To do that with parse you just need to use the base keyword argument:
julia> parse(Int, 'f', base=16)
15
julia> parse(Int, 'F', base=16)
15
As you can see it parses upper or lower case hex digits correctly. In order to do that with the subtraction method, your code would need to do something like this:
'0' <= c <= '9' ? c - '0' :
'A' <= c <= 'F' ? c - 'A' + 10 :
'a' <= c <= 'f' ? c - 'a' + 10 : error()
Which is actually quite close to the implementation of the parse(Int, c) method. Of course at that point it's much clearer and easier to just call parse(Int, c) which does this for you and is well optimized.

Resources