How do you find the unicode value of a character in Julia? - julia

I'm looking for something like Python's ord(char) for Julia that returns an integer.

I think you're looking for codepoint. From the documentation:
codepoint(c::AbstractChar) -> Integer
Return the Unicode codepoint (an unsigned integer) corresponding to the character c (or throw an exception if c does not represent a valid character). For Char, this is a UInt32 value, but AbstractChar types that represent only a subset of Unicode may return a different-sized integer (e.g. UInt8).
For example:
julia> codepoint('a')
0x00000061
To get the exact equivalent of Python's ord function, you might want to convert the result to a signed integer:
julia> Int(codepoint('a'))
97

You can also just do:
julia> Int('a')
97
If you have a String:
julia> s="hello";
julia> Int(s[1])
104
julia> Int(s[2])
101
julia> Int(s[5])
111
More details here.

Related

Difference between character and string when constructing a 1-d array of the specified type

I am confusing with when constructing a 1-d array of the specified type by usung getindex(type[, elements...]).
Of course, that I can convert Int 8 when the element are Int
getindex(Int8, 1, 2)
2-element Vector{Int8}:
1
2
Even when the element are character format, I can convert it to Int8 :
getindex(Int8, '1', '2')
2-element Vector{Int8}:
49
50
However, I can not convert when the element are in string format.
getindex(Int8, "1", "2")
and, raise the following error :
MethodError: Cannot `convert` an object of type String to an object of type Int8
Closest candidates are:
convert(::Type{T}, ::Ptr) where T<:Integer at pointer.jl:23
convert(::Type{IT}, ::GeometryBasics.OffsetInteger) where IT<:Integer at C:\Users\Admin\.julia\packages\GeometryBasics\WMp6v\src\offsetintegers.jl:40
convert(::Type{T}, ::SentinelArrays.ChainedVectorIndex) where T<:Union{Signed, Unsigned} at C:\Users\CARVI\.julia\packages\SentinelArrays\tV9lH\src\chainedvector.jl:209
...
Stacktrace:
[1] setindex!(A::Vector{Int8}, x::String, i1::Int64)
# Base .\array.jl:839
[2] getindex(#unused#::Type{Int8}, x::String, y::String)
# Base .\array.jl:393
[3] top-level scope
# In[35]:1
[4] eval
# .\boot.jl:360 [inlined]
[5] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
# Base .\loading.jl:1116
Why getindex() allow character element to convert to different format (like character -> Int8), but string ?
First of all, that's a rather weird way of writing array literals: getindex(T, xs...) is usually written as T[xs...].
However, the error already quite clearly tells you what went wrong:
Cannot convert an object of type String to an object of type Int8
How do you imagine a general conversion from String to Int8 to look like? What 8-bit integer should correspond to the string "slkdfjls", for example? A string is after all a pretty much arbitrary sequence of bytes. And contrary to your expectation, Julia does not make an attempt to do any parsing of the contained value (for that, use parse(Int8, "1").
Characters on the other hand represent (if valid) single UTF-8 code points, and it is meaningful to reinterpret their fixed amount of bytes as a number:
julia> convert(Int16, '†')
8224
julia> convert(Int8, '1') # certainly not Int8(1)!
49
The conversion is already borderline meaningful when the value exceeds the range of the target type:
julia> convert(Int8, '†')
ERROR: InexactError: trunc(Int8, 8224)
...
UTF-8 characters that happen to be representable by only one byte can be losslessly converted to Int8; this covers all of ASCII. Above that, an error is raised. This is no different from convert(Int8, Int32(something)).

Why Array{Float64,1} is not a subtype of Array{Real,1} in Julia?

I'm trying to write a Julia function, which can accept both 1-dimensional Int64 and Float64 array as input argument. How can I do this without defining two versions, one for Int64 and another for Float64?
I have tried using Array{Real,1} as input argument type. However, since Array{Int64,1} is not a subtype of Array{Real,1}, this cannot work.
A genuine, non secure way to do it is, with an example:
function square(x)
# The point is for element-wise operation
out = x.*x
end
Output:
julia> square(2)
4
julia> square([2 2 2])
1×3 Array{Int64,2}:
4 4 4

In Julia 1.0+ convert a floating-point number to a string retaining the original format

I would like to convert a floating point number to a string, retaining the original format of the digits. Here are some ideas I have tried:
julia> VERSION
v"1.0.3"
Notice the format with 9 digits before decimal point:
julia> x = 123456789.123;
These two attempts changed the format:
julia> string(x)
"1.23456789123e8"
julia> repr(x)
"1.23456789123e8"
Here is what I would like:
julia> to_string(x)
## 123456789.123
This seems simple, but I have not found a way to do it.
Any suggestions?

How to perform division in Go

I am trying to perform a simple division in Go.
fmt.Println(3/10)
This prints 0 instead of 0.3. This is kind of weird. Could someone please share what is the reason behind this? i want to perform different arithmetic operations in Go.
Thanks
The operands of the binary operation 3 / 10 are untyped constants. The specification says this about binary operations with untyped constants
if the operands of a binary operation are different kinds of untyped constants, the operation and, for non-boolean operations, the result use the kind that appears later in this list: integer, rune, floating-point, complex.
Because 3 and 10 are untyped integer constants, the value of the expression is an untyped integer (0 in this case).
To get a floating-point constant result, one of the operands must be a floating-point constant. The following expressions evaluate to the untyped floating-point constant 0.3:
3.0 / 10.0
3.0 / 10
3 / 10.0
When the division operation has an untyped constant operand and a typed operand, the typed operand determines the type of the expression. Ensure that the typed operand is a float64 to get a float64 result.
The expressions below convert int variables to a float64 to get the float64 result 0.3:
var i3 = 3
var i10 = 10
fmt.Println(float64(i3) / 10)
fmt.Println(3 / float64(i10))
Run a demonstration playground.

Easier way to convert a character to an integer?

Still getting a feel for what's in the Julia standard library. I can convert strings to their integer representation via the Int() constructor, but when I call Int() with a Char I don't integer value of a digit:
julia> Int('3')
51
Currently I'm calling string() first:
intval = Int(string(c)) # doesn't work anymore, see note below
Is this the accepted way of doing this? Or is there a more standard method? It's coming up quite a bit in my Project Euler exercise.
Note: This question was originally asked before Julia 1.0. Since it was asked the int function was renamed to Int and became a method of the Int type object. The method Int(::String) for parsing a string to an integer was removed because of the potentially confusing difference in behavior between that and Int(::Char) discussed in the accepted answer.
The short answer is you can do parse(Int, c) to do this correctly and efficiently. Read on for more discussion and details.
The code in the question as originally asked doesn't work anymore because Int(::String) was removed from the languge because of the confusing difference in behavior between it and Int(::Char). Prior to Julia 1.0, the former was parsing a string as an integer whereas the latter was giving the unicode code point of the character which meant that Int("3") would return 3 whereaas Int('3') would return 51. The modern working equivalent of what the questioner was using would be parse(Int, string(c)). However, you can skip converting the character to a string (which is quite inefficient) and just write parse(Int, c).
What does Int(::Char) do and why does Int('3') return 51? That is the code point value assigned to the character 3 by the Unicode Consortium, which was also the ASCII code point for it before that. Obviously, this is not the same as the digit value of the letter. It would be nice if these matched, but they don't. The code points 0-9 are a bunch of non-printing "control characters" starting with the NUL byte that terminates C strings. The code points for decimal digits are at least contiguous, however:
julia> [Int(c) for c in "0123456789"]
10-element Vector{Int64}:
48
49
50
51
52
53
54
55
56
57
Because of this you can compute the value of a digit by subtracting the code point of 0 from it:
julia> [Int(c) - Int('0') for c in "0123456789"]
10-element Vector{Int64}:
0
1
2
3
4
5
6
7
8
9
Since subtraction of Char values works and subtracts their code points, this can be simplified to [c-'0' for c in "0123456789"]. Why not do it this way? You can! That is exactly what you'd do in C code. If you know your code will only ever encounter c values that are decimal digits, then this works well. It doesn't, however, do any error checking whereas parse does:
julia> c = 'f'
'f': ASCII/Unicode U+0066 (category Ll: Letter, lowercase)
julia> parse(Int, c)
ERROR: ArgumentError: invalid base 10 digit 'f'
Stacktrace:
[1] parse(::Type{Int64}, c::Char; base::Int64)
# Base ./parse.jl:46
[2] parse(::Type{Int64}, c::Char)
# Base ./parse.jl:41
[3] top-level scope
# REPL[38]:1
julia> c - '0'
54
Moreover, parse is a bit more flexible. Suppose you want to accept f as a hex "digit" encoding the value 15. To do that with parse you just need to use the base keyword argument:
julia> parse(Int, 'f', base=16)
15
julia> parse(Int, 'F', base=16)
15
As you can see it parses upper or lower case hex digits correctly. In order to do that with the subtraction method, your code would need to do something like this:
'0' <= c <= '9' ? c - '0' :
'A' <= c <= 'F' ? c - 'A' + 10 :
'a' <= c <= 'f' ? c - 'a' + 10 : error()
Which is actually quite close to the implementation of the parse(Int, c) method. Of course at that point it's much clearer and easier to just call parse(Int, c) which does this for you and is well optimized.

Resources