How would I Emit a discriminated union? - reflection

If I have something like this:
type public ServiceActionType =
| ServiceRequest
| ServiceResponse
| ServiceEventInvocation
| ServiceEventSubscriptionRequest
| ServiceMetadataRequest
how do I then add one of those items to the stack, as in:
IlGen.Emit(OpCodes.???, ???)

The existing answers are good, but note that the Microsoft.FSharp.Reflection namespace has some helpers that make it easy to get the members that you care about:
open Reflection
let cases = FSharpType.GetUnionCases(typeof<ServiceActionType>)
let serviceRequestMethod = FSharpValue.PreComputeUnionConstructorInfo(cases.[0])
The result is the same as typeof<ServiceActionType>.GetMethod("get_ServiceRequest") but you don't have to worry about knowing the compiled form, which can vary depending on whether the union cases have fields.

In general, discriminated unions are compiled as class hierarchies (meaning that there is a new sub-class for every case of the discriminated union). However, in your case, the situation is simpler, because none of the cases carries any values.
If you look at the compiled code, you'll see that the generated .NET representation gives you something like:
class ServiceActionType {
public static ServiceActionType ServiceRequest { get; }
public static ServiceActionType ServiceResponse { get; }
public static ServiceActionType ServiceEventInvocation { get; }
// etc. for all the other cases
}
This means that, if you want to construct one of the values of the discriminated union, you just need to emit a call to the getter method of the (static) property representing the case you want.
If you had a case with arguments, say ServiceRequest of RequestInfo, then the generated class would include a NewServiceRequest method instead, which would take the necessary parameters:
public static ServiceActionType NewServiceRequest(RequestInfo info);
That said, I'm not entirely sure why you want to do this, so emitting code might not be the best approach. You can also consider using F# quotations - which can be compiled to dynamic methods - and creating a quotation that represents construction of a DU case is quite easy using Expr.NewUnionCase.

Let's take ServiceActionType.ServiceRequest for example.
I put the following in LINQPad (F# Program mode) to get the IL (but you could use ildasm.exe for the same purpose):
type public ServiceActionType =
| ServiceRequest
| ServiceResponse
| ServiceEventInvocation
| ServiceEventSubscriptionRequest
| ServiceMetadataRequest
ServiceRequest.Dump()
and it gave me:
IL_0001: call Query_qdoovg+ServiceActionType.get_ServiceRequest
IL_0006: call LINQPad.FSharpExtensions.Extensions.Dump
Dump:
IL_0000: nop
IL_0001: ldarg.0
IL_0002: box 03 00 00 1B
IL_0007: call LINQPad.FSharpExtensions.Extensions.Dump
IL_000C: ret
ServiceActionType.get_ServiceMetadataRequest:
IL_0000: ldsfld Query_qdoovg+ServiceActionType._unique_ServiceMetadataRequest
IL_0005: ret
ServiceActionType.get_IsServiceMetadataRequest:
IL_0000: ldarg.0
IL_0001: call Query_qdoovg+ServiceActionType.get_Tag
IL_0006: ldc.i4.4
IL_0007: ceq
IL_0009: ret
ServiceActionType.get_ServiceEventSubscriptionRequest:
IL_0000: ldsfld Query_qdoovg+ServiceActionType._unique_ServiceEventSubscriptionRequest
IL_0005: ret
ServiceActionType.get_IsServiceEventSubscriptionRequest:
IL_0000: ldarg.0
IL_0001: call Query_qdoovg+ServiceActionType.get_Tag
IL_0006: ldc.i4.3
IL_0007: ceq
IL_0009: ret
ServiceActionType.get_ServiceEventInvocation:
IL_0000: ldsfld Query_qdoovg+ServiceActionType._unique_ServiceEventInvocation
IL_0005: ret
ServiceActionType.get_IsServiceEventInvocation:
IL_0000: ldarg.0
IL_0001: call Query_qdoovg+ServiceActionType.get_Tag
IL_0006: ldc.i4.2
IL_0007: ceq
IL_0009: ret
ServiceActionType.get_ServiceResponse:
IL_0000: ldsfld Query_qdoovg+ServiceActionType._unique_ServiceResponse
IL_0005: ret
ServiceActionType.get_IsServiceResponse:
IL_0000: ldarg.0
IL_0001: call Query_qdoovg+ServiceActionType.get_Tag
IL_0006: ldc.i4.1
IL_0007: ceq
IL_0009: ret
ServiceActionType.get_ServiceRequest:
IL_0000: ldsfld Query_qdoovg+ServiceActionType._unique_ServiceRequest
IL_0005: ret
ServiceActionType.get_IsServiceRequest:
IL_0000: ldarg.0
IL_0001: call Query_qdoovg+ServiceActionType.get_Tag
IL_0006: ldc.i4.0
IL_0007: ceq
IL_0009: ret
ServiceActionType.get_Tag:
IL_0000: ldarg.0
IL_0001: ldfld Query_qdoovg+ServiceActionType._tag
IL_0006: ret
ServiceActionType.__DebugDisplay:
IL_0000: ldstr "%+0.8A"
IL_0005: newobj Microsoft.FSharp.Core.PrintfFormat<Microsoft.FSharp.Core.FSharpFunc<Query_qdoovg+ServiceActionType,System.String>,Microsoft.FSharp.Core.Unit,System.String,System.String,System.String>..ctor
IL_000A: call Microsoft.FSharp.Core.ExtraTopLevelOperators.PrintFormatToString
IL_000F: ldarg.0
IL_0010: callvirt Microsoft.FSharp.Core.FSharpFunc<Query_qdoovg+ServiceActionType,System.String>.Invoke
IL_0015: ret
ServiceActionType.CompareTo:
IL_0000: nop
IL_0001: ldarg.0
IL_0002: ldnull
IL_0003: cgt.un
IL_0005: brfalse.s IL_0027
IL_0007: ldarg.1
IL_0008: ldnull
IL_0009: cgt.un
IL_000B: brfalse.s IL_0025
IL_000D: ldarg.0
IL_000E: ldfld Query_qdoovg+ServiceActionType._tag
IL_0013: stloc.0
IL_0014: ldarg.1
IL_0015: ldfld Query_qdoovg+ServiceActionType._tag
IL_001A: stloc.1
IL_001B: ldloc.0
IL_001C: ldloc.1
IL_001D: bne.un.s IL_0021
IL_001F: ldc.i4.0
IL_0020: ret
IL_0021: ldloc.0
IL_0022: ldloc.1
IL_0023: sub
IL_0024: ret
IL_0025: ldc.i4.1
IL_0026: ret
IL_0027: ldarg.1
IL_0028: ldnull
IL_0029: cgt.un
IL_002B: brfalse.s IL_002F
IL_002D: ldc.i4.m1
IL_002E: ret
IL_002F: ldc.i4.0
IL_0030: ret
ServiceActionType.CompareTo:
IL_0000: nop
IL_0001: ldarg.0
IL_0002: ldarg.1
IL_0003: unbox.any Query_qdoovg.ServiceActionType
IL_0008: call Query_qdoovg+ServiceActionType.CompareTo
IL_000D: ret
ServiceActionType.CompareTo:
IL_0000: nop
IL_0001: ldarg.1
IL_0002: unbox.any Query_qdoovg.ServiceActionType
IL_0007: stloc.0
IL_0008: ldarg.0
IL_0009: ldnull
IL_000A: cgt.un
IL_000C: brfalse.s IL_0033
IL_000E: ldarg.1
IL_000F: unbox.any Query_qdoovg.ServiceActionType
IL_0014: ldnull
IL_0015: cgt.un
IL_0017: brfalse.s IL_0031
IL_0019: ldarg.0
IL_001A: ldfld Query_qdoovg+ServiceActionType._tag
IL_001F: stloc.1
IL_0020: ldloc.0
IL_0021: ldfld Query_qdoovg+ServiceActionType._tag
IL_0026: stloc.2
IL_0027: ldloc.1
IL_0028: ldloc.2
IL_0029: bne.un.s IL_002D
IL_002B: ldc.i4.0
IL_002C: ret
IL_002D: ldloc.1
IL_002E: ldloc.2
IL_002F: sub
IL_0030: ret
IL_0031: ldc.i4.1
IL_0032: ret
IL_0033: ldarg.1
IL_0034: unbox.any Query_qdoovg.ServiceActionType
IL_0039: ldnull
IL_003A: cgt.un
IL_003C: brfalse.s IL_0040
IL_003E: ldc.i4.m1
IL_003F: ret
IL_0040: ldc.i4.0
IL_0041: ret
ServiceActionType.GetHashCode:
IL_0000: nop
IL_0001: ldarg.0
IL_0002: ldnull
IL_0003: cgt.un
IL_0005: brfalse.s IL_003C
IL_0007: ldc.i4.0
IL_0008: stloc.0
IL_0009: ldarg.0
IL_000A: call Query_qdoovg+ServiceActionType.get_Tag
IL_000F: switch (IL_0028, IL_002C, IL_0030, IL_0034, IL_0038)
IL_0028: ldc.i4.0
IL_0029: stloc.0
IL_002A: ldloc.0
IL_002B: ret
IL_002C: ldc.i4.1
IL_002D: stloc.0
IL_002E: ldloc.0
IL_002F: ret
IL_0030: ldc.i4.2
IL_0031: stloc.0
IL_0032: ldloc.0
IL_0033: ret
IL_0034: ldc.i4.3
IL_0035: stloc.0
IL_0036: ldloc.0
IL_0037: ret
IL_0038: ldc.i4.4
IL_0039: stloc.0
IL_003A: ldloc.0
IL_003B: ret
IL_003C: ldc.i4.0
IL_003D: ret
ServiceActionType.GetHashCode:
IL_0000: nop
IL_0001: ldarg.0
IL_0002: call Microsoft.FSharp.Core.LanguagePrimitives.get_GenericEqualityComparer
IL_0007: call Query_qdoovg+ServiceActionType.GetHashCode
IL_000C: ret
ServiceActionType.Equals:
IL_0000: nop
IL_0001: ldarg.0
IL_0002: ldnull
IL_0003: cgt.un
IL_0005: brfalse.s IL_0026
IL_0007: ldarg.1
IL_0008: isinst Query_qdoovg.ServiceActionType
IL_000D: stloc.0
IL_000E: ldloc.0
IL_000F: brfalse.s IL_0024
IL_0011: ldarg.0
IL_0012: ldfld Query_qdoovg+ServiceActionType._tag
IL_0017: stloc.1
IL_0018: ldloc.0
IL_0019: ldfld Query_qdoovg+ServiceActionType._tag
IL_001E: stloc.2
IL_001F: ldloc.1
IL_0020: ldloc.2
IL_0021: ceq
IL_0023: ret
IL_0024: ldc.i4.0
IL_0025: ret
IL_0026: ldarg.1
IL_0027: ldnull
IL_0028: cgt.un
IL_002A: ldc.i4.0
IL_002B: ceq
IL_002D: ret
ServiceActionType.Equals:
IL_0000: nop
IL_0001: ldarg.0
IL_0002: ldnull
IL_0003: cgt.un
IL_0005: brfalse.s IL_0022
IL_0007: ldarg.1
IL_0008: ldnull
IL_0009: cgt.un
IL_000B: brfalse.s IL_0020
IL_000D: ldarg.0
IL_000E: ldfld Query_qdoovg+ServiceActionType._tag
IL_0013: stloc.0
IL_0014: ldarg.1
IL_0015: ldfld Query_qdoovg+ServiceActionType._tag
IL_001A: stloc.1
IL_001B: ldloc.0
IL_001C: ldloc.1
IL_001D: ceq
IL_001F: ret
IL_0020: ldc.i4.0
IL_0021: ret
IL_0022: ldarg.1
IL_0023: ldnull
IL_0024: cgt.un
IL_0026: ldc.i4.0
IL_0027: ceq
IL_0029: ret
ServiceActionType.Equals:
IL_0000: nop
IL_0001: ldarg.1
IL_0002: isinst Query_qdoovg.ServiceActionType
IL_0007: stloc.0
IL_0008: ldloc.0
IL_0009: brfalse.s IL_0013
IL_000B: ldarg.0
IL_000C: ldloc.0
IL_000D: call Query_qdoovg+ServiceActionType.Equals
IL_0012: ret
IL_0013: ldc.i4.0
IL_0014: ret
ServiceActionType..ctor:
IL_0000: ldarg.0
IL_0001: call System.Object..ctor
IL_0006: ldarg.0
IL_0007: ldarg.1
IL_0008: stfld Query_qdoovg+ServiceActionType._tag
IL_000D: ret
Where IL_0001: call Query_qdoovg+ServiceActionType.get_ServiceRequest is the bit that does a call instruction to get the ServiceActionType.ServiceRequest value on the stack.
Which could be emitted like: il.Emit(OpCodes.Call, typeof<ServiceActionType>.GetMethod("get_ServiceRequest"))

Related

Array of pointers, VirtualAlloc and RtlMoveMemory. MASM, some kind of problem

Does anybody know how to fix the addElement function so it ends up as another element in the array. The idea is a dynamic array, where arrayPtr is a pointer to the first element, then new elements can be added dynamically and kept track of by increasing the arrayPtr value. So in-fact I think what it would end up being is an array of pointers to DbRecord structs in memory. Allocated by VirtualAlloc and copied by RtlMoveMemory. I am kinda of hung up on RtlMoveMemeory line. I feel like my line of thinking is correct.
.386
.model flat, stdcall
option casemap :none
include windows.inc
include user32.inc
include kernel32.inc
addElement PROTO: ptr DbRecord
.data?
DbRecord struct
Id dd ?
WordOne db 32 dup(?) ; db is define byte, set value of byte
WordTwo db 32 dup(?)
WordThree db 32 dup(?)
Year dd ?
DbRecord ends
arrayPtr dd ? ; pointer in memory to start of array
newElementPointer DbRecord <>
hStdOut dd ?
bytesWritten dd ?
.data
arrayCount dd 0
hello db 'Hello World!', 0
.code
main proc
LOCAL DbRecord01:DbRecord
mov [DbRecord01.Id], 1;
; any other way than one character at a time?
mov byte ptr [DbRecord01.WordOne], 'D'
mov byte ptr [DbRecord01.WordOne + 1], 'o'
mov byte ptr [DbRecord01.WordOne + 2], 'g'
mov byte ptr [DbRecord01.WordOne + 3], 0
mov byte ptr [DbRecord01.WordTwo], 'C'
mov byte ptr [DbRecord01.WordTwo + 1], 'a'
mov byte ptr [DbRecord01.WordTwo + 2], 't'
mov byte ptr [DbRecord01.WordTwo + 3], 0
mov byte ptr [DbRecord01.WordThree], 'E'
mov byte ptr [DbRecord01.WordThree + 1], 'y'
mov byte ptr [DbRecord01.WordThree + 2], 'e'
mov byte ptr [DbRecord01.WordThree + 3], 0
mov [DbRecord01.Year], 2022;
invoke GetStdHandle, STD_OUTPUT_HANDLE
mov [hStdOut], eax
invoke WriteConsole, hStdOut, offset hello, sizeof hello, offset bytesWritten, NULL
invoke addElement, addr DbRecord01
ret
main endp
addElement proc DbRecordPointer: ptr DbRecord
invoke VirtualAlloc, NULL, sizeof DbRecord, MEM_COMMIT, PAGE_READWRITE ; I beleive store a memory address in eax
invoke RtlMoveMemory, DbRecord ptr [eax], DbRecordPointer, sizeof DbRecord ; but how to use that memory address here?
ret
addElement endp
end main
EDIT/Update:
So yes part of the answer is just passing in eax.
I am here now
How do I get the value of eax ("memory location from VirtualAlloc",) where data was copied into arrayPtr (arrayPtr + count * sizeof DbRecord)
.386
.model flat, stdcall
option casemap :none
include windows.inc
include user32.inc
include kernel32.inc
addElement PROTO: ptr DbRecord
.data?
DbRecord struct
Id dd ?
WordOne db 32 dup(?) ; db is define byte, set value of byte
WordTwo db 32 dup(?)
WordThree db 32 dup(?)
Year dd ?
DbRecord ends
arrayPtr dword ? ; pointer in memory to start of array
; newElementPointer DbRecord <>
hStdOut dd ?
bytesWritten dd ?
.data
arrayCount dd 0
hello db 'Hello World!', 0
.code
main proc
LOCAL DbRecord01:DbRecord
mov [DbRecord01.Id], 1;
; any other way than one character at a time?
mov byte ptr [DbRecord01.WordOne], 'D'
mov byte ptr [DbRecord01.WordOne + 1], 'o'
mov byte ptr [DbRecord01.WordOne + 2], 'g'
mov byte ptr [DbRecord01.WordOne + 3], 0
mov byte ptr [DbRecord01.WordTwo], 'C'
mov byte ptr [DbRecord01.WordTwo + 1], 'a'
mov byte ptr [DbRecord01.WordTwo + 2], 't'
mov byte ptr [DbRecord01.WordTwo + 3], 0
mov byte ptr [DbRecord01.WordThree], 'E'
mov byte ptr [DbRecord01.WordThree + 1], 'y'
mov byte ptr [DbRecord01.WordThree + 2], 'e'
mov byte ptr [DbRecord01.WordThree + 3], 0
mov [DbRecord01.Year], 2022;
invoke GetStdHandle, STD_OUTPUT_HANDLE
mov [hStdOut], eax
invoke WriteConsole, hStdOut, offset hello, sizeof hello, offset bytesWritten, NULL
invoke addElement, addr DbRecord01
ret
main endp
addElement proc uses edx DbRecordPointer: ptr DbRecord
Local newElementPointer: Dword
invoke VirtualAlloc, NULL, sizeof DbRecord, MEM_COMMIT, PAGE_READWRITE ; I beleive store a memory address in eax
mov newElementPointer, eax
;invoke RtlMoveMemory, newElementPointer , DbRecordPointer, sizeof DbRecord ; but how to use that memory address here?
invoke RtlMoveMemory, eax , DbRecordPointer, sizeof DbRecord
mov edx, arrayCount
inc edx
mov arrayCount, edx
;mov dword ptr [arrayPtr+arrayCount], eax
ret
addElement endp
end main

ThreadSanitizer (TSan) instrumentation using LLVM opt and TSan passes

My goal is to instrument my initial IR with proper calls to TSan runtime library functions using LLVM opt tool and TSan passes. In other words, I want to end up with similar TSan instrumentation as when using clang -fsanitize=thread -S but by directly using opt and TSan passes instead.
As far as I know, LLVM has two passes for TSan instrumentation: tsan-module (a module pass) and tsan (a function pass). Both passes are available by default in opt, i.e. are included in opt -print-passes report.
I choose tiny_race.c as my sample programe, where the main thread and the thread it spawns (Thread1) form a data race while accessing a global variable Global.
Here are the two steps I take to instrument the code my way:
Generating the initial LLVM IR for tiny_race.c:
clang -S -emit-llvm tiny_race.c -o tiny_race.ll
Using LLVM opt to instrument tiny_race.ll with the two TSan passes:
opt -passes='tsan-module,tsan' tiny_race.ll -S -o myInstrumented.ll
The above pass pipeline executes fine but the resulting myInstrumented.ll lacks some TSan instrumentations. More specifically:
Thread1 (child thread) is left completely un-instrumented.
main thread only has #__tsan_func_entry and #__tsan_func_exit instrumentations and its accesses to Global are not instrumented.
Could anyone please explain why my approach produces a partially-instrumented output? Any suggestion is greatly appreciated.
To better display the difference between the IR resulting from my approach and the expected one, bellow you can find definitions of main and Thread1 in each of them.
Here is myInstrumented.ll:
; Function Attrs: noinline nounwind optnone uwtable
define dso_local ptr #Thread1(ptr noundef %x) #0 {
entry:
%x.addr = alloca ptr, align 8
store ptr %x, ptr %x.addr, align 8
store i32 42, ptr #Global, align 4
%0 = load ptr, ptr %x.addr, align 8
ret ptr %0
}
; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 #main() #0 {
entry:
%0 = call ptr #llvm.returnaddress(i32 0)
call void #__tsan_func_entry(ptr %0) *****TSAN INSTRUMENTATION*****
%retval = alloca i32, align 4
%t = alloca i64, align 8
store i32 0, ptr %retval, align 4
%call = call i32 #pthread_create(ptr noundef %t, ptr noundef null, ptr noundef #Thread1, ptr noundef null) #4
store i32 43, ptr #Global, align 4
%1 = load i64, ptr %t, align 8
%call1 = call i32 #pthread_join(i64 noundef %1, ptr noundef null)
%2 = load i32, ptr #Global, align 4
call void #__tsan_func_exit() *****TSAN INSTRUMENTATION*****
ret i32 %2
}
And here is the resulting IR when using clang -fsanitize=thread -S -emit-llvm tiny_race.c which is my expected result:
; Function Attrs: noinline nounwind optnone sanitize_thread uwtable
define dso_local ptr #Thread1(ptr noundef %x) #0 {
entry:
%0 = call ptr #llvm.returnaddress(i32 0)
call void #__tsan_func_entry(ptr %0) *****TSAN INSTRUMENTATION*****
%x.addr = alloca ptr, align 8
store ptr %x, ptr %x.addr, align 8
call void #__tsan_write4(ptr #Global) *****TSAN INSTRUMENTATION*****
store i32 42, ptr #Global, align 4
%1 = load ptr, ptr %x.addr, align 8
call void #__tsan_func_exit() *****TSAN INSTRUMENTATION*****
ret ptr %1
}
; Function Attrs: noinline nounwind optnone sanitize_thread uwtable
define dso_local i32 #main() #0 {
entry:
%0 = call ptr #llvm.returnaddress(i32 0)
call void #__tsan_func_entry(ptr %0) *****TSAN INSTRUMENTATION*****
%retval = alloca i32, align 4
%t = alloca i64, align 8
store i32 0, ptr %retval, align 4
%call = call i32 #pthread_create(ptr noundef %t, ptr noundef null, ptr noundef #Thread1, ptr noundef null) #4
call void #__tsan_write4(ptr #Global) *****TSAN INSTRUMENTATION*****
store i32 43, ptr #Global, align 4
call void #__tsan_read8(ptr %t) *****TSAN INSTRUMENTATION*****
%1 = load i64, ptr %t, align 8
%call1 = call i32 #pthread_join(i64 noundef %1, ptr noundef null)
call void #__tsan_read4(ptr #Global) *****TSAN INSTRUMENTATION*****
%2 = load i32, ptr #Global, align 4
call void #__tsan_func_exit() *****TSAN INSTRUMENTATION*****
ret i32 %2
}

F# Pointless copying after recursive function compilation

I wrote the following function in F# and tried looking at the decompilation
let startsWith (s: string) (seg: string) =
if s.Length <= seg.Length then
let max = (s.Length - 1)
let rec perChar i =
if s.[i] = seg.[i] then
if i = max then
true
else
perChar (i + 1)
else
false
perChar 0
else
false
As expected, the inner function is optimised into a while loop however I notice the constant values are assigned to dummy variables and assigned back to themselves in each iteration:
internal static bool perChar#8(string s, string seg, int max, int i)
{
while (s[i] == seg[i])
{
if (i == max)
{
return true;
}
string text = s;
string text2 = seg;
int num = max;
i++;
max = num;
seg = text2;
s = text;
}
return false;
}
I suppose I have a couple questions about this:
Why do these dummy names get created?
Is there any performance impact or do these pointless copies get optimised out by the JIT before being executed?
EDIT: IL Added
.method assembly static
valuetype [System.Private.CoreLib]System.Boolean perChar#8 (
class [System.Private.CoreLib]System.String s,
class [System.Private.CoreLib]System.String seg,
valuetype [System.Private.CoreLib]System.Int32 max,
valuetype [System.Private.CoreLib]System.Int32 i
) cil managed
{
// Method begins at RVA 0x2050
// Code size 40 (0x28)
.maxstack 8
// loop start
IL_0000: ldarg.0
IL_0001: ldarg.3
IL_0002: callvirt instance valuetype [netstandard]System.Char [netstandard]System.String::get_Chars(valuetype [netstandard]System.Int32)
IL_0007: ldarg.1
IL_0008: ldarg.3
IL_0009: callvirt instance valuetype [netstandard]System.Char [netstandard]System.String::get_Chars(valuetype [netstandard]System.Int32)
IL_000e: bne.un.s IL_0026
IL_0010: ldarg.3
IL_0011: ldarg.2
IL_0012: bne.un.s IL_0016
IL_0014: ldc.i4.1
IL_0015: ret
IL_0016: ldarg.0
IL_0017: ldarg.1
IL_0018: ldarg.2
IL_0019: ldarg.3
IL_001a: ldc.i4.1
IL_001b: add
IL_001c: starg.s i
IL_001e: starg.s max
IL_0020: starg.s seg
IL_0022: starg.s s
IL_0024: br.s IL_0000
// end loop
IL_0026: ldc.i4.0
IL_0027: ret
} // end of method C::perChar#8

LLVM global constructor is not called for ATmel processors

I have compiled a cpp code and downloaded it to Arduino Uno for blinking an LED. The code works fine.
However, when I convert it to .ll and from .ll to an object file then hex and upload, the code stops working. No LED blinks by the Arduino.
If I address the ports directly:
typedef unsigned char uint8_t;
typedef uint8_t * volatile port_type;
const port_type portB = (port_type) 0x25;
const port_type ddrB = (port_type) 0x24;
it will work fine but if I initialize port addressed via global constructor, it does not work:
int getPortB() {return 0x25;}
int getDdrB() {return 0x24;}
const port_type portB = (port_type) getPortB();
const port_type ddrB = (port_type) getDdrB();
This is because that global constructor is not called at all. If I call it from main function via
call addrspace(1) void #global_var_init()
it will work.
I use the following commands to compile and download the ll file to the Arduino uno:
llvm-as-9 blink1.ll -o blink1.bc
llc-9 -filetype=obj blink1.bc
avr-g++ -mmcu=atmega328p blink1.o -o blink1
avr-objcopy -O ihex -R .eeprom blink1 blink1.hex
avrdude -F -V -c arduino -p ATMEGA328P -P /dev/ttyUSB0 -b 115200 -U flash:w:blink1.hex
blink1.ll
; ModuleID = 'blink1.cpp'
source_filename = "blink1.cpp"
target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
target triple = "avr"
#portB = dso_local global i8* null, align 1
#ddrB = dso_local global i8* null, align 1
#llvm.global_ctors = appending global [1 x { i32, void () addrspace(1)*, i8* }] [{ i32, void () addrspace(1)*, i8* } { i32 65535, void () addrspace(1)* #global_var_init, i8* null }]
; Function Attrs: noinline
define internal void #global_var_init() addrspace(1) {
%1 = inttoptr i16 37 to i8*
store volatile i8* %1, i8** #portB, align 1
%2 = inttoptr i16 36 to i8*
store volatile i8* %2, i8** #ddrB, align 1
ret void
}
; Function Attrs: noinline nounwind optnone
define dso_local void #delay_500ms() addrspace(1) {
call addrspace(0) void asm sideeffect "ldi r19, 150 \0A\09ldi r20, 128 \0A\09ldi r23, 41 \0A\09L1: \0A\09dec r20 \0A\09brne L1 \0A\09dec r19 \0A\09brne L1 \0A\09dec r23 \0A\09brne L1 \0A\09", ""() #3, !srcloc !2
ret void
}
; Function Attrs: noinline norecurse nounwind optnone
define dso_local i16 #main() addrspace(1) {
; call addrspace(1) void #global_var_init()
%1 = alloca i16, align 1
store i16 0, i16* %1, align 1
%2 = load volatile i8*, i8** #ddrB, align 1
store i8 32, i8* %2, align 1
br label %3
3: ; preds = %0, %3
%4 = load volatile i8*, i8** #portB, align 1
store i8 32, i8* %4, align 1
call addrspace(1) void #delay_500ms()
%5 = load volatile i8*, i8** #portB, align 1
store i8 0, i8* %5, align 1
call addrspace(1) void #delay_500ms()
br label %3
}
!0 = !{i32 1, !"wchar_size", i32 2}
!1 = !{!"clang version 9.0.1-+20210314105943+c1a0a213378a-1~exp1~20210314220516.107 "}
!2 = !{i32 1296, i32 1313, i32 1338, i32 1362, i32 1377, i32 1397, i32 1416, i32 1436, i32 1455, i32 1475, i32 1494}
Is this an LLVM bug or am I doing a mistake?

Returning a value pointed to by a pointer in x86 NASM

I'm trying to write a function in x86 NASM assembly that takes a pointer to a structure (structure contains pointer to a buffer) and 2 ints (x,y) which then computes the address of the byte containing (x,y) and returns the value in this address. (The buffer contains a bmp file) I have this function written in C and it works fine.
C function
int loadByte(imgInfo* pImg, int x, int y)
{
unsigned char *pPix = pImg->pImg + (((pImg->width + 31) >> 5) << 2) * y + (x >> 3);
return *pPix;
}
x86 function
load_byte:
push ebp ; prologue
mov ebp, esp
lea ecx, [ebp + 8]
mov ecx, [ecx] ; ecx = &imgInfo
mov eax, [ecx+0] ; eax = width
add eax, 31 ; eax = width + 31
sar eax, 5 ; eax = (width + 31) >> 5
sal eax, 2 ; eax = ((width + 31) >> 5) << 2
mul DWORD [ebp+16] ; eax * y
mov edx, [ebp+12] ; edx = x
sar edx, 3 ; edx = x>>3
add eax, edx ; eax = ((width + 31) >> 5) << 2 * y + (x >> 3)
mov edx, [ecx+8] ; edx = &pImg
add eax, edx
mov eax, [eax]
pop ebp ; epilogue
ret
I tried checking if the address computed in both functions is the same so I changed the return of C to return pPix and commented the line mov eax, [eax] in x86 and to my surprise both functions returned the same number but in the unchanged form (as in the code above) the x86 function always returns -1 for some reason. Is return *pPix not equivalent to mov eax, [eax]? What is wrong with my reasoning?
imgInfo struct
typedef struct
{
int width, height;
unsigned char* pImg; //buffer
int cX, cY;
int col;
} imgInfo;
load_byte C declaration
extern int load_byte(imgInfo* pInfo, int x, int y);

Resources