How to reverse an array recursively using RISC-V? - recursion

I really have absolutely no idea how to do this, seems like the only solution online is behind a paywall. This is all I have:
.data
array: .word 1 2 3 4 5 6 7
.text
main:
la x12, array
addi x13, x0, 7
addi x13, x13, -1
slli x13, x13, 2
add x13, x13, x12
jal x1, reverse
beq x0, x0, END
##### the function would start here #####
reverse: jalr x0, 0(x1)
##### and end here #####
END: add x1, x0, x10
The procedure receives two memory locations, the start position and the end position of the array.
My guess was something like using srli, but I'm 99.9% sure it's wrong.
(this is not for a college assignment btw, just an unnecessarily long revision exercise that's been haunting me for a week and now my brain freezes every time I look at a single line of assembly code)

Related

The program returns error "attempt to execute non-instruction at 0x00000000"

I have been given HW regarding RISC-V.
The tasks is to solve the recurring equation T(n) = 2T(n/2)+n if the n or input is >=2, otherwise it returns 1. I have tried to create the solution code but it keeps giving me the (error) attempt to execute non-instruction at 0x00000000. Can someone please tell me where is my mistake and how to fix it?
Thank you for your time!
Notes: I can only starts to write the code from the "Write your recursive code here...."
.globl __start
.rodata
msg_input: .string "Enter a number: "
msg_result: .string "The result is: "
newline: .string "\n"
.text
__start:
# prints msg_input
li a0, 4
la a1, msg_input
ecall
# read from standard input
li a0, 5
ecall
################################################################################
# write your recursive code here, input is in a0, store the result(integer type) to t0
jal findsum
findsum:
li t0, 2 #t0==2
blt a0, t0, L1 #if n<2 return 1
addi sp, sp, -8 #reserve stack area
sw ra, 0(sp) #save return address
sw a0, 4(sp) #save input
li t0, 2 #t0==2
div a0, a0, t0 #n=n/2
jal findsum #call findsum(n/2)
#a1=FindSum(n/2)
li t0, 2 #t0=2
mul a1, t0, a1 #a1=2*FindSum(n/2)
addi a1, a1, 2 #a1=2*FindSum(n/2)+2
j done
L1:
li a1, 1
done:
lw ra, 0(sp)
addi sp, sp, 8
jr ra
################################################################################
result:
# prints msg_result
li a0, 4
la a1, msg_result
ecall
# prints the result in t0
li a0, 1
mv a1, t0
ecall
# ends the program with status code 0
li a0, 10
ecall

"(error) attempting to write to an invalid memory address" When trying to store stack pointer

I'm trying to learn RISC-V under the Jupiter environment (risc32) and I came across a problem asking me to write a recursive program with RISC-V. I can't seem to get the sw instruction to work, as it always gives an error: invalid address
I've tried different offsets, different registers etc. nothing seems to work
.globl __start
.rodata
msg_input: .string "Enter a number: "
msg_result: .string "The result is: "
newline: .string "\n"
.text
__start:
# prints msg_input
li a0, 4
la a1, msg_input
ecall
#read from standard input
li a0, 5
ecall
#initialize stack
addi x31, x0, 2
addi sp, x0, 800
mv x5, a0
jal x1, recfunc
mv t0, x5
recfunc:
addi sp, sp, -8
sw x1, 0(sp)
bge x5, x31, true
lw x1, 0(sp)
addi x10, x0, 1
addi sp,sp, 8
jalr x0, 0(x1)
true:
div x5, x5, x31
jal x1, recfunc
lw x1, 0(sp)
addi sp,sp,8
mul x10, x10, x31
addi x10, x10, 1
jalr x0, 0(x1)
result:
#prints msg_result
li a0, 4
la a1 msg_result
ecall
#prints the result in t0
li a0, 1
mv a1, t0
ecall
#ends the program with status code 0
li x5, 10
ecall
Error occurs at:
sw x1, 0 (x2)
(error) attempting to write to an invalid memory address 0x00000318

Recursive program in RISC-V assembly

I am trying to create a recursive program in RISC-V but I can't get it to get me the right result. It looks like it is calling itself only two times max, but I tried running it on paper and everything seems correct:
addi x31, x0, 4
addi x30, x0, 2
addi x2, x0, 1600 //initialize the stack to 1600, x2= stackpointer
ecall x5, x0, 5 //read the input to x5
jal x1, rec_func
ecall x0, x10, 2 //print the result
beq x0, x0, end
rec_func:
addi x2, x2, -16 //make room in stack
sd x1, 0(x2) //store pointer and result in stack
sd x10, 8(x2)
bge x5, x31, true // if i > 3, then go to true branch
addi x10, x0, 1 // if i <= 3, then return 1
addi x2, x2, 16 // reset stack point
jalr x0, 0(x1)
true:
addi x5, x5, -2 // compute i-2
jal x1, rec_func // call recursive func for i-2
ld x1, 0(x2) // load the return address
ld x10, 8(x2) // load result from last function call
addi x2, x2, 16 // reset stack point
mul x10, x10, x30 // multiply by 2
addi x10, x10, 1 // add 1
jalr x0, 0(x1) // return
end:
This is the original program logic:
if i<= 3 return 1
else return 2 * rec_func(i-2) +1
I don't have enough reputation to add a comment but have you tried running this with a debugger (GDB ?) instead of on paper? That should show what's actually in the registers and why it's not branching as you might expect. I'm not familiar enough with these instructions (learning x86 assembly) to figure the source out at the moment.
I got it working now. The changes that I made were following:
In the part where I was returning 1, I wasn't loading the stack pointer of the return call
I removed the storing of x10 into the memory as #Michael pointed it out that I was returning it and I didn't need it.
The final code looks like this:
addi x31, x0, 4
addi x30, x0, 2
addi x2, x0, 1600 // initialize the stack to 1600, x2= stackpointer
ecall x5, x0, 5 // read the input to x5
jal x1, rec_func
ecall x0, x10, 2 // print the result
beq x0, x0, end
rec_func:
addi x2, x2, -8 // make room in stack
sd x1, 0(x2) // store pointer and result in stack
bge x5, x31, true // if i > 3, then go to true branch
ld x1, 0(x2)
addi x10, x0, 1 // if i <= 3, then return 1
addi x2, x2, 8 // reset stack point
jalr x0, 0(x1)
true:
addi x5, x5, -2 // compute i-2
jal x1, rec_func // call recursive func for i-2
ld x1, 0(x2) // load the return address
addi x2, x2, 8 // reset stack point
mul x10, x10, x30 // multiply by 2
addi x10, x10, 1 // add 1
jalr x0, 0(x1) // return
end:

Recursive Fibonacci in ARM

I'm trying to convert this recursive Fibonacci code to arm assembly language. I'm new to this and not really sure how to do it. I have some code snippets of things that I've played with below.
Fib (n) {
if (n == 0 || n == 1) return 1;
else return Fib(n-2) + Fib(n-1);
}
Here is my attempt so far:
RO = 1
CMP RO #1
BGT P2
MOV R7 #1
B END
P2:
END LDR LR [SO,#0]
ADD SP SP, #8
MOV PC, LR
Help would be much appreciated
For the sake of avoiding spoon-feeding, I wrote a LEGv8 program that finds Fibonacci sequence using recursion. LEGv8 is slightly different than ARMv8, however the algorithm remains.
Please review the code, and change the commands / registers to their corresponding values in ARMv8.
I assumed that n (the range of the Fibonacci sequence) is stored in register X19.
I also assumed that we ought to store the Fibonacci sequence in an array, which has its base address stored in X20.
MOV X17, XZR // keep (previous) 0 in X17 for further use
ADDI X18, XZR, #1 // keep (Current) 1 in X18 for further use
ADDI X9, XZR, #0 // Assuming i = 0 is in register X9
fibo:
SUBI SP, SP, #24 // Adjust stack pointer for 3 items
STUR LR, [SP, #16] // save the return address
STUR X17, [SP, #8] //save content of X17 on the stack
STUR X18, [SP, #0] //save content of X18 on the stack
SUBS X10, X9, X19 // test for i==n
CBNZ X10, L1 // If i not equal to n, go to L1
MOV X6, XZR // keep 0 on X6
ADDI X5, XZR, #1 // keep 1 on X5
ADDI X2, X9, #1 //X9 increased by 1 for further use
STUR X6, [X20,#0] //store 0 in the array
STUR X5, [X20, #8] //store 1 in the array
ADDI SP, SP, #24 // pop 3 items from the stack
BR LR // return to the caller
L1:
ADD X16, X17, X18 // Next_Num = previous + Current
MOV X17, X18 // Previous = Current
MOV X18, X16 // Current= Next_Num
ADDI X9, X9, #1 // i++
BL fibo // call fibo
LDUR X18, [SP, #0] // return from BL; restore previous
LDUR X17, [SP, #8] // restore current
LDUR LR, [SP, #16] // restore the return address
ADDI SP, SP, #24 // adjust stack pointer to pop 3 items
ADD X7, X18, X17 // keep (previous + current) value on register X7
LSL X2, X2, #3 // Multiplying by 8 for offset
ADD X12, X20, X2 // address of the array increase by 8
STUR X7, [X12, #0] // store (previous + current) value on the array
SUBI X2, X2, #1 // X9 decreased by 1
BR LR // return

Recursive Procedure Whose Calls Resemble a Binary Tree in MIPS

I'm working on an assignment and am having difficulty understanding how to properly code the following problem in C.
int choose(int n, int k){
if (k == 0) {
return 1;
} else if (n == k) {
return 1;
} else if (n < k) {
return 0;
} else {
return choose(n-1, k-1) + choose(n-1, k);
}
}
My thoughts were to use three registers for storing the values onto the stack with each call $s0, $s1, $s2, where $s0 will contain the value of updated n; $s1 would maintain the value of k; and $s2 would hold the value of k in the second choose(n-1, k) since that value will only decrease when the parent call changes it. The reason I chose this is because the value of k isn't subtracted from each call in this one, it should be the same until the parent decrements it in a previous call.
Here is the Choose procedure that I'm trying to do. Problem is that I'm not getting the correct answer, of course.
Choose:
#store current values onto stack
addi $sp, $sp, -16
sw $ra, 0($sp)
sw $s0, 4($sp)
sw $s1, 8($sp)
sw $s2, 12($sp)
#check values meet criteria to add to $v0
beq $s1, $0, one
beq $s0, $s1, one
blt $s0, $s1, zero
beq $s2, $0, one
#no branches passed so decrement values of n and k
subi $s0, $s0, 1
subi $s1, $s1, 1
#move values of registers to $a's for argument passing
move $a0, $s0
move $a1, $s1
jal Choose #call procedure again
#this is where I'm having my problems
#not sure how to loop the procedure to get
#the second half of the equation Choose(n-1,k)
#which is the reason for the next 2 lines of code
move $a2, $s2
jal Choose
add $v0, $v0, $v1
j Choose_Exit
#add one to $v1 from branch passed
one:
addi $v1, $v1, 1
j Choose_Exit
#branch returns 0
zero:
addi $v1, $v1, 0
j Choose_Exit
#return values to caller from stack
Choose_Exit:
lw $s2, 12($sp)
lw $s1, 8($sp)
lw $s0, 4($sp)
lw $ra, 0($sp)
addi $sp, $sp, 16
jr $ra
So I'm having a problem understanding how to properly implement this recursive procedure twice to add them together. I can understand how to create a recursive procedure in MIPS to perform a factorial, since that is always the definition of recursion for any language. But using recursion with differing arguments and then add them all together is confusing me to no end.
When written out on paper, I understand that this procedure can be represented by a binary tree of parents and children. The parent being the single function Choose(n,k) and the children being Choose(n-1, k-1) + Choose(n-1, k) and once one of the leaf children branches from the if statement, it passes a digit to the parent who will wait for the other callee portion of the addition to return its value, etc etc etc.
Any help to point me in the correct direction as to what I'm doing wrong with my approach would be great. I understand the beginning, I understand the end, just need some assistance to help understand the most important part of the middle.
You were pretty close.
You established your stack frame with four words for: return address, arg1, arg2, and save for return value.
Your main snag was that after the first call to your function, you have to save the $v0 onto the stack [as Margaret mentioned above].
Here's some code that I believe will work. It is very similar to yours, but I wrote it from scratch. It has the correct "push"/"pop" of the first call's return value.
I did add one small optimization for the early escape [non-recursive] cases: they omit creating the stack frame.
Anyway, here it is:
##+
# int
# choose(int n, int k)
# {
#
# if (k == 0)
# return 1;
#
# if (n == k)
# return 1;
#
# if (n < k)
# return 0;
#
# return choose(n - 1,k - 1) + choose(n - 1,k);
# }
##-
.text
# choose -- choose
#
# RETURNS:
# v0 -- return value
#
# arguments:
# a0 -- n
# a1 -- k
#
# registers:
# t0 -- temp for 1st return value
choose:
beqz $a1,choose_one # k == 0? if yes, fly
beq $a0,$a1,choose_one # n == k? if yes, fly
blt $a0,$a1,choose_zero # n < k? if yes, fly
# establish stack frame (preserve ra/a0/a1 + space for v0)
sub $sp,$sp,16
sw $ra,12($sp)
sw $a0,8($sp)
sw $a1,4($sp)
addi $a0,$a0,-1 # get n - 1 (common to both calls)
# choose(n - 1,k - 1)
addi $a1,$a1,-1 # get k - 1
jal choose
sw $v0,0($sp) # save 1st return value (on _stack_)
# choose(n - 1,k)
addi $a1,$a1,1 # get k (from k - 1)
jal choose
lw $t0,0($sp) # "pop" first return value from stack
add $v0,$t0,$v0 # sum 1st and 2nd values
# restore from stack frame
lw $ra,12($sp)
lw $a0,8($sp)
lw $a1,4($sp)
add $sp,$sp,16
jr $ra # return
choose_one:
li $v0,1
jr $ra
choose_zero:
li $v0,0
jr $ra
UPDATE:
First off, I like how you noted the procedure as you did before you called it. I'm going to steal that!
Be my guest! It's from many years of writing asm. For a primer on my thoughts about how to write asm well, see my answer: MIPS linked list
I've tried this and it works. I need to experiment with your code to understand why the stack is manipulated when it is (always thought it had to be at the very beginning and end of a proc).
Normally, the stack frame is established at the proc start and restored from at the proc end. Your code for handling the "quick escape" [non-recursive] cases was correct, based on having already established the frame.
This was just a small optimization. But, it comes from the fact that because mips has so many registers that, for small functions, we don't even need a stack frame, particularly if the function is a "leaf" or "tail" (i.e. it doesn't call any other function).
For smaller [non-recursive] functions, sometimes we can get away with a one word stack frame that just preserves $ra (e.g.): fncA calls fncB, but fncB is a leaf. fncA needs a frame but fncB does not. In fact, if we control both functions and we know that fncB does not modify a given temp register (e.g. $t9), we can save the return address there instead of creating a stack frame in fncA:
fncA:
move $t9,$ra # preserve return address
jal fncB # call fncB
jr $t9 # return
fncB:
# do stuff ...
jr $ra # return
Ordinarily, we couldn't rely upon fncB preserving $t9 because, according to the mips ABI, fncB is at liberty to modify/trash any register that is not $sp or $s0-$s7. But, if we craft the functions such that we consider fncB to be "private" to fncA (e.g. like a C static function that only fncA has access to), we can do whatever we want.
Believe it or not, fncA above is ABI conforming.
A given callee (e.g. fncA) does not need to preserve $ra for [the sake of] its caller, just for itself. And, what is important is the value inside $ra, not the specific register. A callee only needs to preserve $s0-$s7, ensure that $sp has the same value at exit as entry, and that it returns to the correct address in caller [which the jr $t9 does--because it has the value that was in $ra when fncA was called].
I like your use of the temp register.
An extra register is required because, in mips, we can not do arithmetic operations from memory operands. mips can only do lw/sw. (i.e.) There is no such thing as:
add $v0,$v0,0($sp)
I used $t0 for simplicity/clarity because, when you need a temp reg, $t0-$t9 are the usual ones to use. The code "reads better" when using $t0.But, this is just a convention.
In the mips ABI, $a0-$a3 can be modified, as can $v1 as only $s0-$s7 need to be preserved. And, "modification" means that they can be used to hold any value or used for any purpose.
In the above link, note that strlen increments $a0 directly to find the end of the string. It is using $a0 for a useful purpose, but, as far as the caller to it is concerned, $a0 is being "trashed" [by strlen]. This usage is ABI conforming.
In choose, I could have used just about any register: $v1, $a2-$a3 instead of $t0. In fact, at that particular point in choose, $a0 is no longer needed, so it could have been used in place of $t0. Although for choose, we are non-ABI conforming (because we save/restore $a0-$a1), this would work in choose, because we restore the original value of $a0 from the function epilog [stack frame pop], preserving the recursive nature of the function.
As I said, $t0-$t9 are the usual registers to use for scratch space. But, I've written functions that use all 10 of them, and still needed more (e.g. drawing into a frame buffer using the Bresenham circle algorithm). $v0-$v1 and $a0-$a3 can be used as temp regs to get an additional 6. If necessary, $s0-$s7 can be preserved in the stack frame, solely to free them up to use as more temp regs.
Disclaimer: I rewrite the assembly code WITHOUT checking it.
There are delay-slots to consider so better to make smart use of them and to make them explicit to avoid aggregations of implicit nop instructions after branch instructions.
Reverse the order of calls between choose(n - 1,k - 1) and choose(n - 1,k) for smarter usage of $a0 and $a1 and of the stack.
Restrain stack usage for only calling choose(n - 1,k) and use tail calling for calling choose(n - 1,k - 1).
It makes more sense to stack the value of a0-1 and not a0.
We accumulate everything into $v0 instead of stacking it. We keep $t0 as a local result to add to $v0 because it is cheap while we can discard it by doing direct things with $v0.
As a result, the overall changes should make icache happier with less instructions and dcache happier with less stack space.
The assembly code:
##+
# int
# choose(int n, int k)
# {
#
# if (k == 0)
# return 1;
#
# if (n == k)
# return 1;
#
# if (n < k)
# return 0;
#
# return choose(n - 1,k - 1) + choose(n - 1,k);
# }
##-
.text
# choose -- choose
#
# RETURNS:
# v0 -- return value
#
# arguments:
# a0 -- n
# a1 -- k
#
# registers:
# t0 -- temp for local result to accumulate into v0
choose:
j choose_rec
addui $v0, $zr, 0 # initialize $v0 to 0 before calling
choose_rec:
beqz $a1,choose_one # k == 0? if yes, fly
addui $t0,$zr,1 # branch delay-slot with $t0 = 1
beq $a0,$a1,choose_one # n == k? if yes, fly
nop # branch delay-slot with $t0 = 1 already done
blt $a0,$a1,choose_zero # n < k? if yes, fly
addui $t0,$zr,0 # branch delay-slot with $t0 = 0
# establish stack frame (preserve ra/a0/a1)
sub $sp,$sp,12
addui $a0,$a0,-1 # get n - 1 (common to both calls)
sw $ra,8($sp)
sw $a0,4($sp)
jal choose_rec # choose(n - 1,k)
sw $a1,0($sp)
# restore from stack frame
lw $ra,8($sp)
lw $a0,4($sp)
lw $a1,0($sp)
add $sp,$sp,12
# choose(n - 1,k - 1)
j choose_rec # tail call: jump instead of call
addi $a1,$a1,-1 # get k - 1
choose_one:
choose_zero:
jr $ra # return
addui $v0,$v0,$t0 # branch delay-slot with $v0 += $t0

Resources