Recursive FIbonacci arm Assembly - recursion

Edit: I have removed my code as I do not want to get caught for cheating on my assignment. I will repost the code once my assignment has been submitted. I apologize for posting it on stack overflow, I just had no where else to go for help. Please respect my edit to remove the code. I have tried deleting it, but it will not let me as I need to request it.
[MIPS code I was trying to follow][1]
[C Code I was trying to follow][2]
I am trying to convert recursive fibonacci code into arm assembly but I am running into issues. When running my arm assembly, the final value of the sum is 5 when it should be 2. It seems as though my code loops but maybe one too many times. Any help would be much appreciated as I am new to this.

This is what your code is doing, and below is a test run.  This simply isn't a usual recursive fibonacci.
#include <stdio.h>
void f ( int );
int R2 = 0;
int main () {
for ( int i = 0; i < 10; i++ ) {
R2 = 0;
f ( i );
printf ( "f ( %d ) = %d\n", i, R2 );
}
}
void f ( int n ) {
if ( n == 0 ) { R2 += 0; return; }
if ( n == 1 ) { R2 += 1; return; }
f ( n-1 );
f ( n-2 );
R2 += n-1;
}
f ( 0 ) = 0
f ( 1 ) = 1
f ( 2 ) = 2
f ( 3 ) = 5
f ( 4 ) = 10
f ( 5 ) = 19
f ( 6 ) = 34
f ( 7 ) = 59
f ( 8 ) = 100
f ( 9 ) = 167
Either you started with a broken Fibonacci algorithm, or substantially changed it going to assembly.  I don't know how this can be fixed, except by following a working algorithm.

Note that in the C code the only addition is in the fib(n-1) + fib(n-2). In particular the special cases just do return 0; and return 1; respectively. Thus your else add 0/1 to sum lines are wrong. You should replace your additions with moves.
Also, you do MOV R1, R0 //copy fib(n-1) which is incorrect because the fib(n-1) has been returned in R2 not R0. That should be MOV R1, R2.
With these changes the code works, even if it is slightly non-standard.

Related

Climbing Stairs Problem ( Access of element in vector )

Below is a code for the problem of CLIMBING STAIRS https://leetcode.com/problems/climbing-stairs/
class Solution {
public:
int climbStairs(int n) {
vector<int> dp(n,0);
dp[0] = 1;
dp[1] = 2;
for(int i=2;i<n;i++){
dp[i] = dp[i-2]+dp[i-1];
}
return dp[n-1];
}
};
The code gives a RUNTIME ERROR of HEAP BUFFER OVERFLOW.
Looking at the code , if n==1 the code should return dp[n-1] i.e. dp[0] ,
but that does not seem to be the case.
I'm guessing the issue maybe related to access of elements in vector.
Can anyone please explain what could be the issue here ??
if n==1 the code should return dp[n-1] i.e. dp[0] , but that does not seem to be the case.
Yes.
But when n == 1, you call
dp[1] = 2;
so you access the second element when you have only one element.
And what about the case n <= 0 ?
So, maybe
int climbStairs(int n) {
if ( 0 >= 0 ) {
return ???;
} else if ( 1 == n ) {
return 1;
} else {
vector<int> dp(n,0);
dp[0] = 1;
dp[1] = 2;
for(int i=2;i<n;i++){
dp[i] = dp[i-2]+dp[i-1];
}
return dp[n-1];
}
}
The problem states the constraints are
1 <= n <= 45
You're going out of range when n is 1 (i.e. you only have dp[0] that scenario)

Is there a Intel SIMD comparison function that returns 0 or 1 instead of 0 or 0xFFFFFFFF?

I'm currently using the intel SIMD function: _mm_cmplt_ps( V1, V2 ).
The function returns a vector containing the results of each component test. Based on if V1 components are less than V2 components, example:
XMVECTOR Result;
Result.x = (V1.x < V2.x) ? 0xFFFFFFFF : 0;
Result.y = (V1.y < V2.y) ? 0xFFFFFFFF : 0;
Result.z = (V1.z < V2.z) ? 0xFFFFFFFF : 0;
Result.w = (V1.w < V2.w) ? 0xFFFFFFFF : 0;
return Result;
However is there a function like this that returns 1 or 0 instead? A function that uses SIMD and no workarounds because it is supposed to be optimized + vectorized.
You can write that function yourself. It’s only 2 instructions:
// 1.0 for lanes where a < b, zero otherwise
inline __m128 compareLessThan_01( __m128 a, __m128 b )
{
const __m128 cmp = _mm_cmplt_ps( a, b );
return _mm_and_ps( cmp, _mm_set1_ps( 1.0f ) );
}
Here’s more generic version which returns either of the 2 values. It requires SSE 4.1 which is almost universally available by now with 97.94% of users, if you have to support SSE2-only, emulate with _mm_and_ps, _mm_andnot_ps, and _mm_or_ps.
// y for lanes where a < b, x otherwise
inline __m128 compareLessThan_xy( __m128 a, __m128 b, float x, float y )
{
const __m128 cmp = _mm_cmplt_ps( a, b );
return _mm_blendv_ps( _mm_set1_ps( x ), _mm_set1_ps( y ), cmp );
}
The DirectXMath no-intrinsics version of _mm_cmplt_ps is actually:
XMVECTORU32 Control = { { {
(V1.vector4_f32[0] < V2.vector4_f32[0]) ? 0xFFFFFFFF : 0,
(V1.vector4_f32[1] < V2.vector4_f32[1]) ? 0xFFFFFFFF : 0,
(V1.vector4_f32[2] < V2.vector4_f32[2]) ? 0xFFFFFFFF : 0,
(V1.vector4_f32[3] < V2.vector4_f32[3]) ? 0xFFFFFFFF : 0
} } };
return Control.v;
XMVECTOR is the same as __m128 which is 4 floats so it needs the alias to make sure it's writing integers.
I use _mm_movemask_ps for the "Control Register" version of DirectXMath functions. It just collects the top-most bit of each SIMD value.
int result = _mm_movemask_ps(_mm_cmplt_ps( V1, V2 ));
The lower nibble of result will contain bit patterns. A 1 bit for each value that passes the test, and a 0 bit for each value that fails the test. This could be used to reconstruct 1 vs. 0.

What is the reason for an FREESXP node error in Rcpp?

I am using the two R packages 'tidyverse' and 'Rcpp' to execute an C++-function within 'mutate' used on a tibble object.
I get the following error:
Error in mutate_impl(.data, dots) :
Evaluation error: GC encountered a node (0x112b8d800) with an unknown SEXP type: FREESXP at memory.c:1013.
I tried to use valgrind on it, but valgrind gives me an error without even executing and I somehow can't get this fixed on my computer. So I would like to ask, if other people get the same error and might have a solution to it.
Here is an example code to be executed:
# load necessary packages
library( tidyverse )
library( Rcpp )
# define C++ function inline
cppFunction( '
IntegerVector lee_ready_vector( NumericVector & price, NumericVector &bidprice,
NumericVector &askprice ) {
const int nrows = price.length();
IntegerVector indicator( nrows );
if ( nrows < 3 ) {
return indicator;
}
if ( nrows != bidprice.length() || nrows != askprice.length() ) {
throw std::invalid_argument( "Arguments differ in lengths" );
}
NumericVector midprice = ( askprice + bidprice ) / 2.0;
try {
for( int i = 2; i <= nrows; ++i ) {
if ( price[i] == askprice[i] ) {
indicator[i] = 1;
} else if ( price[i] == bidprice[i] ) {
indicator[i] = -1;
} else {
if ( price[i] > midprice[i] ) {
indicator[i] = 1;
} else if ( price[i] < midprice[i] ) {
indicator[i] = -1;
} else {
/* price == midpice */
if ( price[i] > price[i-1] ) {
indicator[i] = 1;
} else if ( price[i] < price[i-1] ) {
indicator[i] = -1;
} else {
if ( price[i] > price[i-2] ) {
indicator[i] = 1;
} else {
indicator[i] = -1;
}
}
}
}
}
} catch ( std::exception &ex ) {
forward_exception_to_r( ex );
} catch (...) {
::Rf_error( "c++ exception (unknown reason)" );
}
return indicator;
}')
# define function for random dates inline
latemail <- function( N, st="2012/01/01", et="2012/03/31" ) {
st <- as.POSIXct( as.Date( st ) )
et <- as.POSIXct( as.Date( et ) )
dt <- as.numeric( difftime( et,st,unit="sec" ) )
ev <- sort(runif( N, 0, dt ) )
rt <- st + ev
sort( as.Date( rt ) )
}
# set random seed
set.seed( 12345 )
# start test loop
# try 100 times to crash the session
# repeat this whole loop several times, if necessary
for ( i in 1:100 ) {
# 500,000 observation altogether
N <- 500000
dates <- latemail( N )
mid <- sample(seq(from=8.7, to=9.1, by = 0.01), N, TRUE)
# bid and ask series lay around mid series
bid <- mid - .1
ask <- mid + .1
# p is either equal to bid or ask or lays in the middle
p <- rep( 0, N )
for(i in 1:2000) {
p[i] <- sample( c(mid[i], bid[i], ask[i]), 1 )
}
# create the dataset
df <- tibble( dates, p, bid, ask )
# execute the C++ function on grouped data
df %>% group_by( dates ) %>%
mutate( ind = lee_ready_vector( p, bid, ask ) ) %>%
ungroup()
}
Is anybody able to reproduce the error. Anyone able to give a solution?
There is a lot going on in your code, and the example is not reproducible which is always a drag. But let's start somewhere:
Your loop index in C++ is for( int i = 2; i <= nrows; ++i ) which is very likely wrong. Indices in C and C++ run from 0 to n-1, so you probably want for( int i = 1; i < nrows; ++i ) which allows to lag once.
Your use of inline and cppFunction is outdated. Use Rcpp Attributes instead. Read a recent intro such as the intro vignette from our recent TAS paper. That also frees you from doing the try/catch at the end.
Your time conversion is too complicated. Just use anytime::anytime() on the input to get POSIXct.
Your lack of indentation does not help. I would write the core part in a proper editor for C++ and maybe include the R snippet after /*** R or have a separate R file.
Lee and Ready is nice but not all that predictive.
Since my last post here, I tried out the tips, Dirk has given above. Isolating the error to some specific rows of data, turned out to be quite difficult: due to the double grouping of this large dataset and the dependence of rows in the algorithm, I spent a lot of time testing without any success and had still a lot of work to do. At some point I turned to Dirk’s first tip, namely
Your loop index in C++ is for( int i = 2; i <= nrows; ++i ) which is very likely wrong. Indices in C and C++ run from 0 to n-1, so you probably want for( int i = 1; i < nrows; ++i ) which allows to lag once.
So, I recoded my loop so that it is for( int i = 0; i < nrows - 2; ++i ) and adjusted the indices of the variables inside the loop accordingly and the error is gone. So it seemed that for some rows - when the last cases in the loop were reached - an indexing error occurred. From now on I will always start my loop at 0. Even though concrete solutions could not be given, this tip has helped me a lot. Thanks again.
To point 2: In my package I actually use Attributes, I wanted to give users here the possibility to just run the script in the console. For the future: What to do with cpp-files here? Just posting the code and the file names?
Point 3: This is an interesting package. I used it here and there while searching for the error with sample data, but I haven’t heard of it before. Thanks for mentioning this.
Point 4: I edited this above. My apology.
Regarding 5. Lee & Ready: in science this is still the most accepted algorithm for identification of trade direction and as older paper used this algorithm, comparisons with older literature use then the same algorithm. As I know, you are working in the quantitative finance field for a very long time now, what alternative would you suggest, Dirk?

Multi dimensional array assignment in verilog, without loop?

How can one assign a multi dimensional array if it is a wire, in a single line?
assign TO[W1:0][W2:0] = cntrl ? FROM1[W1:0][W2:0] : FROM2[W1:0][W2:0];
I get a syntax error if I use this.
Is there any other way other than using generate or for loops?
You need to be using SystemVerilog to make aggregate assignments to arrays.
To assign an unpacked arrays braces with tick '{ and } are used, provided all the values of the array should be assigned.
usage example
module top ( input i);
wire d [0:1][0:3];
wire a [0:1][0:3]='{ '{1,1,1,1}, '{1,1,1,1} };
wire b [0:1][0:3]='{ '{0,0,0,0}, '{0,0,0,0} };
assign d = i? (' { '{a[0][0],a[0][1],a[0][2],a[0][3]},'{b[1][0],b[1][1],b[1][2],b[1][3]}}):
(' { '{b[1][0],b[1][1],b[1][2],b[1][3]},'{a[0][0],a[0][1],a[0][2],a[0][3]}});
endmodule
Here wire a [0:1][0:3]='{ '{1,1,1,1}, '{1,1,1,1} }; and wire b [0:1][0:3]='{ '{0,0,0,0}, '{0,0,0,0} }; represents
// a[0][0] = 1 b[0][0] = 0
// a[0][1] = 1 b[0][1] = 0
// a[0][2] = 1 b[0][2] = 0
// a[0][3] = 1 b[0][3] = 0
// a[1][0] = 1 b[1][0] = 0
// a[1][1] = 1 b[1][1] = 0
// a[1][2] = 1 b[1][2] = 0
// a[1][3] = 1 b[1][3] = 0
Working example can be found in the eda-playground link
Two pure verilog solutions: (Assuming W1 and W2 are parameters, not variables)
// 'TO' must be a reg
integer i,j;
always #* begin
for(i=0; i<=W1; i=i+1) begin
for(j=0; j<=W2; j=j+1) begin
TO[i][j] = cntrl ? FROM1[i][j] : FROM2[i][j];
end
end
end
// 'TO' must be a wire
genvar i,j;
generate
for(i=0; i<=W1; i=i+1) begin
for(j=0; j<=W2; j=j+1) begin
assign TO[i][j] = cntrl ? FROM1[i][j] : FROM2[i][j];
end
end
endgenerate

Maximum no. of nodes reachable from a given source in a Graph

I have a directed graph in which each node has exactly one edge, to one other node. I have to find the node from which the maximum number of nodes can be reached.
I tried to do it using a dfs, and store the information in the array sum[] but I get segmentation faults.
The graph is represented as an adjacency List of pair< int, int >. First is the destination, and second is the weight. In this problem weight = 0.
My dfs implementation:
int sum[V]; // declared globally, initially set to 0
bool visited[V]; // declared globally, initially set to false
int dfs( int s ){
visited[s]= true;
int t= 0;
for( int i= 0; i< AdjList.size(); ++i ){
pii v= AdjList[s][i];
if( visited[v.first] )
return sum[v.first];
t+= 1 + dfs( v.first );
}
return sum[s]= t;
}
Inside main():
int maxi= -1; // maximum no. of nodes that can be reached
for( int i= 1; i<= V; ++i ){ // V is total no. of Vertices
int cc;
if( !visited[i] )
cc= g.dfs( i ) ;
if( cc > maxi ){
maxi= cc;
v= i;
}
}
And the graph is :
1 2 /* 1---->2 */
2 1 /* 2---->1 */
5 3 /* 5---->3 */
3 4 /* 3---->4 */
4 5 /* 4---->5 */
What is be the problem in my dfs implementation?
You exit your dfs when you find any node that was already reached, but I have the impresion that you should run through all adjectent nodes: in your dfs function change the if statement inside for loop:
instead:
if(visited[v.first] )
return sum[v.first];
t+= 1 + dfs( v.first );
if(!visited[v.first] ) {
t+= dfs( v.first );
}
and initialize t with 1 (not 0). This way you will find size of connected component. Because you are not interested in the node from which you started then you have to decrease the final result by one.
There is one more assumption that I made: your graph is undirected. If it's directed then if you are interested in just solving the problem (not about complexity) then just clear visited and sum array after you are done with single run of dfs in main function.
EDIT
One more error in your code. Change:
for( int i= 0; i< AdjList.size(); ++i ){
into:
for( int i= 0; i< AdjList[s].size(); ++i ){
You should be able to trace segfault by yourself. Use gdb it's really usefull tool.

Resources