Comparison of pointers that contain the same address? - pointers

My function adds all elements of an array together and takes the "start" pointer and the "end" pointer(I know there are easier ways to get the sum). My problem is that my for-loop is skipped. But if I test the condition separately it works. Does that have anything to do with the order of execution of the for-loop?
My example:
int arr[]={3, 2, 1, 1}
int *start = &arr[0]
int *end = &arr[3]
printf("%d\n", (&start[0] == end)) //The result is 0(false)
printf("%d\n", (&start[3] == end)); // The result is 1(true)
for (int i = 0; (&start[i] == end); i++) // The for-loop dosen't get executed.

if i==0 then the condition is false, so the loop is not executed..
if you want to go through the array you should write &start[i] != end for the condition

Related

Heap Corruption error when calling C from R, can't find the source issue

UPDATE3: Problem is solved but I'm leaving the code here as-is for future reference--I've posted an answer below with the final state of the code in case people wanted to see the final product.
UPDATE2: Refactored to use R_alloc instead of calloc for automated cleanup. Unfortunately the problem persists.
UPDATE: If I add this line right before UNPROTECT(1):
Rprintf("%p %p %p", (void *)rans, (void *)fm, (void *)corrs);
then the function executes with no corrupted heap error. Maybe there's a background garbage collection call that corrupts one of the pointers prior to execution finishing, resulting in a write to a garbage pointer? Important to note here that if I don't print out all three of the pointer addresses, the error comes back.
Also I'm running this on an M1 Mac and compiling with clang via R CMD SHLIB, in case Apple silicon is to blame.
I'm at my wits end trying to debug this issue, and I figured I'd turn to SO for help. I'm writing a function in C to optimize some parts of my R code, and I'm getting a Heap Corruption Error when running the function many times. The function trimCovar() is called from R using the .Call("trimCovar", ...) interface.
I'm having a lot of difficulty debugging this for a few reasons:
I'm on OSX, so I can't use Valgrind
C function depends on inputs from R, so I can't debug the C code on its own
Heap corruption only occurs when calling the function many times within an R function
(just running .Call directly a bunch of times has no errors)
Error point is inconsistent
I start with two sets of vectors, and I condense them into a frequency matrix, where each column is a position in the vector set, and each row is a particular character that appears. I concatenate them into one matrix prior to passing in because it makes pre-processing easier. An toy example of the frequency matrix would be:
INPUT:
v1_1 = 101
v1_2 = 011
v2_1 = 111
v2_2 = 110
Frequency Matrix:
position: | 1_1 | 1_2 | 1_3 | 2_1 | 2_2 | 2_3 |
0: 0.5 0.5 0.0 0.0 0.0 0.5
1: 0.5 0.5 1.0 1.0 1.0 0.5
The goal is to find the NV highest correlated positions across the vector sets, which I do by calculating pairwise KL divergence of positions. These are stored in a linked list sorted in ascending order, and at the end I take the positions corresponding to the first NV entries. The R code I have can deparse everything else, so I really just need a vector of positions at the end (duplicates are allowed).
The function takes in 5 arguments:
fMAT: a frequency matrix (RObject, so gets read in as a flat vector)
fSP : columns in matrix corresponding to positions from the first vector set
sSP : same as fSP but for second vector set
NV : Number of values to return
NR : Number of columns in fMAT
The error returned is:
R(95564,0x104858580) malloc: Heap corruption detected, free list is damaged at 0x600000f10040
*** Incorrect guard value: 4626885667169763328
R(95564,0x104858580) malloc: *** set a breakpoint in malloc_error_break to debug
This only happens when I run an R function that calls this 10+ times, so I'm assuming that I'm just missing one or two small hanging pointers corrupting a memory reference. I've tried running this with gc() called in R immediately after each call, but it doesn't fix the problem. I'm not really sure what else to do at this point, I've tried using lldb but I'm not really sure how to use that program. From running lots of print statements I've determined that it usually crashes in the main loop (identified in code below), but it's inconsistent on when it crashes. I've also tried saving off erroneous inputs--I can rerun them individually with no issues, so it must be something relatively small that only appears over many runs.
Happy to provide more details if it would help. Code is listed at the bottom.
The only thing being allocated here are linked list nodes, and I thought I had free()'d them all prior to returning. I've also double checked the input values, so I'm 99.99% sure that I'm never referencing out of bounds on firstSeqPos, secondSeqPos, ans, or fm. I've also triple checked the R code surrounding this and can confidently say it is not the source of this error.
I haven't coded in C in a long time so I feel like I'm missing something obvious. If I really have to I can try to get ahold of a Linux box to run valgrind, but if there's another option I'd prefer it. Thanks in advance!
Code:
#include <R.h>
#include <Rdefines.h>
#include <Rinternals.h>
#include <math.h>
#include <stdlib.h>
#include <stdbool.h>
typedef struct node {
double data;
int i1;
int i2;
struct node *next;
} node;
// Linked list
// data is the correlation value,
// i1 the position from first vector set,
// i2 the position from second vector set
node *makeNewNode(double data, int i1, int i2){
node *newNode;
newNode = (node *)R_alloc(1, sizeof(node));
newNode->data = data;
newNode->i1 = i1;
newNode->i2 = i2;
newNode->next = NULL;
return(newNode);
}
//insert link in sorted order (ascending)
void insertSorted(node **head, node *toInsert, int maxSize) {
int ctr = 0;
if ((*head) == NULL || (*head)->data >= toInsert->data){
toInsert->next = *head;
*head = toInsert;
} else {
node *temp = *head;
while (temp->next != NULL && temp->next->data < toInsert->data){
temp = temp->next;
if (ctr == maxSize){
// Performance optimization, if we aren't inserting in the first NR
// positions then we can just skip since we only care about the NR
// lowest scores overall
return;
}
ctr += 1;
}
toInsert->next = temp->next;
temp->next = toInsert;
}
}
// MAIN FUNCTION CALLED FROM R
// (This is the one that crashes)
SEXP trimCovar(SEXP fMAT, SEXP fSP, SEXP sSP, SEXP NV, SEXP NR){
// Converting input SEXPs into C-compatible values
int nv = asInteger(NV);
int nr = asInteger(NR);
int sp1l = length(fSP);
int sp2l = length(sSP);
int *firstSeqPos = INTEGER(coerceVector(fSP, INTSXP));
int *secondSeqPos = INTEGER(coerceVector(sSP, INTSXP));
double *fm = REAL(fMAT);
int colv1, colv2;
// Using a linked list for efficient insert
node *corrs = NULL;
int cv1, cv2;
double p1, p2, score=0;
// USUALLY FAILS IN THIS LOOP
for ( int i=0; i<sp1l; i++ ){
cv1 = firstSeqPos[i];
colv1 = (cv1 - 1) * nr;
for ( int j=0; j<sp2l; j++ ){
cv2 = secondSeqPos[j];
colv2 = (cv2 - 1) * nr;
// KL Divergence
score = 0;
for ( int k=0; k<nr; k++){
p1 = fm[colv1 + k];
p2 = fm[colv2 + k];
if (p1 != 0 && p2 != 0){
score += p1 * log(p1 / p2);
}
}
// Add result into LL
node *newNode = makeNewNode(score, cv1, cv2);
insertSorted(&corrs, newNode, nv);
}
R_CheckUserInterrupt();
}
SEXP ans;
PROTECT(ans = allocVector(INTSXP, 2*nv));
int *rans = INTEGER(ans);
int ctr=0;
int pos1, pos2;
node *ptr = corrs;
for ( int i=0; i<nv; i++){
rans[2*i] = ptr->i1;
rans[2*i+1] = ptr->i2;
ptr = ptr->next;
}
UNPROTECT(1);
return(ans);
}
int *firstSeqPos = INTEGER(coerceVector(fSP, INTSXP));
int *secondSeqPos = INTEGER(coerceVector(sSP, INTSXP));
This is not good. The SEXPs returned by the 2 calls to coerceVector() need to be protected. However it's usually considered better practice to do this coercion at the R level right before entering the .Call entry point. Note that if fSP and sSP are integer matrices, there's no need to coerce them to integer as they are already seen as integer vectors at the C level. This also avoids a possibly expensive copy (as.integer() in R and coerceVector() in C both trigger a full copy of the matrix data).
The question was answered above, but I received a couple messages from people asking for the final code, so I'm going to include it as an answer to preserve the original question. There's a couple optimizations here (thanks to #hpages for help and troubleshooting regarding these):
Original code fails because the output of coerceVector() wasn't protected with PROTECT(). I've refactored the R code to check for integer inputs prior to calling this C function to avoid this function call and be more efficient with memory (see the accepted answer for more details).
Original code uses R_alloc(), which gives responsibility to R to clean up memory at the end of the function call. However, this introduces substantial memory overhead during the runtime of the function, since memory allocated to nodes not inserted into the linked list aren't cleared until the end of the function call.
Allocation with calloc() isn't as simple as switching over and calling free() at the end of the function, since we have to guard the case where the user interrupts execution of the program. If an interrupt signal is thrown prior to the end of the function, we'll never free the memory.
Final C Code:
#include <R.h>
#include <Rdefines.h>
#include <Rinternals.h>
#include <math.h>
#include <stdlib.h>
#include <stdbool.h>
typedef struct node {
double data;
int i1;
int i2;
struct node *next;
} node;
// Defining the head as a static so that we can access it globally
// Important for ensuring clean up in case of interrupt
static node *corrs = NULL;
// Function to clean up memory allocations in case of interrupt
void cleanupFxn(){
node *ptr = corrs;
// Free allocated memory in linked list
while (corrs != NULL){
ptr = corrs;
corrs = corrs->next;
free(ptr);
}
}
node *makeNewNode(double data, int i1, int i2){
node *newNode;
// very important to use calloc here so we have control of when we free it
// R_alloc() memory won't be freed until after function finishes execution
newNode = (node *)calloc(1, sizeof(node));
newNode->data = data;
newNode->i1 = i1;
newNode->i2 = i2;
newNode->next = NULL;
return(newNode);
}
// insert link in sorted order
// returns a bool corresponding to if we inserted
bool insertSorted(node **head, node *toInsert, int maxSize) {
int ctr = 0;
if ((*head) == NULL || (*head)->data >= toInsert->data){
toInsert->next = *head;
*head = toInsert;
return(true);
} else {
node *temp = *head;
while (temp->next != NULL && temp->next->data < toInsert->data){
temp = temp->next;
if (ctr == maxSize){
// Performance optimization, if we aren't inserting in the first NR
// positions then we can just skip since we only care about the NR
// lowest scores overall. Saves a huge amount of time and memory.
return(false);
}
ctr += 1;
}
toInsert->next = temp->next;
temp->next = toInsert;
return(true);
}
}
SEXP trimCovar(SEXP fMAT, SEXP fSP, SEXP sSP, SEXP NV, SEXP NR){
// Converting inputs into C-compatible forms
int nv = asInteger(NV);
int nr = asInteger(NR);
int sp1l = length(fSP);
int sp2l = length(sSP);
// Note here we're not using coerceVector() anymore
// typechecking done on R side
int *firstSeqPos = INTEGER(fSP);
int *secondSeqPos = INTEGER(sSP);
double *fm = REAL(fMAT);
int colv1, colv2;
// Using a linked list for efficient insert
corrs = NULL;
int cv1, cv2;
double p1, p2, score=0;
bool success;
for ( int i=0; i<sp1l; i++ ){
cv1 = firstSeqPos[i];
colv1 = (cv1 - 1) * nr;
for ( int j=0; j<sp2l; j++ ){
cv2 = secondSeqPos[j];
colv2 = (cv2 - 1) * nr;
score = 0;
for ( int k=0; k<nr; k++){
p1 = fm[colv1 + k];
p2 = fm[colv2 + k];
if (p1 != 0 && p2 != 0){
score += p1 * log(p1 / p2);
}
}
node *newNode = makeNewNode(score, cv1, cv2);
success = insertSorted(&corrs, newNode, nv);
// If we don't insert, free the associated memory
// I'm checking for NULL here just out of an abundance of caution
if (!success && newNode != NULL){
free(newNode);
newNode = NULL;
}
}
R_CheckUserInterrupt();
}
SEXP ans;
PROTECT(ans = allocVector(INTSXP, 2*nv));
int *rans = INTEGER(ans);
node *ptr=corrs;
for ( int i=0; i<nv; i++){
rans[2*i] = ptr->i1;
rans[2*i+1] = ptr->i2;
ptr = ptr->next;
}
// Free allocated memory in linked list
cleanupFxn();
UNPROTECT(1);
return(ans);
}
Assuming the C file is named trimCovar.c, we'd compile with R CMD SHLIB trimCovar.c.
R Code to run this function:
dyn.load("trimCovar.so")
# Wrapped into a function with on.exit(...) to ensure cleanup
# in the event the user or system interrupts execution early
CorrComp_C <- function(fm, fsp, ssp, nv, nr){
# type checking to ensure input to C is integer vector
# (could probably do more type checking here, mainly for illustration)
stopifnot(is(fsp, 'integer'))
stopifnot(is(ssp, 'integer'))
on.exit(.C("cleanupFxn"))
a <- .Call('trimCovar', fm, fsp, ssp, nv, nr)
return(a)
}

dereference pointer always prints 0

I'm trying to figure out why dereferencing my pointer always prints 0. I put in other print statements to make sure random() is working correctly, and it does.
int * first = (int *) malloc(sizeof(int) * N);
while( i < N)
first[i++] = random();
printf("%d", first[i]);
}
I even assigned the values of first to another array and those values matched the ones returned by random(). Why does my print statement in this while loop always print 0?
first[i++] = random();
printf("%d", first[i]);
Assuming i is 0, with these two lines you are assigning a value to the first element of the array:
first[0] = random();
incrementing the index:
i++
then printing the value in the second element of the array:
printf("%d", first[1]);
If you make the incrementing of the index explicit it should be clearer:
while (i < N)
{
first[i] = random();
printf("%d", first[i]);
i++;
}
(You also appeared to have missed the opening bracket ({) but that could be a typo in the question)

Recursive Merge Sort Without Arrays

This is a problem I'm working on right now without any idea how to solve. I'm supposed to write the pseudocode to the merge function, but I'm not sure what I'm supposed to do. What I've been given is as follows:
Begin MergeSort(L[], start, stop)
if (stop<=start) return;
int m = (start+stop)/2;
MergeSort(L, start, m);
Mergesort(L, m+1, stop);
merge(L, start, m, stop);
End MergeSort
The only other thing I've been told is that I'm supposed to find the "merge(L, start, m, stop);" line. I've been researching all day, and everything I've found says that you should have 2 arrays, called left and right, to assign the recursive lines, making:
Begin MergeSort(L[], start, stop)
if (stop<=start) return;
array left[];
array right[];
int m = (start+stop)/2;
left=MergeSort(L, start, m);
right=Mergesort(L, m+1, stop);
merge(L, start, m, stop);
End MergeSort
If I were given this problem, I would be able to solve it, but I'm stuck because once I've broken each sublist into single elements, I'm not even sure what I'm supposed to call them, so I'm not sure how to work with them.
I'm still a beginner when it comes to code (taking the very basics of C and Python), so please keep the answer simple, if possible.
Thank you very much for reading this, and I hope that I get an answer so I understand what I'm doing.
MergeSort consists of 2 functions: mergeSort and merge. First one you have already written correctly. So, the only you problem with merge function.
Because of merge sort is not in-space sort algorithm, it require extra space to store data. Bellow is pretty simplified version of merge function that creates extra array of size stop - start:
Begin merge(L[] array, int start, int m, int stop)
if (start == stop) {
return;
}
int leftPos = start;
int rightPos = m + 1;
int curPos = start;
L[] temp = new L[stop - start];
while(leftPos <= m && rightPos <= stop) {
if (array[leftPos] <= array[rightPos]) {
temp[curPos++] = array[leftPos++];
} else {
temp[curPos++] = array[rightPos++];
}
}
while(leftPos <= m) {
temp[curPos++] = array[leftPos++];
}
while(rightPos <= stop) {
temp[curPos++] = array[rightPos++];
}
for (int i = start; i <= stop; i++) {
array[i] = temp[i - start];
}
End merge

Finding the even and odd numbers in a linked list by recursion?

Happy new years for everyone! :)
I could do better things in the first day of the year but i'm trying to implement the linked lists and recursion together.
I just thought that how I can achieve to write a function that calculates the even numbers in a linked list with recursion.
void List:: findingEvens(Node* n, Node*& newHead){
if(n == NULL)
return;
else
if(n-> data % 2 != 0)
findingEvens(n-> next);
else{
if(!newHead){
newHead = n;
}
else{
Node* temp =head;
for(;temp->next;temp=temp->next){
temp = temp -> next;
}
temp-> next = n;
findingEvents(n-> next);
}
}
}
The problem is that in my h class I add the following as it should be
void findingEvens(Node* n);
However this makes me error which says that error: ‘Node’ has not been declared
Actually I have a Node struct after the definition of this function in h class.
Is the implementation of the recursive function wrong?
Any help will be appreciated, happy new year again :)
void List:: findingEvens(Node* n, Node*& newHead){
if(n == NULL)
return;
else
if(n-> data % 2 != 0)
findingEvens(n-> next, newHead);
else{
// Push even node onto newHead list
newNode = new Node;
newNode->data = n->data;
newNode->next = newHead;
newHead = newNode;
findingEvens(n-> next, newHead);
}
}
You need to pass newHead in the recursive calls.
You can't just assign n directly to newHead, because then the newHead list will have all the links from the original list. You need to make new nodes and copy the data.
The above code builds the result list in reverse order of the original list, e.g. if you start with 1, 2, 3, 5, 6, 8, 9, the result will be 8, 6, 2. You can reverse the list when it's done.

A recursion algorithm

Ok, this may seem trivial to some, but I'm stuck.
Here's the algorithm I'm supposed to use:
Here’s a recursive algorithm. Suppose we have n integers in a non-increasing sequence, of which the first is the number k. Subtract one from each of the first k numbers after the first. (If there are fewer than k such number, the sequence is not graphical.) If necessary, sort the resulting sequence of n-1 numbers (ignoring the first one) into a non-increasing sequence. The original sequence is graphical if and only if the second one is. For the stopping conditions, note that a sequence of all zeroes is graphical, and a sequence containing a negative number is not. (The proof of this is not difficult, but we won’t deal with it here.)
Example:
Original sequence: 5, 4, 3, 3, 2, 1, 1
Subtract 1 five times: 3, 2, 2, 1, 0, 1
Sort: 3, 2, 2, 1, 1, 0
Subtract 1 three times: 1, 1, 0, 1, 0
Sort: 1, 1, 1, 0, 0
Subtract 1 once: 0, 1, 0, 0
Sort: 1, 0, 0, 0
Subtract 1 once: -1, 0, 0
We have a negative number, so the original sequence is not graphical.
This seems simple enough to me, but when I try to execute the algorithm I get stuck.
Here's the function I've written so far:
//main
int main ()
{
//local variables
const int MAX = 30;
ifstream in;
ofstream out;
int graph[MAX], size;
bool isGraph;
//open and test file
in.open("input3.txt");
if (!in) {
cout << "Error reading file. Exiting program." << endl;
exit(1);
}
out.open("output3.txt");
while (in >> size) {
for (int i = 0; i < size; i++) {
in >> graph[i];
}
isGraph = isGraphical(graph, 0, size);
if (isGraph) {
out << "Yes\n";
}else
out << "No\n";
}
//close all files
in.close();
out.close();
cin.get();
return 0;
}//end main
bool isGraphical(int degrees[], int start, int end){
bool isIt = false;
int ender;
inSort(degrees, end);
ender = degrees[start] + start + 1;
for(int i = 0; i < end; i++)
cout << degrees[i];
cout << endl;
if (degrees[start] == 0){
if(degrees[end-1] < 0)
return false;
else
return true;
}
else{
for(int i = start + 1; i < ender; i++) {
degrees[i]--;
}
isIt = isGraphical(degrees, start+1, end);
}
return isIt;
}
void inSort(int x[],int length)
{
for(int i = 0; i < length; ++i)
{
int current = x[i];
int j;
for(j = i-1; j >= 0 && current > x[j]; --j)
{
x[j+1] = x[j];
}
x[j+1] = current;
}
}
I seem to get what that sort function is doing, but when I debug, the values keep jumping around. Which I assume is coming from my recursive function.
Any help?
EDIT:
Code is functional. Please see the history if needed.
With help from #RMartinhoFernandes I updated my code. Includes working insertion sort.
I updated the inSort funcion boundaries
I added an additional ending condition from the comments. But the algorithm still isn't working. Which makes me thing my base statements are off. Would anyone be able to help further? What am I missing here?
Ok, I helped you out in chat, and I'll post a summary of the issues you had here.
The insertion sort inner loop should go backwards, not forwards. Make it for(i = (j - 1); (i >= 0) && (key > x[i]); i--);
There's an out-of-bounds access in the recursion base case: degrees[end] should be degrees[end-1];
while (!in.eof()) will not read until the end-of-file. while(in >> size) is a superior alternative.
Are you sure you ender do not go beyond end? Value of ender is degrees[start] which could go beyond the value of end.
Then you are using ender in for loop
for(int i = start+1; i < ender; i++){ //i guess it should be end here
I think your insertion sort algorithm isn't right. Try this one (note that this sorts it in the opposite order from what you want though). Also, you want
for(int i = start + 1; i < ender + start + 1; i++) {
instead of
for(int i = start+1; i < ender; i++)
Also, as mentioned in the comments, you want to check if degrees[end - 1] < 0 instead of degrees[end].

Resources