Behaviour of non-const int pointer on a const int - pointers

#include<stdio.h>
int main()
{
const int sum=100;
int *p=(int *)∑
*p=101;
printf("%d, %d",*p,sum);
return 0;
}
/*
output
101, 101
*/
p points to a constant integer variable, then why/how does *p manage to change the value of sum?

It's undefined behavior - it's a bug in the code. The fact that the code 'appears to work' is meaningless. The compiler is allowed to make it so your program crashes, or it's allowed to let the program do something nonsensical (such as change the value of something that's supposed to be const). Or do something else altogether. It's meaningless to 'reason' about the behavior, since there is no requirement on the behavior.
Note that if the code is compiled as C++ you'll get an error since C++ won't implicitly cast away const. Hopefully, even when compiled as C you'll get a warning.

p contains the memory address of the variable sum. The syntax *p means the actual value of sum.
When you say
*p=101
you're saying: go to the address p (which is the address where the variable sum is stored) and change the value there. So you're actually changing sum.

You can see const as a compile-time flag that tells the compiler "I shouldn't modify this variable, tell me if I do." It does not enforce anything on whether you can actually modify the variable or not.
And since you are modifying that variable through a non-const pointer, the compiler is indeed going to tell you:
main.c: In function 'main':
main.c:6:16: warning: initialization discards qualifiers from pointer target type
You broke your own promise, the compiler warns you but will let you proceed happily.

The behavior is undefined, which means that it may produce different outcomes on different compiler implementations, architecture, compiler/optimizer/linker options.
For the sake of analysis, here it is:
(Disclaimer: I don't know compilers. This is just a logical guess at how the compiler may choose to handle this situation, from a naive assembly-language debugger perspective.)
When a constant integer is declared, the compiler has the choice of making it addressable or non-addressable.
Addressable means that the integer value will actually occupy a memory location, such that:
The lifetime will be static.
The value might be hard-coded into the binary, or initialized during program startup.
It can be accessed with a pointer.
It can be accessed from any binary code that knows of its address.
It can be placed in either read-only or writable memory section.
For everyday CPUs the non-writeability is enforced by memory management unit (MMU). Messing the MMU is messy impossible from user-space, and it is not worth for a mere const integer value.
Therefore, it will be placed into writable memory section, for simplicity's sake.
If the compiler chooses to place it in non-writable memory, your program will crash (access violation) when it tries to write to the non-writable memory.
Setting aside microcontrollers - you would not have asked this question if you were working on microcontrollers.
Non-addressable means that it does not occupy a memory address. Instead, every code that references the variable (i.e. use the value of that integer) will receive a r-value, as if you did a find-and-replace to change every instance of sum into a literal 100.
In some cases, the compiler cannot make the integer non-addressable: if the compiler knows that you're taking the address of it, then surely the compiler knows that it has to put that value in memory. Your code belongs to this case.
Yet, with some aggressively-optimizing compiler, it is entirely possible to make it non-addressable: the variable could have been eliminated and the printf will be turned into int main() { printf("%s, %s", (b1? "100" : "101"), (b2? "100" : "101")); return 0; } where b1 and b2 will depend on the mood of the compiler.
The compiler will sometimes take a split decision - it might do one of those, or even something entirely different:
Allocate a memory location, but replace every reference with a constant literal. When this happens, a debugger will tell you the value is zero but any code that uses that location will appear to contain a hard-coded value.
Some compiler may be able to detect that the cast causes a undefined behavior and refuse to compile.

Related

STM32F4 UART half word addressing

Trying to roll my own code for STM32F4 UART.
A peculiarity of this chip is that if you use byte addressing as the GNAT compiler does when setting a single bit, the corresponding bit in the other byte of the half word is set. The data sheet says use half word addressing. Is there a way to tell the compiler to do this? I tried
for CR1_register'Size use 16;
but this had no effect. Writing the whole 16 bit word works, but you lose the ability to set named bits.
The GNAT way to do this, as used in the AdaCore Ada Drivers Library, is to use the GNAT-only aspect Volatile_Full_Access, about which the GNAT Reference Manual says
This is similar in effect to pragma Volatile, except that any reference to the object is guaranteed to be done only with instructions that read or write all the bits of the object. Furthermore, if the object is of a composite type, then any reference to a subcomponent of the object is guaranteed to read and/or write all the bits of the object.
The intention is that this be suitable for use with memory-mapped I/O devices on some machines. Note that there are two important respects in which this is different from pragma Atomic. First a reference to a Volatile_Full_Access object is not a sequential action in the RM 9.10 sense and, therefore, does not create a synchronization point. Second, in the case of pragma Atomic, there is no guarantee that all the bits will be accessed if the reference is not to the whole object; the compiler is allowed (and generally will) access only part of the object in this case.
Their code is
-- Control register 1
type CR1_Register is record
-- Send break
SBK : Boolean := False;
...
end record
with Volatile_Full_Access, Size => 32,
Bit_Order => System.Low_Order_First;
for CR1_Register use record
SBK at 0 range 0 .. 0;
...
end record;
Portable way is to do this explicitly: read whole record, modify, then write it back. As long as it is declared Volatile a compiler will not optimize reads and writes out.
-- excerpt from my working code --
declare
R : Control_Register_1 := Module.CR1;
begin
R.UE := True;
Module.CR1 := R;
end;
This is very verbose, but it does its work.

Specifying Referential transparency in ACSL

I want to find some ACSL annotation that can be applied to a function or function pointer to indicate that it has the property of referential transparency. Some way to say "this function will always return the same value when given the same arguments". So far I haven't found any such way. Can anyone point me to a way to express that?
Maybe some way to refer to an arbitrary logic function? If I could name an unknown logic boolean uknown_function(void* a, void* b) = /* this is unkown */; then I could document a function as having a postcondition that it's \result is equal to this arbitrary/unknown logic function?
The larger context is trying to do type-erased comparisons. I want to generally express the concept of "the user has given me void*s to work with and a bool (*)(void const*, void const*) to compare them with, and the user is guaranteeing to me that the function provided really is a strict partial order over whatever those pointers point to." If I had that, then I could start to describe properties of these type-erased objects being sorted, for example.
There is indeed no direct possibility to do that in ACSL: a function contract only specifies what happens during a single call of the function. You could indeed rely on a declared but left undefined logic function, with a reads clause that specifies the part of the C memory state that the function will need to compute its result, e.g.
/*# logic boolean unknown_function{L}(int* a, int* b) reads a[0 .. 1], b[2 .. 3]; */
but if you work with void *, without knowing the size of the underlying objects, this might be tricky to specify: unless the result of unknown_function relies solely on the value of the pointer, and not the content of the pointed object, in which case you don't need that reads trick.
Note in addition that contracts over function pointers are not supported yet, which will probably be an issue for what you intend to do if I understand correctly your last paragraph.
Finally, you might be interested in an upcoming plug-in, RPP, that proposes a way to specify, prove, and use properties relating several calls of one or more C function(s). It is described here and here, and a public release should happen in a not-too-distant future.

Do you make safe and unsafe version of your functions or just stick to the safe version? (Embedded System)

let's say you have a function that set an index and then update few variables based on the value stored in the array element which the index is pointing to. Do you check the index to make sure it is in range? (In embedded system environment to be specific Arduino)
So far I have made a safe and unsafe version for all functions, is that a good idea? In some of my other codes I noticed that having only safe functions result in checking conditions multiple time as the libraries get larger, so I started to develop both. The safe function checks the condition and call the unsafe function as shown in example below for the case explained above.
Safe version:
bool RcChannelModule::setFactorIndexAndUpdateBoundaries(factorIndex_T factorIndex)
{
if(factorIndex < N_FACTORS)
{
setFactorIndexAndUpdateBoundariesUnsafe(factorIndex);
return true;
}
return false;
}
Unsafe version:
void RcChannelModule::setFactorIndexAndUpdateBoundariesUnsafe(factorIndex_T factorIndex)
{
setCuurentFactorIndexUnsafe(factorIndex);
updateOutputBoundaries();
}
If I am doing it wrong fundamentally please let me know why and how I could avoid that. Also I would like to know, generally when you program, do you consider the future user to be a fool or you expect them to follow the minimal documentation provided? (the reason I say minimal is because I do not have the time to write a proper documentation)
void RcChannelModule::setCuurentFactorIndexUnsafe(const factorIndex_T factorIndex)
{
currentFactorIndex_ = factorIndex;
}
Safety checks, such as array index range checks, null checks, and so on, are intended to catch programming errors. When these checks fail, there is no graceful recovery: the best the program can do is to log what happened, and restart.
Therefore, the only time when these checks become useful is during debugging and testing of your code. C++ provides built-in functionality for dealing with this through asserts, which are kept in the debug versions of the code, but compiled out from the release version:
void RcChannelModule::setFactorIndexAndUpdateBoundariesUnsafe(factorIndex_T factorIndex) {
assert(factorIndex < N_FACTORS);
setCuurentFactorIndexUnsafe(factorIndex);
updateOutputBoundaries();
}
Note: [When you make a library for external use] an argument-checking version of each external function perhaps makes sense, with non-argument-checking implementations of those and all internal-only functions. If you perform argument checking then do it (only) at the boundary between your library and the client code. But it's pointless to offer a choice to your users, for if you want to protect them from usage errors then you cannot rely on them to choose the "safe" versions of your functions. (John Bollinger)
Do you make safe and unsafe version of your functions or just stick to the safe version?
For higher level code, I recommend one version, a safe one.
High level code, with a large set of related functions and data, the combinations of interactions of data and code are not possible to fully check at development time. When an error is detected, the data should be set to indicate an error state. Subsequent use of data within these functions would be aware of the error state.
For lowest level -time critical routines, I'd go with #dasblinkenlight answer. Create one source code that compiles 2 ways per the debug and release compiles.
Yet keep in mind #pete becker, it this really likely a performance bottle neck to do a check?
With floating-point related routines, use the NaN to help keep track of an unrecoverable error.
Lastly, as able, create functions that do not fail and avoid the issue. With many, not all, this only requires small code additions. It often only adds a constant of time performance penalty and not a O(n) penalty.
Example: Consider a function to lop off the first character of a string - in place.
// This work fine as long as s[0] != 0
char *slop_1(char *s) {
size_t len = strlen(s); // most work is here
return memmove(s, s + 1, len); // and here
}
Instead define the function, and code it, to do nothing when s[0] == 0
char *slop_2(char *s) {
size_t len = strlen(s);
if (len > 0) { // negligible additional work
memmove(s, s + 1, len);
}
return s;
}
Similar code can be applied to OP's example. Note that it is "safe", at least within the function. The assert() scheme can still be used to discovery development issues. Yet the released code, without the assert(), still checks the range.
void RcChannelModule::setFactorIndexAndUpdateBoundaries(factorIndex_T factorIndex)
{
if(factorIndex < N_FACTORS) {
setFactorIndexAndUpdateBoundariesUnsafe(factorIndex);
} else {
assert(1);
}
}
Since you tagged this Arduino and embedded, you have a very resource-constrained system, one of the crappiest processors still manufactured.
On such a system you cannot afford extra error handling. It is better to properly document what values the parameters passed to the function must have, then leave the checking of this to the caller.
The caller can then either check this in run-time, if needed, or otherwise in compile-time with a static assert. Your function would however not be able to implement it as a static assert, as it can't know if factorIndex is a run-time variable or a compile-time constant.
As for "I have no time to write proper documentation", that's nonsense. It takes far less time to document this function than to post this SO question. You don't necessarily have to write an essay in some Word file. You don't necessarily have to use Doxygen or similar.
But you do need to write the bare minimum of documentation: In the header file, document the purpose and expected values of all function parameters in the form of comments. Preferably you should have a coding standard for how to document such functions. A minimal documentation of public API functions in the form of comments is part of your job as programmer. The code is not complete until this is written.

When Qt-5 will fail the connect

Reading Qt signal & slots documentation, it seems that the only reason for a new style connection to fail is:
"If there is already a duplicate (exact same signal to the exact same slot on the same objects), the connection will fail and connect will return false"
Which means that connection was already successful the first time and does not allow multi-connections when using Qt::UniqueConnection.
Does this means that Qt-5 style connection will always success? Are there any other reasons for failure?
The new-style connect can still fail at runtime for a variety of reasons:
Either sender or receiver is a null pointer. Obviously this requires a check that can only happen at runtime.
The PMF you specified for a signal is not actually a signal. Lacking proper C++ reflection capabilities, all you can do at compile time is checking that the signal is a non-static member function of the sender's class.
However, that's not enough to make it a signal: it also needs to be in a signals: section in your class definition. When moc sees your class definition, it will generate some metadata containing the information that that function is indeed a signal. So, at runtime, the pointer passed to connect is looked up in a table, and connect itself will fail if the pointer is not found (because you did not pass a signal).
The check on the previous point actually requires a comparison between pointers to member functions. It's a particularly tricky one, because it will typically involve different TUs:
one is the TU containing moc-generated data (typically a moc_class.cpp file). In this TU there's the aforementioned table containing, amongst other things, pointers to the signals (which are just ordinary member functions).
is the TU where you actually invoke connect(sender, &Sender::signal, ...), which generates the pointer that gets looked up in the table.
Now, the two TUs may be in the same application, or perhaps one is in a library and the other in your application, or maybe in two libraries, etc; your platform's ABI starts to get into play.
In theory, the pointers stored when doing 1. are identical to the pointers generated when doing 2.; in practice, we've found cases where this does not happen (cf. this bug report that I reported some time ago, where older versions of GNU ld on ARM generated code that failed the comparison).
For Qt this meant disabling certain optimizations and/or passing some extra flags to the places where we know this to happen and break user software. For instance, as of Qt 5.9, there is no support for -Bsymbolic* flags on GCC on anything but x86 and x86-64.
Of course, this does not mean we've found and fixed all the possible places. New compilers and more aggressive optimizations might trigger this bug again in the future, making connect return false, even when everything is supposed to work.
Yes it can fail if either sender or receiver are not valid objects (nullptr for example)
Example
QObject* obj1 = new QObject();
QObject* obj2 = new QObject();
// Will succeed
connect(obj1, &QObject::destroyed, obj2, &QObject::deleteLater);
delete obj1;
obj1 = nullptr;
// Will fail even if it compiles
connect(obj1, &QObject::destroyed, obj2, &QObject::deleteLater);
Do not try to register pointer type. I've used the macro
#define QT_REG_TYPE(T) qRegisterMetaType<T>(#T)
with pointer type CMyWidget*, that was the problem. Using the type directly worked.
No it's not always successful. The docs give an example here where connect would return false because the signal should not contain variable names.
// WRONG
QObject::connect(scrollBar, SIGNAL(valueChanged(int value)),
label, SLOT(setNum(int value)));

glVertexAttribPointer last attribute value or pointer

The last attribute of glVertexAttribPointer is of type const GLvoid*. But is it really a pointer? It is actually an offset. If I put 0, it means an offset of 0 and not a null pointer to an offset. In my engine, I use this function:
void AbstractVertexData::vertexAttribPtr(int layout) const
{
glVertexAttribPointer(layout,
getShaderAttribs()[layout]->nbComponents,
static_cast<GLenum>(getShaderAttribs()[layout]->attribDataType),
getShaderAttribs()[layout]->shouldNormalize,
getVertexStride(layout),
reinterpret_cast<const void*>(getVertexAttribStart(layout)));
}
getVertexAttribStart returns an intptr_t. When I run drmemory, it says "uninitialized read" and I want to remove that warning. This warning comes from the reinterpret_cast. I can't static_cast to a const void* since my value isn't a pointer. What should I do to fix this warning?
Originally, back in OpenGL-1.1 when vertex arrays got introduces, functions like glVertexPointer, glTexCoordPointer and so on were accepting pointers into client address space. When shaders got introduced they came with arbitrary vertex attributes and the function glVertexAttribPointer follows the same semantics (this was in OpenGL-2.1).
The buffer objects API was then reusing existing functions, where you'd pass an integer for a pointer parameter.
OpenGL-3.3 core eventually made the use of buffer objects mandatory and ever since the glVertexAttribPointer functions being defines with a void* in their function signature are a sore spot; I've written in extent about it in https://stackoverflow.com/a/8284829/524368 (but make sure to read the rest of the answers as well).
Eventually new functions got introduced that allow for a more fine grained control over how vertex attributes are accessed, replacing glVertexAttribPointer, and those operate purely on offsets.

Resources