I don't understand the result of this little program - pointers

i've made this little program to test a little part of a bigger program.
int main()
{
char c[]="ddddddddddddd";
char *g= malloc(4*sizeof(char));
*g=NULL;
strcpy (g,c);
printf("Hello world %s!\n",g);
return 0;
}
I expected that the function would return "Hello World dddd" ,since the length of g is 4*sizeof(char), but it returns " Hello World ddddddddddddd ".Can you explain me Where I'm wrong ?

Don't do that, it's undefined behaviour.
The strcpy function will happily copy all those characters in c regardless of the size of g.
That's because it copies characters up to the first \0 in c. In this particular case it may corrupt your heap, or it may not, depending on the minimum size of things that get allocated in the heap (many have a "resolution" of sixteen bytes for example).
There are other functions you can use (though they're optional) if you want your code to be more robust, such as strncpy (provided you understand the limitations), or strcpy_s(), as detailed in Appendix K of the ISO C11 standard (and earlier iterations as well).
Or, if you can't use those for some reason, it's up to the developer to ensure they don't break the rules.

Related

Frama-c : Trouble understanding WP memory models

I'm looking for WP options/model that could allow me to prove basic C memory manipulations like :
memcpy : I've tried to prove this simple code :
struct header_src{
char t1;
char t2;
char t3;
char t4;
};
struct header_dest{
short t1;
short t2;
};
/*# requires 0<=n<=UINT_MAX;
# requires \valid(dest);
# requires \valid_read(src);
# assigns (dest)[0..n-1] \from (src)[0..n-1];
# assigns \result \from dest;
# ensures dest[0..n] == src[0..n];
# ensures \result == dest;
*/
void* Frama_C_memcpy(char *dest, const char *src, uint32_t n);
int main(void)
{
struct header_src p_header_src;
struct header_dest p_header_dest;
p_header_src.t1 = 'e';
p_header_src.t2 = 'b';
p_header_src.t3 = 'c';
p_header_src.t4 = 'd';
p_header_dest.t1 = 0x0000;
p_header_dest.t2 = 0x0000;
//# assert \valid(&p_header_dest);
Frama_C_memcpy((char*)&p_header_dest, (char*)&p_header_src, sizeof(struct header_src));
//# assert p_header_dest.t1 == 0x6265;
//# assert p_header_dest.t2 == 0x6463;
}
but the two last assert weren't verified by WP (with default prover Alt-Ergo). It can be proved thanks to Value analysis, but I mostly want to be able to prove the code not using abstract interpretation.
Cast pointer to int : Since I'm programming embedded code, I want to be able to specify something like:
#define MEMORY_ADDR 0x08000000
#define SOME_SIZE 10
struct some_struct {
uint8_t field1[SOME_SIZE];
uint32_t field2[SOME_SIZE];
}
// [...]
// some function body {
struct some_struct *p = (some_struct*)MEMORY_ADDR;
if(p == NULL) {
// Handle error
} else {
// Do something
}
// } body end
I've looked a little bit at WP's documentation and it seems that the version of frama-c that I use (Magnesium-20151002) has several memory model (Hoare, Typed , +cast, +ref, ...) but none of the given example were proved with any of the model above. It is explicitly said in the documentation that Typed model does not handle pointer-to-int casts. I've a lot of trouble to understand what's really going on under the hood with each wp-model. It would really help me if I was able to verify at least post-conditions of the memcpy function. Plus, I have seen this issue about void pointer that apparently are not very well handled by WP at least in the Magnesium version. I didn't tried another version of frama-c yet, but I think that newer version handle void pointer in a better way.
Thank you very much in advance for your suggestions !
memcpy
Reasoning about the result of memcpy (or Frama_C_memcpy) is out of range of the current WP plugin. The only memory model that would work in your case is Bytes (page 13 of the manual for Chlorine), but it is not implemented.
Independently, please note that your postcondition from Frama_C_memcpy is not what you want here. You are asserting the equality of the sets dest[0..n] and src[0..n]. First, you should stop at n-1. Second, and more importantly, this is far too weak, and is in fact not sufficient to prove the two assertions in the caller. What you want is a quantification on all bytes. See e.g. the predicate memcmp in Frama-C's stdlib, or the variant \forall int i; 0 <= i < n -> dest[i] == src[i];
By the way, this postcondition holds only if dest and src are properly separated, which your function does not require. Otherwise, you should write dest[i] == \at (src[i], Pre). And your requires are also too weak for another reason, as you only require the first character to be valid, not the n first ones.
Cast pointer to int
Basically, all current models implemented in WP are unable to reason on codes in which the memory is accessed with multiple incompatible types (through the use of unions or pointer casts). In some cases, such as Typed, the cast is detected syntactically, and a warning is issued to warn that the code cannot be analyzed. The Typed+Cast model is a variant of Typed in which some casts are accepted. The analysis is correct only if the pointer is re-cast into its original type before being used. The idea is to allow using some libc functions that require a cast to void*, but not much more.
Your example is again a bit different, because it is likely that MEMORY_ADDR is always addressed with type some_stuct. Would it be acceptable to change the code slightly, and change your function as taking a pointer to this type? This way, you would "hide" the cast to MEMORY_ADDR inside a function that would remain unproven.
I tried this example in the latest version of Frama-C (of course the format is modified a little bit).
for the memcpy case
Assertion 2 fails but assertion 3 is successfully proved (basically because the failure of assertion 2 leads to a False assumption, which proves everything).
So in fact both assertion cannot be proved, same as your problem.
This conclusion is sound because the memory models used in the wp plugin (as far as I know) has no assumption on the relation between fields in a struct, i.e. in header_src the first two fields are 8 bit chars, but they may not be nestedly organized in the physical memory like char[2]. Instead, there may be paddings between them (refer to wiki for detailed description). So when you try to copy bits in such a struct to another struct, Frama-C becomes completely confused and has no idea what you are doing.
As far as I am concerned, Frama-C does not support any approach to precisely control the memory layout, e.g. gcc's PACKED which forces the compiler to remove paddings.
I am facing the same problem, and the (not elegant at all) solution is, use arrays instead. Arrays are always nested, so if you try to copy a char[4] to a short[2], I think the assertion can be proved.
for the Cast pointer to int case
With memory model Typed+cast, the current version I am using (Chlorine-20180501) supports casting between pointers and uint64_t. You may want to try this version.
Moreover, it is strongly suggested to call Z3 and CVC4 through why3, whose performance is certainly better than Alt-Ergo.

how can I get the data pointer of a string variable in go?

I want to get the data pointer of a string variable(like string::c_str() in c++) to pass to a c function and I found this doesn't work:
package main
/*
#include <stdio.h>
void Println(const char* str) {printf("%s\n", str);}
*/
import "C"
import (
"unsafe"
)
func main() {
s := "hello"
C.Println((*C.char)(unsafe.Pointer(&(s[0]))))
}
Compile error info is: 'cannot take the address of s[0]'.
This will be OK I but I doubt it will cause unneccesary memory apllying. Is there a better way to get the data pointer?
C.Println((*C.char)(unsafe.Pointer(&([]byte(s)[0]))))
There are ways to get the underlying data from a Go string to C without copying it. It will not work as a C string because it is not a C string. Your printf will not work even if you manage to extract the pointer even if it happens to work sometimes. Go strings are not C strings. They used to be for compatibility when Go used more libc, they aren't anymore.
Just follow the cgo manual and use C.CString. If you're fighting for efficiency you'll win much more by just not using cgo because the overhead of calling into C is much bigger than allocating some memory and copying a string.
(*reflect.StringHeader)(unsafe.Pointer(&sourceTail)).Data
Strings in go are not null terminated, therefore you should always pass the Data and the Len parameter to the corresponding C functions. There is a family of functions in the C standard library to deal with this type of strings, for example if you want to format them with printf, the format specifier is %.*s instead of %s and you have to pass both, the length and the pointer in the arguments list.

Is there a minimum string length for F() to be useful?

Is there a limit for short strings where using the F() macro brings more RAM overhead then saving?
For (a contrived) example:
Serial.print(F("\n"));
Serial.print(F("Hi"));
Serial.print(F("there!"));
Serial.print(F("How do you doyou how?"));
Would any one of those be more efficient without the F()?
I imagine it uses some RAM to iterate over the string and copy it from PROGMEM to RAM. I guess the question is: how much? Also, is heap fragmentation a concern here?
I'm looking at this purely from SRAM-conserving perspective.
From a purely SRAM-conserving perspective all of your examples are identical in that no SRAM is used. At run-time some RAM is used, but only momentarily on the stack. Keep in mind that calling println() (w/o any parameters) uses some stack/RAM.
For a single character it will take up less space in flash if a char is passed into print or println. For example:
Serial.print('\n');
The char will be in flash (not static RAM).
Using
Serial.print(F("\n"));
will create a string in flash memory that is two bytes long (newline char + null terminator) and will additionally pass a pointer to that string to print which is probably two bytes long.
Additionally at runtime, using the F macro will result in two fetches ('\n' and the null terminator) from flash. While fetches from flash are fast, passing in a char results in zero fetches from flash, which is a tiny bit faster.
I don't think there is any minimum size of the string to be useful. If you look at how the outputting is implemented in Print.cpp:
size_t Print::print(const __FlashStringHelper *ifsh)
{
PGM_P p = reinterpret_cast<PGM_P>(ifsh);
size_t n = 0;
while (1) {
unsigned char c = pgm_read_byte(p++);
if (c == 0) break;
n += write(c);
}
return n;
}
You can see from there that only one byte of RAM is used at a time (plus a couple of variables), as it pulls the string from PROGMEM a byte at a time. These are all on the stack so there is no ongoing overhead.
I imagine it uses some RAM to iterate over the string and copy it from PROGMEM to RAM. I guess the question is: how much?
No, it doesn't as I showed above. It outputs a byte at a time. There is no copying (in bulk) of the string into RAM first.
Also, is heap fragmentation a concern here?
No, the code does not use the heap.

Global variable touched by a passed-in parameter becomes unusable

folks!
I pass a struct full of data to my kernel, and I run into the following difficulty using it (very stripped down):
[edit: mac osx / xcode 3.2 on mac book pro; this compile is obviously for cpu]
typedef struct
{
float xoom;
int sizex;
} varholder;
float zX, xd;
__kernel void Harlan( __global varholder * vh )
{
int X = get_global_id(0), Y = get_global_id(1);
zX = ( ( X - vh->sizex/2 ) / vh->xoom + vh->sizex/2 ); // (a)
xd = zX; // (b) BOOM!!
}
after executing line (a), the line marked (b), a simple assignment, gives "LLVM compiler failed to compile a function".
if, however, we do not execute line (a), then line (b) is fine.
So, through my fiddling around a LOT with this, it seems as if it is the assignment statement (a), which uses a passed-in parameter, that messes up the future access of the variable zX. However, of course I need to be able to use the results of calculations further down the line.
I have zX and xd declared at the file level because my helper functions need them.
Any thoughts?
Thanks!
David
p.s. I'm now registered so will be able to upvote and accept answers, which I am sadly unable to do for the last person who helped me (used same username to register, but can't seem to vote on the old post; sorry!).
No, say it ain't so!
I am sincerely hoping that this is not a "correct" answer to my own question. I found on another forum (though not the same question asked!) the following, and I am afraid that it refers to what I'm trying to do:
(quote)
You're doing something the standard prohibits. Section 6.5 says:
'All program scope variables must be declared in the __constant address space.'
In other words, program scope variables cannot be mutable.
(end quote)
... well, tcha!!!! What an astoundingly inconvenient restriction! I'm sure there's reasoning behind it.
[edit: Not At All inconvenient! it was in fact astonishingly easy to work around, given a fresh start the next morning. (And no alcohol.)]
You guys & dolls all knew this, right, and didn't have the heart to tell me?...

address representation in ada

I have pasted a code below which is in Ada language.I need some clarification on some implementations.
C : character;
Char : character;
type Myarr_Type is array (character range 'A'..'K') of character;
Myarr : Myarr_Type := ('A','B','C','D','E','F','G','H','I','J','K');
Next_Address := Myarr'address --'
Last_Address := Next_Address + Storage_Offset'(40); --'
return P2 + Storage_Offset'(4); --'
Last_Address := Next_Address + Storage_Offset'(4); --'
Now my doubt is 1) what does P2 + Storage_Offset'(4) actually mean.Does that mean that its returning the address of the next element in the array which is 'B'.Storage_Offset'(4) in Ada --does this mean 4 bits or 4 bytes of memory. 2) If i assume that Last_Address points to last element of the array which is 'K', how does the arithmentic Storage_Offset'(40) satisfies the actual implementation?
Please get back to me if u need any more clarifications.
Please assume that the function does not exist.
As a matter of fact,i have some ada file and my job is to convert them to C files.Since i am a beginner in ada,i faced a lot of issues with that.Please pardon in case of any confusion
Thanks
Maddy
Storage_Offset is a special integeral type in package System.Storage_Elements that can be added to objects of type System.Address. What exactly the units of Address and Storage_Offset are is implementation defined, but probably just about every implementation in existence uses bytes. So Next_Address + Storage_Offset'(4) means "the address four bytes past whatever Next_Address refers to."
You talked a bit about Ada porting. In 99% of cases, that is a very stupid idea (the %1 being when you need to port to a platform that has no Ada compiler). I'd say the same thing no matter what language you are porting. It's a fool's game. The best outcome you can hope for when porting code is that after a ton of effort it works as well as it did before. With coding the best case never happens.
Ada can interface to C just fine, so it would be far smarter to keep unchanged code in Ada and only "port" stuff you need to change.
If you come across any tasking code, protected types, or custom streams you will be in a world of hurt. Those things don't really have easy C analogs.
If your bosses really have a boner for C or something, I'd suggest looking into Sofcheck's AdaMagic, which provides a service to transform Ada code to ANSI C. Back in the day (two owners prior) they used to claim that it produced maintainable C code. Either way, it will probably be far cheaper than having an inexperienced (in Ada) developer try to do it all by hand.
my_func(int P1,int P2)
{
return P2 + Storage_Offset'(4);
}
Well, this is a C function whose body is written in Ada. There is no "+" operator taking an Integer and a Storage_Offset; perhaps you're looking for
function "+"(Left : Address; Right : Storage_Offset)
return Address;
and perhaps you meant to call my_func(something, Next_Address)?
In that case, the expression will return the address of whatever is 4 Storage_Elements, ie bytes, after Myarr('A').
Myarr_Type is an array of Character, which with any normal compiler on any common architecture is going to be a standard 8-bit byte. So Myarr_Type objects will be 11 bytes long, not 44, and Myarr('A')'Address + 4 will be the address of Myarr('E').
If you want the address of the last element of Myarr, try
Myarr (Myarr'Last)'Address

Resources