My goal is to call Windows' GetModuleInformation function to get a MODULEINFO struct back. This is all working fine. The problem comes as a result of me wanting to do pointer arithmetic and dereferences on the LPVOID lpBaseOfDll which is part of the MODULEINFO.
Here is my code to call the function in Lua:
require "luarocks.require"
require "alien"
sizeofMODULEINFO = 12 --Gotten from sizeof(MODULEINFO) from Visual Studio
MODULEINFO = alien.defstruct{
{"lpBaseOfDll", "pointer"}; --Does this need to be a buffer? If so, how?
{"SizeOfImage", "ulong"};
{"EntryPoint", "pointer"};
}
local GetModuleInformation = alien.Kernel32.K32GetModuleInformation
GetModuleInformation:types{ret = "int", abi = "stdcall", "long", "pointer", "pointer", "ulong"}
local GetModuleHandle = alien.Kernel32.GetModuleHandleA
GetModuleHandle:types{ret = "pointer", abi = "stdcall", "pointer"}
local GetCurrentProcess = alien.Kernel32.GetCurrentProcess
GetCurrentProcess:types{ret = "long", abi = "stdcall"}
local mod = MODULEINFO:new() --Create struct (needs buffer?)
local currentProcess = GetCurrentProcess()
local moduleHandle = GetModuleHandle("myModule.dll")
local success = GetModuleInformation(currentProcess, moduleHandle, mod(), sizeofMODULEINFO)
if success == 0 then --If there is an error, exit
return 0
end
local dataPtr = mod.lpBaseOfDll
--Now how do I do pointer arithmetic and/or dereference "dataPtr"?
At this point, mod.SizeOfImage seems to be giving me the correct values that I am expecting, so I know the functions are being called and the struct is being populated. However, I cannot seem to do pointer arithmetic on mod.lpBaseOfDll because it is a UserData.
The only information in the Alien Documentation that may address what I'm trying to do are these:
Pointer Unpacking
Alien also provides three convenience functions that let you
dereference a pointer and convert the value to a Lua type:
alien.tostring takes a userdata (usually returned from a function that has a pointer return value), casts it to char*, and returns a Lua
string. You can supply an optional size argument (if you don’t Alien
calls strlen on the buffer first).
alien.toint takes a userdata, casts it to int*, dereferences it and returns it as a number. If you pass it a number it assumes the
userdata is an array with this number of elements.
alien.toshort, alien.tolong, alien.tofloat, and alien.todouble are like alien.toint, but works with with the respective typecasts.
Unsigned versions are also available.
My issue with those, is I would need to go byte-by-byte, and there is no alien.tochar function. Also, and more importantly, this still doesn't solve the problem of me being able to get elements outside of the base address.
Buffers
After making a buffer you can pass it in place of any argument of
string or pointer type.
...
You can also pass a buffer or other userdata to the new method of your
struct type, and in this case this will be the backing store of the
struct instance you are creating. This is useful for unpacking a
foreign struct that a C function returned.
These seem to suggest I can use an alien.buffer as the argument of MODULEINFO's LPVOID lpBaseOfDll. And buffers are described as byte arrays, which can be indexed using this notation: buf[1], buf[2], etc. Additionally, buffers go by bytes, so this would ideally solve all problems. (If I am understanding this correctly).
Unfortunately, I can not find any examples of this anywhere (not in the docs, stackoverflow, Google, etc), so I am have no idea how to do this. I've tried a few variations of syntax, but nearly every one gives a runtime error (others simply does not work as expected).
Any insight on how I might be able to go byte-by-byte (C char-by-char) across the mod.lpBaseOfDll through dereferences and pointer arithmetic?
I need to go byte-by-byte, and there is no alien.tochar function.
Sounds like alien.tostring has you covered:
alien.tostring takes a userdata (usually returned from a function that has a pointer return value), casts it to char*, and returns a Lua string. You can supply an optional size argument (if you don’t Alien calls strlen on the buffer first).
Lua strings can contain arbitrary byte values, including 0 (i.e. they aren't null-terminated like C strings), so as long as you pass a size argument to alien.tostring you can get back data as a byte buffer, aka Lua string, and do whatever you please with the bytes.
It sounds like you can't tell it to start at an arbitrary offset from the given pointer address. The easiest way to tell for sure, if the documentation doesn't tell you, is to look at the source. It would probably be trivial to add an offset parameter.
Related
I'm looking for WP options/model that could allow me to prove basic C memory manipulations like :
memcpy : I've tried to prove this simple code :
struct header_src{
char t1;
char t2;
char t3;
char t4;
};
struct header_dest{
short t1;
short t2;
};
/*# requires 0<=n<=UINT_MAX;
# requires \valid(dest);
# requires \valid_read(src);
# assigns (dest)[0..n-1] \from (src)[0..n-1];
# assigns \result \from dest;
# ensures dest[0..n] == src[0..n];
# ensures \result == dest;
*/
void* Frama_C_memcpy(char *dest, const char *src, uint32_t n);
int main(void)
{
struct header_src p_header_src;
struct header_dest p_header_dest;
p_header_src.t1 = 'e';
p_header_src.t2 = 'b';
p_header_src.t3 = 'c';
p_header_src.t4 = 'd';
p_header_dest.t1 = 0x0000;
p_header_dest.t2 = 0x0000;
//# assert \valid(&p_header_dest);
Frama_C_memcpy((char*)&p_header_dest, (char*)&p_header_src, sizeof(struct header_src));
//# assert p_header_dest.t1 == 0x6265;
//# assert p_header_dest.t2 == 0x6463;
}
but the two last assert weren't verified by WP (with default prover Alt-Ergo). It can be proved thanks to Value analysis, but I mostly want to be able to prove the code not using abstract interpretation.
Cast pointer to int : Since I'm programming embedded code, I want to be able to specify something like:
#define MEMORY_ADDR 0x08000000
#define SOME_SIZE 10
struct some_struct {
uint8_t field1[SOME_SIZE];
uint32_t field2[SOME_SIZE];
}
// [...]
// some function body {
struct some_struct *p = (some_struct*)MEMORY_ADDR;
if(p == NULL) {
// Handle error
} else {
// Do something
}
// } body end
I've looked a little bit at WP's documentation and it seems that the version of frama-c that I use (Magnesium-20151002) has several memory model (Hoare, Typed , +cast, +ref, ...) but none of the given example were proved with any of the model above. It is explicitly said in the documentation that Typed model does not handle pointer-to-int casts. I've a lot of trouble to understand what's really going on under the hood with each wp-model. It would really help me if I was able to verify at least post-conditions of the memcpy function. Plus, I have seen this issue about void pointer that apparently are not very well handled by WP at least in the Magnesium version. I didn't tried another version of frama-c yet, but I think that newer version handle void pointer in a better way.
Thank you very much in advance for your suggestions !
memcpy
Reasoning about the result of memcpy (or Frama_C_memcpy) is out of range of the current WP plugin. The only memory model that would work in your case is Bytes (page 13 of the manual for Chlorine), but it is not implemented.
Independently, please note that your postcondition from Frama_C_memcpy is not what you want here. You are asserting the equality of the sets dest[0..n] and src[0..n]. First, you should stop at n-1. Second, and more importantly, this is far too weak, and is in fact not sufficient to prove the two assertions in the caller. What you want is a quantification on all bytes. See e.g. the predicate memcmp in Frama-C's stdlib, or the variant \forall int i; 0 <= i < n -> dest[i] == src[i];
By the way, this postcondition holds only if dest and src are properly separated, which your function does not require. Otherwise, you should write dest[i] == \at (src[i], Pre). And your requires are also too weak for another reason, as you only require the first character to be valid, not the n first ones.
Cast pointer to int
Basically, all current models implemented in WP are unable to reason on codes in which the memory is accessed with multiple incompatible types (through the use of unions or pointer casts). In some cases, such as Typed, the cast is detected syntactically, and a warning is issued to warn that the code cannot be analyzed. The Typed+Cast model is a variant of Typed in which some casts are accepted. The analysis is correct only if the pointer is re-cast into its original type before being used. The idea is to allow using some libc functions that require a cast to void*, but not much more.
Your example is again a bit different, because it is likely that MEMORY_ADDR is always addressed with type some_stuct. Would it be acceptable to change the code slightly, and change your function as taking a pointer to this type? This way, you would "hide" the cast to MEMORY_ADDR inside a function that would remain unproven.
I tried this example in the latest version of Frama-C (of course the format is modified a little bit).
for the memcpy case
Assertion 2 fails but assertion 3 is successfully proved (basically because the failure of assertion 2 leads to a False assumption, which proves everything).
So in fact both assertion cannot be proved, same as your problem.
This conclusion is sound because the memory models used in the wp plugin (as far as I know) has no assumption on the relation between fields in a struct, i.e. in header_src the first two fields are 8 bit chars, but they may not be nestedly organized in the physical memory like char[2]. Instead, there may be paddings between them (refer to wiki for detailed description). So when you try to copy bits in such a struct to another struct, Frama-C becomes completely confused and has no idea what you are doing.
As far as I am concerned, Frama-C does not support any approach to precisely control the memory layout, e.g. gcc's PACKED which forces the compiler to remove paddings.
I am facing the same problem, and the (not elegant at all) solution is, use arrays instead. Arrays are always nested, so if you try to copy a char[4] to a short[2], I think the assertion can be proved.
for the Cast pointer to int case
With memory model Typed+cast, the current version I am using (Chlorine-20180501) supports casting between pointers and uint64_t. You may want to try this version.
Moreover, it is strongly suggested to call Z3 and CVC4 through why3, whose performance is certainly better than Alt-Ergo.
I want to get the data pointer of a string variable(like string::c_str() in c++) to pass to a c function and I found this doesn't work:
package main
/*
#include <stdio.h>
void Println(const char* str) {printf("%s\n", str);}
*/
import "C"
import (
"unsafe"
)
func main() {
s := "hello"
C.Println((*C.char)(unsafe.Pointer(&(s[0]))))
}
Compile error info is: 'cannot take the address of s[0]'.
This will be OK I but I doubt it will cause unneccesary memory apllying. Is there a better way to get the data pointer?
C.Println((*C.char)(unsafe.Pointer(&([]byte(s)[0]))))
There are ways to get the underlying data from a Go string to C without copying it. It will not work as a C string because it is not a C string. Your printf will not work even if you manage to extract the pointer even if it happens to work sometimes. Go strings are not C strings. They used to be for compatibility when Go used more libc, they aren't anymore.
Just follow the cgo manual and use C.CString. If you're fighting for efficiency you'll win much more by just not using cgo because the overhead of calling into C is much bigger than allocating some memory and copying a string.
(*reflect.StringHeader)(unsafe.Pointer(&sourceTail)).Data
Strings in go are not null terminated, therefore you should always pass the Data and the Len parameter to the corresponding C functions. There is a family of functions in the C standard library to deal with this type of strings, for example if you want to format them with printf, the format specifier is %.*s instead of %s and you have to pass both, the length and the pointer in the arguments list.
I have a code block that queries AD and retrive the results and write to a channel.
func GetFromAD(connect *ldap.Conn, ADBaseDN, ADFilter string, ADAttribute []string, ADPage uint32) *[]ADElement {
searchRequest := ldap.NewSearchRequest(ADBaseDN, ldap.ScopeWholeSubtree, ldap.NeverDerefAliases, 0, 0, false, ADFilter, ADAttribute, nil)
sr, err := connect.SearchWithPaging(searchRequest, ADPage)
CheckForError(err)
fmt.Println(len(sr.Entries))
ADElements := []ADElement{}
for _, entry := range sr.Entries{
NewADEntity := new(ADElement) //struct
NewADEntity.DN = entry.DN
for _, attrib := range entry.Attributes {
NewADEntity.attributes = append(NewADEntity.attributes, keyvalue{attrib.Name: attrib.Values})
}
ADElements = append(ADElements, *NewADEntity)
}
return &ADElements
}
The above function returns a pointer to []ADElements.
And in my initialrun function, I call this function like
ADElements := GetFromAD(connectAD, ADBaseDN, ADFilter, ADAttribute, uint32(ADPage))
fmt.Println(reflect.TypeOf(ADElements))
ADElementsChan <- ADElements
And the output says
*[]somemodules.ADElement
as the output of reflect.TypeOf.
My doubt here is,
since ADElements := []ADElement{} defined in GetFromAD() is a local variable, it must be allocated in the stack, and when GetFromAD() exits, contents of the stack must be destroyed, and further references to GetFromAD() must be pointing to invalid memory references, whereas I still am getting the exact number of elements returned by GetFromAD() without any segfault. How is this working? Is it safe to do it this way?
Yes, it is safe because Go compiler performs escape analysis and allocates such variables on heap.
Check out FAQ - How do I know whether a variable is allocated on the heap or the stack?
The storage location does have an effect on writing efficient programs. When possible, the Go compilers will allocate variables that are local to a function in that function's stack frame. However, if the compiler cannot prove that the variable is not referenced after the function returns, then the compiler must allocate the variable on the garbage-collected heap to avoid dangling pointer errors. Also, if a local variable is very large, it might make more sense to store it on the heap rather than the stack.
Define "safe"...
You will not end up freeing the memory of ADElements, since there's at least one live reference to it.
In this case, you should be completely safe, since you're only passing the slice once and then you seem to not modify it, but in the general case it might be better to pass it element-by-element across a chan ADElement, to avoid multiple unsynchronized accesses to the slice (or, more specifically, the array backing the slice).
This also holds for maps, where you can get curious problems if you pass a map over a channel, then continue to access it.
I'm getting this unhandled exception when I exit my program:
Unhandled exception at 0x102fe274 (msvcr100d.dll) in Parameters.exe: 0xC0000005: Access violation reading location 0x00000005.
The debugger stops in a module called crtdll.c on this line:
onexitbegin_new = (_PVFV *) DecodePointer(__onexitbegin);
The top line on the call stack reads:
msvcr100d.dll!__clean_type_info_names_internal(__type_info_node * p_type_info_root_node=0x04a6506c) Line 359 + 0x3 bytes C++
The program then remains in memory until I close down the IDE.
I'm more used to developing with managed languages so I expect I'm doing something wrong with my code maintenance. The code itself reads a memory mapped file and assoiciates it with pointers:
SUBROUTINE READ_MMF ()
USE IFWIN
USE, INTRINSIC :: iso_c_binding
USE, INTRINSIC :: iso_fortran_env
INTEGER(HANDLE) file_mapping_handle
INTEGER(LPVOID) memory_location
TYPE(C_PTR) memory_location_cptr
INTEGER memory_size
INTEGER (HANDLE) file_map
CHARACTER(5) :: map_name
TYPE(C_PTR) :: cdata
integer :: n = 3
integer(4), POINTER :: A, C
real(8), POINTER :: B
TYPE STRUCT
integer(4) :: A
real(8) :: B
integer(4) :: C
END TYPE STRUCT
TYPE(STRUCT), pointer :: STRUCT_PTR
memory_size = 100000
map_name = 'myMMF'
file_map = CreateFileMapping(INVALID_HANDLE_VALUE,
+ NULL,
+ PAGE_READWRITE,
+ 0,
+ memory_size,
+ map_name // C_NULL_CHAR )
memory_location = MapViewOfFile(file_map,
+IOR(FILE_MAP_WRITE, FILE_MAP_READ),
+ 0, 0, 0 )
cdata = TRANSFER(memory_location, memory_location_cptr)
call c_f_pointer(cdata, STRUCT_PTR, [n])
A => STRUCT_PTR%A
B => STRUCT_PTR%B
C => STRUCT_PTR%C
RETURN
END
Am I supposed to deallocate the c-pointers when I'm finished with them? I looked into that but can't see how I do it in Fortran...
Thanks for any help!
The nature of the access violation (during runtime library cleanup) suggests that your program is corrupting memory in some way. There are a number of programming errors that can lead to that - and the error or errors responsible could be anywhere in your program. The usual "compile and run with all diagnostic and debugging options enabled" approach may help identify these.
That said, there is a programming error in the code example shown. The C_F_POINTER procedure from the ISO_C_BINDING intrinsic module can operate on either scalar or array Fortran pointers (the second argument). If the Fortran pointer is a scalar then the third "shape" argument must not be present (it must be present if the Fortran pointer is an array).
Your code breaks this requirement - the Fortran pointer STRUCT_PTR in your code is a scalar, but yet you provide the third shape argument (as [n]). It is quite plausible that this error will result in memory corruption - typically the implementation of C_F_POINTER would try and populate a descriptor in memory for the Fortran pointer, and the descriptor for a pointer to an array may be very different from a pointer to a scalar.
Subsequent references to STRUCT_PTR may further the corruption.
While it is not required by the standard to diagnose this situation, I am a little surprised that the compiler does not issue a diagnostic (assuming you example code is what you actually are compiling). If you reported this to your compiler's vendor (Intel, presumably given IFWIN etc) I suspect they would regard it as a deficiency in their compiler.
To release the memory associated with the file mapping you use the UnmapViewOfFile and CloseHandle API's. To use these you should "store" (your program needs to remember in some way) the base address (memory_location, which can also be obtained by calling C_LOC on STRUCT_PTR once the problem above is fixed) returned by MapViewOfFile, and the handle to the mapping (file_map) returned by CreateFileMapping; respectively.
I've only ever done this with Cray Pointers: not with the ISO bindings and I know it does work with Cray Pointers.
What you don't say is whether this is happening the first time or second time the routine is being called. If it is called more than once, then there is a problem in the coding in that Create/OpenFileMapping should only be called once to get a handle.
You don't need to deallocate memory because the memory is not yours to deallocate: you need to call UnmapViewOfFile(memory_location). After you have called this, memory_location, memory_location_cptr and possibly cdata are no longer valid.
The way this works is with two or more programs:
One program calls CreateFileMapping, the others calls OpenFileMapping to obtain a handle to the data. This only needs to be called once at the start of the program: not every time you need to access the file. Multiple calls to Create/OpenFileMapping without a corresponding close can cause crashes.
They then call MapViewOfFile to map the file into memory. Note that only one program can do this at a time. When the program is finished with the memory file, it calls UnmapViewOfFile. The other program can now get to the file. There is a blocking mechanism. If you do not call UnmapViewOfFile, other programs using MapViewOfFile will be blocked.
When all is done, call close on the handle created by Create/OpenFileMapping.
Given below is some code in ada
with TYPE_VECT_B; use TYPE_VECT_B;
Package TEST01 is
procedure TEST01
( In_State : IN VECT_B ;
Out_State : IN OUT VECT_B );
function TEST02
( In_State : IN VECT_B ) return Boolean ;
end TEST01;
The TYPE_VECT_B package specification and body is also defined below
Package TYPE_VECT_B is
type VECT_B is array (INTEGER range <>) OF BOOLEAN ;
rounded_data : float ;
count : integer ;
trace : integer ;
end TYPE_VECT_B;
Package BODY TYPE_VECT_B is
begin
null;
end TYPE_VECT_B;
What does the variable In_State and Out_State actually mean? I think In_State means input variable. I just get confused to what actually Out_State means?
An in parameter can be read but not written by the subprogram. in is the default. Prior to Ada 2012, functions were only allowed to have in parameters. The actual parameter is an expression.
An out parameter implies that the previous value is of no interest. The subprogram is expected to write to the parameter. After writing to the parameter, the subprogram can read back what it has written. On exit the actual parameter receives the value written to it (there are complications in this area!). The actual parameter must be a variable.
An in out parameter is like an out parameter except that the previous value is of interest and can be read by the subprogram before assignment. For example,
procedure Add (V : Integer; To : in out Integer; Limited_To : Integer)
is
begin
-- Check that the result wont be too large. This involves reading
-- the initial value of the 'in out' parameter To, which would be
-- wrong if To was a mere 'out' parameter (it would be
-- uninitialized).
if To + V > Limited_To then
To := Limited_To;
else
To := To + V;
end if;
end Add;
Basically, every parameter to a function or procedure has a direction to it. The options are in, out, in out (both), or access. If you don't see one of those, then it defaults to in.
in means data can go into the subroutine from the caller (via the parameter). You are allowed to read from in parameters inside the routine. out means data can come out of the routine that way, and thus you are allowed to assign values to the parameter inside the routine. In general, how the compiler accomplishes the data passing is up to the compiler, which is in accord with Ada's general philosophy of allowing you to specify what you want done, not how you want it done.
access is a special case, and is roughly like putting a "*" in your parameter definition in Cish languages.
The next question folks usually have is "if I pass something large as an in parameter, is it going to push all that data on the stack or something?" The answer is "no", unless your compiler writers are unconsionably stupid. Every Ada compiler I know of under the hood passes objects larger than fit in a machine register by reference. It is the compiler, not the details of your parameter passing mechanisim, that enforces not writing data back out of the routine. Again, you tell Ada what you want done, it figures out the most efficient way to do it.