Writing vectors to shared memory - vector

I'm new to STL containers, and from what I saw, writing vectors to shared memory and reading off from it seems to be a bit of a bother.
For certain reasons, libraries like Boost:InterProcess are not an option. So I need to write the shared memory and allocator myself, for a vector of a complex type, but am not sure how to go on about it.
If someone could give me an example of how it's done, or a link where I can see a code example or an explanation, I'd be grateful. THanks
EDIT
I've a vector defined as
vector <SERVER_INFO_TYPE *> m_serverVecotr
where SERVER_INFO_TYPE is itself a structure containing CStrings etc..

Related

Indexing Dynamic Arrays in CUDA [duplicate]

There are a lot of questions online about allocating, copying, indexing, etc 2d and 3d arrays on CUDA. I'm getting a lot of conflicting answers so I'm attempting to compile past questions to see if I can ask the right ones.
First link: https://devtalk.nvidia.com/default/topic/392370/how-to-cudamalloc-two-dimensional-array-/
Problem: Allocating a 2d array of pointers
User solution: use mallocPitch
"Correct" inefficient solution: Use malloc and memcpy in a for loop for each row (Absurd overhead)
"More correct" solution: Squash it into a 1d array "professional opinion," one comment saying no one with an eye on performance uses 2d pointer structures on the gpu
Second link: https://devtalk.nvidia.com/default/topic/413905/passing-a-multidimensional-array-to-kernel-how-to-allocate-space-in-host-and-pass-to-device-/
Problem: Allocating space on host and passing it to device
Sub link: https://devtalk.nvidia.com/default/topic/398305/cuda-programming-and-performance/dynamically-allocate-array-of-structs/
Sub link solution: Coding pointer based structures on the GPU is a bad experience and highly inefficient, squash it into a 1d array.
Third link: Allocate 2D Array on Device Memory in CUDA
Problem: Allocating and transferring 2d arrays
User solution: use mallocPitch
Other solution: flatten it
Fourth link: How to use 2D Arrays in CUDA?
Problem: Allocate and traverse 2d arrays
Submitted solution: Does not show allocation
Other solution: squash it
There are a lot of other sources mostly saying the same thing but in multiple instances I see warnings about pointer structures on the GPU.
Many people claim the proper way to allocate an array of pointers is with a call to malloc and memcpy for each row yet the functions mallocPitch and memcpy2D exist. Are these functions somehow less efficient? Why wouldn't this be the default answer?
The other 'correct' answer for 2d arrays is to squash them into one array. Should I just get used to this as a fact of life? I'm very persnickety about my code and it feels inelegant to me.
Another solution I was considering was to max a matrix class that uses a 1d pointer array but I can't find a way to implement the double bracket operator.
Also according to this link: Copy an object to device?
and the sub link answer: cudaMemcpy segmentation fault
This gets a little iffy.
The classes I want to use CUDA with all have 2/3d arrays and wouldn't there be a lot of overhead in converting those to 1d arrays for CUDA?
I know I've asked a lot but in summary should I get used to squashed arrays as a fact of life or can I use the 2d allocate and copy functions without getting bad overhead like in the solution where alloc and cpy are called in a for loop?
Since your question compiles a list of other questions, I'll answer by compiling a list of other answers.
cudaMallocPitch/cudaMemcpy2D:
First, the cuda runtime API functions like cudaMallocPitch and cudaMemcpy2D do not actually involve either double-pointer allocations or 2D (doubly-subscripted) arrays. This is easy to confirm simply by looking at the documentation, and noting the types of parameters in the function prototypes. The src and dst parameters are single-pointer parameters. They could not be doubly-subscripted, or doubly dereferenced. For additional example usage, here is one of many questions on this. here is a fully worked example usage. Another example covering various concepts associated with cudaMallocPitch/cudaMemcpy2d usage is here. Instead the correct way to think about these is that they work with pitched allocations. Also, you cannot use cudaMemcpy2D to transfer data when the underlying allocation has been created using a set of malloc (or new, or similar) operations in a loop. That sort of host data allocation construction is particularly ill-suited to working with the data on the device.
general, dynamically allocated 2D case:
If you wish to learn how to use a dynamically allocated 2D array in a CUDA kernel (meaning you can use doubly-subscripted access, e.g. data[x][y]), then the cuda tag info page contains the "canonical" question for this, it is here. The answer given by talonmies there includes the proper mechanics, as well as appropriate caveats:
there is additional, non-trivial complexity
the access will generally be less efficient than 1D access, because data access requires dereferencing 2 pointers, instead of 1.
(note that allocating an array of objects, where the object(s) has an embedded pointer to a dynamic allocation, is essentially the same as the 2D array concept, and the example you linked in your question is a reasonable demonstration for that)
Also, here is a thrust method for building a general dynamically allocated 2D array.
flattening:
If you think you must use the general 2D method, then go ahead, it's not impossible (although sometimes people struggle with the process!) However, due to the added complexity and reduced efficiency, the canonical "advice" here is to "flatten" your storage method, and use "simulated" 2D access. Here is one of many examples of questions/answers discussing "flattening".
general, dynamically allocated 3D case:
As we extend this to 3 (or higher!) dimensions, the general case becomes overly complex to handle, IMO. The additional complexity should strongly motivate us to seek alternatives. The triply-subscripted general case involves 3 pointer accesses before the data is actually retrieved, so even less efficient. Here is a fully worked example (2nd code example).
special case: array width known at compile time:
Note that it should be considered a special case when the array dimension(s) (the width, in the case of a 2D array, or 2 of the 3 dimensions for a 3D array) is known at compile-time. In this case, with an appropriate auxiliary type definition, we can "instruct" the compiler how the indexing should be computed, and in this case we can use doubly-subscripted access with considerably less complexity than the general case, and there is no loss of efficiency due to pointer-chasing. Only one pointer need be dereferenced to retrieve the data (regardless of array dimensionality, if n-1 dimensions are known at compile time for a n-dimensional array). The first code example in the already-mentioned answer here (first code example) gives a fully worked example of that in the 3D case, and the answer here gives a 2D example of this special case.
doubly-subscripted host code, singly-subscripted device code:
Finally another methodology option allows us to easily mix 2D (doubly-subscripted) access in host code while using only 1D (singly-subscripted, perhaps with "simulated 2D" access) in device code. A worked example of that is here. By organizing the underlying allocation as a contiguous allocation, then building the pointer "tree", we can enable doubly-subscripted access on the host, and still easily pass the flat allocation to the device. Although the example does not show it, it would be possible to extend this method to create a doubly-subscripted access system on the device based off a flat allocation and a manually-created pointer "tree", however this would have approximately the same issues as the 2D general dynamically allocated method given above: it would involve double-pointer (double-dereference) access, so less efficient, and there is some complexity associated with building the pointer "tree", for use in device code (e.g. it would necessitate an additional cudaMemcpy operation, probably).
From the above methods, you'll need to choose one that fits your appetite and needs. There is not one single recommendation that fits every possible case.

How to organize a large number of variables in ocaml

i first want to say that i'm beginner in ocaml. So i made a simple app that takes data from a json, does some calculations or replace some of them with arg from the command line, then writes another json with the new data and also replace those values in a html template and writes that too. You can see my project here https://github.com/ralcr/invoice-cmd/blob/master/invoice.ml
The question is how to deal with that amount of variables? In the languages i know i would probably repeat myself twice, but here are like 6 times. Thanks for any advice.
First of all, I would like to notice, that StackExchange code review is probably a better place to post such questions, as the question is more about a design rather than about the language.
I have two suggestions, on how to improve your code. The first one is to use string maps (or hashtables) to store your variables. Another is much more radical, is to rewrite the code in a more functional way.
Use maps
In your code, you're doing a lot of pouring the same water from one bucket into another, without doing actual work. The first thing that comes to mind, is whether it is necessary at all. When you parse JSON definitions into a set of variables, you do not actually reduce complexity or enforce any particular invariants. Basically, you're confusing data with code. These variables, are actually data that you're processing not a part of the logic of your application. So the first step would be to use string map, and store them in it. Then you can easily process a big set of variables with fold and map.
Use functions
Another approach is not to store the variables at all and express everything as stateless transformations on JSON data. Your application looks like a JSON processor, so I don't really see any reason why you should first read everything and store it in the memory, and then later produce the result. It is more natural to process data on the fly and express your logic as a set of small transformations. Try to split everything into small functions, so that each individual transformation can be easily understood. Then compose your transformation from smaller parts. This would be a functional style, where the flow of data is explicit.

In a paragraph or less, what is the purpose and benefits of pointers?

See title. That's all I have to ask. The net doesn't have many succinct answers to this question. Please keep in mind stack vs heap. Explain as you would to a complete beginner. Just looking for the "why" not the "how".
edit
Are pointers a way to get large objects out of the stack?
When passing a huge object from one piece of your program to another to be worked on like an entire class for example or something with a large amount of data like an image or video passing every single bit of data would be very inefficient. Instead you can just pass a tiny little memory address (pointer) that the receiving part of your program can then use to get to the object to be worked on.
Aside from that huge aspect, they offer a lot of flexibility but I need more than a paragraph for that.
When you get into managed code like C# or Java EVERYTHING is done with pointers/references but it's all behind the scenes and you don't have to deal with them like you would in C++ or another similar language. But it's still crucial to understand how they work.
Edit in response to:
"why would I pass a large object around if I don't need to work on
it?"
You wouldn't. However; Correct me if I'm straying from what your asking but what you'll learn if you continue into Computer Science is that a piece of your program should be as simple as possible it should only do 1 thing. Commonly known as the Single Responsibility Principle this dictates that you will have many seemingly tiny parts of your program that will all work together to accomplish the over arching goal. That means that a lot of those tiny pieces are going to need to work on the same objects, the same data and use the same tools to get the job done. Lets look at a hypothetical.
You're coding a simple image editing application.You're going to need a cropping tool, a paint brush tool, a selection tool, and a re-size tool. Each of these tools are going to need their own place in your program (a class or more likely many classes that work together) and that class will have many smaller pieces (methods/functions and other things) that work together to accomplish the goal of that class. Every single one of these classes and methods is most likely going to need to look at or modify the image data. With a pointer you can provide them with a memory address instead of making an entire copy of the image. That way when one of the classes or methods makes a change to it you don't need to worry about managing all these copies and making sure they all get the same change.
It allows you to do pass-by-reference/shared data structures, which has two big features: it saves memory and CPU overhead by not making copies, and it provides for complex communication patterns by making changes to shared data.

What can pointers do that are otherwise impossible to implement?

So I'm having a hard time grasping the idea behind pointers and all that memory allocation.
I'm thinking nowadays with computer as powerful as they are right now why do we have to use pointers at all?
Isn't there always a workaround to do things without the help of pointers?
Pointers are an indirection: instead of working with the data itself, you are working with (something) that points to the data. Depending on the semantics of the language, this allows many things: cheaply switch to another instance of data (by setting the pointer to point to another instance), passing pointers allows access to the original data without having to make (a possibly expensive) copy, etc.
Memory allocation is related to pointers, but separate: you can have pointers without allocating memory. The reason you need pointers for memory allocation is that the actual address the allocated block of memory resides is not known at compile time, so you can only access it via a level of indirection (i.e. pointers) -- the compiler statically allocates space for the pointer that will point to the dynamically allocated memory.
Pointers are incredibly powerful. Just because computers have a faster processing time nowdays, doesn't mean that's any reason to abandon something as essential as pointers. Passing around giant chunks of memory on the stack is inefficient at best, catastrophic at worst. With pointers, you only need to maintain a reference to where the data resides, rather than duplicating huge chunks of memory each time you call a function.
Also, if you're copying all the data every time, how do you modify the original data? Aside from returning the copy of the structure in every call that touches it.
I remember reading somewhere that Dijkstra was assessing a student for a programming course; this student was quite intelligent but s/he wasn't able to solve the problem because there was sort of a mental block.
All the code was sort of ok, but what was needed was simply to use the expression
a[a[i+1]] = j;
and even if being so close to the solution still the goal seemed to be miles away.
Languages "without pointers" already exist... e.g. BASIC. Without explicit pointers, that is. But the indirection idea... the idea that you can have data to mean just where to find other data is central to programming.
The very idea of array is about being able to use computed values to find other values.
Trying to hide this idea is an horrible plan. According to Dijkstra anyone that has been exposed to the BASIC language has already received such a mental mutilation that is impossible to recover as a good programmer (and probably the absence of explicit indirection was one of the problems).
I think he was exaggerating.
Just a bit.

Does Haskell have pointers?

Do you know if are there pointers in Haskell?
If yes: how do you use them? Are there any problems with them? And why aren't they popular?
If no: is there any reason for it?
Yes there are. Take a look at Foreign.Ptr or Data.IORef
I suspect this wasn't what you are asking for though. As Haskell is for the most part without state, it means pointers don't fit into the language design. Having a pointer to memory outside the function would mean that a function is no longer pure and only allowing pointers to values within the current function is useless.
Haskell does provide pointers, via the foreign function interface extension. Look at, for example, Foreign.Storable.
Pointers are used for interoperating with C code. Not for every day Haskell programming.
If you're looking for references -- pointers to objects you wish to mutate -- there are STRef and IORef, which serve many of the same uses as pointers. However, you should rarely -- if ever -- need Refs.
If you simply wish to avoid copying large values, as sepp2k supposes, then you need do nothing: in most implementation, all non-trivial values are allocated separately on a heap and refer to one another by machine-level addresses (i.e. pointers). But again, you need do nothing about any of this, it is taken care of for you.
To answer your question about how values are passed, they are passed in whatever way the implementation sees fit: since you can't mutate the values anyway, it doesn't impact the meaning of the code (as long as the strictness is respected); usually this works out to by-need unless you're passing in e.g. Int values that the compiler can see have already been evaluated...
Pass-by-need is like pass-by-reference, except that any given reference could refer either to an actual evaluated value (which cannot be changed), or to a "thunk" for a not-yet-evaluated value. Wikipedia has more.

Resources