memory management in xen - xen

I understand that xen allocates all the physical memory required by the guest when the guest gets started. Also it maintains a shadow page table (I'm assuming it uses struct page_info to maintain this. Am I correct? If not can anyone explain?) which I wish to access, because I need to traverse that list to check whether the guest to which this page is assigned to has at least accessed it once.
struct page_info {
union {
};
pointer next, prev;
union {
};
};
Can anyone explain me how I can acheive this?

I understand that xen allocates all the physical memory required by the guest when the guest gets started.
Possibly. I think that the current Xen toolstack does do this, but there are proposals to instead just perform a reservation for the physical memory of the guest without actually performing the allocation until the guest actually accesses it. The motivation for this is to allow memory overcommit to increase the host's capacity for guests that allow memory ballooning, and the avoid the time-consuming scrub of all of the guest memory, that is required for security purposes and isolation enforcement, prior to running the guest.
Also it maintains a shadow page table
For some guests, in some configurations, yes. It will do so for fully-virtualized (aka HVM) guests, where Xen is not using hardware extended page table support (eg. Intel EPT). It will not do so for paravirtualized guests.
(I'm assuming it uses struct page_info to maintain this. Am I correct? If not can anyone explain?)
Hmm. The shadow page tables are a very sophisticated and intricate piece of software. They use multiple datastructures to maintain the virtualized address space of the guest. I think you'll need to study the code in detail to get a handle on it. It's beyond the scope of a short answer here.
which I wish to access because I need to traverse that list to check whether the guest to which this page is assigned to has at least accessed it once.
You'll need to mark the page as absent in the guest page tables, and modify the page fault handler in Xen's shadow page table code to fix it up as present on trap, but also update your own datastructure for tracking the accesses as they occur. There is code similar to this for tracking guest pages accesses for writes, not reads, called dirty bitmap tracking which is used when performing live migration of virtual machines.
Can anyone explain me how I can achieve this?
It would help if you could explain your motivation for wanting to track guest read accesses to physical memory locations.

Related

How do I configure OpenSplice DDS for 100,000 nodes?

What is the right approach to use to configure OpenSplice DDS to support 100,000 or more nodes?
Can I use a hierarchical naming scheme for partition names, so "headquarters.city.location_guid_xxx" would prevent packets from leaving a location, and "company.city*" would allow samples to align across a city, and so on? Or would all the nodes know about all these partitions just in case they wanted to publish to them?
The durability services will choose a master when it comes up. If one durability service is running on a Raspberry Pi in a remote location running over a 3G link what is to prevent it from trying becoming the master for "headquarters" and crashing?
I am experimenting with durability settings in a remote node such that I use location_guid_xxx but for the "headquarters" cloud server I use a Headquarters
On the remote client I might to do this:
<Merge scope="Headquarters" type="Ignore"/>
<Merge scope="location_guid_xxx" type="Merge"/>
so a location won't be master for the universe, but can a durability service within a location still be master for that location?
If I have 100,000 locations does this mean I have to have all of them listed in the "Merge scope" in the ospl.xml file located at headquarters? I would think this alone might limit the size of the network I can handle.
I am assuming that this product will handle this sort of Internet of Things scenario. Has anyone else tried it?
Considering the scale of your system I think you should seriously consider the use of Vortex-Cloud (see these slides http://slidesha.re/1qMVPrq). Vortex Cloud will allow you to better scale your system as well as deal with NAT/Firewall. Beside that, you'll be able to use TCP/IP to communicate from your Raspberry Pi to the cloud instance thus avoiding any problem related to NATs/Firewalls.
Before getting to your durability question, there is something else I'd like to point out. If you try to build a flat system with 100K nodes you'll generate quite a bit of discovery information. Beside generating some traffic, this will be taking memory on your end applications. If you use Vortex-Cloud, instead, we play tricks to limit the discovery information. To give you an example, if you have a data-write matching 100K data reader, when using Vortex-Cloud the data-writer would only match on end-point and thus reducing the discovery information by 100K times!!!
Finally, concerning your durability question, you could configure some durability service as alignee only. In that case they will never become master.
HTH.
A+

Getting current transferred MPI network communication volume

I have a question related to MPI.
In order to keep track of the communication volume used by my implementation, I would like to get the currently-transferred data amount since the mpi-process' start until the current measure-point.
I checked the specification as well as the mpi.h header file of mpich and did not find a matching function to call or variable that keeps track of the network transfer costs. It would, of course, be possible to implement a small traffic registry or define a macro for tracking communication sizes, but maybe it can be read out from somewhere.
Do you know a method to gain the current transfer size, maybe it is also possible to get this number using a system call to get the network traffic size of the process?
Is it maybe possible to access the proc information of the current process, maybe the /proc/net is maintained per process as well, such as /proc/self/net?
Thank you in advance,
Martin

Peer-to-peer replication of a sqlite database

I am looking for a way to replicate a small and simple relational database (like SQLite) across peers. This should work in an environment with unstable network connections, hence the need for each peer to have a full copy of the database. This should allow a peer to continue working off-line in the event of network failure.
To keep things simple, replication should only have to support the replication of addition of data, i.e. only INSERTs, not DELETEs or UPDATEs.
Does anyone know of a good - and ideally cross-platform - technology or method of creating such a system? I am currently looking at JXTA and JXSE, but I am put off by its complexity and apparant lack of life in its community after the takeover of Sun by Oracle.
Thanks!
Frans
rqlite uses the raft consensus algorithm, so it should be fairly resilient to unstable network connection.
Also, it seems to be possible to configure rqlite to accept reads even in the case of a network failure.
A similar project, dqlite, exists as a library, available in various languages, but it seems less explicit about the event of a network failure.
You may want to explore JGroups for the communication layer if you don't like JXTA. For the replication, I think you will have to implement your own code.
I am working on something similar (though the code is far from ready). I'll describe a little about my intended approach, but whether that is suitable for you depends on some key design points you'd need to consider. I am not aware of any ready-built projects that will do this, unfortunately.
In particular we'd need to know what language you wish to use, or which languages you'd rather avoid.
Also, consider how you intend to do peer dicovery - can you set up trust between node pairs manually, or do you want them to auto-discover?
Presumably all peers may insert data?
If you are able to use PHP, and are happy manually peering node pairs, then my approach may be of interest. Set up an ORM such as Doctrine, Propel or NotORM, and get each node to regularly sync with an internet time source. For each new row in a db, grab the data (either in an array or ORM object), serialise it, and push it out to all nodes that you have a trust relationship with. Where a push fails, keep a note of this and retry at periodic intervals (potentially giving up after a remote node fails to answer a large number of retries).
Pushes can either be kicked off by your application that creates the row, or can be called by whatever scheduler is available on each machine. A push message can be XML, or for simplicity can be just a POST message containing the new row and whatever metadata (e.g. timestamp of save, so as to resolve INSERT order from several nodes).
If your nodes do not have static IP addresses, they could be registered with a dynamic DNS addressing service so as to allow each node to stay in touch with peers even if their IP changes. You might also consider adding a message signing system, to ensure that messages between nodes are genuine.

Address Inquiry

Okay so, lets say I have an integer.
When I execute the program, that integer gets an address.
Makes sense.
But, there is many programs out there. Lets see, when creating any game hack, lets say minesweeper I find address of where that data stored and change it.
But... That hack, that simple hack which just changing some address... Works on every computer and every-time.
The question is, that data is getting same address every-time.
And on my computer, there is about 30 exe running now.
Don't other programs want that address ? What If they want that address ? Why that hack works every-time ? Why other programs dont want that very same address ? How its working every-time ?
Every application gets it's own virtual addressing space (4GB on 32 bit machines) to overcome that problem in a multitasking operating system.
Here is a pretty good article covering the subject.
Your "hack" is probably locating a process using something like OpenProcess and editing the memory using WriteProcessMemory. That's why it works on "all" machines.
Basically, you need to read about virtual memory. The purpose of virtual memory is to abstract away the physical address space, and give each process (i.e. each application) its own "virtual" address space, which avoids the problem that you're describing.
If your minesweeper hack consists of manipulating data stored on a specified static address, there's no way it would work on every computer.. Program memory allocation is OS dependent.

virtual address

Suppose I'm starting two instances of the same program. Will the text region of both programs have same virtual addresses?
Depends. On most systems, if you run the same program twice in the same environment (same parameters, etc.), you'll find the same address mapping. This is simply because most of what the process does is deterministic, dependent only on the environment, command-line parameters, contents of files read, but not on changing data such as the date or process ID. This is very useful when debugging: if you restart your program, sometimes even after a small code change and recompilation, you have a chance that the memory layout remained the same. Of course, different instances of the program running concurrently may have the same virtual addresses, but they won't have the same physical addresses.
Some systems, such as OpenBSD, or Linux with various hardening settings, implement address space layout randomization (ASLR). ASLR means that each time a process starts, the virtual addresses of its code, data, stack(s) and heap(s) are determined at random. This is a security features, designed to make exploits of security vulnerabilities harder: the exploit code can't just access known code at known addresses. However, as ASLR becomes more popular, exploits also become more sophisticated to work around it. ASLR remains useful because it increases the workload for the exploit writer without adding a lot of complexity.
Probably not, but it's possible that they could. Each process has its own independent memory space.

Resources