understand 'struct proto' and 'struct proto_ops' in the kernel - networking

I'm studying the tcp/ip implementation, specifically sockets layer, and there's something I don't quite understand about a few structure.
I know that 'struct proto_ops' is used to define the operations, e.g. bind/connect/accept, and every socket has a specified proto_ops.
On the other hand, 'struct proto' defines new protocol and the structure also defines function pointers for accept/bind/setsockopt/getsockopt/etc. methods.
I read lots of code in $linux/net/ and I don't see where does it use the operations defined via 'struct proto', so I'm not sure how these methods are being used in the code?
Could someone clarify this for me?
Thanks.

I think the question is the most headache problem when a newbie try to create a new protocol
Explain:
Both structures have member elements with similar names although they represent different functions
struct prot_ops: used for communication between socket layer and transport layer
struct prot: used for communicate with system calls
Example:
when you call a system call in userspace, ex connect(), prot_ops_connect() will be call first.
In fucntion prot_ops_connect(), we need to call sk->sk_prot->connect()
And sk->sk_prot->connect() will call proto_connect() automatically
Hope this help

You can image like this.There are three layer:
BSD sock->inet sock->tcp/udp sock
corresponding ops:
BSD api->proto_ops->proto
If you read sys_socket() and sys_read(), you will get the same answers.
Hopefully, this can help you:-)

Related

Why should I use a pointer ( performance)?

I'm wondering if there is any perf benchmark on raw objects vs pointers to objects.
I'm aware that it doesn't make sense to use pointers on reference types (e.g. maps) so please don't mention it.
I'm aware that you "must" use pointers if the data needs to be updated so please don't mention it.
Most of the answers/ docs that I've found basically rephrase the guidelines from the official documentation:
... If the receiver is large, a big struct for instance, it will be much cheaper to use a pointer receiver.
My question is simply what means "large" / "big"? Is a pointer on a string overkill ? what about a struct with two strings, what about a struct 3 string fields??
I think we deal with this use case quite often so it's a fair question to ask. Some advise to don't mind the performance issue but maybe some people want to use the right notation whenever they have to chance even if the performance gain is not signifiant. After all a pointer is not that expensive (i.e. one additional keystroke).
An example where it doesn't make sense to use a pointer is for reference types (slices, maps, and channels)
As mentioned in this thread:
The concept of a reference just means something that serves the purpose of referring you to something. It's not magical.
A pointer is a simple reference that tells you where to look.
A slice tells you where to start looking and how far.
Maps and channels also just tell you where to look, but the data they reference and the operations they support on it are more complex.
The point is that all the actually data is stored indirectly and all you're holding is information on how to access it.
As a result, in many cases you don't need to add another layer of indirection, unless you want a double indirection for some reason.
As twotwotwo details in "Pointers vs. values in parameters and return values", strings, interface values, and function values are also implemented with pointers.
As a consequence, you would rarely need a to use a pointer on those objects.
To quote the official golang documentation
...the consideration of efficiency. If the receiver is large, a big struct for instance, it will be much cheaper to use a pointer receiver.
It's very hard to give you exact conditions since there can be different performance goals. As a rule of thumb, by default, all objects larger than 128 bits should be passed by pointer. Possible exceptions of the rule:
you are writing latency sensitive server, so you want to minimise garbage collection pressure. In order to achieve that your Request struct has byte[8] field instead of pointer to Data struct which holds byte[8]. One allocation instead of two.
algorithm you are writing is more readable when you pass the struct and make a copy
etc.

Ada: pragma Pure / Remote_Types and system types

I'm writing an Ada application that needs to be distributed, and I'm trying to use the DSA to do it, but I'm finding big limitations in what is "allowed" to be "withed" and what isn't.
I won't post sourcecode, since it's quite complex and this is a generic question anyway, I just wanted some pointers on what I'm not understanding correctly, so please bear with me and correct me if I'm wrong.
So my problem is this: I want to mark a procedure with the pragma Remote_Call_Interface so it can be called remotely. However as soon as I add the pragma compilation breaks due to the fact that the procedure is including other packages in my project that are not categorized as either Pure or Remote_Types.
So I try to mark the packages I need as either Pure or Remote_Types (dpeending wether they have state or not) but this in turn breaks compilation even further, since it turns out that you can't use even basic system types in a Pure/Remote_Types package, for example: you can't use Vectors, you can't use Unbounded_Strings, you can't use Maps, etc... the whole program falls to pieces since I can't use the data structures I used to build it anymore!
Is there a way around this? Or if I want to distribute my application I must strictly limit myself to the most basic types like Integers and booleans and little else?? I don't understand if I'm hitting against a limitation of the language or if I'm just doing it incorrectly (unfortunately the tutorials I found on DSA are all very vague, incidentally if anyone has some good ones feel free to link them!)
EDIT: after ajb's answer let me specify what is annoying me in particular: in the package I want to mark with pragma Remote_Call_Interface I'm trying to "with" some packages that are not pure/remote_types, however it only uses the types in those packages locally, it does not contain any procedures that accept such types as parameters, nor functions that return such types. This is what bothers me: since those types would not have to "travel" over the network, why can't I with them? I'm only using them locally... I don't understand this, and that is why I was trying to make those types Pure/Remote_Types, but now that I've read ajb's explanation (ie: Remote_Types is used so that objects of those types can travel over the network) I'm even more confused about why I can't use them if I only use them locally.
I'm not an expert on Ada distributed programming, but here's what I do know (or think I know):
The Annotated Ada Reference Manual, Section E.2.3 says, "The restrictions governing a remote call interface library unit are intended to ensure that the values of the actual parameters in a remote call can be meaningfully sent between two active partitions." For example, if a record type has a field that's an access type, you can't send it from one partition to another blindly, because the called partition won't be able to access the memory that the pointer points to. (Unbounded_String, Map, and Vector are implemented using access types as part of the internals.) All types used as parameters or return types must support "external streaming", meaning there has to be a way for the type to be converted to and from a stream of bytes so that the parameter value can be transmitted over a socket. If you have a record with an access type, but you provide 'Read and 'Write attributes so that the type can be written to and read from a byte stream without any actual pointers being transmitted, then you can put your record type in a Remote_Types package.
I'm not sure exactly what your problem is: are there certain types you want to pass as a parameter to a remote call but can't; or are there types that you want to use only in the rest of your application, but are getting in the way?
If it's the second one, then I think the solution is to restructure your packages so that all the "remote types" are separate from the non-remote types.
However, if you're really looking to pass an Unbounded_String, Map, or Vector from one partition to another in a remote call, it's trickier. Unbounded_String really should support external streaming, and there was a proposal to make Unbounded_String a Remote_Types package (see AI05-0204), but it wasn't acted on--I don't know why. Map and Vector would be bigger problems, though, since they are generic packages that have to work on any type, including those that don't support external streaming. In any case, those types aren't set up to be automatically converted to or from bytes to be passed over a socket.
But I think you could make it work like this:
private with Ada.Strings.Unbounded;
package Remote_Types_Package is
pragma Remote_Types;
type My_Unbounded_String is private;
private
type My_Unbounded_String is record
S : Ada.Strings.Unbounded.Unbounded_String;
end record;
end Remote_Types_Package;
The Unbounded_String package must be withed with private with; see E.2.2(6). You'll need to provide a function to create the My_Unbounded_String, and you'll need to provide stream read and write routines for My_Unbounded_String, and define 'Read and 'Write for the type. You should be able to write the Read and Write attributes by using the Read and Write attributes for the Unbounded_String. Something similar should be doable if you want to use a Vector as a remote call parameter, although you may have to do more work to marshal/unmarshal the type yourself.
Once again, I have not tried this, and it's possible there are some hitches in this solution.
EDIT: Since it now looks like the question is the simpler one--i.e. you have some types that are not going to be passed between partitions getting in the way--the solution should be simpler. Any types that you define that are going to be communicated between partitions need to be in a Remote_Types package, say P1. Other types should be in a different package, say P2 (or multiple packages). If types in P1 depend on types in P2, you can still get this to work by having P1 say private with P2;, and making sure you have the marshalling and unmarshalling procedures you need. If you run into difficulties, I'd encourage you to ask a new question here.
I don't know why the language required all such types to be quarantined in a Remote_Types package, instead of just saying that any type used in a Remote_Call_Interface package has to have only parts that can be streamed. There may have been some implementation issues. Any code that exists for a Remote_Types package has to be in programs in both partitions, perhaps, and this may have been an attempt to limit the type of code that would have to be linked into multiple partitions. But I'm just guessing.

MPI: Local and non-local calls

I want to know what exactly are the two, and how they are different. What are the advantage or disadvantage of the two types of calls? Really appreciate using some small example code.
These are detailed, e.g., in here: http://www.netlib.org/utk/papers/mpi-book/node14.html
It's not really proper to talk about advantages of either. They are for different purposes. An example of a local call MPI_Comm_rank, since it doesn't need input from other processes and an example of a non-local call is MPI_Send, since it will have to communicate with some other process that will recieve the message.

what are exactly MPI, MPICH, and OPENMPI? what does "implementation" mean in this context?

My question might seem silly to those who have been in the field for long time, but I appreciate your patience in elaborating it for me.
When they say MPICH is an "implementation" of MPI, what does it mean?
Is the following analogy true(?):
if we think of MPI as a set of standards for a FORTRAN compiler, then MPICH, and OPENMPI are different versions of FORTRAN compilers, like Intel.Fortran, Compaq.Fortran, GNU.Fortran, and so on.
MPI is a standard: it outlines a particular model for message passing in a distributed system. However, it only gives a series of requirements: it does not actually include any code, nor does it specify how exactly these requirements need to be fulfilled. For example, take a look at this excerpt from the official MPI 2.2 spec (as of today):
A valid MPI implementation guarantees certain general properties of
point-to-point communication, which are described in this section.
Order Messages are non-overtaking: If a sender sends two messages in succession to the same destination, and both match the same
receive, then this operation cannot receive the second message if the
first one is still pending.
It then goes on to explain the rationale behind this requirement and provide an example, but says nothing more about the requirement itself.
An MPI implementation is a library that fulfills every requirement - like the one above - in the MPI specification. However, the standard contains absolutely no requirements as to what language constructs, OS calls, 3rd party libraries, etc can/can't/should be used. Occasionally, it will give advice to implementors, like this:
Advice to implementors. The implementation may keep a reference count
of active communications that use the datatype, in order to decide
when to free it. Also, one may implement constructors of derived
datatypes so that they keep pointers to their datatype arguments,
rather then copying them. In this case, one needs to keep track of
active datatype definition references in order to know when a datatype
object can be freed. (End of advice to implementors.)
however, these are still vague, very language-agnostic, and only recommendations: an implementation can ignore every single one of these advices, and still conform to the standard.
So yes, in essence it's similar to various implementations of a compiler. If a program takes valid source code for a language, and produces binary code that does everything that the language specification says it should do given the original source code, it's a conforming compiler for that language. Similarly, if you can use a library to pass messages in a way that doesn't break any rules of the MPI spec, then that's a valid MPI implementation.

Use-cases for reflection

Recently I was talking to a co-worker about C++ and lamented that there was no way to take a string with the name of a class field and extract the field with that name; in other words, it lacks reflection. He gave me a baffled look and asked when anyone would ever need to do such a thing.
Off the top of my head I didn't have a good answer for him, other than "hey, I need to do it right now". So I sat down and came up with a list of some of the things I've actually done with reflection in various languages. Unfortunately, most of my examples come from my web programming in Python, and I was hoping that the people here would have more examples. Here's the list I came up with:
Given a config file with lines like
x = "Hello World!"
y = 5.0
dynamically set the fields of some config object equal to the values in that file. (This was what I wished I could do in C++, but actually couldn't do.)
When sorting a list of objects, sort based on an arbitrary attribute given that attribute's name from a config file or web request.
When writing software that uses a network protocol, reflection lets you call methods based on string values from that protocol. For example, I wrote an IRC bot that would translate
!some_command arg1 arg2
into a method call actions.some_command(arg1, arg2) and print whatever that function returned back to the IRC channel.
When using Python's __getattr__ function (which is sort of like method_missing in Ruby/Smalltalk) I was working with a class with a whole lot of statistics, such as late_total. For every statistic, I wanted to be able to add _percent to get that statistic as a percentage of the total things I was counting (for example, stats.late_total_percent). Reflection made this very easy.
So can anyone here give any examples from their own programming experiences of times when reflection has been helpful? The next time a co-worker asks me why I'd "ever want to do something like that" I'd like to be more prepared.
I can list following usage for reflection:
Late binding
Security (introspect code for security reasons)
Code analysis
Dynamic typing (duck typing is not possible without reflection)
Metaprogramming
Some real-world usages of reflection from my personal experience:
Developed plugin system based on reflection
Used aspect-oriented programming model
Performed static code analysis
Used various Dependency Injection frameworks
...
Reflection is good thing :)
I've used reflection to get current method information for exceptions, logging, etc.
string src = MethodInfo.GetCurrentMethod().ToString();
string msg = "Big Mistake";
Exception newEx = new Exception(msg, ex);
newEx.Source = src;
instead of
string src = "MyMethod";
string msg = "Big MistakeA";
Exception newEx = new Exception(msg, ex);
newEx.Source = src;
It's just easier for copy/paste inheritance and code generation.
I'm in a situation now where I have a stream of XML coming in over the wire and I need to instantiate an Entity object that will populate itself from elements in the stream. It's easier to use reflection to figure out which Entity object can handle which XML element than to write a gigantic, maintenance-nightmare conditional statement. There's clearly a dependency between the XML schema and how I structure and name my objects, but I control both so it's not a big problem.
There are lot's of times you want to dynamically instantiate and work with objects where the type isn't known until runtime. For example with OR-mappers or in a plugin architecture. Mocking frameworks use it, if you want to write a logging-library and dynamically want to examine type and properties of exceptions.
If I think a bit longer I can probably come up with more examples.
I find reflection very useful if the input data (like xml) has a complex structure which is easily mapped to object-instances or i need some kind of "is a" relationship between the instances.
As reflection is relatively easy in java, I sometimes use it for simple data (key-value maps) where I have a small fixed set of keys. One one hand it's simple to determine if a key is valid (if the class has a setter setKey(String data)), on the other hand i can change the type of the (textual) input data and hide the transformation (e.g simple cast to int in getKey()), so the rest of the application can rely on correctly typed data.
If the type of some key-value-pair changes for one object (e.g. form int to float), i only have to change it in the data-object and its users but don't have to keep in mind to check the parser too. This might not be a sensible approach, if performance is an issue...
Writing dispatchers. Twisted uses python's reflective capabilities to dispatch XML-RPC and SOAP calls. RMI uses Java's reflection api for dispatch.
Command line parsing. Building up a config object based on the command line parameters that are passed in.
When writing unit tests, it can be helpful to use reflection, though mostly I've used this to bypass access modifiers (Java).
I've used reflection in C# when there was some internal or private method in the framework or a third party library that I wanted to access.
(Disclaimer: It's not necessarily a best-practice because private and internal methods may be changed in later versions. But it worked for what I needed.)
Well, in statically-typed languages, you'd want to use reflection any time you need to do something "dynamic". It comes in handy for tooling purposes (scanning the members of an object). In Java it's used in JMX and dynamic proxies quite a bit. And there are tons of one-off cases where it's really the only way to go (pretty much anytime you need to do something the compiler won't let you do).
I generally use reflection for debugging. Reflection can more easily and more accurately display the objects within the system than an assortment of print statements. In many languages that have first-class functions, you can even invoke the functions of the object without writing special code.
There is, however, a way to do what you want(ed). Use a hashtable. Store the fields keyed against the field name.
If you really wanted to, you could then create standard Get/Set functions, or create macros that do it on the fly. #define GetX() Get("X") sort of thing.
You could even implement your own imperfect reflection that way.
For the advanced user, if you can compile the code, it may be possible to enable debug output generation and use that to perform reflection.

Resources