Is it possible to create API bindings for GJS in other languages? - gjs

What I mean specifically is if it's possible to have some code written, say in C (or some other compiled language), and then expose it and use from within a GJS runtime.

In fact, this is how all of GJS works. Just as node.js is ECMAScript on top of node's own platform, GJS was created so that ECMAScript could be used with the GNOME platform libraries.
This is effectively limited to C libraries written with GObject, but of course anything you can use from C can be wrapped into a GObject-based library. There are Boxed Types for integrating foreign structures into the GLib type system, or you can wrap things into the structure of a GObject subclass.
The principle is pretty straight-forward and relies on use GObject-Introspection Annotiations to express function signatures, memory ownership and so on. Below is a simple example:
/**
* namespace_copy_string:
* #input: (transfer none): an input string
*
* This function copies a string and returns the result.
*
* Returns: (transfer full): a new string
*/
char *
namespace_copy_string (const char *input)
{
return g_strdup (input);
}
The headers and source are then scanned for public symbols with these annotations, and use to generate an XML-format and compiled typelib. meson is the recommended build-system for GObject-based libraries and includes helpers for generating the introspection data. You can also use gi-docgen to easily generate documentation from this output.
Once installed, the result can be imported into any language binding that supports GObject-Introspection (GJS, Python, etc):
const Namespace = imports.gi.Namespace;
let copy = Namespace.copy_string("content");

Related

Is there a way to specify the base address of a shared library using dlopen()?

It seems that when we dlopen() some libraries, they will be loaded into some preferred (but not fixed) addresses. I've checked the source code of dlopen(), and a core function says
static __always_inline const char *
_dl_map_segments (struct link_map *l, int fd,
const ElfW(Ehdr) *header, int type,
const struct loadcmd loadcmds[], size_t nloadcmds,
const size_t maplength, bool has_holes,
struct link_map *loader)
{
const struct loadcmd *c = loadcmds;
if (__glibc_likely (type == ET_DYN))
{
/* This is a position-independent shared object. We can let the
kernel map it anywhere it likes, but we must have space for all
the segments in their specified positions relative to the first.
So we map the first segment without MAP_FIXED, but with its
extent increased to cover all the segments. Then we remove
access from excess portion, and there is known sufficient space
there to remap from the later segments.
As a refinement, sometimes we have an address that we would
prefer to map such objects at; but this is only a preference,
the OS can do whatever it likes. */
ElfW(Addr) mappref
= (ELF_PREFERRED_ADDRESS (loader, maplength,
c->mapstart & GLRO(dl_use_load_bias))
- MAP_BASE_ADDR (l));
/* Remember which part of the address space this object uses. */
l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
c->prot,
MAP_COPY|MAP_FILE,
fd, c->mapoff);
if (__glibc_unlikely ((void *) l->l_map_start == MAP_FAILED))
return DL_MAP_SEGMENTS_ERROR_MAP_SEGMENT;
...
}
The comment says you can specify a preferred address, but OS will determine whether to use it.
Question
Is there any way we can specify the base address for each dlopened module?
ELF_PREFERRED_ADDRESSS is set to 0 by default, but this macro seems to infer that the preferred addresses can be changed, say by an environment variable? But even there is one, I doubt that it can be changed for each dlopened library.
If I want to implement this feature myself, it seems that I need to wrap a new dlopen function and pass the preferred address to the above core function (and use MAP_FIXED maybe). Is it correct?
Thanks!
Is there any way we can specify the base address for each dlopened module?
No.
ELF_PREFERRED_ADDRESSS is set to 0 by default, but this macro seems to infer that the preferred addresses can be changed, say by an environment variable? But even there is one, I doubt that it can be changed for each dlopened library.
This code is compiled into the dynamic loader ld-linux.so and can not be changed after the compilation.
If I want to implement this feature myself, it seems that I need to wrap a new dlopen function and pass the preferred address to the above core function (and use MAP_FIXED maybe). Is it correct?
The function is private to ld-linux. You will not be able to wrap it, or call it from outside of ld-linux.
P.S. What you are likely looking for is the prelink command.

When, where and why use namespace when registering custom types for Qt

Similar questions have been raised multiple times, but I'm focussing on the namespace and pointer issues.
MyClass.h
namespace foo {
class MyClass {
MyClass();
};
QDataStream &operator<<(QDataStream &out, const MyClass & myObj);
QDataStream &operator>>(QDataStream &in, MyClass &myObj);
} // namespace foo
Q_DECLARE_METATYPE(foo::MyClass) // #1
Q_DECLARE_METATYPE(foo::MyClass*) // #2
fooMyClass.cpp (so many permutations):
MyClass::MyClass()
{
qRegisterMetaType<MyClass>("MyClass"); // #3
qRegisterMetaType<MyClass*>("MyClass*"); // #4
qRegisterMetaType<MyClass>("foo::MyClass"); // #5
qRegisterMetaType<MyClass*>("foo::MyClass*"); // #6
qRegisterMetaType<foo::MyClass>("foo::MyClass"); // #7
qRegisterMetaType<foo::MyClass*>("foo::MyClass*"); // #8
qRegisterMetaType<MyClass>(); // #9
qRegisterMetaType<MyClass*>(); // #10
qRegisterMetaType<foo::MyClass>(); // #11
qRegisterMetaType<foo::MyClass*>(); // #12
// same for qRegisterMetaTypeStreamOperators<T>();
}
So my question is, when and why is it required to provide the namespace and/or the pointer variant if I intend to use the custom objects for signals and slots (potentially as reference and pointer) inside as well as outside the namespace. Do I always have to fully qualify the namespace?
I'm referring to Qt5 in this answer. Qt4 doesn't go well with this use case.
Data stream operators
Data stream operators are not required for your type if you only intend to use it in signals and slots. They are required if you want to do some serialization.
Pointers, references and values
Qt considers MyClass and MyClass* two different unrelated types. You should declare, register and use them separately. Using const MyClass & argument type is compatible with MyClass in Qt meta-object system. Note that using MyClass and MyClass* meta types simultaneously in one program is unusual and can cause mistakes and confusion. You should choose one of the options and use it throughout the program. Also passing pointers to slots is not recommended because it causes unsolvable ownership problem. So I recommend to use passing by const reference (which sometimes will be converted to passing by value internally in Qt signal-slot system). If MyClass objects contain massive data, you should implement implicit data sharing using QSharedDataPointer.
Declaring a meta type
First of all, you always need to declare your meta type:
Q_DECLARE_METATYPE(foo::MyClass)
It works at compile time, so there are no limitations on how you refer to your class. The following code will work as well:
using namespace foo;
Q_DECLARE_METATYPE(MyClass)
Registering a meta type
Now you need to register your classes. Theoretically, you need to specify all strings that you want to use to refer to your type, i.e.:
qRegisterMetaType<foo::MyClass>("MyClass");
qRegisterMetaType<foo::MyClass>("foo::MyClass");
It doesn't matter how you refer to MyClass in the template argument. The following code will work similarly:
using namespace foo;
qRegisterMetaType<MyClass>("MyClass");
qRegisterMetaType<MyClass>("foo::MyClass");
For example, the "MyClass" and "foo::MyClass" strings are used to identify argument types when you refer to your signals and slots like SIGNAL(signal1(MyClass)).
New signal and slot syntax
If you using new signal slot syntax with pointers to member functions, you need to do only one registration with arbitrary string argument. It seems that it is intended to make it work even without any registrations. This part of the docs instructs to only add Q_DECLARE_METATYPE, in opposite to this that requires qRegisterMetaType(). Unfortunately, now in my Qt installation it works only with direct connections. Queued connections still require at least one registration call.
Implicit registration of class without namespace
I was experimenting with some variants of registration in Qt 5.1 and found out that Qt automatically registers aliases without namespace. So if you write
qRegisterMetaType<foo::MyClass>("foo::MyClass");
, Qt will additionally automatically register "MyClass" alias. So, after executing this statement you will be able to refer to your type as MyClass and foo::MyClass. There is no information in the documentation about how Qt handles namespaces. We could assume that this behavior is intended and will not be removed in next versions but I wouldn't rely on that. The following code makes implicit registration obvious:
qRegisterMetaType<foo::MyClass>("foo::MyClass");
qRegisterMetaType<bar::MyClass>("MyClass");
Qt 5.1 says:
QMetaType::registerTypedef: Binary compatibility break -- Type name 'MyClass' previously registered as typedef of 'MyClass' [1030], now registering as typedef of 'bar::MyClass' [1032].
Qt 4.8 works without error (it seems that this behavior is not yet introduced in this version).

Go Programming - bypassing access privileges using pointers

Let's say I have the following hierarchy for my project:
fragment/fragment.go
main.go
And in the fragment.go I have the following code, with one getter and no setter:
package fragment
type Fragment struct {
number int64 // private variable - lower case
}
func (f *Fragment) GetNumber() *int64 {
return &f.number
}
And in the main.go I create a Fragment and try to change Fragment.number without a setter:
package main
import (
"fmt"
"myproject/fragment"
)
func main() {
f := new(fragment.Fragment)
fmt.Println(*f.GetNumber()) // prints 0
//f.number = 8 // error - number is private
p := f.GetNumber()
*p = 4 // works. Now f.number is 4
fmt.Println(*f.GetNumber()) // prints 4
}
So by using the pointer, I changed the private variable outside of the fragment package. I understand that in for example C, pointers help to avoid copying large struct/arrays and they are supposed to enable you to change whatever they're pointing to. But I don't quite understand how they are supposed to work with private variables.
So my questions are:
Shouldn't the private variables stay private, no matter how they are accessed?
How is this compared to other languages such as C++/Java? Is it the case there too, that private variables can be changed using pointers outside of the class?
My Background: I know a bit C/C++, rather fluent in Python and new to Go. I learn programming as a hobby so don't know much about technical things happening behind the scenes.
You're not bypassing any access privilegies. If you acquire a *T from any imported package then you can always mutate *T, ie. the pointee at whole, as in an assignment. The imported package designer controls what you can get from the package, so the access control is not yours.
The restriction to what's said above is for structured types (structs), where the previous still holds, but the finer granularity of access control to a particular field is controlled by the field's name case even when referred to by a pointer to the whole structure. The field name must be uppercase to be visible outside its package.
Wrt C++: I believe you can achieve the same with one of the dozens C++ pointer types. Not sure which one, though.
Wrt Java: No, Java has no pointers. Not really comparable to pointers in Go (C, C++, ...).

Changing function reference in Mach-o binary

I need to change to reference of a function in a mach-o binary to a custom function defined in my own dylib. The process I am now following is,
Replacing references to older functions to the new one. e.g _fopen to _mopen using sed.
I open the mach-o binary in MachOView to find the address of the entities I want to change. I then manually change the information in the binary using a hex editor.
Is there a way I can automate this process i.e write a program to read the symbols, and dynamic loading info and then change them in the executable. I was looking at the mach-o header files at /usr/include/mach-o but am not entire sure how to use them to get this information. Do there exist any libraries present - C or python which help do the same?
interesting question, I am trying to do something similar to static lib; see if this helps
varrunr - you can easily achieve most if not all of the functionality using DYLD's interposition. You create your own library, and declare your interposing functions, like so
// This is the expected interpose structure
typedef struct interpose_s {
void *new_func;
void *orig_func;
} interpose_t;
static const interpose_t interposing_functions[] \
__attribute__ ((section("__DATA, __interpose"))) = {
{ (void *)my_open, (void *) open }
};
.. and you just implement your open. In the interposing functions all references to the original will work - which makes this ideal for wrappers. And, you can insert your dylib forcefully using DYLD_INSERT_LIBRARIES (same principle as LD_PRELOAD on Linux).

ld linker question: the --whole-archive option

The only real use of the --whole-archive linker option that I have seen is in creating shared libraries from static ones. Recently I came across Makefile(s) which always use this option when linking with in house static libraries. This of course causes the executables to unnecessarily pull in unreferenced object code. My reaction to this was that this is plain wrong, am I missing something here ?
The second question I have has to do with something I read regarding the whole-archive option but couldn't quite parse. Something to the effect that --whole-archive option should be used while linking with a static library if the executable also links with a shared library which in turn has (in part) the same object code as the static library. That is the shared library and the static library have overlap in terms of object code. Using this option would force all symbols(regardless of use) to be resolved in the executable. This is supposed to avoid object code duplication. This is confusing, if a symbol is refereed in the program it must be resolved uniquely at link time, what is this business about duplication ? (Forgive me if this paragraph is not quite the epitome of clarity)
Thanks
There are legitimate uses of --whole-archive when linking executable with static libraries. One example is building C++ code, where global instances "register" themselves in their constructors (warning: untested code):
handlers.h
typedef void (*handler)(const char *data);
void register_handler(const char *protocol, handler h);
handler get_handler(const char *protocol);
handlers.cc (part of libhandlers.a)
typedef map<const char*, handler> HandlerMap;
HandlerMap m;
void register_handler(const char *protocol, handler h) {
m[protocol] = h;
}
handler get_handler(const char *protocol) {
HandlerMap::iterator it = m.find(protocol);
if (it == m.end()) return nullptr;
return it->second;
}
http.cc (part of libhttp.a)
#include <handlers.h>
class HttpHandler {
HttpHandler() { register_handler("http", &handle_http); }
static void handle_http(const char *) { /* whatever */ }
};
HttpHandler h; // registers itself with main!
main.cc
#include <handlers.h>
int main(int argc, char *argv[])
{
for (int i = 1; i < argc-1; i+= 2) {
handler h = get_handler(argv[i]);
if (h != nullptr) h(argv[i+1]);
}
}
Note that there are no symbols in http.cc that main.cc needs. If you link this as
g++ main.cc -lhttp -lhandlers
you will not get an http handler linked into the main executable, and will not be able to call handle_http(). Contrast this with what happens when you link as:
g++ main.cc -Wl,--whole-archive -lhttp -Wl,--no-whole-archive -lhandlers
The same "self registration" style is also possible in plain-C, e.g. with the __attribute__((constructor)) GNU extension.
Another legitimate use for --whole-archive is for toolkit developers to distribute libraries containing multiple features in a single static library. In this case, the provider has no idea what parts of the library will be used by the consumer and therefore must include everything.
An additional good scenario in which --whole-archive is well-used is when dealing with static libraries and incremental linking.
Let us suppose that:
libA implements the a() and b() functions.
Some portion of the program has to be linked against libA only, e.g. due to some function wrapping using --wrap (a classical example is malloc)
libC implements the c() functions and uses a()
the final program uses a() and c()
Incremental linking steps could be:
ld -r -o step1.o module1.o --wrap malloc --whole-archive -lA
ld -r -o step2.o step1.o module2.o --whole-archive -lC
cc step3.o module3.o -o program
Failing to insert --whole-archive would strip function c() which is anyhow used by program, preventing the correct compilation process.
Of course, this is a particular corner case in which incremental linking must be done to avoid wrapping all calls to malloc in all modules, but is a case which is successfully supported by --whole-archive.
I agree that using —whole-archive to build executables is probably not what you want (due to linking in unneeded code and creating bloated software). If they had a good reason to do so they should have documented it in the build system, as now you are left to guessing.
As to your second part of the question. If an executable links both a static library and a dynamic library that has (in part) the same object code as the static library then the —whole-archive will ensure that at link time the code from the static library is preferred. This is usually what you want when you do static linking.
Old query, but on your first question ("Why"), I've seen --whole-archive used for in-house libraries as well, primarily to sidestep circular references between those libraries. It tends to hide poor architecture of the libraries, so I'd not recommend it. However it's a fast way of getting a quick trial working.
For your second query, if the same symbol was present in a shared object and a static library, the linker will satisfy the reference with whichever library it meets first.
If the shared library and static library have an exact sharing of code, this may all just work. But where the shared library and the static library have different implementations of the same symbols, your program will still compile but will behave differently based on the order of libraries.
Forcing all symbols to be loaded from the static library is one way of removing confusion as to what is loaded from where. But in general this sounds like solving the wrong problem; you mostly won't want the same symbols in different libraries.

Resources