Scheduler without OS - microcontroller

Do you know if there is any open source task scheduler without the OS support?
Basically, we are looking for a lean scheduler that can schedule and preempt the tasks on our AM335X TI chipset based boards, which don't have any RTOS running on them.

There are no "portable" schedulers at that level, because the context switch functions are hardware-dependent. Therefore, you need a scheduler for your specific hardware (e.g., provided by TI in their development environments) or a very minimal RTOS.
There are RTOSs that are very minimal (a few KB of footprint) and can contain only the scheduler. Have a look for example at http://erika.tuxfamily.org. However, I'm afraid that your specific microcontroller is not supported.

Most simple RTOS kernels provide at least scheduling, synchronisation, and IPC. However since they are also provided as static libraries, what you don't use will not be included in your product. That said I find it difficult to think of a system where synchronisation, IPC and timer services would not be required or at least beneficial.
There are numerous such RTOS libraries including Segger embOS for example.

The only portable, hardware independent solution I know is Protothreads by Adam Dunkels. But it is not true multitasking anyway, just a nice syntax sugar on concurrent state machines. However it may help you in your task.

Related

How to customize BlueZ?

I will be asking a very subjective question, but it is important as I am looking to recover from failure to effectively use BlueZ programatically.
Basically I envision an IoT edge device that runs on a miniature computer (Ex: Raspberry pi or Intel Compute Stick). The device would then run AlpineLinux OS and interact with Cloud.
Since it is IoT environment, it is needless to mention the importance of Bluetooth BLE over ISM band. Hence the central importance of being able to customize and work with BlueZ.
I am looking to do several things with BlueZ BLE including but not limited to
Advertising
Pairing
Characteristic
Broadcast
Secure transport of data etc...
Since I will be needing full control over data, for data-processing and interacting with cloud (Edge AI or Data-science on Cloud) I am looking at three ways of using BlueZ:
Make DBus API calls to BlueZ Methods.
Modify BlueZ codebase and make install a custom bin.
(So that callback handlers can be registered and wealth of other bluez
methods can be invoked)
Invoke BlueZ using command line utils like hcitool/bluetoothctl inside a program using system() calls.
No 1 is where I have failed. It is exorbitant amount of effort to construct and export DBus objects and then to invoke BlueZ methods. Plus there is no guarantee that you will be able to take care of all BLE issues.
No 2 looks very promising and I want to fully explore how feasible it is to modify the BlueZ code to my needs.
No 3 is the least desirable option, but I want to have it as a fallback option nevertheless.
Given my problem statement, what is the most viable strategy forward? I am asking this aloud so that I do not make more missteps and cost myself time and efforts.
Your best strategy is to start with the second way (which you already found promising) as this is a viable solution and many developers go about this method in order to create their BlueZ programs. Here is what I would do:-
Write all the functionality of the system in some sort of flowchart or state machine. This helps you visualise your whole system and what needs to be done to reach your end goal.
Try to perform all the above functionality manually using bluetoothctl and btmgmt. This includes advertising, pairing, etc. I recommend steering away from legacy commands such as hcitool and hciconfig as these have been deprecated and have a very different code structure.
When stumbling upon something that is not the default in bluetoothctl/btmgmt or you want to tweak the functionality, update the source to do so.
Finally, once you manually get the system to perform the functionality that you need (it doesn't have to be all, it can just be a subset of the functions), you can move to automating the whole process. This involves modifying the source for bluetoothctl/btmgmt commands so that instead of manual intervention, everything would be event-driven.
This is a bonus, but if you can create automated tests using python or some other scripting language, then this would ensure that your system is robust and that previous functionality doesn't break when adding new ones.
By the end of this process, you'll have a much better understanding of the internals of bluetoothctl/btmgmt and D-BUS APIs that you might be able to completely detach your code from the original bluetoothctl/btmgmt or create the program from scratch.
You probably already know this, but when modifying the tools, this is the starting point for the source code:-
bluetoothctl - client/main.c
btmgmt - tools/btmgmt.c
For more references on using bluetoothctl commands and btmgmt, please see the links below:-
BlueZ D-Bus C or C++ Sample
Bluetoothctl set passkey
https://stackoverflow.com/a/51876272/2215147
Bluez Programming
Linux command line howto accept pairing for bluetooth device without pin
https://stackoverflow.com/a/52982329/2215147
Bluetooth Low Energy in C - using Bluez to create a GATT server
I hope this helps.

How to fork interactive programs

I have an interactive program with a high start-up cost. After start-up, I'd like to fork the process into separate concurrent sessions. Ideally each separate session would become a GNU screen window but being able to individually telnet/ssh to each session would be fine too.
It shouldn't be too hard to write this from scratch but it seems like something that should have been done/considered before and maybe there are reasons why this is a bad idea...
I know that an alternative approach is to use shared memory for the data that's expensive to initialize. The reason I'm reluctant to go down that path is that the shared data uses C++ data structures with pointers, which makes it hard to mmap it into an unrelated process.
This is what any database does - the startup is phenomenally expensive, but the db provides several different means of connecting - example Oracle's BEQ protocol.
Telnet has issues, consider ssh. Either way, consider a daemon that answers requests for connect on a port (you would use AF_UNIX sockets I guess), then creates a separate session.
Stevens Advanced Programming in the UNIX Envrionment or Rochkind's Advanced UNIX Programming have discussions and complete examples. Since my Stevens book seems to have gone on extensive holiday, see Rochkind 4.3 and 4.10.
And no, there is no pending doom for using this approach.

why have user context and kernel context ... unix

operating systems related question dunno if i can ask here
but thought i would get proper explanation in this forum
when a process be exec in user context... wont the higher priority precesses in kernel context blocking the process in user context all the time...
it is hazy for me... the concepts
......
There is two main kind of scheduler in operating system, preemptive schedulers and non-preemptive schedulers.
Non preemptive schedulers would behave like you think, a process with higher rights and higher priority will keep using the cpu until it finish OR until it block (on a mutex for example or with a call to yield which explicitly release the cpu in order to schedule another one.)
But non-preemptive schedulers are rare and linux scheduler isn't that kind. It uses time slices to let process work for a short period of time before de-scheduling it, it also include priority but keep scheduling processes with lower priority, you should take a look at this linux scheduler article.
This Stackoverflow posting has a discussion that includes a run-down of how kernel mode works with an explanation of some of the jargon. In particular look at the section titled 'A brief primer on kernel vs. user mode'. It might help to shed some light on your question.
http://en.wikipedia.org/wiki/Ring_0
Your process in kernel mode can as well be preempted, when it reaches the quantum.
Wikipedia: Preemption

How is MPI I/O Implemented?

Long-Winded Background
I'm working on parallelising some code for cardiac electrophysiology simulations. Since users can specify their own simulations using an in-built scripting language, I have no way of knowing how to manage the trade-off of communication vs. computation. To combat this, I'm making a sort-of runtime profiler, which will decide how to handle the domain decomposition once it's seen the simulation to be run and the hardware environment that it has to work with.
My question is this:
How is MPI I/O implemented behind the scenes? Is each process actually writing to a single file on some other node, or is each process writing to some sparse file, which will get spliced back together when the file is closed?
Knowing this will help me decide whether to consider I/O operations as communication or computation, and adjust the balance accordingly…
Thanks in advance for any insight you can offer.
Ross
The mechanism for I/O is implementation dependent. In addition, there is not a single style of I/O. Some I/O is cached by the remote ranks and collected by the mpirun process at the end of the run. Some I/O is written to local scratch space as required. Some I/O is written to a NAS/SAN style high performance shared file system.
Some MPI's use 3rd party libraries to support I/O to parallel file systems, and those details may be proprietary. Some file systems are local discs, others are SAN over fiber or InfinBand.
How are you planning to actually measure the time spent in I/O? Are you planning to use the pMPI interface to intercept all the calls into the library?

Microcontroller + Verilog/VHDL simulator?

Over the years I've worked on a number of microcontroller-based projects; mostly with Microchip's PICs. I've used various microcontroller simulators, and while they can be very helpful at times, I often find myself frustrated. In real life microcontrollers never exist alone and the firmware's behavior is dependent on the environment. However, none of the sims I've used provide decent support for anything outside the microcontroller.
My first thought was to model the entire board in Verilog. But, I'd rather not create an entire CPU model, and I haven't had much luck finding existing models for the chips I use. Regardless, I really don't need, or want, to simulate the proc at that level of detail, and I'd like to retain the debugging facilities provided by a regular processor sim.
It seems to me that the ideal solution would be a hybrid simulator that interfaces a traditional processor simulator with a Verilog model.
Does such a thing exist?
I've used the Altera Nios II processor embedded on a FPGA. Altera provides a toolchain for simulating the CPU (with its software) together with your custom logic in a simulator. I suppose that a similar setup can be achieved by downloading a VHDL/Verilog core of your CPU (Did you try opencores ? They have lots of stuff there).
But keep in mind that it is going to be mind-bogglingly slow, so don't expect to simulate whole complex processes this way. The best you can hope for is simulating fine software-hardware interaction points to debug problems. If you need a deeper simulation, consider running it on a FPGA with built-in monitoring code.
For the "simulate the whole board" approach,
The Free Model Foundry has a large number of models, some in VHDL others in Verilog, that are available now.. but you'll need to pay to have new models created. These are very helpful in being sure the board is built correctly.
But I think the more common approach when debugging your PIC is to just build a board, then work on the firmware. In the chip world, (where the firmware is running on a microprocessor in a chip that hasn't gone to fab yet) people often resort to very expensive systems (or renting time on them) that allow compiling part of the design into an emulator while the rest of the design runs in the normal simulator environment. Without the barrier of an expensive mask set for the chip, the cost is just not justifiable for a Circuit board. Although I've heard of some creative applications of Simulink (Mathworks) with FPGA's, but my recollection is that one either ran the system on the computer, or programmed the device and ran the same thing in realtime.
I believe both Cadence (ask about Palladium) and Mentor Graphics have that integrated solution if you have the money to spend on it.
What I have done recently is create an interface between the simulation environment and host system. Different hdl simulators have different interfaces, and getting the simulator NOT think in batch mode, the traditional simulation model, instead run for ever like a real design is half of the problem.
Then from the host using C (or whatever) you can create abstractions that may or may not allow you to write your application software for whatever target (depending on what language and compiler capabilities you have). For example you can make a generic poke and peek function and on the final target have those actually poke and peek memory or I/O, but for simulation through the abstraction you talk to a testbench in the simulation that simulates the same memory cycle.
I went one step further and used (Berkeley) sockets between the host and test bench so that the simulation can keep running while the host applications stop and start. Not unlike having a real processor with an OS that you are starting applications and running them to completion and starting another. At least for test applications, for delivery you probably only have one app.
By creating these abstraction layers I can write real applications that will be used on the target when it is built. Along the way you can use software simulation of the logic initially, then if you like build an fpga with an abstraction interface (throw away logic) say a uart for example. Replace the shim between the applications abstraction layer and the simulator with a uart interface, or whatever. Then when you marry the processor and logic in the same chip or on the same board, replace the abstraction layer again with direct calls to whatever interfaces they have always though they were talking to. If something breaks and you have retained the abstraction layer you can take the application back to the simulation model and have access to all of your logic internals.
Specifically this time around I am using a hdl language cyclicity cdl which is on sourceforge, the documentation needs some help but the examples may get you going, and it produces synthesizable verilog, so you get an extra win there. I threw out all the scripting batch stuff other than the bare minimum needed to connect and start a C simulation model. So my test bench is in C (well C++ technically) the sockets layer was done there. The output can be .vcd files which gtkwave uses. Basically you can do the bulk of your HDL design using open source software with no licenses, etc. By adding one or two lines of code to the CDL simulation part I was able to have it run as an infinite loop, which I can say works quite well, there doesnt appear to be any memory leaks, etc.
both modelsim and cadence have standardized ways of connecting host C programs to the simulation world and from there you can use an IPC to get to host applications talking to an abstraction layer api.
this is probably way overkill for a pic, I have given up pics a while ago for the faster and C friendly arm based micros anyway. There is/was an open core pic that you could simply incorporate into your simulation, even though that is not what you are trying to do here.
Not that I've seen. Your best bet is to properly define the interfaces and behavior between the uC and FPGA and then define a series of test waveforms that can be applied using an automated tester. You would have to make the automated tester (or perhaps a logic analyzer may have some such functionality) out of an FPGA or uC (apply waveform, watch interrupts, breakpoints, etc). If you really want I know that Opencores.org has PIC and AVR-like 8-bit uC cores defined as VHDL, so you could implement your entire project on the FPGA and then just debug that.
Generally there isn't need to model the CPU at the RTL level. Since you don't really care about what it does bit by bit; you generally care about what it does, e.g. register values, memories and bus access.
The simplest is call at Bus Functional Model. This just generates the read and writes that the CPU does, often based on a text file. These are available for some CPUs and many popular buses (e.g. PCI, PCIe). THese simulate super fast.
The next step up is a functional cycle-accurate model. Those simulate fast. They are often encrypted.
Last is a full RTL model. Those usually are only available if you are working closely with the CPU vendor, e.g. using their core in your ASIC. Typically these are encrypted, unless you are a huge company.
Memory models are typically cycle-accurate (e.g. Micron).
My workmates from the hardware department use FPGA simulation software quite often to find timing-bugs and trace down strange behaviours.
Simulating one or two milliseconds can take several hours, so using the simulator for anything but very small things is not feasable.
You may want to have a look at SystemC though. http://en.wikipedia.org/wiki/SystemC

Resources