Is it possible to boot an unified kernel image as Xen Dom0 with UEFI Secure Boot? - xen

I am trying to set up a Xen Host in a way that every step until booting the Dom0 Linux kernel is Secure Boot verified.
Without Xen, this could be achieved by signing an unified kernel image containing the kernel, initrd and kernel command-line parameters in a single EFI binary.
Signing only the Xen EFI binary is useless because the kernel, initrd and Xen configuration file could be modified without affecting Secure Boot.
When booted via Shim, Xen verifies the Dom0 kernel and initrd using the Shim protocol, but the Xen configuration file containing the kernel command-line parameters is not verified, so an attacker could still modify these parameters.
tklengyel/xen-uefi patches the Xen source code to measure the Xen configuration file into a PCR register. This would not be necessary if the signed kernel binary booted by Xen included the initrd and kernel command-line parameters and all other parameters specified in the Xen configuration file were ignored.
Is there any way to achieve this?

There is no support for it in the mainline Xen tree as of 4.15, although there are preliminary patches to support building a "unified Xen": https://github.com/osresearch/xen/tree/secureboot
This borrows the technique from systemd-boot to create the single unified Xen executable with xen.cfg, bzImage, initrd.img and an optional XSM file, each in their own named PE section. This executable can then be signed with sbsign and validated by UEFI Secure Boot using the Platform Key or key database. A UEFI boot manager entry can be created for this unified Xen so that grub is not required.
It's been tested in qemu with the OVMF Secure Boot enabled, as well as Thinkpad hardware. Further cleanup is necessary before it is ready for submission to xen-devel.

Related

DPDK Crypto Device Scheduler "Capability Update Failed"

I am working on a DPDK project and experience issues that need you help.
The project needs to implement encryption/decryption through DPDK (multi-buffer library). To support all cipher and hash algorithms, I need create 4 type of virtual devices: crypto_null, crypto_aesni_mb, crypto_snow3g and crypto_zuc. I tried to create a crypto-scheduler to manage all 4 devices. When the devices attach to the scheduler, it failed. I can reproduce the exact same failure with the DPDK example program: l2fwd_crypto.
Here is the command I use to run l2fwd_crypto.
./l2fwd-crypto -l 0-1 -n 4 --vdev "crypto_aesni_mb0" --vdev "crypto_null" --vdev "crypto_zuc" --vdev "crypto_snow3g0" --vdev "crypto_scheduler,slave=crypto_null,slave=crypto_aesni_mb0,slave=crypto_snow3g0,slave=crypto_zuc" -- -p 0x3 --chain CIPHER_HASH --cipher_op ENCRYPT --cipher_algo aes-cbc --cipher_key 00:01:02:03:04:05:06:07:08:09:0a:0b:0c:0d:0e:0f --auth_op GENERATE --auth_algo aes-xcbc-mac --auth_key 10:11:12:13:14:15:16:17:18:19:1a:1b:1c:1d:1e:1f
The error message is:
rte_cryptodev_scheduler_slave_attach() line 214: capabilities update failed
I am using DPDK 20.05 on CentOS 7.4
My question is:
Is this the correct way to handle all different crypto algorithm? I mean create 4 virtual devices.
Why the crypto scheduler failed?
Any suggestion/comments are really appreciated.
[EDIT-1: Based on comment conversation]
DPDK POLL mode Crypto Scheduler is intended to be running with either HW or SW of the same type. This is covered in dpdk document. Hence if one needs to Crypto scheduler to work, it has to initialized with all same type (either HW/SW).
Hence re-running the test with either all NULL, ZUC, SNoW, AES_MB will work
note: with respect to internally logic working, my personal opinion is the current logic in crypto-scheduler logic is correct. Because, in actual logic one will lookup dest-ip or src+dest IP in either ACL, LPM, or Exact Match to identify the SA or Crypto keys. this can offloaded to SW or HW depending upon work load or flow (mice or elephant) flow.

How to programmatically determine which is the boot disk on Solaris/illumos?

On a test server there are two Samsung 960 Pro SSDs, exactly same maker, model and size. On both I've installed a fresh install of exactly the same OS, OmniOS r15026.
By pressing F8 at POST time, I can access the motherboard BOOT manager, and choose one of the two boot drives. Thus, I know which one the system booted from.
But how can one know programmatically, after boot, which is the boot disk?
It seems that is:
Not possible on Linux,
Not possible on FreeBsd
Possible on macOS.
Does Solaris/illumos offer some introspective hooks to determine which is the boot disk?
Is it possible to programmatically determine which is the boot disk on Solaris/illumos?
A command line tool would be fine too.
Edit 1: Thanks to #andrew-henle, I have come to know command eeprom.
As expected it is available on illumos, but on test server with OmniOS unfortunately it doesn't return much:
root#omnios:~# eeprom
keyboard-layout=US-English
ata-dma-enabled=1
atapi-cd-dma-enabled=1
ttyd-rts-dtr-off=false
ttyd-ignore-cd=true
ttyc-rts-dtr-off=false
ttyc-ignore-cd=true
ttyb-rts-dtr-off=false
ttyb-ignore-cd=true
ttya-rts-dtr-off=false
ttya-ignore-cd=true
ttyd-mode=9600,8,n,1,-
ttyc-mode=9600,8,n,1,-
ttyb-mode=9600,8,n,1,-
ttya-mode=9600,8,n,1,-
lba-access-ok=1
root#omnios:~# eeprom boot-device
boot-device: data not available.
Solution on OmniOS r15026
Thanks to #abarczyk I was able to determine the correct boot disk.
I had to use a slightly different syntax:
root#omnios:~# /usr/sbin/prtconf -v | ggrep -1 bootpath
value='unix'
name='bootpath' type=string items=1
value='/pci#38,0/pci1022,1453#1,1/pci144d,a801#0/blkdev#w0025385971B16535,0:b
With /usr/sbin/format, I was able to see entry corresponds to
16. c1t0025385971B16535d0 <Samsung-SSD 960 PRO 512GB-2B6QCXP7-476.94GB>
/pci#38,0/pci1022,1453#1,1/pci144d,a801#0/blkdev#w0025385971B16535,0
which is correct, as that is the disk I manually selected in BIOS.
Thank you very much to #abarczyk and #andrew-henle to consider this and offer instructive help.
The best way to find the device from which the systems is booted is to check prtconf -vp output:
# /usr/sbin/prtconf -vp | grep bootpath
bootpath: '/pci#0,600000/pci#0/scsi#1/disk#0,0:a'
On my Solaris 11.4 Beta system, there is a very useful command called devprop which helps answer your question:
$ devprop -s bootpath
/pci#0,0/pci1849,8c02#1f,2/disk#1,0:b
then you just have to look through the output of format to see what that translates to. On my system, that is
9. c2t1d0 <ATA-ST1000DM003-1CH1-CC47-931.51GB>
/pci#0,0/pci1849,8c02#1f,2/disk#1,0
Use the eeprom command.
Per the eeprom man page:
Description
eeprom displays or changes the values of parameters in the EEPROM.
It processes parameters in the order given. When processing a
parameter accompanied by a value, eeprom makes the indicated
alteration to the EEPROM; otherwise, it displays the parameter's
value. When given no parameter specifiers, eeprom displays the values
of all EEPROM parameters. A '−' (hyphen) flag specifies that
parameters and values are to be read from the standard input (one
parameter or parameter=value per line).
Only the super-user may alter the EEPROM contents.
eeprom verifies the EEPROM checksums and complains if they are
incorrect.
platform-name is the name of the platform implementation and can be
found using the –i option of uname(1).
SPARC
SPARC based systems implement firmware password protection with
eeprom, using the security-mode, security-password and
security-#badlogins properties.
x86
EEPROM storage is simulated using a file residing in the
platform-specific boot area. The /boot/solaris/bootenv.rc file
simulates EEPROM storage.
Because x86 based systems typically implement password protection in
the system BIOS, there is no support for password protection in the
eeprom program. While it is possible to set the security-mode,
security-password and security-#badlogins properties on x86 based
systems, these properties have no special meaning or behavior on x86
based systems.

On what parameters boot sequence varies?

Does every Unix flavor have same boot sequence code ? I mean there are different kernel version releases going on for different flavors, so is there possibility of different code for boot sequence once kernel is loaded? Or they keep their boot sequence (or code) common always?
Edit: I want to know into detail how boot process is done.
Where does MBR finds a GRUB? How this information is stored? Is it by default hard-coded?
Is there any block level partion architecture available for boot sequence?
How GRUB locates the kernel image? Is it common space, where kernel image is stored?
I searched a lot on web; but it shows common architecture BIOS -> MBR -> GRUB -> Kernel -> Init.
I want to know details of everything. What should I do to know this all? Is there any way I could debug boot process?
Thanks in advance!
First of all, the boot process is extremely platform and kernel dependent.
The point is normally getting the kernel image loaded somewhere in memory and run it, but details may differ:
where do I get the kernel image? (file on a partition? fixed offset on the device? should I just map a device in memory?)
what should be loaded? (only a "core" image? also a ramdisk with additional data?)
where should it be loaded? Is additional initialization (CPU/MMU status, device initialization, ...) required?
are there kernel parameters to pass? Where should they be put for the kernel to see?
where is the configuration for the bootloader itself stored (hard-coded, files on a partition, ...)? How to load the additional modules? (bootloaders like GRUB are actually small OSes by themselves)
Different bootloaders and OSes may do this stuff differently. The "UNIX-like" bit is not relevant, an OS starts being ostensibly UNIXy (POSIX syscalls, init process, POSIX userland,...) mostly after the kernel starts running.
Even on common x86 PCs the start differs deeply between "traditional BIOS" and UEFI mode (in this last case, the UEFI itself can load and start the kernel, without additional bootloaders being involved).
Coming down to the start of a modern Linux distribution on x86 in BIOS mode with GRUB2, the basic idea is to quickly get up and running a system which can deal with "normal" PC abstractions (disk partitions, files on filesystems, ...), keeping at minimum the code that has to deal with hardcoded disk offsets.
GRUB is not a monolithic program, but it's composed in stages. When booting, the BIOS loads and executes the code stored in the MBR, which is the first stage of GRUB. Since the amount of code that can be stored there is extremely limited (few hundred bytes), all this code does is to act as a trampoline for the next GRUB stage (somehow, it "boots GRUB");
the MBR code contains hard-coded the address of the first sector of the "core image"; this, in turn, contains the code to load the rest of the "core image" from disk (again, hard-coded as a list of disk sectors);
Once the core image is loaded, the ugly work is done, since the GRUB core image normally contains basic file system drivers, so it can load additional configuration and modules from regular files on the boot partition;
Now what happens depends on the configuration of the specific boot entry; for booting Linux, usually there are two files involved: the kernel image and the initrd:
initrd contains the "initial ramdrive", containing the barebones userland mounted as / in the early boot process (before the kernel has mounted the filesystems); it mostly contains device detection helpers, device drivers, filesystem drivers, ... to allow the kernel to be able to load on demand the code needed to mount the "real" root partition;
the kernel image is a (usually compressed) executable image in some format, which contains the actual kernel code; the bootloader extracts it in memory (following some rules), puts the kernel parameters and initrd memory position in some memory location and then jumps to the kernel entrypoint, whence the kernel takes over the boot process;
From there, the "real" Linux boot process starts, which normally involves loading device drivers, starting init, mounting disks and so on.
Again, this is all (x86, BIOS, Linux, GRUB2)-specific; points 1-2 are different on architectures without an MBR, and are are skipped completely if GRUB is loaded straight from UEFI; 1-3 are different/avoided if UEFI (or some other loader) is used to load directly the kernel image. The initrd thing may be not involved if the kernel image already bundles all that is needed to start (typical of embedded images); details of points 4-5 are different for different OSes (although the basic idea is usually similar). And, on embedded machines the kernel may be placed directly at a "magic" location that is automatically mapped in memory and run at start.

UEFI runtime service next to OS

I had the idea of running a small service next to the OS but I'm not sure if it is possible. I tried to figure it out by reading some docs but didn't get far, so here comes my question.
I read about the UEFI runtime services.
Would it be possible to have a small module in the firmware which runs next to what ever operating system is used and that sends information concerning the location of the device to an address on the internet?
As far as my knowledge goes, I would say that it should not possbile to run something in the background once UEFI handed the control over to the OS kernel.
To clarify my intentions, I would like to have something like that on my laptop. There is the Prey project but it is installed inside the OS. I'm using a Linux distribution without autologin. If somebody would steal it they would probably just install Windows.
What you want to do is prohibited because that would be the gateway for viruses, loggers and other malwares.
That said, if you want to get some code running aside of the OS, you should look at System Management Mode (SMM).
SMM is an execution mode of the x86 processors orthogonal to the standard protected mode. SMM allows the BIOS to completely suspend the OS on all the CPUs at once and enter in SMM mode to execute some BIOS services. Switching to SMM mode happens right now on your x86 machine, as you're reading this Stackoverflow answer. It is triggered either by:
hardware: a dedicated System Management Interrupt line (SMI#), very similar to how IRQs work,
software: via an I/O access to a location considered special by the motherboard logic (port 0xb2 is common).
SMM services are called SMM handlers and for instance sensors values are very often retrieved by the means of a SMM call to a SMM handler.
SMM handlers are setup during the DXE phase of UEFI firmware initialization into the SMRAM, an area dedicated to SMM handlers. See the following diagram:
SMM Drivers are dispatched by the SMM Core during the DXE Phase. So
additional SMI handlers can be registered in the DXE Phase. Late in
the DXE Phase, when no more SMM drivers can be dispatched, SMRAM will
be locked down (as recommended practice). Once SMRAM is locked down,
no additional SMM Drivers may be dispatched, so not additional SMI
handlers can be registered. For example, an SMM Driver that registers
an SMI handler cannot be loaded from the EFI Shell or be added as a
DriverOption in the UEFI Boot Manager.
source: tianocore
This means that the code of your SMM handler must be present in the BIOS image, which implies rebuilding the BIOS with your handler added. It's tricky but tools exist out there to both provide a DXE environment and build your SMM handler code into a PE executable, as well as other tools to add a DXE driver to an existing BIOS image. Not all BIOS manufacturers are supported though. It's risky unless your Flash chip is in a socket and you can reprogram it externally.
But the first thing is to check if the SMRAM is locked on your system. If you're lucky you can add your very own SMM handler directly in SMRAM. It's fidly but doable.
Note: SMM handlers inside the BIOS are independent from the OS so it would run even if a robber installs a new Operating System, which is what you want. However being outside of an OS has huge disadvantages: you'd need to embedd in your SMM handler a driver for the network interface (a polling-only, interrupt-less driver!) and wlan 802.11, DHCP and IP support to connect to the Wifi and get your data routed to an external host on the Internet. How would you determine the wifi SSID and password? Well you could wait for the OS to initialize the network adapter for you, but you'd need to save/restore the full state of the network host controller between calls. Not a small or easy project.
As far as my knowledge goes, I would say that it should not possbile to run something in the background once UEFI handed the control over to the OS kernel.
I agree. Certainly, the boot environment (prior to ExitBootServices() only uses a single threaded model.
There is no concept of threads in the UEFI spec as far as I can see. Furthermore each runtime service is something the OS deliberately invokes, much like the OS provides system calls for applications. Once you enter a runtime system function, note the following restriction from 7.1:
Callers are prohibited from using certain other services from another processor or on the same
processor following an interrupt as specified in Table 30.
Depending on which parts of the UEFI firmware your runtime service needed access to would depend on which firmware functions would be non-reentrant whilst your call was busy.
Which is to say that, even if you were prepared to sacrifice a thread to sit eternally inside an EFI runtime service, you could well block the entire rest of the kernel from using other runtime services.
I do not think it is going to be possible unfortunately, but interesting question all the same!

How many sockets can be created from a port?

How many sockets can be created from a port?
It's operating system dependent.
For Windows, look here for the MaxConnections entry.
For Linux, look here as the comment on question says.
This is an operating system limit.
Basically each socket will require a file descriptor (in Linux/Unix terms; it's probably equivalent in Windows). The OS will have a per-process file descriptor limit (say 250-1000) and that'll be the upper limit.
That'll be governed by the number of client-side ports available to your process (that i, when you open a connection to a remote host/port combination, you will also require a port at your end).
The total of client side (or ephemeral) ports will be made available to all the processes on your machine. So it depends on what else is currently running.
The number of ports and the configuration is OS dependent. Just Google for 'max number of ports' plus your OS.

Resources