How to see variables stored on the stack with GDB - pointers

I'm trying to figure out what is stored at a certain place on the stack with GDB. I have a statement:
cmpl $0x176,-0x10(%ebp)
In this function I'm comparing 0x176 to the -0x10(%ebp) and I am wondering if there is a way to see what is stored at -0x10(%ebp).

I am wondering if there is a way to see what is stored at -0x10(%ebp).
Assuming you have compiled with debug info, info locals will tell you about all the local variables in current frame. After that, print (char*)&a_local - (char*)$ebp will tell you the offset from start of a_local to %ebp, and you can usually find out what local is close to 0x176.
Also, if your locals have initializers, you can do info line NN to figure out which assembly instruction range corresponds to initialization of a given local, then disas ADDR0,ADDR1 to see the disassembly, and again understand which local is located at what offset.
Another alternative is to readelf -w a.out, and look for entries like this:
int foo(int x) { int a = x; int b = x + 1; return b - a; }
<1><25>: Abbrev Number: 2 (DW_TAG_subprogram)
<26> DW_AT_external : 1
<27> DW_AT_name : foo
<2b> DW_AT_decl_file : 1
<2c> DW_AT_decl_line : 1
<2d> DW_AT_prototyped : 1
<2e> DW_AT_type : <0x67>
<32> DW_AT_low_pc : 0x0
<36> DW_AT_high_pc : 0x23
<3a> DW_AT_frame_base : 0x0 (location list)
<3e> DW_AT_sibling : <0x67>
<2><42>: Abbrev Number: 3 (DW_TAG_formal_parameter)
<43> DW_AT_name : x
<45> DW_AT_decl_file : 1
<46> DW_AT_decl_line : 1
<47> DW_AT_type : <0x67>
<4b> DW_AT_location : 2 byte block: 91 0 (DW_OP_fbreg: 0)
<2><4e>: Abbrev Number: 4 (DW_TAG_variable)
<4f> DW_AT_name : a
<51> DW_AT_decl_file : 1
<52> DW_AT_decl_line : 1
<53> DW_AT_type : <0x67>
<57> DW_AT_location : 2 byte block: 91 74 (DW_OP_fbreg: -12)
<2><5a>: Abbrev Number: 4 (DW_TAG_variable)
<5b> DW_AT_name : b
<5d> DW_AT_decl_file : 1
<5e> DW_AT_decl_line : 1
<5f> DW_AT_type : <0x67>
<63> DW_AT_location : 2 byte block: 91 70 (DW_OP_fbreg: -16)
This tells you that x is stored at fbreg+0, a at fbreg-12, and b at fbreg-16. Now you just need to examine location list to figure out how to derive fbreg from %ebp. The list for above code looks like this:
Contents of the .debug_loc section:
Offset Begin End Expression
00000000 00000000 00000001 (DW_OP_breg4: 4)
00000000 00000001 00000003 (DW_OP_breg4: 8)
00000000 00000003 00000023 (DW_OP_breg5: 8)
00000000 <End of list>
So for most of the body, fbreg is %ebp+8, which means that a is at %ebp-4. Disassembly confirms:
00000000 <foo>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 10 sub $0x10,%esp
6: 8b 45 08 mov 0x8(%ebp),%eax # 'x' => %eax
9: 89 45 fc mov %eax,-0x4(%ebp) # '%eax' => 'a'
...

Related

Decode Epson print (ESC-i) command decoding/encoding

I'm trying to understand the algorithm used for compression value = 1 with the Epson ESCP2 print command, "ESC-i". I have a hex dump of a raw print file which looks, in part, like the hexdump below (note little-endian format issues).
000006a 1b ( U 05 00 08 08 08 40 0b
units; (page1=08), (vt1=08), (hz1=08), (base2=40 0b=0xb40=2880)
...
00000c0 691b 0112 6802 0101 de00
esc i 12 01 02 68 01 01 00
print color1, compress1, bits1, bytes2, lines2, data...
color1 = 0x12 = 18 = light cyan
compress1 = 1
bits1 (bits/pixel) = 0x2 = 2
bytes2 is ??? = 0x0168 = 360
lines2 is # lines to print = 0x0001 = 1
00000c9 de 1200 9a05 6959
00000d0 5999 a565 5999 6566 5996 9695 655a fd56
00000e0 1f66 9a59 6656 6566 5996 9665 9659 6666
00000f0 6559 9999 9565 6695 9965 a665 6666 6969
0000100 5566 95fe 9919 6596 5996 5696 9666 665a
0000110 5956 6669 0456 1044 0041 4110 0040 8140
0000120 9000 0d00
1b0c 1b40 5228 0008 5200 4d45
FF esc # esc ( R 00 REMOTE1
The difficulty I'm having is how to decode the data, starting at 00000c9, given 2 bits/pixel and the count of 360. It's my understanding this is some form of tiff or rle encoding, but I can't decode it in a way that makes sense. The output was produced by gutenprint plugin for GIMP.
Any help would be much appreciated.
The byte count is not a count of the bytes in the input stream; it is a count of the bytes in the input stream as expanded to an uncompressed form. So when expanded, there should be a total of 360 bytes. The input bytes are interpreted as either a count of bytes to follow, if positive, in which case the count is the byte value +1; and if negative the count is a count of the number of times the immediately following byte should be expanded, again, +1. The 0D at the end is a terminating carriage return for the line as a whole.
The input stream is only considered as a string of whole bytes, despite the fact that the individual pixel/nozzle controls are only 2 bits each. So it is not really possible to use a repeat count for something like a 3-nozzle sequence; a repeat count must always specify a full byte 4-nozzle combination.
The above example then specifies:
0xde00 => repeat 0x00 35 times
0x12 => use the next 19 bytes as is
0xfd66 => repeat 0x66 4 times
0x1f => use the next 32 bytes as is
etc.

Why an ELF executable could have 4 LOAD segments?

There is a remote 64-bit *nix server that can compile a user-provided code (which should be written in Rust, but I don't think it matters since it uses LLVM). I don't know which compiler/linker flags it uses, but the compiled ELF executable looks weird - it has 4 LOAD segments:
$ readelf -e executable
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
...
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000004138 0x0000000000004138 R 0x1000
LOAD 0x0000000000005000 0x0000000000005000 0x0000000000005000
0x00000000000305e9 0x00000000000305e9 R E 0x1000
LOAD 0x0000000000036000 0x0000000000036000 0x0000000000036000
0x000000000000d808 0x000000000000d808 R 0x1000
LOAD 0x0000000000043da0 0x0000000000044da0 0x0000000000044da0
0x0000000000002290 0x00000000000024a0 RW 0x1000
...
On my own system all executables that I was looking at only have 2 LOAD segments:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
...
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x00000000003000c0 0x00000000003000c0 R E 0x200000
LOAD 0x00000000003002b0 0x00000000005002b0 0x00000000005002b0
0x00000000000776c8 0x000000000009b200 RW 0x200000
...
What are the circumstances (compiler/linker versions, flags etc) under which a compiler might build an ELF with 4 LOAD segments?
What is the point of having 4 LOAD segments? I imagine that having a segment with read but not execute permission might help against certain exploits, but why have two such segments?
A typical BFD-ld or Gold linked Linux executable has 2 loadable segments, with the ELF header merged with .text and .rodata into the first RE segment, and .data, .bss and other writable sections merged into the second RW segment.
Here is the typical section to segment mapping:
$ echo "int foo; int main() { return 0;}" | clang -xc - -o a.out-gold -fuse-ld=gold
$ readelf -Wl a.out-gold
Elf file type is EXEC (Executable file)
Entry point 0x400420
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R 0x8
INTERP 0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0006b0 0x0006b0 R E 0x1000
LOAD 0x000e18 0x0000000000401e18 0x0000000000401e18 0x0001f8 0x000200 RW 0x1000
DYNAMIC 0x000e28 0x0000000000401e28 0x0000000000401e28 0x0001b0 0x0001b0 RW 0x8
NOTE 0x000254 0x0000000000400254 0x0000000000400254 0x000020 0x000020 R 0x4
GNU_EH_FRAME 0x00067c 0x000000000040067c 0x000000000040067c 0x000034 0x000034 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x000e18 0x0000000000401e18 0x0000000000401e18 0x0001e8 0x0001e8 RW 0x8
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .dynsym .dynstr .gnu.hash .hash .gnu.version .gnu.version_r .rela.dyn .init .text .fini .rodata .eh_frame .eh_frame_hdr
03 .fini_array .init_array .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag
06 .eh_frame_hdr
07
08 .fini_array .init_array .dynamic .got .got.plt
This optimizes the number of mmaps that the kernel must perform to load such executable, but at a security cost: the data in .rodata shouldn't be executable, but is (because it's merged with .text, which must be executable). This may significantly increase the attack surface for someone trying to hijack a process.
Newer Linux systems, in particular using LLD to link binaries, prioritize security over speed, and put ELF header and .rodata into the first R-only segment, resulting in 3 load segments and improved security. Here is a typical mapping:
$ echo "int foo; int main() { return 0;}" | clang -xc - -o a.out-lld -fuse-ld=lld
$ readelf -Wl a.out-lld
Elf file type is EXEC (Executable file)
Entry point 0x201000
There are 10 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000200040 0x0000000000200040 0x000230 0x000230 R 0x8
INTERP 0x000270 0x0000000000200270 0x0000000000200270 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000200000 0x0000000000200000 0x000558 0x000558 R 0x1000
LOAD 0x001000 0x0000000000201000 0x0000000000201000 0x000185 0x000185 R E 0x1000
LOAD 0x002000 0x0000000000202000 0x0000000000202000 0x001170 0x002005 RW 0x1000
DYNAMIC 0x003010 0x0000000000203010 0x0000000000203010 0x000150 0x000150 RW 0x8
GNU_RELRO 0x003000 0x0000000000203000 0x0000000000203000 0x000170 0x001000 R 0x1
GNU_EH_FRAME 0x000440 0x0000000000200440 0x0000000000200440 0x000034 0x000034 R 0x1
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0
NOTE 0x00028c 0x000000000020028c 0x000000000020028c 0x000020 0x000020 R 0x4
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .rodata .dynsym .gnu.version .gnu.version_r .gnu.hash .hash .dynstr .rela.dyn .eh_frame_hdr .eh_frame
03 .text .init .fini
04 .data .tm_clone_table .fini_array .init_array .dynamic .got .bss
05 .dynamic
06 .fini_array .init_array .dynamic .got
07 .eh_frame_hdr
08
09 .note.ABI-tag
Not to be left behind, the newer BFD-ld (my version is 2.31.1) also makes ELF header and .rodata read-only, but fails to merge two R-only segments into one, resulting in 4 loadable segments:
$ echo "int foo; int main() { return 0;}" | clang -xc - -o a.out-bfd -fuse-ld=bfd
$ readelf -Wl a.out-bfd
Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 11 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x000268 0x000268 R 0x8
INTERP 0x0002a8 0x00000000004002a8 0x00000000004002a8 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0003f8 0x0003f8 R 0x1000
LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00018d 0x00018d R E 0x1000
LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x000110 0x000110 R 0x1000
LOAD 0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001e8 0x0001f0 RW 0x1000
DYNAMIC 0x002e50 0x0000000000403e50 0x0000000000403e50 0x0001a0 0x0001a0 RW 0x8
NOTE 0x0002c4 0x00000000004002c4 0x00000000004002c4 0x000020 0x000020 R 0x4
GNU_EH_FRAME 0x002004 0x0000000000402004 0x0000000000402004 0x000034 0x000034 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001c0 0x0001c0 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn
03 .init .text .fini
04 .rodata .eh_frame_hdr .eh_frame
05 .init_array .fini_array .dynamic .got .got.plt .data .bss
06 .dynamic
07 .note.ABI-tag
08 .eh_frame_hdr
09
10 .init_array .fini_array .dynamic .got
Finally, some of these choices are affected by the --(no)rosegment (or -Wl,z,noseparate-code for BFD ld) linker option.

Extracting record from big endian data

I have the following code for network protocol implementation. As the protocol is big endian, I wanted to use the Bit_Order attribute and High_Order_First value but it seems I made a mistake.
With Ada.Unchecked_Conversion;
with Ada.Text_IO; use Ada.Text_IO;
with System; use System;
procedure Bit_Extraction is
type Byte is range 0 .. (2**8)-1 with Size => 8;
type Command is (Read_Coils,
Read_Discrete_Inputs
) with Size => 7;
for Command use (Read_Coils => 1,
Read_Discrete_Inputs => 4);
type has_exception is new Boolean with Size => 1;
type Frame is record
Function_Code : Command;
Is_Exception : has_exception := False;
end record
with Pack => True,
Size => 8;
for Frame use
record
Function_Code at 0 range 0 .. 6;
Is_Exception at 0 range 7 .. 7;
end record;
for Frame'Bit_Order use High_Order_First;
for Frame'Scalar_Storage_Order use High_Order_First;
function To_Frame is new Ada.Unchecked_Conversion (Byte, Frame);
my_frame : Frame;
begin
my_frame := To_Frame (Byte'(16#32#)); -- Big endian version of 16#4#
Put_Line (Command'Image (my_frame.Function_Code)
& " "
& has_exception'Image (my_frame.Is_Exception));
end Bit_Extraction;
Compilation is ok but the result is
raised CONSTRAINT_ERROR : bit_extraction.adb:39 invalid data
What did I forget or misunderstand ?
UPDATE
The real record in fact is
type Frame is record
Transaction_Id : Transaction_Identifier;
Protocol_Id : Word := 0;
Frame_Length : Length;
Unit_Id : Unit_Identifier;
Function_Code : Command;
Is_Exception : Boolean := False;
end record with Size => 8 * 8, Pack => True;
for Frame use
record
Transaction_Id at 0 range 0 .. 15;
Protocol_Id at 2 range 0 .. 15;
Frame_Length at 4 range 0 .. 15;
Unit_id at 6 range 0 .. 7;
Function_Code at 7 range 0 .. 6;
Is_Exception at 7 range 7 .. 7;
end record;
Where Transaction_Identifier, Word and Length are 16-bit wide.
These ones are displayed correctly if I remove the Is_Exception field and extend Function_Code to 8 bits.
The dump of the frame to decode is as following:
00000000 00 01 00 00 00 09 11 03 06 02 2b 00 64 00 7f
So my only problem is really to extract the 8th bit of the last byte.
So,
for Frame use
record
Transaction_Id at 0 range 0 .. 15;
Protocol_Id at 2 range 0 .. 15;
Frame_Length at 4 range 0 .. 15;
Unit_id at 6 range 0 .. 7;
Function_Code at 7 range 0 .. 6;
Is_Exception at 7 range 7 .. 7;
end record;
It seems you want Is_Exception to be the the LSB of the last byte?
With for Frame'Bit_Order use System.High_Order_First; the LSB will be bit 7,
(also, 16#32# will never be -- Big endian version of 16#4#, the bit pattern just doesn't match)
It may be more intuitive and clear to specify all of your fields relative to the word they're in, rather than the byte:
Unit_ID at 6 range 0..7;
Function_Code at 6 range 8 .. 14;
Is_Exception at 6 range 15 .. 15;
Given the definition of Command above, the legal values for the last byte will then be:
2 -> READ_COILS FALSE
3 -> READ_COILS TRUE
8 -> READ_DISCRETE_INPUTS FALSE
9 -> READ_DISCRETE_INPUTS TRUE
BTW,
by applying your update to your original program, and adding/changing the following, you program works for me
add
with Interfaces;
add
type Byte_Array is array(1..8) of Byte with Pack;
change, since we don't know the definition
Transaction_ID : Interfaces.Unsigned_16;
Protocol_ID : Interfaces.Unsigned_16;
Frame_Length : Interfaces.Unsigned_16;
Unit_ID : Interfaces.Unsigned_8;
change
function To_Frame is new Ada.Unchecked_Conversion (Byte_Array, Frame);
change
my_frame := To_Frame (Byte_Array'(00, 01, 00, 00, 00, 09, 16#11#, 16#9#));
I finally found what was wrong.
In fact, the Modbus Ethernet Frame definition mentioned that, in case of exception, the returned code should be the function code plus 128 (0x80) (see explanation on Wikipedia). That's the reason why I wanted to represent it through a Boolean value but my representation clauses were wrong.
The correct clauses are these ones :
for Frame use
record
Transaction_Id at 0 range 0 .. 15;
Protocol_Id at 2 range 0 .. 15;
Frame_Length at 4 range 0 .. 15;
Unit_id at 6 range 0 .. 7;
Is_Exception at 6 range 8 .. 8;
Function_Code at 6 range 9 .. 15;
end record;
This way, the Modbus network protocol is correctly modelled (or not but at least, my code is working).
I really thank egilhh and simonwright for making me find what was wrong and explain the semantics behind the aspects.
Obviously, I don't know who reward :)
Your original record declaration works fine (GNAT complains about the Pack, warning: pragma Pack has no effect, no unplaced components). The problem is with working out the little-endian Byte.
---------------------------------
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | BE bit numbers
---------------------------------
| c c c c c c c | e |
---------------------------------
| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | LE bit numbers
---------------------------------
so if you want the Command to be Read_Discrete_Inputs, the Byte needs to have BE bit 4 (LE bit 3) set i.e. LE 16#8#.
Take a look at this AdaCore post on bit order and byte order to see how they handle it. After reading that, you will probably find that the bit order of your frame value is really 16#08#, which probably is not what you are expecting.
Big Endian / Little Endian typically refers to Byte order rather than bit order, so when you see that Network protocols are Big Endian, they mean Byte order. Avoid setting Bit_Order for your records. In modern systems, you will almost never need that.
Your record is only one byte in size, so byte order won't matter for it by itself. Byte order comes into play when you have larger field values (>8 bits long).
The bit_order pragma doesn't reverse the order that the bits appear in memory. It simply defines whether the most significant bit (left most) will be logically referred to as zero (High_Order_First) or the least significant bit will be referred to as zero (Low_Order_First) when interpreting the First_Bit and Last_Bit offsets from the byte position in the representation clause. Keep in mind that these offsets are taken from the MSB or LSB of the scalar the record component belongs to AS A VALUE. So in order for the byte positions to carry the same meaning on a little endian CPU as they do on a big endian CPU (as well as the in memory representation of multibyte machine scalars, which exist when one or more record components with the same byte position have a last_bit value which exceeds the capacity of a single byte) then 'Scalar_Storage_Order must also be specified.

eBPF: global variables and structs

So I have a simple eBPF code:
my.h:
#ifndef __MY_COMMON_H__
#define __MY_COMMON_H__
#include <linux/types.h>
struct foo {
int a;
int b;
int c;
int d;
};
#endif /* __MY_COMMON_H__ */
my_kern.c:
...
struct bpf_map_def SEC("maps") my_map = {
.type = BPF_MAP_TYPE_HASH,
.key_size = ...,
.value_size = ...,
.max_entries = MAX_ENTRIES,
};
struct foo my_foo = {
.a = 150000,
.b = 100,
.c = 10,
.d = 40,
};
SEC("sockops")
int my_bpf(struct bpf_sock_ops *sk_ops)
{
...
};
char _license[] SEC("license") = "GPL";
u32 _version SEC("version") = LINUX_VERSION_CODE;
I build the code with llvm-5.0, with no errors/warnings, however bpftool prog load ... fails:
libbpf: Program 'sockops' contains non-map related relo data pointing to section 6
Error: failed to load program
$ llvm-readelf-5.0 -s my_kern.o
There are 12 section headers, starting at offset 0xa90:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .strtab STRTAB 0000000000000000 0009c0 0000cc 00 0 0 1
[ 2] .text PROGBITS 0000000000000000 000040 000000 00 AX 0 0 4
[ 3] sockops PROGBITS 0000000000000000 000040 0006e0 00 AX 0 0 8
[ 4] .relsockops REL 0000000000000000 000980 000040 10 11 3 8
[ 5] maps PROGBITS 0000000000000000 000720 00001c 00 WA 0 0 4
[ 6] .data PROGBITS 0000000000000000 00073c 00001c 00 WA 0 0 4
[ 7] .rodata.str1.16 PROGBITS 0000000000000000 000760 000093 01 AMS 0 0 16
[ 8] .rodata.str1.1 PROGBITS 0000000000000000 0007f3 00001d 01 AMS 0 0 1
[ 9] license PROGBITS 0000000000000000 000810 000004 00 WA 0 0 1
[10] version PROGBITS 0000000000000000 000814 000004 00 WA 0 0 4
[11] .symtab SYMTAB 0000000000000000 000818 000168 18 1 10 8
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
$
Section 6 contains my my_foo structure, I could dump its contents with llvm-objdump.
This error does not happen if I define my_foo inside main() function for instance. Does it mean such global declarations are not permitted by eBPF convention?
eBPF knows nothing about global variables. When bpftool sends your program to the kernel, it only sends one piece of bytecode instructions that is supposed to be “self-contained” (at least if you don't use eBPF function calls, but eBPF functions are not yet supported by libbpf and bpftool so I assume this is not the case).
Anyway: when bpftool calls libbpf to load your program from the ELF file, it expects to find the whole self-contained program in one ELF section. There is an exception for maps, for which some metadata is placed into a specific ELF section. Other than this, libbpf does not know how to get the definition of your global variable my_foo from the .data section and to move it into the main section. This is why it warns you about non-map related relo[cation] data in this .data section.
my_kern.o
+----------------------------+
| ELF header |
+----------------------------+
|sockops |
| |
| eBPF instructions |
| | |
| ->“get my_foo from .data” | <- libbpf: “What am I supposed to do with this??”
| |
+----------------------------+
| Other ELF sections… |
+----------------------------+
|.data | <- libbpf: “I don't care about this section”
| my_foo |
+----------------------------+
I'm a true artist, aren't I?
So the problem actually comes from how clang handles your global variable here. If you move the definition inside the main function, clang apparently does not move it to its own .data section in the object file it creates. I suppose you are trying to move the variable to a header file, possibly to share it with other source files; I don't know if this is possible to have this to compile correctly, there may exist some flags for clang or some preprocessing directives that would help you, but this is beyond my knowledge.
Seems like static global variable relocation works now (kernel 5.4, Clang 10, Ubuntu 20.04). In my code the value of variable test persists between runs of BPF prog.
static __u64 test = 0;
SEC("cgroup_skb/egress")
int cb_pkt(struct __sk_buff *skb)
{
bpf_printk("Packet with size: %d\n", test);
test = skb->len;
return 1;
}

About MPEG-4 headers

I examined some MPEG-4 video headers and saw some byte arrays like below at the beginning:
00 00 01 B0 01 00 00 01 B5 89 13
I know 00 00 01 parts but what exactly B0 B1 and B5 89 13 parts mean? Actually, if I put this byte array infront of an MPEG-4 stream, it works fine.
But I don't know if those values works with different mpeg-4 stream sources ?
0x000001B0 -> Visual Object Sequence Start (VOSS) Code
0x000001B5 -> Visual Object Start (VOS) Code
You can find the complete MPEG-4 elementary video header details at "ISO/IEC 14496-2" documentation. Here are the details you asked for.
Visual Object Sequence Start (VOSS) Code
-> 4 bytes visual object sequence start code = long hex value of 0x000001B0
-> 8 bits profile/level indicator = 1 byte unsigned number
Visual Object Start (VOS) Code
-> 4 bytes visual object start code = long hex value of 0x000001B5
-> 1 bit has id marker flag = 1/4 nibble flag
_ID_Marker_Section_
-> 4 bits version id = 1 nibble unsigned value - only if marker is true
- version id types are ISO 14496-2 = 1
-> 3 bits visual object priority = 3/4 nibble unsigned value - only if marker is true
- priorities are 1 through to 7
-> 4 bits visual object type = 1 nibble unsigned value
- types are video = 1 ; still texture = 2 ; mesh = 3 ; face = 4
-> 1 bit video signal type = 1/4 nibble flag
- NOTE: if this is false Y has a sample range of 16 through to 235

Resources