So you've bought a shinny new mac M1/M2 and you realise that you can't get virtualisation to work.
I tried every hypervisor and tried a lot of combinations
So it took me days to get the following working with visualisation since VirtualBox 7.0.2 which stated it would work for M1/M2 chipset.
What I wanted to achieve was the following:
Shared Folders
Bridged Network Interface
RHEL based OS
Sounds simple right!! Well this was a lot of trial and error. I read a lot of articles with people patching software just to get the basics going and I couldn't find the patched files so wanted an out-of-the-box solution.
So after hours of trying Centos Stream 8 which all our servers are on and what our development environments are on, I tried installing Ubuntu 20.04 in parallels with shared folders but without bridged networking and this was the light bulb moment when I realised it was the pagesize of the OS, so RHEL 8 will not work.
So I tried RHEL 9 in parallels and it booted up and installed. I then tried to install the Guest Tools so I can get Shared Folders to work but this then highlighted a bug in the Guest Tool which really disappointed me considering they state they have a working hypervisor which can work with aarch64 but this wasn't the case, however it does work on Ubuntu just not RHEL 9.
I then tried VMWare Fusion which failed miserably, you can launch the VM but no Shared Folders or no Bridged Network. Quickly move on to something I tried earlier which was UTM.
Now UTM has 3 modes and it is important to know the 3 (correct me if i'm wrong).
Virtualisation with QEMU
Virtualisation with Apple Hypervisor
Emulation (everything is run via Rosetta)
I tried QEMU but there was issues with Shared Folders so wasted no time in not spending anymore time using QEMU.
I then decided to try the Apple Hypervisor and I noticed everything was running faster than it did in QEMU. Mounting the Shared Folder was easy and simple, and you can mount it in you /etc/fstab for permanently mount, you also can set the Bridged Network in the settings.
So after a lot of trial and error I was able to get a RHEL based OS VM running with UTM using Apple Hypervisor with the same functionality as I used to have with vagrant.
i'm installed wordpress in VPS by docker using this instruction: https://www.atlantic.net/vps-hosting/install-wordpress-with-docker-on-ubuntu-20-04/ Additionally I add only domain forwading to VPS DNS (and this is works). But I didn't add more there - only these instructions and DNS change I did.
The VPS has 16 GB RAM, 6 vCPU Cores, 400 GB SSD and Ubuntu 20.04 as the installed OS.
But execution server time (looking up on HTTP request in PageSpeed Insights) is very bad, because is between 600 and 2000 milliseconds (only server reaction time, not total loading time). If you are looking on used by me instructions (from linked website), what do you think what will be the solution in this situation?
Thanks for any answers.
Try to install a pluggin in your wordpress site for cache, example: WP Fastest Cache...
There are alot of pluggins for cache. Click on link below to see more solutions.
https://blog.hubspot.com/website/best-wordpress-cache-plugins-to-speed-up-a-site
Hope that I've answared your question, if not feel free to ask me again. Bye
I am currently running into a weird issue that the foreach...dopar runs like 10x faster in my local laptop (Dell laptop, Windows 10 OS) with 15 cores than when the R code was put onto Docker container containing 8 cores. The code itself only sets the ncores parameter to be 3, so I am puzzled by why there is such a drastic difference in runtime like that. Did anyone here run into the similar issue with the doParallel package in Docker? If yes, how did you resolve it?
I have a docker compose setup that consists of:
wordpress:php8.1
mariadb:latest
traefik:v2.7
To rule out Windows networking differences, this is the benchmark ran from inside:
curl --resolve example.com:443:172.18.0.3 --write-out '%{time_total}\n' --output /dev/null https://example.com
Where example.com is my custom domain, the IP is traefik container's current IP. I'm running the command SSH'd into VMware, and then from PowerShell after typing wsl. I copied the project from one place to the other as-is (kudos to Docker for portability).
It consistently returns ~0.2s for the VMware-backed instance, while it's 0.4-0.6s for WSL. It represents the load of the index.php itself which includes HTML source of a performant WP site, using a handcoded theme, no plugins. Locally, the staticized version doesn't seem to have a measurable difference, or very slight, all under 10ms on both systems.
Rest of the config:
Windows 11 22H2 22610.1, WSL 2, VMware 16.1.0 build-17198959, Docker installed the same way and NOT Docker Desktop.
UFW off.
The vmdk and vhdx files are on the same SSD.
Not using /mnt files across filesystems: everything is under the home folder of each system, respectively.
OS is Ubuntu 20.04.4 LTS in WSL, and Ubuntu 21.10 in VMware. I tried Ubuntu 22.04 LTS in WSL and WMware too, same numbers.
I even tried with Nginx + PHP-FPM variant of WP in WSL, and OpenLiteSpeed in VMware, nothing changes the numbers.
The only difference I know of is that I had to sudo update-alternatives --config iptables and choose /usr/sbin/iptables-legacy to make Docker work in WSL, which on WMware I didn't need to do.
Several caveats with my answer:
I wish I had a solution for you, but your primary question is why, which this should answer. And perhaps the data here will help lead us to a solution.
I hope I'm wrong about this, and I may well be. I've always heard anecdotally that WSL2 performance was "near native." Your experience, however, coupled with my benchmarking below, leads me to believe that may not be the case.
That said, I'm going to report the data that I came up with in investigating this.
Short summary -- My benchmarking seems to show a substantial disk IO and memory performance delta between Hyper-V and VMWare that likely explains your WordPress results.
Supporting data and research
I started out with a similar test scenario as yours, but attempted to reduce it to as much of an MRE as I could:
Configuration
Hardware:
i9500
16GB of RAM
SSD
Fairly fresh/recent install of Windows 11 Pro, with WSL2 enabled
Fresh install of VMWare Workstation 16 Player
Virtualization:
Default VMWare settings (2 CPUs, 4GB RAM)
Default WSL2 settings (6 CPUs, 8GB RAM)
In both WSL2 and VMWare:
Ubuntu Server 20.04 guest/distribution
Docker installed in both (from official repo, not Docker Desktop)
ubuntu:latest (22.04) Docker image
MySQL server (MariaDB) and Sysbench installed in the Ubuntu 22.04 Docker container
Note that for the benchmarking below:
I closed VMWare when testing WSL2, and vice-versa.
I did not reboot the Windows host between tests. However, note that some of these tests were run multiple times, from both VMWare and WSL2/Hyper-V with no substantial differences in the results, so I do not believe a reboot would have changed the results measurably.
Benchmarking
I started off with some basic Sysbench testing of CPU and Memory. This was done from within the Docker container.
A simple sysbench cpu --time=300 run:
VMWare
WSL2
events/sec
1,250.97
1,252.89
# events
375,294.00
375,869.00
Latency
↪ Min
0.77
0.77
↪ Avg
0.80
0.80
↪ Max
31.40
4.07
↪ 95th percentile
0.87
0.86
Pretty much an even matchup there.
sysbench memory run:
VMWare
WSL2
Total operations
64,449,416.00
6,456,274.00
MiB transferred
62,938.88
6,304.96
Latency
↪ Min
0.00
0.00
↪ Avg
0.00
0.00
↪ Max
23.63
0.12
↪ 95th percentile
0.00
0.00
Ouch - WSL2's Docker image is running at about 10% of the memory bandwidth of VMWare. I'll be honest; it was tough to spot this until I inserted the comma separators here in the table ;-). At first glance, I thought the two were on par.
I decided to skip straight to MySQL testing, also using Sysbench, since this would probably provide the closest match to your WordPress usage. This was done (after a corresponding prepare) with:
sysbench oltp_read_write.lua --mysql-user=root --time=300 --tables=10 --table-size=1000000 --range_selects=off --report-interval=1 --histogram run
I'll skip the histogram and second-by-second results (but I have them saved if they are useful to anyone), but here's the summary data:
VMWare
WSL2
Queries performed
↪ Read
583,220
66,910
↪ Write
233,288
26,764
↪ Other
116,644
13,382
↪ Total
933,152
107,056
Transactions
58,322
6,691
Ignored errors
0
0
Reconnects
0
0
Latency
↪ Min
2.08
14.54
↪ Avg
5.14
44.83
↪ Max
71.67
193.75
↪ 95th Percentile
11.65
81.48
Again, ouch -- WSL2's MySQL performance (in Docker, at least) is benchmarking at around a tenth of that of VMWare. It's likely that most of your observed performance difference is represented in these results.
At this point, I began to suspect that the problem could be reproduced in a more generic (IO) way at the hypervisor level, ignoring WSL2 and Docker entirely. WSL2, of course, runs inside a (hidden to the user) Hyper-V backed VM, even though it doesn't require the full Hyper-V manager.
I proceeded to enable Hyper-V and install another Ubuntu 20.04 guest in it. I then installed Sysbench in both the VMWare and Hyper-V guest Ubuntu OS.
I then ran a disk IO comparison with:
sysbench fileio --file-total-size=15G prepare
sysbench fileio --file-total-size=15G --file-test-mode=rndrw --time=300 --max-requests=0 --histogram run
The results bore out the suspicion:
VMWare Ubuntu Guest
Hyper-V Ubuntu Guest
File operations
↪ Reads/sec
2,847.07
258.37
↪ Writes/sec
1,898.05
172.25
↪ fsyncs/sec
6,074.06
551.20
Throughput
↪ MiB/sec Read
44.49
4.04
↪ MiB/sec Written
29.66
2.69
Latency
↪ Min
0.00
0.00
↪ Avg
0.09
1.02
↪ Max
329.88
82.77
↪ 95th Percentile
0.32
4.10
One interesting thing to note during this stage was that the prepare operation for Sysbench was faster on Hyper-V by about 30% (IIRC). I did not capture the results since the prepare step isn't supposed to be part of the benchmarking.
However, after reading your comment and benchmarking results on unzip being faster on WSL2, I'm thinking there may be a connection. Both VMWare and Hyper-V/WSL2 use dynamically resized virtual disks (sometimes called "sparse"). The size of the virtual disk on the host OS essentially starts as a near-0-byte file and grows as needed up to its maximum size.
It may be that either:
Hyper-V has a performance advantage when growing the virtual disk.
Or in our testing, VMWare needed to grow the disk for these operations but the Hyper-V/WSL2 disk already had excess free space (from previously deleted files) available.
I cannot say for sure exactly which order I did things in, and the only way to know for sure would be to "shrink/compress" the virtual disks and try again.
Summary
It appears, to my naïve eye and at least on the "Pro" level of Windows, that Hyper-V has some serious performance limitations when compared to VMWare.
Tuning attempts and other comparisons
I did attempt some tuning of the Hyper-V system, but I'm no expert in that area. Regardless, there's not much that we as users could do to extend any Hyper-V tuning to WSL2 -- Microsoft would have to make most of those changes.
I did try converting the dynamic VHDX to a fixed one, in hopes that it would increase IO, but it did not make a substantial change either.
I've also now also tried:
Disabling swap in WSL2
Running the Sysbench tests with more threads
Setting a fixed 12GB RAM size for WSL2.
Running sysbench under WSL2 on my primary desktop, which has faster memory and an NVMe drive compared to my test system's SSD.
Memory speed was significantly improved - Apples-to-oranges, but my desktop's memory numbers were comparable to VMWare running on the lower-end test system.
However, this made no difference on the disk IO numbers. They were still in the same range as the test system.
Next steps
It would be great if you could, perhaps, run some similar benchmarking apart from your WordPress instance, since you already have the two environments set up. If we can corroborate the data, we should probably report it to, at least, the WSL team. They can hopefully either provide some guidance on how to tune WSL2 to near parity with VMWare, or work with the Hyper-V team on this.
Again, it does seem surprising to me that the disparity between Hyper-V and VMWare is so great. I still struggle to believe that I myself am not doing something wrong in the benchmarking.
After taking a nice look into wsl and classic VMs and refreshing my memory a little bit, I have reached a theory, but I cannot prove it.
I hope this answer helps in anyway or attracts someone with a direct knowledge of this question.
I asked here in the comments and to myself:
Is it possible that Hyper-V is just configured to use much less % of 'raw power' than VMWare? (I.E: Microsoft giving WSL not a lot of priority compared to windows and windows getting almost all the available resources) Or is it a more intrinsic problem with Hyper-V performance (coding etc)?
This comes from my own understanding that WSL seems to be more intended for accessing linux commodities from windows but not for a resource-intensive (including network speed) activity like hosting a webpage. Due to how WSL is integrated, it seems intuitive to think that it will run faster than a typical VM, but a VM is fully configurable, you can almost give it full access to resources.
If you look at these answers, you can see that it doesn't really seem to be intended to replace a VM per se.
So I think that WSL probably is not configured to these kind of tasks, neither it is meant to be configurable enough to change that.
I think that the main usage Microsoft was aiming with WSL was to give windows users a dinamic workflow in which you could switch between Windows and Linux commodities (since, IMHO, Linux is much better in console commodities than Windows), but NOT to make WSL a full-fledged VM with all of it's characteristics. It also makes sense since, why would you want to make, for example, a webpage and host it in a linux environment that has the burden of sharing resources with a windows environment you are not using AND that it is the 'main' one?
There are known networking bottlenecks in WSL2.
See this GitHub issue for potential workarounds: https://github.com/microsoft/WSL/issues/4901
Some solutions that have worked for others:
https://github.com/microsoft/WSL/issues/4901#issuecomment-664735137
https://github.com/microsoft/WSL/issues/4901#issuecomment-909723742
https://github.com/microsoft/WSL/issues/4901#issuecomment-957851617
If we install.packages("dplyr") on a GCP 'RStudio Server Pro Standard' VM, it takes around 3 minutes to install (on instance with 4 cores / 15 gb ram)
This seems unusual, as installation would typicaly take ~20 seconds on a laptop with equivalent specs.
Why so slow, and is there a quick and easy way of speeding this up?
Notes
I use the RStudio Server Pro Standard image from GCP marketplace to start the instance
Keen to know if there are any 'startup scripts' or similar I can set to run after the instance starts, e.g. to install a collection of commonly used packages
#user5783745 you can also adjust the Makevars to allow multithreaded compilation, which will help speed up compilations.
I followed this RStudio community post, and dropped MAKEFLAGS = -j4 into ~/.R/Makevars.
This basically halved the amount of time it took to install dplyr from scratch on the RStudio Server Pro Standard for GCP instance I spun up. (same as yours, 4 vCPU, 15GB ram)