* Front panel The case's manual has a terse illustration with two arrows to pull the front panel "away and up" from the rest of the case. Here too, the amount of force required to do that is terrifying. Notice how [[https://www.youtube.com/watch?v=nUD0HyzVpLg][our friend here]] cuts abruptly at 8:17; that's because the levels of violence required to tear that panel off are too graphic for YouTube. * Front fan Remember that fan from earlier, the one with only 3 holes for the motherboard's 4 pins? Turns out 1. that last "optional" pin is supposed to allow speed control; without it, the fan always spins at full speed; 2. the fan itself (ZA1225ASL) is [[https://www.youtube.com/watch?v=pd6gDY7LPlU][complete and utter crap]]: it cannot be disassembled, so no cleaning off the dirt, no greasing. So the thing is loud, it always spins at full speed, and if one day it decides to become even louder than usual, you're SOL. * Motherboard ** Firmware updates Quoth ~fwupdmgr get-devices~: #+begin_example WARNING: UEFI capsule updates not available or enabled in firmware setup See https://github.com/fwupd/fwupd/wiki/PluginFlag:capsules-unsupported for more information. #+end_example Quoth the wiki: #+begin_quote Most typically entering the firmware setup screen and enabling capsule updates will cause this warning to disappear, and also make firmware updates possible. The relevant option may be poorly labelled, for example "allow Windows UEFI updates". #+end_quote Not seeing any such option in the boot menu. #+begin_quote It is possible, but unlikely, that flashing the latest vendor BIOS, using either Windows or a LiveCD, will add support for [the thing that correlates with capsule updates being enabled]. #+end_quote Well then. [[https://www.msi.com/Motherboard/B550M-A-PRO/support#bios][Vendor says]] "put this on a stick; reboot; ask the menu to flash from the stick". Putting some feelers out first: #+begin_quote If you execute a UEFI update, this update might delete the existing UEFI boot entries — [[https://wiki.archlinux.org/title/GRUB#Installation][ArchWiki]], 2024 #+end_quote #+begin_quote Like others in this forum, I too suffered from a reformatted EFI partition following a BIOS update on my desktop pc. I had no idea that the MSI BIOS team doesn’t care about Linux installs, so to my surprise, following the update, my system booted straight to windows. […] Ultimately, I completely wiped and recreated the EFI partition with gparted (fat32), changed the structure to GPT with gdisk, and then mounted that partition in the /mnt/efi location, and then proceeded to generate a new fstab with genfstab. After arch-chroot’ing into my endeavoros install, I ran bootctl install (which complained about boot loader not setting esp information) and then reinstall-kernels. I updated the loader.conf with the correct default boot ID, and set the recommended options. That got me back into my system after quite a bit of trial and error. — [[https://forum.endeavouros.com/t/endeavoros-efi-partition-wiped-by-msi-bios-update/54740][EndeavorOS forums]], May 2024 #+end_quote #+begin_quote when updating the bios, it cleared all my settings. Apparently, this includes clearing the list of boot loaders, which it set back to the default of just Windows. Sadly this bios does not provide the tools to add boot entries as, apparently, some do. To fix it, I managed to boot to a Linux live USB and add the missing entry using the efiboomgr command line tool. — [[https://forum-en.msi.com/index.php?threads/updating-to-bios-7a32v1q1-wont-see-linux-uefi-boot.388109/][MSI AMD forums]], August 2023 #+end_quote Welp. OT1H, I could dedicate a couple of week-ends learning the joys and wonders of efibootmgr, gdisk & friends. OTOH I sort of like keeping my desktop station… not bricked? Pity, because otherwise I've had smooth and incident-free firmware updates on other stations with ~fwupdmgr~ 🤷 * SSD ** Failure On November 19 2024, LDLC's off-brand SSD died on me. RIP. Re-installed Tumbleweed on the replacement (Kingston SA400S3) on November 28. Since then… *** Performance loss Getting uncannily reproducible frame drops (60 ↘ 40±10, movement visibly choppy) in Hades Ⅱ when moving toward effects/particles-heavy areas. No idea WTF, those areas ran fine before. - "High" graphics setting at native 1920×1080 resolution. - Tried "Low" graphics, lowered resolution, disabled vsync: symptoms persist. - Not forcing any "compatibility tool" version, assuming this yields "Proton Experimental". - Tried a couple of old Proton versions: symptoms persist. - Reinstalled game & nuked everything under - =~/.cache/mesa_shader_cache*= - =~/.cache/radv_builtin_shaders*= - =~/.config/unity3d= - =~/.local/share/Steam= - =~/.local/share/vulkan/= - =~/.steam*= in case "stale shaders" were to blame or something. - Tumbleweed/Plasma/Wayland session. - Tried X11: symptoms persist. - Reducing noise with =balooctl6 suspend=, =swapoff -a= (RAM nowhere near exhausted). Well then. **** CPU frequency scaling? Started by noticing that the Plasma "Power Management" tray widget says "Power Profile" is "Not available". Not 100% sure whether that was the case with the old installation; maybe I had had something configured or installed to enable this? Internet says "install and enable power-profiles-daemon", except that's on: #+begin_example $ systemctl status power-profiles-daemon.service ● power-profiles-daemon.service - Power Profiles daemon Loaded: loaded (/usr/lib/systemd/system/power-profiles-daemon.service; disabled; preset: disabled) Active: active (running) since Sun 2024-12-01 11:46:32 CET; 45min ago Invocation: b2545a02bc9642b7aeb5f370e8b50e7c Main PID: 2289 (power-profiles-) Tasks: 4 (limit: 18320) CPU: 52ms CGroup: /system.slice/power-profiles-daemon.service └─2289 /usr/libexec/power-profiles-daemon #+end_example But: #+begin_example $ powerprofilesctl ,* balanced: PlatformDriver: placeholder power-saver: PlatformDriver: placeholder #+end_example Internet says I am missing the right scaling driver, and seems very keen on enabling =amd_pstate=, which I do not seem to have available: #+begin_example $ cpupower frequency-info analyzing CPU 5: driver: acpi-cpufreq CPUs which run at the same hardware frequency: 5 CPUs which need to have their frequency coordinated by software: 5 maximum transition latency: Cannot determine or is not supported. hardware limits: 1.40 GHz - 3.70 GHz available frequency steps: 3.70 GHz, 1.70 GHz, 1.40 GHz available cpufreq governors: ondemand performance schedutil current policy: frequency should be within 1.40 GHz and 3.70 GHz. The governor "schedutil" may decide which speed to use within this range. current CPU frequency: Unable to call hardware current CPU frequency: 3.30 GHz (asserted by call to kernel) boost state support: Supported: yes Active: no $ zcat /proc/config.gz | grep -i pstate CONFIG_X86_INTEL_PSTATE=y CONFIG_X86_AMD_PSTATE=y CONFIG_X86_AMD_PSTATE_DEFAULT_MODE=3 # CONFIG_X86_AMD_PSTATE_UT is not set #+end_example =/proc/config.gz= suggests the kernel configuration supports it, but =cpupower= does not seem to know about it. =dmesg= offers: #+begin_example $ sudo dmesg -H […] amd_pstate: the _CPC object is not present in SBIOS or ACPI disabled #+end_example Though: #+begin_example $ lscpu | grep -i cppc Flags: […] cppc […] #+end_example So ACPI problem? Lots of posts mentioning =amd_= parameters on the kernel command-line but AFAIU those are stale with newer kernels (6.11 here) which automatically (attempt to) load the =amd_pstate= driver. Went through the UEFI menu and found nothing related to ACPI or [[https://forum.level1techs.com/t/amd-p-state-driver/197885/24][X2APIC]]. Skeptical UEFI settings anyway, since I did not change them between the old and new installations. /Some time later/ Probably not ACPI, =dmesg= is choke full of ACPI noise. OTOH, using some diagnosis methods from [[https://bugzilla.kernel.org/show_bug.cgi?id=218171][this kernel bug report]]: #+begin_example $ find /sys/devices -name '*cppc*' 🦗 #+end_example (=acpidump ; acpixtract ; iasl ; grep -i cpc *.dsl= also yields 🦗, but =iasl= complains about "unresolved" "control methods", so 🤷) /Some time later/ [[https://wiki.archlinux.org/title/CPU_frequency_scaling#amd_pstate][ArchWiki]] does say "Change /Enable CPPC/ […] from /Auto/ to /Enabled/". My UEFI menu tucks that under /Overclocking → Advanced CPU Configuration → AMD CBS → CPPC CTRL/. That change *does* convince Linux to enable =amd_pstate=; going over the previous tests in reverse order: #+begin_example $ [… acpidump && acpixtract && iasl … ] && grep -i cpc *.dsl ssdt1.dsl: Name (_CPC, Package (0x17) // _CPC: Continuous Performance Control [… repeats 12 times …] $ find /sys/devices -name '*cppc*' -o -name '*pstate*' | tr -s '[:digit:]' N | sort -u /sys/devices/system/cpu/amd_pstate /sys/devices/system/cpu/cpufreq/policyN/amd_pstate_highest_perf /sys/devices/system/cpu/cpufreq/policyN/amd_pstate_hw_prefcore /sys/devices/system/cpu/cpufreq/policyN/amd_pstate_lowest_nonlinear_freq /sys/devices/system/cpu/cpufreq/policyN/amd_pstate_max_freq /sys/devices/system/cpu/cpufreq/policyN/amd_pstate_prefcore_ranking /sys/devices/system/cpu/cpuN/acpi_cppc $ sudo dmesg -H [… ominous silence about amd_pstate …] $ cpupower frequency-info analyzing CPU 1: driver: amd-pstate-epp CPUs which run at the same hardware frequency: 1 CPUs which need to have their frequency coordinated by software: 1 maximum transition latency: Cannot determine or is not supported. hardware limits: 400 MHz - 4.31 GHz available cpufreq governors: performance powersave current policy: frequency should be within 2.38 GHz and 4.31 GHz. The governor "powersave" may decide which speed to use within this range. current CPU frequency: Unable to call hardware current CPU frequency: 3.57 GHz (asserted by call to kernel) boost state support: Supported: yes Active: yes AMD PSTATE Highest Performance: 255. Maximum Frequency: 4.31 GHz. AMD PSTATE Nominal Performance: 219. Nominal Frequency: 3.70 GHz. AMD PSTATE Lowest Non-linear Performance: 141. Lowest Non-linear Frequency: 2.38 GHz. AMD PSTATE Lowest Performance: 24. Lowest Frequency: 400 MHz. $ powerprofilesctl performance: CpuDriver: amd_pstate Degraded: no ,* balanced: CpuDriver: amd_pstate PlatformDriver: placeholder power-saver: CpuDriver: amd_pstate PlatformDriver: placeholder #+end_example And lo, the 🍃↔🚀 slider appears in the Power Management tray widget. Nervous about entering the "Overclocking" UEFI zone tho, and concerned about these "Maximum frequencies". /And does it even help with the game?/ 🥁 No. No it does not; no discernible difference in FPS nor vibes. Will assume this new baseline cannot hurt - OT1H "overclocking" is scary, OTOH Linux now has a finer handle on the CPU and hopefully will not overwork it to death? **** Sᴇᴠᴇʀᴀʟ Wᴇᴇᴋꜱ Lᴀᴛᴇʀ - [[https://www.gamingonlinux.com/forum/topic/5475/page=1/][ridge reports]] "bad frame pacing on ADMGPU", - when vsync is turned off: a non-factor in my testing, - lots of useful information in that thread tho and interesting-sounding pointers, - [[https://www.gamingonlinux.com/forum/topic/5475/page=2/#r42519][Shmerl]] says: - games can cause stutter by underloading the GPU, causing it to drop out of "high performance mode", - (=amdgpu_top= and =radeontop= do confirm that lag spikes correlate with GPU usage drop) - see [[https://gitlab.freedesktop.org/drm/amd/-/issues/1500][drm/amd#1500]]: - /lots/ of sysfs noodling there; unfortunately, none of the suggested settings for =power_dpm_force_performance_level= & =pp_power_profile_mode= change the symptoms. - In [[https://gitlab.freedesktop.org/drm/amd/-/issues/3618#note_2689087][this drm/amd#3618 thread]], @agd5f suggests "6.11 stable kernels" include a fix for the issue at hand there and a further rework "was submitted to 6.13"; @mattipulkkinen reports happy results with 6.13-rc2 (FTR, symptoms persist here with 6.12.8). - Piggybacked onto [[https://gitlab.freedesktop.org/mesa/mesa/-/issues/11300][mesa/mesa#11300]]: - common: Hades Ⅱ, iGPU, recent kernel & Mesa, Proton Experimental, - differences: Fedora, GNOME, X11, - noteworthy: good performance on Windows, - suggestion by @Venemo: downgrade & bisect Mesa; - tempting, though scared of bricking graphical sessions and/or ending up with a frankensystem (intalling binaries under a prefix is probably easy, but then keeping track of config tweaks and cache artifacts sounds fraught). - In [[https://gitlab.freedesktop.org/upower/power-profiles-daemon/-/issues/164][upower/power-profiles-daemon#164]], @Nyan reports problematic iGPU capping; not convinced this is applicable though, given the reported symptoms (video playback is fine here). - Seen reports of Variable Refresh Rate causing problems: - searched high and low to understand why VRR appears nowhere in Plasma settings, despite the start menu turning up "Display Configuration" when searching for "VRR", - mystery solved by ~kscreen-doctor -o~: =Vrr: incapable= 🤷 - [[https://www.techpowerup.com/forums/threads/what-fixed-stuttering-and-random-framerate-spikes-in-games-for-me.327264/][aska33j proclaims]] that /disabling CPPC/ "fixed stuttering and random framerate spikes in games for [them]" so… roundtrip to UEFI, disabling that. The =amd_pstate= warning is back; the "Power Profile" slider is no longer accessible in the systray widget; no discernible effect in-game anyway. - Looking at Steam forums, [[https://steamcommunity.com/app/1145350/discussions/1/596260472619121965/][some folks]] do report FPS drops /shortly after the update/: #+begin_quote it started fine after the major update, now suddenly im stuck with 40~50 fps with micro sutters — December 6 2024 #+end_quote - After AMD drivers & Mesa, figured I could look at vkd3d's issue tracker. [[https://github.com/doitsujin/dxvk/issues/4436][doitsujin/dxvk#4436]] and [[ValveSoftware/steam-for-linux#11446]] looked somewhat promising: reports of lag on "KDE Tumbleweed Wayland", reported not long before my symptoms began (November 2024)); alas, ~LD_PRELOAD=~ does not help. - #+begin_quote Alternatively, remove the offending line in =/usr/share/drirc.d/00-radv-defaults.conf= #+end_quote /discovers [[https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/util/00-radv-defaults.conf][=/usr/share/drirc.d/=]]/ Computers were a mistake. - Peeked at [[https://github.com/HansKristian-Work/vkd3d-proton/blob/master/.github/ISSUE_TEMPLATE/bug_report.md][vkd3d-proton's issue template]] and idly ran with ~PROTON_LOG=1~. Over the course of 30 seconds or so, the log file gets flooded with 3MB's worth of =trace:unwind:dump_unwind_info= 🤨 **** This is insane Selected subset of moving parts; "testability" considering ease of clean reverts: | Part | Testability | |--------------+-------------------------------------------------------------------------------------| | Linux kernel | 🫣 [[https://en.opensuse.org/SDB:InstallNewerKernel][some distro documentation]]; afraid of side-effects | | AMD drivers | 🤷 no clue; maybe inextricable from kernel? | | Mesa | 😬 easy to recompile; hard to control transient state in cache & config folders | | Steam | 🫥 under Steam's control | | Wine | 🫥 under Steam's control | | Proton | 👌 as long as I stick to versions under Steam's control; have not considered GE yet | | vkd3d-proton | 🫥 under Steam's control | | Hades Ⅱ | 🫥 under Steam's control | That's looking at software packages as individual blackboxes; config-wise, worth noting: | Part | Testability | |------------+-------------------| | AMD pstate | 😬 UEFI roundtrip | | sysfs | OK | Let's throw in: | Part | Testability | |---------------+-----------------------------------| | Mobo firmware | 🔥 reports of nuked boot settings |