summaryrefslogtreecommitdiff
path: root/guides
diff options
context:
space:
mode:
Diffstat (limited to 'guides')
-rw-r--r--guides/sysadmin/machines/amdahl30/killing-time.org98
1 files changed, 55 insertions, 43 deletions
diff --git a/guides/sysadmin/machines/amdahl30/killing-time.org b/guides/sysadmin/machines/amdahl30/killing-time.org
index f781582..be38736 100644
--- a/guides/sysadmin/machines/amdahl30/killing-time.org
+++ b/guides/sysadmin/machines/amdahl30/killing-time.org
@@ -1,17 +1,16 @@
-* Failure
On November 19 2024, LDLC's off-brand SSD died on me. RIP.
Re-installed Tumbleweed on the replacement (Kingston SA400S3) on
-November 28. Since then…
-** Performance loss
-Getting uncannily reproducible frame drops (60 ↘ 40±10, movement
-visibly choppy) in Hades Ⅱ when moving toward effects/particles-heavy
-areas. No idea WTF, those areas ran fine before.
+November 28.
+
+Since then, I have been getting uncannily reproducible stuttering and
+frame drops (60↘40±10) in Hades Ⅱ when moving toward effect- or
+particle-heavy areas of the hub rooms (Crossroads, Training Grounds).
+No idea WTF, those areas ran fine before.
- "High" graphics setting at native 1920×1080 resolution.
- - Tried "Low" graphics, lowered resolution, disabled vsync: symptoms
- persist.
-- Not forcing any "compatibility tool" version, assuming this yields
- "Proton Experimental".
+ - Tried "Low" graphics, lowered resolution, disabled vsync, switched
+ to Windowed mode: symptoms persist.
+- Proton Experimental.
- Tried a couple of old Proton versions: symptoms persist.
- Reinstalled game & nuked everything under
- =~/.cache/mesa_shader_cache*=
@@ -23,15 +22,22 @@ areas. No idea WTF, those areas ran fine before.
in case "stale shaders" were to blame or something.
- Tumbleweed/Plasma/Wayland session.
- Tried X11: symptoms persist.
-- Reducing noise with =balooctl6 suspend=, =swapoff -a= (RAM nowhere
- near exhausted).
+- Reducing noise with
+ - ~balooctl6 suspend~
+ - ~swapoff -a~ (RAM nowhere near exhausted)
Well then.
-*** CPU frequency scaling?
+* CPU frequency scaling?
+(Hey 👋 A warning: this was the first rabbit hole I burrowed into.
+Spoiler alert: nothing I learned here solved the problem. Feel free
+to skip to the next section if you want to know how this ends
+{{{narrator(he wrote\, furiously hoping against hope that he would
+indeed see the end of this someday)}}})
+
Started by noticing that the Plasma "Power Management" tray widget
-says "Power Profile" is "Not available". Not 100% sure whether that
-was the case with the old installation; maybe I had had something
-configured or installed to enable this?
+says "Power Profile" is "Not available". Not sure whether that was
+the case with the old installation; maybe I had something configured
+or installed to enable this?
Internet says "install and enable power-profiles-daemon", except
that's on:
@@ -60,10 +66,18 @@ $ powerprofilesctl
PlatformDriver: placeholder
#+end_example
-Internet says I am missing the right scaling driver, and seems very
-keen on enabling =amd_pstate=, which I do not seem to have available:
+Internet says I am missing the right scaling driver, and sounds very
+keen on enabling =amd_pstate=, which I do not seem to have available.
+=/proc/config.gz= suggests the kernel configuration supports it, but
+=cpupower= does not appear to know about it:
#+begin_example
+$ zcat /proc/config.gz | grep -i pstate
+CONFIG_X86_INTEL_PSTATE=y
+CONFIG_X86_AMD_PSTATE=y
+CONFIG_X86_AMD_PSTATE_DEFAULT_MODE=3
+# CONFIG_X86_AMD_PSTATE_UT is not set
+
$ cpupower frequency-info
analyzing CPU 5:
driver: acpi-cpufreq
@@ -81,16 +95,9 @@ analyzing CPU 5:
boost state support:
Supported: yes
Active: no
-
-$ zcat /proc/config.gz | grep -i pstate
-CONFIG_X86_INTEL_PSTATE=y
-CONFIG_X86_AMD_PSTATE=y
-CONFIG_X86_AMD_PSTATE_DEFAULT_MODE=3
-# CONFIG_X86_AMD_PSTATE_UT is not set
#+end_example
-=/proc/config.gz= suggests the kernel configuration supports it, but
-=cpupower= does not seem to know about it. =dmesg= offers:
+=dmesg= offers:
#+begin_example
$ sudo dmesg -H
@@ -105,14 +112,15 @@ Flags: […] cppc […]
#+end_example
So ACPI problem? Lots of posts mentioning =amd_= parameters on the
-kernel command-line but AFAIU those are stale with newer kernels (6.11
-here) which automatically (attempt to) load the =amd_pstate= driver.
+kernel command-line, but AFAIU those posts are stale with newer
+kernels (6.11 here) which automatically (attempt to) load the
+=amd_pstate= driver.
Went through the UEFI menu and found nothing related to ACPI or
-[[https://forum.level1techs.com/t/amd-p-state-driver/197885/24][X2APIC]]. Skeptical UEFI settings anyway, since I did not change them
-between the old and new installations.
+[[https://forum.level1techs.com/t/amd-p-state-driver/197885/24][X2APIC]]. Skeptical of UEFI settings anyway, since I did not change
+them between the old and new installations.
-/Some time later/
+{{{narrator(Some time later)}}}
Probably not ACPI, =dmesg= is choke full of ACPI noise. OTOH, using
some diagnosis methods from [[https://bugzilla.kernel.org/show_bug.cgi?id=218171][this kernel bug report]]:
@@ -122,10 +130,10 @@ $ find /sys/devices -name '*cppc*'
🦗
#+end_example
-(=acpidump ; acpixtract ; iasl ; grep -i cpc *.dsl= also yields 🦗,
+(~acpidump ; acpixtract ; iasl ; grep -i cpc *.dsl~ also yields 🦗,
but =iasl= complains about "unresolved" "control methods", so 🤷)
-/Some time later/
+{{{narrator(Some time later)}}}
[[https://wiki.archlinux.org/title/CPU_frequency_scaling#amd_pstate][ArchWiki]] does say "Change /Enable CPPC/ […] from /Auto/ to /Enabled/".
My UEFI menu tucks that under /Overclocking → Advanced CPU
@@ -199,7 +207,7 @@ No. No it does not; no discernible difference in FPS nor vibes.
Will assume this new baseline cannot hurt - OT1H "overclocking" is
scary, OTOH Linux now has a finer handle on the CPU and hopefully will
not overwork it to death?
-*** Sᴇᴠᴇʀᴀʟ Wᴇᴇᴋꜱ Lᴀᴛᴇʀ
+* Sᴇᴠᴇʀᴀʟ Wᴇᴇᴋꜱ Lᴀᴛᴇʀ
- [[https://www.gamingonlinux.com/forum/topic/5475/page=1/][ridge reports]] "bad frame pacing on ADMGPU",
- when vsync is turned off: a non-factor in my testing,
- lots of useful information in that thread tho and
@@ -213,6 +221,8 @@ not overwork it to death?
- /lots/ of sysfs noodling there; unfortunately, none of the
suggested settings for =power_dpm_force_performance_level= &
=pp_power_profile_mode= change the symptoms.
+ - Since this forum seems full of knowledgeable folks, posted [[https://www.gamingonlinux.com/forum/topic/6437/][a new
+ topic]] there… but then [[https://www.gamingonlinux.com/forum/topic/6463/][the UK OSA dropped]].
- In [[https://gitlab.freedesktop.org/drm/amd/-/issues/3618#note_2689087][this drm/amd#3618 thread]], @agd5f suggests "6.11 stable kernels"
include a fix for the issue at hand there and a further rework "was
@@ -248,29 +258,31 @@ not overwork it to death?
- Looking at Steam forums, [[https://steamcommunity.com/app/1145350/discussions/1/596260472619121965/][some folks]] do report FPS drops /shortly
after the update/:
#+begin_quote
- it started fine after the major update, now suddenly im stuck with 40~50 fps with micro sutters
+ it started fine after the major update, now suddenly im stuck with
+ 40~50 fps with micro sutters
— December 6 2024
#+end_quote
- After AMD drivers & Mesa, figured I could look at vkd3d's issue
tracker. [[https://github.com/doitsujin/dxvk/issues/4436][doitsujin/dxvk#4436]] and
- [[ValveSoftware/steam-for-linux#11446]] looked somewhat promising:
+ [[https://github.com/ValveSoftware/steam-for-linux/issues/11446][ValveSoftware/steam-for-linux#11446]] looked somewhat promising:
reports of lag on "KDE Tumbleweed Wayland", reported not long before
my symptoms began (November 2024)); alas, ~LD_PRELOAD=~ does not
help.
-
- #+begin_quote
- Alternatively, remove the offending line in =/usr/share/drirc.d/00-radv-defaults.conf=
- #+end_quote
+ #+begin_quote
+ Alternatively, remove the offending line in
+ =/usr/share/drirc.d/00-radv-defaults.conf=
+ #+end_quote
- /discovers [[https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/util/00-radv-defaults.conf][=/usr/share/drirc.d/=]]/
+ {{{narrator(discovers [[https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/util/00-radv-defaults.conf][=/usr/share/drirc.d/=]])}}}
- Computers were a mistake.
+ Computers were a mistake.
- Peeked at [[https://github.com/HansKristian-Work/vkd3d-proton/blob/master/.github/ISSUE_TEMPLATE/bug_report.md][vkd3d-proton's issue template]] and idly ran with
~PROTON_LOG=1~. Over the course of 30 seconds or so, the log file
gets flooded with 3MB's worth of =trace:unwind:dump_unwind_info= 🤨
-*** This is insane
+* This is insane
Selected subset of moving parts; "testability" considering ease of
clean reverts:
@@ -297,5 +309,5 @@ Let's throw in:
| Part | Testability |
|---------------+-----------------------------------|
-| Mobo firmware | 🔥 reports of nuked boot settings |
+| Mobo firmware | 🔥 [[file:maintenance.org::*Firmware updates][reports]] of nuked boot settings |