• 0 Posts
  • 62 Comments
Joined 1 year ago
cake
Cake day: July 4th, 2023

help-circle







  • I agree that UI should always take priority. I shouldn’t have to do anything to guarantee this.

    I have HZ_1000, tickless kernel with nohz_full set up. This all has a throughput/bandwidth cost (about 2%) in exchange for better responsiveness by default.

    But this is not enough, because the short burst UI tasks need near-zero wake-up latency… By the time the task scheduler has done its re-balancing the UI task is already sleeping/halted again, and this cycle repeats. So the nice/priorities don’t work very well for UI tasks. Only way a UI task can run immediately is if it can preempt something or if the system has a somewhat idle CPU to put it on.

    The kernel doesn’t know any better which tasks are like this. The on-going EEVDF, sched_ext scheduler projects attempt to improve the situation. (EEVDF should allow specifying the desired latency, while sched_ext will likely allow tuning the latency automatically)


  • No, I definitely want it to use as many resources it can get.

    taskset -c 0 nice -n+5 bash -c 'while :; do :; done' &
    taskset -c 0 nice -n+0 bash -c 'while :; do :; done'
    

    Observe the cpu usage of nice +5 job: it’s ~1/10 of the nice +0 job. End one of the tasks and the remaining jumps back to 100%.

    Nice’ing doesn’t limit the max allowed cpu bandwidth of a task; it only matters when there is contention for that bandwidth, like running two tasks on the same CPU thread. To me, this sounds exactly what you want: run at full tilt when there is no contention.


  • The kernel runs out of time to solve the NP-complete scheduling problem in time.

    More responsiveness requires more context-switching, which then subtracts from the available total CPU bandwidth. There is a point where the task scheduler and CPUs get so overloaded that a non-RT kernel can no longer guarantee timed events.

    So, web browsing is basically poison for the task scheduler under high load. Unless you reserve some CPU bandwidth (with cgroups, etc.) beforehand for the foreground task.

    Since SMT threads also aren’t real cores (about ~0.4 - 0.7 of an actual core), putting 16 tasks on a 16/8 machine is only going to slow down the execution of all other tasks on the shared cores. I usually leave one CPU thread for “housekeeping” if I need to do something else. If I don’t, some random task is going to be very pleased by not having to share a core. That “spare” CPU thread will be running literally everything else, so it may get saturated by the kernel tasks alone.

    nice +5 is more of a suggestion to “please run this task with a worse latency on a contended CPU.”.

    (I think I should benchmark make -j15 vs. make -j16 to see what the difference is)