• 2 Posts
  • 91 Comments
Joined 2 years ago
cake
Cake day: June 15th, 2023

help-circle

  • sntx@lemm.eetoLinux@lemmy.mlReassessing Wayland
    link
    fedilink
    arrow-up
    2
    ·
    2 days ago

    To be honest, I switched to Wayland years ago precisely because of the better perceived input/cursor experience.

    Change my mind, but having an average of half a frame input latency is much preferred when in return I gain that the cursor position on the screen actually aligns with all the other content displayed.

    Plus, I’m very sensitive to tearing, so whenever it happens I get the impression that there was a huge rendering error.

    Well and on the note that the cursor might visibly stutter, sure. But it’s a bit misleading. A game pinning the GPU to 100 % and running on 5 FPS doesn’t mean that your cursor will be rendered with 5 FPS. So far I’ve only noticed cursor lag/stutters in OOM situations, but neither under heavy GPU or CPU load.












  • I’m also on p2p 2x3090 with 48GB of VRAM. Honestly it’s a nice experience, but still somewhat limiting…

    I’m currently running deepseek-r1-distill-llama-70b-awq with the aphrodite engine. Though the same applies for llama-3.3-70b. It works great and is way faster than ollama for example. But my max context is around 22k tokens. More VRAM would allow me more context, even more VRAM would allow for speculative decoding, cuda graphs, …

    Maybe I’ll drop down to a 35b model to get more context and a bit of speed. But I don’t really want to justify the possible decrease in answer quality.



  • Thanks for the writeup! So far I’ve been using ollama, but I’m always open for trying out alternatives. To be honest, it seems I was oblivious to the existence of alternatives.

    Your post is suggesting that the same models with the same parameters generate different result when run on different backends?

    I can see how the backend would have an influence hanfling concurrent api calls, ram/vram efficiency, supported hardware/drivers and general speed.

    But going as far as having different context windows and quality degrading issues is news to me.