Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I did something similar but using a K80 and M40 I dug up from eBay for pennies. Be advised though, stay as far away as possible from the K80 - the drivers were one of the most painful tech things I've ever had to endure, even if 24GB of VRAM for 50 bucks sounds incredibly appealing. That said, I had a decent-ish HP workstation laying around with 1200 watt power supply so I had where to put those two in. The one thing to note here is that these types of GPUs do not have a cooling of their own. My solution was to 3d print a bunch of brackets and attach several Noctua fans and have them blow at full speed 24/7. Surprisingly it worked way better than I expected - I've never gone above 60 degrees. As a side efffect, the CPUs are also benefiting from this hack: at idle, they are in the mid-20 degrees range. Mind you, the noctua fans are located on the front and the back of the case: the ones on the front act as an intake and the ones on the back as exhaust and there's two more inside the case that are stuck in front of the GPUs.

The workstation was refurbished for just over 600 bucks, and another 120 bucks for the GPUs and another ~60 for the fans.

Edit: and before someone asks - no I have not uploaded the STL's anywhere cause I haven't had the time but also since this is a very niche use case, though I might: the back(exhaust) bracket came out brilliant the first try - it was a sub-millimeter fit. Then I got cocky and thought that I'd also nail it first try on the intake and ended up re-printing it 4 times.



> Be advised though, stay as far away as possible from the K80 - the drivers were one of the most painful tech things I've ever had to endure, even if 24GB of VRAM for 50 bucks sounds incredibly appealing.

I thought the problem was that those cards have loads of RAM but lack really important compute capabilities such that they're kind of useless for actually running AI workloads on. Is that not the case?


> Is that not the case?

it is - they're laughably slow and not even supported by latest CUDA

> NVIDIA Driver support for Kepler is removed beginning with R495. CUDA Toolkit development support for Kepler continues through CUDA 11.x.


But Deepseek R1 doesn't use CUDA, so maybe for this specific case, it isn't a big deal?


> it isn't a big deal?

friend you shouldn't make comments like this unless you understand the definitions of the words. Deepseek wrote some parts of their kernels using PTX. newsflash: PTX support for features is lockstep with CUDA support for the same features ie the fact that CUDA doesn't support it means you couldn't write the PTX to use those features either.


It is poor form to condemn someone from asking a question.

Thank you for providing the information to clear up ignorance though.


this is a question:

> is deepseak's use of PTX instead of CUDA relevant here?

this is a conclusion/assumption thinly veiled as a question

> Deepseek R1 doesn't use CUDA, so ... it isn't a big deal?

note, genuine questions don't already presuppose an answer.


Asking if it is a big deal or not is definitely a question ;) Thank you for providing the information I was missing though.


The PTX hack is for backend runner and training infra, the public weights are often executed using existing backends. Especially R1-distill-* models are.


the two things (weights and kernels) have nothing to do with each other in the slightest. again i wish people would take a beat before commenting out of their depth and consider whether their comment adds to the conversation or not.


I'm running P41s in one of my test boxes. These don't have support for BF16 but they do support F16 and F32 and those are accelerated to a certain degree, they're lacking kernels that are as optimized but its not terribly hard to adapt other ones for the purposes.

You don't get great out-of-the-box performance but it only took me three work days or so with no experience writing these to adapt, test, and validate a kernel using the acceleration hardware that was available (no prior experience writing these kernels).

They're not as powerful as others but still significantly better than running on a CPU alone and I'd bet my kernel is missing more advanced optimizations.

My issue with these was the power cable and fans. The author touches on the fans and I did try a 3D printed shroud and some of the higher pressure fans but I could only run the cards in short stints. I ended up making an enclosure that went straight out of the case using two high pressure SAN array fans I harvested from the IT graveyard per card and making a hole with an angle grinder.

The power cable is NOT STANDARD on these. I had to find a weird specific cable to adapt the standard 8-pin GPU connector and each card takes two of these bad boys.


> K80 - the drivers were one of the most painful tech things I've ever had to endure

Well, for a dedicated LLM box it might be feasible to suffer with drivers a bit, no? What was your experience like with the software side?


Curious what HP workstation you have?


HP Z440, it's in the article.


My comment was not directed at the blog but at the person I responded to.


What kind of performance did you get out of that?


What’s the most pain you’ve ever felt?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: