Hello,

I’m trying to build a homelab to run a bunch of daily batch processing jobs as it’s primary job. I used to do this on AWS but it was getting really expensive, so I currently do it on a Ryzen 5600 setup (it was on sale). On my current on-premise setup and even on AWS, I am only able to run a sample of the jobs I would like to do, and essentially use this sample to approximate the rest (not ideal).

Each job currently takes approximately 2 minutes and about 2.4 Mb memory. The job is simply a python script that reads data from a PostgreSQL database and does a type of multivariate linear approximation. It’s similar to an ML algorithm but it’s not suited for GPU processing (a lot of small matrices vs one big matrix). I need to run as many of these as possible on a daily basis. I would also like to larger analysis on an ad hoc basis, but this type of study is not time constrained.

So other than running the script, I would need it to be able to efficiently run the PostgreSQL database where the data is stored. I currently have a Fractal Node 804 case (mATX) that I’d like to re-use if possible. My only constraint is budget, which I don’t want to spend more than a couple thousand Canadian dollars on, though I’ll entertain more expensive options if they make sense. I am also in Canada.

Before I bite the bullet and buy a Threadripper, any ideas?

Thanks

  • thelastknowngod
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    8 months ago

    To start, move the database to a different machine that has a fast ssd and lots of mem. If you’re workload is mostly doing reads from the db, consider breaking it into a single writer with one or more replicas.