People are speaking with ChatGPT for hours, bringing 2013’s Her closer to reality

boem@lemmy.world · 1 year ago

People are speaking with ChatGPT for hours, bringing 2013’s Her closer to reality

Communist@lemmy.ml · 1 year ago

For large context models the hardware is prohibitively expensive.

supert@lemmy.sdfeu.org · 1 year ago

I can run 4bit quantised llama 70B on a pair of 3090s. Or rent gpu server time. It’s expensive but not prohibitive.

Communist@lemmy.ml · 1 year ago

How many tokens can you run it for?

supert@lemmy.sdfeu.org · 1 year ago

3k?Can’t recall exactly, and I’m getting hardwarestability issues.

anotherandrew@lemmy.mixdown.ca · 1 year ago

I’m trying to get to the point where I can locally run a (slow) LLM that I’ve fed my huge ebook collection too and can ask where to find info on $subject, getting title/page info back. The pdfs that are searchable aren’t too bad but finding a way to ocr the older TIFF scan pdfs and getting it to “see” graphs/images are areas I’m stuck on.

Grimy@lemmy.world · 1 year ago

I personally use runpod. It doesn’t cost much even for the high end level stuff. Tbh the openai API is easier though and gives mostly better results.

Communist@lemmy.ml · 1 year ago

I specifically said “large context” how many tokens can you get through before it goes insanely slow?