Hacker News new | past | comments | ask | show | jobs | submit login

> GPT-4: 8 x 220B experts trained with different data/task distributions and 16-iter inference.

There was a post on HackerNews the other day about a 13B open source model.

Any 220B open source models? Why or why not?

I wonder what the 8 categories were. I wonder what goes into identifying tokens and then trying to guess which category/model you should look up. What if tokens go between two models, how do the models route between each other?




220B open source models wouldn't be as useful for most users.

You need two RTX 3090 24GB cards already to run inference with a 65B model that is 4bit quantized. Going beyond that (already expensive) hardware is out of reach for the average hobbyist developer.


You could run it quantized to 4 bits on CPU with 256GB ram, which is much cheaper to rent/buy. Sure it might be somewhat slow, but for lots of use cases that doesn't matter.


Benchmarks I've run with a Ryzen 7950x, 128 GB RAM with Nvidia GeForce 3060 12 GB VRAM show a slowdown less than half when not using the GPU, with LLama.cpp as the inference platform and various ggml open source models in the 7B-13B parameter range.

The Ryzen does best with 16 threads, not the 32 it is capable of, which is expected due to it having 16 CPU cores.


Llama.cpp running on the GPU is pretty slow. Better try with something else. The speedup going from CPU to RTX 3090 is usually around 10x or 15x.


Google open-sourced (Apache 2.0) the Switch Transformers C-2048 model (1.6T parameters for 3.1 TB): https://huggingface.co/google/switch-c-2048


I think it’s just an ensemble of models, so you do some kind of pooling/majority vote on your output tokens


Would this be before or after inference? Is there some sort of a delegation based on the matter?


If it is output tokens then it is after the inference.


same. i wish i had asked george instead of nodding along like an idiot. he probably wouldnt know but at least he’d speculate in interesting ways.


It was a great interview, thank you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: