And it fits the definition doesn't it since it tokenizes inputs to compute them ...

p1esk · 2024-09-03T05:54:20 1725342860

Wait, do you really not know what a transformer is in the context of ML? It’s been dominating the field for 7 years now.

bschmidt1 · 2024-09-03T21:48:39 1725400119

Can't read? I just explained thoroughly what it is in the comment above. Do you understand what matrix transformations are?

Do you know that a vector in LLMs for word embeddings is the same thing as a vector in 3D game dev libraries like Three.js?

Sounds like you 2 are the only ones who don't get it.

p1esk · 2024-09-04T00:39:24 1725410364

Please do yourself a favor and google “transformer paper”. Open the very first result and read the pdf. Hopefully it will become clear what people mean when they say “transformer” in ML context, and you will finally realize how silly you look like in this thread.

dang · 2024-09-06T19:26:57 1725650817

You guys both broke the site guidelines badly in this thread. We have to ban accounts that post like this, so please don't.

If you'd please review https://news.ycombinator.com/newsguidelines.html and stick to the rules when posting here, we'd appreciate it.

bschmidt1 · 2024-09-04T06:42:29 1725432149

You still don't get it. For LLMs a "transformer architecture" only means one that:

- Tokenizes sequences

- Converts tokens to vectors

- Performs vector/matrix transformations

- Converts back to tokens

The matrix transformation part is why it's called a "transformer". Do some reading yourself https://en.wikipedia.org/wiki/Transformer_(deep_learning_arc...

> how silly you look

You'll look twice as silly after thinking vectors are unique to LLMs, or that the word "transformer" has anything to do with LLMs rather than lower-level array math.

Consider that a "vector database" is a very specific technology - yet the word "vector" is not off limits in other database related libraries, especially if dealing with vectors.

In any case - if you think I'm trying to pass it off as something else, what I call "transformer" does tokenize lots of text (breaks it down by ~word, ~pixel) and derives semantic values (AKA trains) to produce real-time completions to inputs by way of math, not lookups. It fits the definition even in that sense where "transformer" meant something more abstract than the mathematical term.

dang · 2024-09-06T19:27:07 1725650827

You guys both broke the site guidelines badly in this thread. We have to ban accounts that post like this, so please don't.

If you'd please review https://news.ycombinator.com/newsguidelines.html and stick to the rules when posting here, we'd appreciate it.

bschmidt1 · 2024-09-07T02:28:22 1725676102

I didn't know it was that strict, no offense to the other poster, it was just a little disagreement :)