Hacker News new | past | comments | ask | show | jobs | submit login

They are just inquiring as to what the underlying data structure and algorithm is, not what function it performs, or the myriad of ways it can be used.



It's an inquiry with an embedded false dichotomy/assumption that n-grams are not used in LLMs, when in fact ChatGPT also uses n-grams/"Markov chains". Popular embeddings including those ChatGPT uses like text-embedding-ada-002 and later also use parts-of-speech codes. And the chat interface uses conventional NLP too. Maybe some people think it's nothing but "magical vectors" doing all the work, but that's incorrect.

If you google "Is ChatGPT just a glorified Markov chain?" you will amazingly get pages of results of people asking this question, just like "Is jQuery just a glorified monad?" as if to reduce something novel down to useless, mere philosophy that "we've had" for thousands of years. Imagine suggesting using a state management library in React to improve FE dev and getting the retort: "Isn't that just a state machine?" in a discounting manner, and imagine the rest of the team actually nodding their head in agreement like a scene in Idiocracy - welcome to Hacker News.

For smart people, the answer to any question like this is "No". Google is not a glorified Array. Bitcoin is not a glorified LinkedList. Language models are not glorified Markov chains. To even ask that is so reductionist and incorrect that any answer obfuscates what they actually are.

Here's a gist you can paste into your browser that shows how both n-grams and conventional NLP (parts-of-speech analysis) are used to derive vector embeddings in the first place: https://gist.github.com/bennyschmidt/ba79ba64faa5ba18334b4ae... (following in the style of text-embedding-ada-002 albeit much tinier)

They are not mutually exclusive concepts to begin with. Never have been. None of these comments even deserve these lengthy replies (I am likely responding to a mix of 12- and 24-year-olds who don't care that much anyway, just want to "win"), yet I feel compelled to explain.


I think this is way too harsh. What if someone who is not interested in learning a subject deeply, but still genuinely wonders if they get the gist of it and/or want to know where to start in case ? Of course one of them will eventually remember markov chains and start drawing parallels with modern LLMs. It is only natural. No need to berate people for that.

edit: I do appreciate your work and explanation, btw.


They don't have good intentions.

Thanks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: