I took a very cursory look at the code, and it looks like this is just a standar...

bschmidt1 · 2024-09-01T18:56:17 1725216977

I get this question only on Hacker News, and am baffled as to why (and also the question "isn't this just n-grams, nothing more?").

https://github.com/bennyschmidt/next-token-prediction

^ If you look at this GitHub repo, should be obvious it's a token prediction library - the video of the browser demo shown there clearly shows it being used with an <input /> to autocomplete text based on your domain-specific data. Is THAT a Markov chain, nothing more? What a strange question, the answer is an obvious "No" - it's a front-end library for predicting text and pixels (AKA tokens).

https://github.com/bennyschmidt/llimo

This project, which uses the aforementioned library is a chat bot. There's an added NLP layer that uses parts-of-speech analysis to transform your inputs into a cursor that is completed (AKA "answered"). See the video where I am chatting with the bot about Paris? Is that nothing more than a standard Markov chain? Nothing else going on? Again the answer is an obvious "No" it's a chat bot - what about the NLP work, or the chat interface, etc. makes you ask if it's nothing more than a standard [insert vague philosophical idea]?

To me, your question is like when people were asking if jQuery "is just a monad"? I don't understand the significance of the question - jQuery is a library for web development. Maybe there are some similarities to this philosophical concept "monad"? See: https://stackoverflow.com/questions/10496932/is-jquery-a-mon...

It's like saying "I looked at your website and have concluded it is nothing more than an Array."

nurettin · 2024-09-02T04:47:40 1725252460

They are just inquiring as to what the underlying data structure and algorithm is, not what function it performs, or the myriad of ways it can be used.

bschmidt1 · 2024-09-02T18:22:16 1725301336

It's an inquiry with an embedded false dichotomy/assumption that n-grams are not used in LLMs, when in fact ChatGPT also uses n-grams/"Markov chains". Popular embeddings including those ChatGPT uses like text-embedding-ada-002 and later also use parts-of-speech codes. And the chat interface uses conventional NLP too. Maybe some people think it's nothing but "magical vectors" doing all the work, but that's incorrect.

If you google "Is ChatGPT just a glorified Markov chain?" you will amazingly get pages of results of people asking this question, just like "Is jQuery just a glorified monad?" as if to reduce something novel down to useless, mere philosophy that "we've had" for thousands of years. Imagine suggesting using a state management library in React to improve FE dev and getting the retort: "Isn't that just a state machine?" in a discounting manner, and imagine the rest of the team actually nodding their head in agreement like a scene in Idiocracy - welcome to Hacker News.

For smart people, the answer to any question like this is "No". Google is not a glorified Array. Bitcoin is not a glorified LinkedList. Language models are not glorified Markov chains. To even ask that is so reductionist and incorrect that any answer obfuscates what they actually are.

Here's a gist you can paste into your browser that shows how both n-grams and conventional NLP (parts-of-speech analysis) are used to derive vector embeddings in the first place: https://gist.github.com/bennyschmidt/ba79ba64faa5ba18334b4ae... (following in the style of text-embedding-ada-002 albeit much tinier)

They are not mutually exclusive concepts to begin with. Never have been. None of these comments even deserve these lengthy replies (I am likely responding to a mix of 12- and 24-year-olds who don't care that much anyway, just want to "win"), yet I feel compelled to explain.

nurettin · 2024-09-02T20:21:25 1725308485

I think this is way too harsh. What if someone who is not interested in learning a subject deeply, but still genuinely wonders if they get the gist of it and/or want to know where to start in case ? Of course one of them will eventually remember markov chains and start drawing parallels with modern LLMs. It is only natural. No need to berate people for that.

edit: I do appreciate your work and explanation, btw.

bschmidt1 · 2024-09-03T21:49:14 1725400154

They don't have good intentions.

Thanks.