I find the whole thing ominous because there is no "there" there: there is no understanding in the GPT-2 system, but it's able to generate increasingly plausible text. This greatly increases the amount of plausible nonsense that can be used to drown out actual research. You could certainly replace a lot of pop-sci and start several political movements with GPT-2... all of which has no actual nutritional content.
I find it intriguing for exactly that reason. I agree there's no fundamental "there", but I suggest that you may find that ominous because it implies there's no fundamental understanding anywhere. Only stories that survive scrutiny.
GPT can write "about" something from a prompt. This is not much different than me interpreting data that I'm analyzing. I'm constantly generating stories and checking them, until one story survives it all. How do I generate stories!? Seriously. I'm sure I have a GPT module in my left frontal cortex. I use it all the time when I think about actions I take, and it's what I try to ignore when I meditate. Its ongoing narrative is what feeds back into how I feel about things, which affects how I interact with things and what things I interact with ... not necessarily as a goal-driven decision process, more as a feedback-driven randomized selection. Isn't this kind of the basis of Cognitive Behavioral Therapy, meditation, etc. See [1,2]. If you stick GPT and sentiment analysis into a room, will they produce a rumination feedback like a depressed person?
Anyway, if you can tell a coherent story to justify a result (once presented with a result), one that is convincing enough for people to believe and internalize the result in their future studies, how is that different from understanding that result and teaching it to others? The act of teaching is itself story generation. Mental models are just story-driven hacks that allow people to generalize results in an insanely complex system.
1. Happiness Hypothesis Jonathan Haidt
2. Buddhism and modern psychology, coursera
You could probably ask an undergraduate or pop science fan to write a paper title and first sentence of an abstract that would turn heads and get good results.
Faking an entire 10 page paper with figures and citations is much harder. I'm sure it'll happen next week, but until then I can still say that's where real understanding is demonstrated.
> there is no understanding in the GPT-2 system, but it's able to generate increasingly plausible text
Under some definition of "understanding". GPT understands how to link words and concepts in a broadly correct manner. As long as the training data is valid, it's very plausible that it could connect some concepts that it genuinely and correctly understands are compatible, but doing so in a way which humans had not considered.
It can't do research or verify truth, but I've seen several examples of it coming up with an idea that as far as I can tell had never been explored, but made perfect sense. It understood that those concepts fit together because it saw a chain connecting them through god-knows how many input texts, yet a human wouldn't have ever thought of it. That's still valuable.
As to how far that understanding can be developed... I'm not sure. It's hard to believe that algorithmically generated text would ever be able to somehow ensure that it produces true text, but then again ten years ago I would have scoffed at the idea of it getting as far as it already has.
Makes you wonder if humans also have less "there" than we give ourselves credit for. How much of sentence construction is just repeating familiar tropes in barely-novel ways?
To what extent have our brains already decided what to say while we still perceive ourselves as 'thinking about the wording'?
No, the text is generally very implausible if you know anything about the science. For example, describing dna twists with protein folding descriptors, mixing up quantum computing with astronomy, inorganic chemistry with biochemistry, virology with bacteriology...
I was really impressed with gpt-2 but seeing this really gave me a feel for how much of a lack of understanding it has.
A striking argument, at this point in time.
Quite likey this weakness will be overcome soon, when deep learning becomes integrated with [KR²]-spectrum methods...
this is already happening. maybe not in Nature, but all over the "small" papers of this world (and so also with any Nature derivatives). There's people milling the same article with nonsense "results" through 3 journals each in Elsevier, Springer, Wiley and RCS. And they stay up. Forever. And anyone looking into this topic will waste an hour because these 10x cited articles are just self-citations of the same crap.